Get more information about the benefits of public vs. private vs. hybrid cloud options.
Find Out:
• What is hybrid cloud and how you can use it.
• Loggly's use case and learnings
• Why go hybrid?
This was originally used to present at #StackWorld16.
DevEX - reference for building teams, processes, and platforms
Why Go Hybrid Cloud?
1. | Log management as a service Simplify Log Management #LDFE
Manoj Chaudhary
CTO & VP of Engineering at
Loggly
2. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Today’s Talk
Hybrid Cloud
Our Use Case
and Learnings
Go Hybrid
3. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Operated by a 3rd party
provider. Utilized for the work
loads that are not core for the
private cloud.
Hybrid
Cloud
Connect
Private
Cloud
Public
Cloud
Operated in your own data
center. Typically within a
firewall for organizations.
Controlled and maintained by Ops to ensure
interoperability with private cloud.
- BEST OF BOTH WORLDS -
This model offers versatility and convenience,
while preserving management, control and
security.
Delivered as a cloud service over the Internet. Sold on-demand,
typically by the minute or the hour. Customers only pay for the CPU
cycles, storage or bandwidth they consume.
This cloud computing environment which uses a mix of on-
premises, private cloud and third-party, public cloud services
with orchestration between the two platforms.
4. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Loggly’s Use Case
Use Case
• Centralized log
management
• Allow customers to analyze
large amount of data
• Real time processing of
data
Our Challenges
• Massive incoming event
stream
• Fundamentally multi-tenant
• Near real-time indexing
• Near real-time searches
• Near real-time alerts
Our Data Processing Stack
• Collector
• Kafka
• Elasticsearch
• Redis/Memcached
• Ngnix
5. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Event
Processing
Loggly’s Big Data Pipeline
Event Processing
Event Processing
Event Processing
Event Processing
Event Processing
Kafka
Queue
6. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Loggly’s Big Data Pipeline
From Kafka
Elastic Search Clusters Multi-Tiered
Elastic Cluster
7. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Loggly’s - Learning 1St Deployment
Our First Attempt Was Public
Cloud
Why Did We Try Public Cloud?
• Fastest route to go live
• Less upfront cost
• We could learn about
application behavior and needs
Findings
• Our resources grow consistently
• Our work load requires
extensive Compute, NIO and IO
• Reliability and consistent
performance is key
• Public Cloud - Not cost efficient
for our model
8. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Go Hybrid
Hybrid
Cloud
Private Cloud Public CloudConnect
Why Hybrid?
• Uses the right cloud for right kind of work load
• Gives us more control overs hardware components,
compute, failover options
• Reliability - use of resources is predictable
• Suits our Infrastructure growth
• Cost efficient for big data work load
Public Vs. Private Cloud Services
• There isn’t one answer on how to decide. It is entirely based
on needs.
Recommendations
Public Cloud
• All internet facing services
• All services which can
burst and need high
elasticity
Private Cloud
• Big data processing
services
• Services that process
sensitive data
9. | Log management as a service Reveal What Matters | @loggly - #StackWorld16
Reach Out to Me!
Reach me at manoj@loggly.com
Blogs at http://bit.ly/ManojBlogs
About Us:
Loggly is the world’s most popular cloud-based log management solution, used by more
than 5,000 happy customers to effortlessly spot problems in real-time, easily pinpoint root
causes and resolve issues faster to ensure application success.
Try Loggly for Free! → https://www.loggly.com/
Visit us at loggly.com or follow @loggly on Twitter.
Editor's Notes
Things we talk today:
What is Hybrid Cloud and its Uses.
Loggly Use-case and Learning's
Why Go Hybrid
Private Cloud generally run in companies IT owned data center behind the firewall and is for an organizations compare to public cloud which is operated by 3rd Party provider like amazon aws, Google App Engine.
A public cloud is one based on the standard cloud computing model, in which a service provider makes resources, such as applications and storage, available for consumption over the Internet.
The main benefits of using a public cloud service are: • Easy and inexpensive set-up because hardware, application and bandwidth costs are covered by the provider. • Scalability to meet needs. • No wasted resources because you pay for what you use.
Hybrid cloud is a combination of public cloud services and on-premises private cloud – with orchestration and automation between the two.
For example, an enterprise can deploy an on-premises private cloud to host sensitive or critical workloads, but use a third-party public cloud provider, such as Google Compute Engine, to host less-critical resources, such as test and development workloads. To hold customer-facing archival and backup data, a hybrid cloud could also use Amazon Simple Storage Service (Amazon S3).
Another example could be Companies can run mission-critical workloads or sensitive applications on the private cloud while using the public cloud for bursty workloads that must scale on-demand.
The goal of hybrid cloud is to create a unified, automated, scalable environment which takes advantage of all that a public cloud infrastructure can provide, while still maintaining control over mission-critical data.
Loggly Use-case
We offer our customer Centralized Log management system. What does that mean
That mean: we allow our customers to debug application from one centralized place i.e. browser and allow customer to analyze large amount of data using browser. So that customer don’t have to write custom scripts and manage those script as the log changes
Our use-case is different than the typical big data use-case where the big data companies collect data dump it into big data systems like HDFS or Hadoop and then do offline processing. Our Use-case is to process the data in real time. Our customer expect to see the data in real time as soon as it leaves customer premises they expect to see it and ready to analyze.
Few high-level challenges for us:
We are fundamentally multi-tenant system and massive incoming stream of data. Our customer send enormous amount of logs to us every day.
As I mentioned we are have to be realtime for indexing, searches and alerts. When our customer face issue in production if logs or searching of log is behind then it doesn’t match the production and it is not very useful to them.
First Attempt:
Our first attempt was public cloud. We deployed the application on Public Cloud. It was the best course of action for few reason
Fastest way to get to production
Less upfront cost and TCO was better.
It help us understand infrastructure
Our Findings:
Our resources grow consistently. Our work load is such that it always go up and our growth is predictable. So planning can be done easily. It is not very elastic in nature, seasonality is not a big factor like ecommerce sites where during the Thanks giving, New Year the needs go pretty high
Our work load is Big data work load so the it requires very high compute, IO and NIO resources.
Consistent performance and the reliability of the stack is key. This is uttermost importance if customer is having issue with production they need us the most for debugging their application.
Since we need the very high compute and IO. Public Cloud was not very cost efficient for our work load.
This is our log ingestion Architecture
This is the first point where the data hits from customers is collector. These sits on edge of our network. These are designed to have customer throw log at us and they will get collected and persisted. So in net . The two design goal the goal of the collector is to collect log as fast as possible and persist data to disk equally fast.
Logs get ingested to Collector using TCP, UDP, Http or https protocol. Collector literally works at network speed. It is written in C++. Once the log is collect we process the log and most of our secret sauce is here in event processing while we process we make sure that no event get dropped at any point. Once processed we put the log back into the Kafka.
Now writers read data from the Kafka and write to ES in NRT. Customer searches these log in NRT from the ES. The writers are constantly pulling data from the Kafka queues in batches and update the ES using the bluk API of ES which help us with NRT. Now the key is that it is a pull model and if the producers are producing more than what is writers can write then it stays in kafka for little longer but data never get dropped.
You need to be very careful with ES once you push it for indexing really high amount of data. The memory can be an issue, cpu can be an issue. Our ES has grown pretty big(I will say one of the biggest in ES customer base) I can’t give numbers because it grows and shrink elastically.
It is really fast and scalable system. The event logs are available from the time it hits collector to ES where customer can search it is less than 10 secs and this is really fast at the number of events.
Now we have concept of the Deferred event if for some reason the logs doesn’t get processed we keep it in the deferred kafka. If you notice both Writers and Event processing components provide both metrics api and action api. Same is the case with collector it provides the Metrics and action API.
Loggly Use-case
We offer our customer Centralized Log management system. What does that mean
That mean: we allow our customers to debug application from one centralized place i.e. browser and allow customer to analyze large amount of data using browser. So that customer don’t have to write custom scripts and manage those script as the log changes
Our use-case is different than the typical big data use-case where the big data companies collect data dump it into big data systems like HDFS or Hadoop and then do offline processing. Our Use-case is to process the data in real time. Our customer expect to see the data in real time as soon as it leaves customer premises they expect to see it and ready to analyze.
Few high-level challenges for us:
We are fundamentally multi-tenant system and massive incoming stream of data. Our customer send enormous amount of logs to us every day.
As I mentioned we are have to be realtime for indexing, searches and alerts. When our customer face issue in production if logs or searching of log is behind then it doesn’t match the production and it is not very useful to them.
First Attempt:
Our first attempt was public cloud. We deployed the application on Public Cloud. It was the best course of action for few reason
Fastest way to get to production
Less upfront cost and TCO was better.
It help us understand infrastructure
Our Findings:
Our resources grow consistently. Our work load is such that it always go up and our growth is predictable. So planning can be done easily. It is not very elastic in nature, seasonality is not a big factor like ecommerce sites where during the Thanks giving, New Year the needs go pretty high
Our work load is Big data work load so the it requires very high compute, IO and NIO resources.
Consistent performance and the reliability of the stack is key. This is uttermost importance if customer is having issue with production they need us the most for debugging their application.
Since we need the very high compute and IO. Public Cloud was not very cost efficient for our work load.
Since we have
Big data workload with real-time needs
Needed reliable and consistent performance
Need to use every bit of compute, IO and NIO
The best course of action for use is to have the Hybrid cloud.
There is no answer how to decide which resource to run on Public Cloud vs. Private Cloud.
The way we decided is move all the resources which face internet go to public cloud.
The resources which are elastic and can burst or have potentially to burst suddenly.
Private cloud
All the services which does f big data crunching and do all heavy lifting
All the services which process sensitive data.
Last and final thing Going Hybrid made us save $$$ since the TCO reduced significantly.
Some companies delineate the workload between public and private cloud by the type of deployment like dev and staging in public cloud and production in private cloud.
Some companies also do based on SLA. The high SLA services go on private cloud and less SLA based services go on private cloud.