Centralized Logging System Using ELK Stack

Centralized Logging System
By:- Rohit Sharma
Email:- rohitrsh@gmail.com

Agenda
The agenda of this session is below fields:
a. Discuss about CLS
b. Centralized logging tools
c. ELK Stack : Introduction
d. Implementation and configuration of ELK stack

What is CLS?
• CLS stands for Centralized Logging System. The CLS is designed to collect and
manage information retrieved from operating systems and/or applications. This
information can then be processed by a central managing system to generate
information for auditing and reporting.
• Using the Central Logging System, your company is able to analysis the data
quickly. The system automates control processes, giving users additional time to
respond more effectively to any anomalies. Proper system configuration results in
the automatic escalation of events, for example, according to predefined
procedures.

Why CLS?
– Logs are a critical part of any system, they provide vital information about the application and
answer questions on what the system is doing and what has happened. Most of the processes
running on the system generate logs in one form or other. For convenience, these logs are
often collected in files on a local disk with the log rotation option. When the system is hosted
on one machine, file logs are easy to access and analyze, but when system grows to multiple
hosts, log management is becoming a nightmare. It is difficult to look up a particular error
across thousands of log files on hundreds of servers without the help of specific tools. A
common approach to this issue is to deploy and configure a centralized logging system, so
that data from each log file of each host is pushed to a central location
• Benefits for organization and IT department
– Fulfillment of auditing/compliance requirements
– Optimization of time and resources
– Systems status information
– Single point of control
– Archived history of your activities
– Universality and scalability of your systems
– Historical log database

CLS Tools in Market
• Splunk
• Splunk, an industry-leading platform for machine data, automatically indexes all your log
data, including structured, unstructured and complex multi-line application log data. Splunk
aims to provide a deeper understanding of real-time data.
• Loggly
• A cloud-based log management service, Loggly makes the log management process much
less cumbersome. With a simple set-up process and intuitive tools, Loggly doesn’t require a
ton of on-ramping. Loggly provides immediate value by interpreting and making sense of
data pouring in from your applications, platforms and systems instantly.
• Graylog2
• An open-source data analytics system that’s been field-tested around the globe, Graylog2
collects and aggregates events from a multitude of sources and presents your data in a
streamlines, simplified interface where you can drill down to important metrics, identify key
relationships, generate powerful data visualizations and derive actionable insights.
• Fluntd
• An open-source data collector for processing data streams, fluentd offers more than 150
plugins for extended functionality, more robust log management and additional uses. It
works with more than 125 types of systems and is designed for high-volume data streams.
You don’t need any ad-hoc scripts to use fluentd; the functionality is built in out of the box.
It’s similar to syslogd but uses JSON for log messages.

What is ELK Stack?
– Elastisearch ELK Stack offers a set of applications and utilities, each serving a
distinct purpose, which combine to create a powerful, end-to-end search and
analytics platform. (L)ogstash captures log data in a central
location,(E)lastisearch takes it a step further with real-time analysis
and (K)ibana transforms data into powerful visualizations for actionable
insights. This comprehensive platform is built on Apache Lucene and offered
under an Apache 2 Open-Source License.
• Key Features:
– Stacked solution with powerful components
– Powerful analytics with instant insights
– Visualize data with Kibana
– Resistant clusters for security and reliability
– Document-oriented
– No Schema; automatic interpretation
– Conflict management with optimistic version control
– Multi-tenancy with individual or group queries
– Redundancy for data security

ELK Solution Architecture
 The Shippers usually known as agents , it will forward all the logs to broker which is configure
in syslogs to be forward. I have used logstash jumberjack shipper agent.
 The Broker just like shipper agent just need to configure it as broker (collector), its store logs
in local storage forwarded by shipper agent.
 Elasticsearch index all the logs collected by broker agent. For indexing It converts all the logs in
Json. So It can be easily stored in any non-structure database (ie mongodb, hadoop)

Logstash
– Logstash is a tool for managing events and logs. It is written in JRuby and
requires JVM to run it. Usually one client is installed per host, and can listen to
multiple sources including log files, Windows events, syslog events, etc. The
downside of using JVM is that memory usage can be higher than you would
expect for log transportation. However, community has
developed Lumberjack that is deployed on each host. It collects and ships logs
to Logstash which is running centralized log hosts. Logstash itself is only a client
(shipper) that can send log message to centralized storage.
• Input: Input can be file, syslog, Redis, logstash-farwarder (Lumberjack)
• Filers: are format the logs as per the require format. i.e. apache, syslog.
Also we can create custom filer using GROK pattern.
• Output: Filtered log output can be stored on Elasticsearch, File, Graphite.
 Log processing
Input  Filters  Codecs Output

Elasticsearch
– ElasticSearch,built on top of Apache Lucene, is a search engine with focus on
real-time analysis of the data, and is based on the RESTful architecture. It
provides standard full text search functionality and powerful search based on
query. ElasticSearch is document-oriented/based and you can store everything
you want as JSON. This makes it powerful, simple and flexible.
• Indexing: ElasticSearch is able to achieve fast search responses because,
instead of searching the text directly, it searches an index instead.
• DSL Query: The Query DSL is ElasticSearch's way of making Lucene's query
syntax accessible to users, allowing complex queries to be composed using
a JSON syntax
• Visualize: It can be integrate with any frontend tool which visualize JSON
data.
• NoSQL Integration: Usually it index and store all the data in local disk, but
in big infrastructure it can be integrate with Any NoSQL DB i.e. Cassandra,
MongoDB, Hadoop.

Kibana
– Kibana is the frontend part of the ELK stack, which will present the data stored from
Logstash into ElasticSearch, in a very customizable interface with histogram and
other panels which will create a big overview for you. Great for real-time analysis
and search of data you have parsed into ElasticSearch, and very easy to implement
• Query Dashboard: is use to fetch the data to analytical data for any request of
incident on basis of custom query and time stamp.
• Monitoring Dashboard: Its static dashboard need, provide various monitoring
graphs such as histogram, pie chart on the basis of configured queries.

Enhancements?
– As its open source below are the future enhancements :
• Email alerting: Currently, Kabana doesn't support email alerting however
there’s some plugins are available on github. From that email alerting can
be integrate.
• GROK Patterns: Using GROK pattern we can easily parse any log format in
logstash its uses regex to read the log files print complete exception traces.
There are GROK debugger available which reads the logs format and create
the GROK patterns
– http://grokdebug.herokuapp.com/
• PacketBeat Integration: PacketBeat another frontend solution to visualise
elasticsearch index, it provides enhance capabilities to monitor and analysis
the logs.
– http://packetbeat.com/
• Kibana Queries: As Kibana user DSL (Distributed search language) to
analyse the data need to work on it. So we can have good hands on DSL.

Other Solutions
– All other open source solution like ELK stack :
• Fluentd: Fluentd is an open source data collector, which lets you unify the
data collection and consumption for a better use and understanding of data
– http://www.fluentd.org/architecture
• Apache Flume: Flume is a distributed, reliable, and available service for
efficiently collecting, aggregating, and moving large amounts of log data. It
has a simple and flexible architecture based on streaming data flows.
– http://flume.apache.org/
• Socket Appenders: For log4j can use socket appender, it directly forward
logs to logstash broker node. So we can remove logstash-farwarder.
– https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/net/S
ocketAppender.html
• MongoDB Appenders: This is directly forward log4j logs into MongoDB
database. So we can there is no requirement of logstash, we can directly
configured eslasticsearch with MongoDB plugin.
– https://github.com/log4mongo/log4mongo-net

ELK Stack
Thank You!
Rohit Sharma

Centralized Logging System Using ELK Stack

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Centralized Logging System Using ELK Stack

Similar to Centralized Logging System Using ELK Stack (20)

Recently uploaded

Recently uploaded (20)

Centralized Logging System Using ELK Stack