SlideShare a Scribd company logo
1 of 73
Download to read offline
TALLINN UNIVERSITY OF TECHNOLOGY
Faculty of Information Technology
Department of Computer Science
Chair of Network Software
CHOOSING AN OPEN-SOURCE LOG
MANAGEMENT SYSTEM FOR SMALL
BUSINESS
Master’s Thesis
ITI70LT
Student: Artyom Churilin
Student code: 113832IVCMM
Advisor: Risto Vaarandi, Ph.D
Tallinn, 2013
2
Declaration
I hereby declare that I am the sole author of this thesis. The work is original and has not been
submitted for any degree or diploma at any other University. I further declare that the material
obtained from other sources has been duly acknowledged in the thesis.
……………………………………. ………………………………
(date) (signature)
3
List of Acronyms and Abbreviations
AMQP Advanced Message Queuing Protocol
APT Advanced Persistent Threat
CERT Computer Emergency Response Team
CIRT Critical Incident Response Team
CPU Central Processing Unit
DNS Domain Name System. Often used to refer to a DNS server
ELSA Enterprise Log Search and Archive. Open-source log management
system created by Martin Holste – former Security Incident
Response Team Lead specializing in network security monitoring
and open-source tools
FIFO First In First Out, in this paper used as named or unnamed pipe. A
pipe is a mechanism for inter-process communication; data written
to the pipe by one process can be read by another process
GELF Graylog Extended Log Format
GNU A recursive acronym for GNU's Not Unix
GNU GPL GNU General Public License, widely used license for free software
GUI Graphic User Interface
LDAP Lightweight Directory Access Protocol
AD Microsoft Active Directory
PCAP Packet Capture. Application programming interface for capturing
network traffic.
PRI Priority field in Syslog message
RFC Request for Comments
RPM A package management system for many Linux distributions.
4
sendmail mail server application used on Unix platforms
TCP Transmission Control Protocol.
URL Uniform Resource Locator. Sometimes referred to as a “web link”
UDP User Datagram Protocol.
VHD File format supported by many virtual platforms. Virtual Hard Disk
5
Abstract
This thesis focuses on comparison of three popular open-source log management systems. The
purpose of this thesis is to give overview of three popular log management systems and provide
guidelines for choosing the best suiting one for a small company.
The choice was based on the comparative analysis as well as performance and usability testing.
ELSA is a high performance open-source log management system that can challenge enterprise
grade commercial solutions. It was designed for effective incident response and fighting against
APT (Advanced Persistent Threat).
Kibana is log analysis front end for Logstash and Elasticsearch. It can also be used with other
back ends that support formatted output into Elasticsearch, such as Rsyslog with oemelasticsearch
module.
Graylog2 is an alternative log management tool with its own web GUI. Speciality of Graylog2 is
that logs can be easily divided into different streams to give access to specific type of logs to
different users.
Performance testing showed that ELSA is the fastest and can handle in average 14285,7 logs per
second with the modest hardware resources used for testing. As the solution is meant for small
business, performance is not a crucial factor so Graylog2 and Kibana could very well compete
with ELSA in the given conditions.
According to usability test results Kibana is the most usable system.
Kibana with Rsyslog was chosen as the best fitting solution for a small company. It has some
shortcomings with authentication and saved searches, but the usability, ease of installation and
universality makes it such an outstanding solution for small business. The lacking functions are
under development, meanwhile there is possibility to use external mechanisms and workarounds.
6
Table of Contents
List of Figures .................................................................................................................................. 9
List of Tables.................................................................................................................................. 10
1. Introduction ................................................................................................................................ 11
1.1. Event logs ............................................................................................................................ 11
1.2. Central log management...................................................................................................... 11
1.3. Purpose of the thesis............................................................................................................ 12
1.4. Outline of the thesis............................................................................................................. 12
2. Log collection............................................................................................................................. 13
2.1. Logging protocols................................................................................................................ 13
2.1.1. BSD Syslog protocol .................................................................................................... 13
2.1.2. IETF Syslog protocol.................................................................................................... 14
2.2. Non GUI logging solutions.................................................................................................. 14
2.2.1. Unix Syslogd software suite ......................................................................................... 15
2.2.2. Syslog-ng framework.................................................................................................... 15
2.2.3. Rsyslog software suite .................................................................................................. 15
2.3. Graphical log management solutions .................................................................................. 17
2.3.1. Graylog2 ....................................................................................................................... 18
2.3.2. Kibana........................................................................................................................... 18
2.3.3. ELSA ............................................................................................................................ 19
3. Comparative analysis ................................................................................................................. 20
3.1. Structure .............................................................................................................................. 21
3.1.1. Graylog2 structure ........................................................................................................ 21
3.1.2. Kibana structure............................................................................................................ 21
3.1.3. ELSA structure ............................................................................................................. 22
3.2. Input and output................................................................................................................... 23
3.2.1. Graylog2 input and output............................................................................................ 23
3.2.2. Kibana input and output................................................................................................ 24
7
3.2.3. ELSA input and output ................................................................................................. 26
3.3. Interface............................................................................................................................... 26
3.3.1. Graylog2 interface ........................................................................................................ 26
3.3.2. Kibana interface............................................................................................................ 27
3.3.3. ELSA interface ............................................................................................................. 28
3.4. Features................................................................................................................................ 29
3.4.1. Graylog2 features.......................................................................................................... 29
3.4.2. Kibana features ............................................................................................................. 30
3.4.3. ELSA features............................................................................................................... 31
3.5. Search .................................................................................................................................. 31
3.5.1. Graylog2 search............................................................................................................ 31
3.5.2. Kibana search................................................................................................................ 31
3.5.3. ELSA search ................................................................................................................. 31
3.6. Conclusion based on comparative analysis ......................................................................... 32
4. Choosing a log management solution......................................................................................... 34
4.1. Logging requirements for small business............................................................................ 34
4.2. Testing................................................................................................................................. 35
4.2.1. Testing environment ..................................................................................................... 35
4.2.2. Performance testing ...................................................................................................... 39
4.2.3. Usability testing............................................................................................................ 46
5. Implementation........................................................................................................................... 52
5.1. Production environment ...................................................................................................... 52
5.2. Implementation of Kibana in production............................................................................. 52
6. Future research ........................................................................................................................... 54
7. Summary .................................................................................................................................... 55
Resüme........................................................................................................................................... 56
List of References........................................................................................................................... 57
Appendices..................................................................................................................................... 59
8
Appendix - 1 Basic Event Log Cycle ......................................................................................... 59
Appendix 2 - Logstash Inputs, Filters and Outputs.................................................................... 60
Appendix 3 - Rsyslog main components installation................................................................. 61
Appendix 4 - Kibana setup example scheme.............................................................................. 62
Appendix 5 -TCP and UDP output options in Logstash ............................................................ 63
Appendix 6 – Graylog2 setup example scheme ......................................................................... 65
Appendix 7 – Graylog2 tweaked settings................................................................................... 66
Appendix 8 – Graylog2, Kibana and ELSA component details................................................. 67
Appendix 9 – Lucene search ...................................................................................................... 68
Appendix 10 – Kibana search examples..................................................................................... 70
Appendix 11 – ELSA search examples...................................................................................... 72
Appendix 12 – ELSA performance test details .......................................................................... 73
9
List of Figures
Figure 1 Log management solution model..................................................................................... 17
Figure 2 Graylog2 software components ....................................................................................... 36
Figure 3 Kibana main components................................................................................................. 37
Figure 4 ELSA main components .................................................................................................. 38
Figure 5 Performance test results statistics compared.................................................................... 40
Figure 6 Relative increase in performance with 4 cores ................................................................ 41
Figure 7 Graylog2 of performance test results logs/sec................................................................. 42
Figure 8 Kibana and Logstash performance test results logs/sec................................................... 43
Figure 9 Kibana and Rsyslog performance test results logs/sec .................................................... 44
Figure 10 ELSA performance test results logs/sec......................................................................... 45
Figure 11 Scheme of Kibana implementation................................................................................ 53
10
List of Tables
Table 1 Basic overview of the log management solution............................................................... 20
Table 2 Advantages and disadvantages of log management solutions........................................... 32
Table 3 Usability test score ............................................................................................................ 46
11
1. Introduction
Today’s computer networks are very complex. Operating systems have millions of lines of code,
amounts of data and data transfer rates are continuously growing to meet the demands of the
market. Even relatively small networks can have millions of events per second. They vary in
importance and are often interconnected.
What are these events, how can they be managed and how to get useful information from these
events? What do current popular solutions offer and how to choose one for a small company?
There are various commercial log management solutions available on the market. These solutions
are quite expensive and are hardly affordable by small companies. Fortunately there are open-
source log management tools, which are free of charge and the only cost is the hardware or
hardware resources on a virtual platform. As small companies normally have only few technicians
it is important that the solution is easy to install, maintain and use. Performance requirements for
small business are normally moderate, but it depends on the specific environment.
1.1. Event logs
Event can be defined as “a relevant change in state” of a system [1], alternatively - an “action or
occurrence detected by a program”. An example could be: a network packet arrived to switch or a
firewall, user ran an executable, a network link went down, a user browsing a website received
error code 404 because of a broken URL etc. IT systems handling those events generate event
messages and usually by default store them locally or, if configured specifically, send to a remote
location. When these messages are recorded they are referred to as event logs or simply logs.
There are several standards and formats of log messages, but in general all logs consist of two
main parts. First is the timestamp, stating the date and time the event happened. Second part is the
data, containing information about the event itself. Logs can have more distinctive parts like
facility (type of software that generated the event), source IP, severity (e.g. error, info, debug) etc.
A typical event log cycle is presented by a diagram in Appendix 1 at the end of the thesis.
1.2. Central log management
Many server and client operating systems, network switches, routers, firewalls, printers, even
VoIP phones have capability to produce logs and send them through the network. Depending on
the size and complexity of the IT infrastructure there could be tens, thousands or possibly millions
of events per second. These events vary in importance and urgency but all of them are required to
get the full picture of what is going on in the network and inside the nodes’ operating systems.
By default logs are stored locally. This setup has many drawbacks. Firstly it is not efficient as
each device has to be managed separately. Secondly the logs stored locally can be deleted or
12
changed. If an attacker or malware managed to infiltrate a network device or a server, logs
including the records about the security breach could be changed or deleted. In this case the attack
would not be even noticed. Thirdly, if a device memory is corrupted or hardware fails, then the
local logs might not be accessible at all. In this case it might not be possible to find out the reason
of this malfunction. Central log management and event alert system can help solve these issues.
It is of crucial importance for an IT department of any organisation to be able to efficiently track
any event in the network within a needed timeframe. One logical solution to this issue is to send
all logs into a central log server. Modern log protocols support encryption and authentication to
secure the log collection. Software development, website administration, network administration,
incident response these are some example activities that require efficient log management.
1.3. Purpose of the thesis
The purpose of this thesis is to give guidelines for choosing a solution for small business and
choose the best suiting open-source log management system for a small target company. The
choice is based on a comparative analysis as well as performance and usability testing.
1.4. Outline of the thesis
Chapter one of the thesis states the problem of the research. Main standards and protocol suites is
described in chapter two.
Comparative analysis based on the features, performance and usability testing is presented in
chapter three.
Chapter four describes the performance and usability testing and presents the results.
Implementation plan of a chosen log management system in a small company is described in
chapter five.
Chapter six offers some ideas for future research.
Chapter seven summarizes the thesis.
13
2. Log collection
Event logs can be generated by most of the applications, operating systems and network devices.
Logs can be used for incident investigation, historical reporting, debugging etc. Because event
logs are produced in real-time – they can also be used for real-time monitoring systems. Often
such monitoring solutions include a frontend with analytical module and dashboards that show the
current status as well as the historical. Usually such solutions have capability to send notifications
for specific events (e.g. in a form of email alerts).
2.1. Logging protocols
There are several main standards and protocol suites that are currently used in applications,
operating systems and network devices. New standards were introduced to address the
shortcomings of their predecessors.
2.1.1. BSD Syslog protocol
BSD Syslog Was developed in 1980s by Eric Allman for sendmail application as an alternative
for appending messages to flat files from programs. According to RFC3164, the sender sends a
syslog message with maximum size of 1KB to the receiver over the UDP protocol; destination
port 514 is used and source port 514 is recommended.
Syslog message is sent with a UDP packet which has following payload:
<PRI>Timestamp Hostname Content
The formula for calculating PRI:
PRI = 8*Facility + Severity
Facility defines the software component that generated the event. Here are the facility values used
for calculating PRI: kern (0), user (1), mail (2), daemon (3), local0..7 (16..23)
Severity defines the level of relative event importance. Here are the severity values used for
calculating PRI: emerg (0), alert (1), crit (2), error (3), warning (4), notice (5), info (6), debug (7).
Timestamp has the next syntax: “MMM DD hh:mm:ss”. Hostname part contains the sender
hostname or IP address. First 32 alphanumeric characters in the content field are regarded as tag
field (name of the logging program), and the rest is regarded as message field [3].
One of the drawbacks of the BSD Syslog protocol is that it uses UDP only. This means there is no
delivery control as no acknowledgement of the receipt is made [4]. Another limitation is that BSD
syslog does not support encryption, so messages are sent in clear text. It also does not support
authentication. Timestamps have no time zone information and time is given in seconds. UTF
encoded characters are not supported. These shortcomings were addressed by IETF syslog
protocol (Chapter 2.1.2).
14
2.1.2. IETF Syslog protocol
IETF Syslog protocol is defined by RFCs 5424-5426. It supports TLS and default port for
message reception is 6514/tcp. Both the message sender and receiver must support certificate
based authentication. However, the administrator chooses the authentication options. Messages
are sent as TLS application data which consists of one or more syslog frames.
RFC 5426 sets requirements for message transmission over UDP: default message reception at
514/udp, a message is sent as a single UDP packet. IETF syslog messages are more structured
than the ones of BSD syslog. Here is the structure of IETF messages:
<PRI>Version Timestamp Hostname Application PID MsgID StructData Message
To sum up: IETF syslog protocol is a more structured, transmission-reliable and secure than BSD
syslog [3].
2.2. Non GUI logging solutions
Since 1980 when the BSD syslog protocol was created, there have been some important
developments in syslog based solutions. Here are some important events that have formed today’s
non-GUI open-source syslog market:
• 1998 Balabit releases Syslog-ng
• 2004 Rsyslog is released
• 2007 Syslog-ng announces Syslog-ng PE (premium edition)
At the same time that Syslog-ng went partially commercial in 2007 by introducing the PE version,
Rsyslog got to the same level with its features. On February 28th Rsyslog 3.12.0 was released.
According to Rainer Gerhards, from that date on Rsyslog supported all Syslog-ng major features,
but had a number of major features exclusive to it. Rainer Gerhards considered Rsyslog 3.12.0
fully superior to Syslog-ng at the same period of time with exception of platform support [5].”
Syslog-PE has some additional advanced features like encrypted log storage, Microsoft Windows
support and client-side failover [6]. According to the popularity, community support and online
discussions Syslog-ng OSE (open-source edition) and Rsyslog are the most widely used open-
source non-GUI syslog solutions.
15
2.2.1. Unix Syslogd software suite
UNIX syslogd (syslog daemon) can receive messages from a local file system socket and UDP
port 514, and send output to local files or remote syslogd instance. Syslogd configuration is
usually stored in /etc/syslog.conf that contains single-line rules. Each rule consists of selector and
action, where selector is a list of “facility.severity” pairs and action specifies a destination for the
message. Facility can be set to the standard syslog facility classifiers alternatively it can have “*”,
which means any facility. Severity can be set to the standard syslog severity classifiers or “none”.
Flat files, FIFOs, terminals and remote log servers are usually supported as destinations. This
suite is still used for simple solutions, but generally it has been by more functional software suites
like Syslog-ng and Rsyslog.
2.2.2. Syslog-ng framework
Syslog-ng is one of the most prominent syslog frameworks with a very large user base.
Supports logging both over UDP and TCP In addition to BSD syslog protocol, also supports IETF
syslog protocol including encryption and authentication. Syslog-ng employs regular expressions
for matching and filtering messages by tag, message text, etc. It supports custom message
templates and allows user to change the log message format and the set of message fields that are
logged [7].
2.2.3. Rsyslog software suite
Rsyslog is an advanced open-source logging solution. Letter R in the name stands for reliable,
which mainly emphasises the use of TCP as transport and does not point to the unreliability of
predecessors. Rsyslog can be used under terms of GPLv3 license but can be used for a non-
GPLv3 compatible project in some special cases described in the license agreement [8].
Rsyslog has been developed in 2004 based on the sysklogd (syslog and klogd – latter handles
kernel messages) standard package. The goal of the Rsyslog project is to provide a feature-richer
and reliable syslog daemon while retaining drop-in replacement capabilities to stock syslogd [5].
It adds a lot of features to Unix syslogd, including support for IETF syslog protocol along with
other features has advanced message filtering and custom message formatting.
Rsyslog configuration is usually stored in /etc/rsyslog.conf It supports traditional selector-action
rules of Unix syslogd, in order to ease migration from syslogd. Rsyslog has become the default
logging solution for many Linux distributions.
According to Rainer Gerhards, the main author of the Rsyslog, the main competitor for Rsyslog is
Syslog-ng. Rsyslog’s advantage is that it is free of charge including all features, but full-featured
Syslog-ng PE (premium edition) has a paid license.
16
Rsyslog maintains backward compatibility with syslogd: basic syslog.conf format is extremely
well known, covered in a lot of text books, taught in numerous courses and used in a myriad of
Internet tutorials. So if we would abandon it, we would thrash a lot of people's knowledge and
help resources [9].
17
2.3. Graphical log management solutions
Whatever solution is used for the backed of the log collection it important to have the logs
presented in a comprehensive and useful manner. Aim of a Graphical User Interface (GUI) is to
give quick and easy access to an IT system. User management, system configuration, graphs with
historical data, dashboards with real time statistics – these are some of the main useful features
that are available in a good GUI frontend. Productivity and user experience of an operator of such
GUI depends on how flexible, customisable and usable these options are. There is no perfect GUI
for all cases it is more a question of what suits best to the given environment. Open-source
graphical log management solutions are quite flexible and can be used in combination with other
non-GUI solutions. Following main components of a graphical log management solution could be
outlined (see Figure 1):
• log shipper
• log parser
• log storage, indexing and search
• web interface
Figure 1 Log management solution model
Most of the components, depending on the solution, could be replaced by some alternative ones.
log
shipper
storage
search
index
web interface
logs
logs
log parser
logs
18
Log shipper can normally be any log collection software like a syslog daemon. It serves as the
entry point for event logs from local services or network and applies some action to the logs. In
log management system it normally sends the logs for further parsing and filtering. (Syslogd,
Rsyslog, Syslog-ng, Logstash, Graylog2 etc.)
Log parser is a separate process or module which is responsible for parsing fields out from raw
log messages and creating structured messages which are suitable for writing into log storage.
(Grok, Json, Ruby, Syslog4j etc.) Log storage, indexing and search are performed using databases
and indexing software. (MySQL, MongoDB, Tokyo Cabinet, Elasticsearch, Sphinx search)
Web interface works as a frontend to all of the components and provides means to manage the log
data. (Log analyser, Kibana, Graylog2 web interface, Elsa web interface etc.)
Distribution of the functions among components depends on the architecture of the solution.
Multiple functions can be executed by a single part of the log management solution e.g. Graylog2
server is log shipper and parser. Logstash is a log shipper, indexer and has its own integrated web
interface. Kibana is a front end web interface and indexer. In many cases the parsing and storing
functionality is implemented inside the log shipper.
Single function can be divided among multiple components e.g. Graylog2 storage is done by
Elasticsearch (messages) and MongoDB (statistics, user accounts) [10]. Log management
solutions might include other components like various plugins, filters and middleware. This will
be described in more detail in chapter 3.1 of the thesis.
2.3.1. Graylog2
Graylog2 is an open-source GPLv3 licensed log management system that stores logs in
Elasticsearch. It was designed by Lennart Koopmann, developer at XING AG, and was released
in 2010. It consists of a server written in Java that accepts syslog messages via TCP or UDP and
stores them in indexes of Elasticsearch. The second part is a Ruby on Rails web interface.
Graylog2 web interface allows searching through the logs, apply filters, blacklist strings, quickly
view logs for each monitored host and flexibly manage access to the logs by authorising users to
see specific log “streams”.Main configuration file is graylog2.conf. Embedded Elasticsearch
configuration file is graylog2-elasticsearch.yml. Elasticsearch - elasticsearch.yml
2.3.2. Kibana
Kibana is a browser based frontend for Logstash and Elasticsearch written in Java Script and
Ruby. It was designed and developed in 2012 by Rashid Khan, developer at Elasticsearch project.
Its default log shipper Logstash is a flexible open-source log management software supporting a
19
long list of inputs, filters and outputs. As an alternative to Logstash, Kibana can be configured to
work with other log management software which supports output to Elasticsearch. The setup
examples described further in the thesis are Kibana with Logstash and Kibana with Rsyslog.
Main configuration file for Kibana is Kibana.Config.rb, for Elasticsearch - elasticsearch.yml, for
Logstash - logstash.conf and for Rsyslog – rsyslog.conf.
2.3.3. ELSA
ELSA stands for Enterprise Logs Search and Archive. It is an open-source log management
solution written in C. ELSA was created by Michael Holste - former Security Incident Response
Team leader, currently employed at Mandiant (company offering information security services).
Its author describes the program in short as: GPLv2 framework around Syslog-ng, MySQL, and
Sphinx search. [11]
Perl is used as a pipe between the components e.g. logs are taken from Syslog-ng output and
prepared for batch loading into MySQL.
ELSA was designed to support efficient network incident response. It is oriented on high
performance and is advertised to handle more than a million of logs per minute and give a billion
results for a query in half a second on modest hardware [12].
ELSA has two main installations: node and web. ELSA nodes that only gather, store and forward
the logs need only node component installed. Nodes that are used as a gateway for queries need
the ELSA web component installation. In small setup scenarios, like the one used for testing, both
components are installed on the same node. Main configuration files are elsa_node.conf and
elsa_web.conf.
20
3. Comparative analysis
Comparative analysis is based on the primary data generated during the tests and secondary data
from the web resources. Here is the basic overview of the solutions presented in the table
presented in Table 1.
Name: Graylog2 Kibana ELSA
Language Java, Java Script, Ruby Java, Java Script, Ruby C, Perl
Protocols
BSD & IETF syslog, GELF,
GELF via http, AMQP
BSD & IETF syslog,
AMQP, XMPP… BSD & IETF syslog
Transport TCP, UDP TCP, UDP TCP, UDP
Log shipper Graylog2 Logstash, Rsyslog Syslog-ng
Log parser syslog4j
grok, json, syslog4j… 28
filters perl, PatternDB
Storage Elasticsearch, MongoDB Elasticsearch MySQL
Indexing Elasticsearch Elasticsearch Sphinx search
License GNU GPLv3 Apache 2.0 GNU GPL v2
Documentation
Good: platform independent
instructions, official
examples for Debian,
unofficial for RHEL
Good: platform
independent
instructions, official
examples for Debian,
unofficial for RHEL Excellent
Installation scripts
Script available for Debian
based
Script available for
Debian based
Multiplatform fully
auto
Demo
http://public-
graylog2.taulia.com/session
http://demo.kibana.org
/#/dashboard no live demo
Authentication Local, LDAP
Needs external
authentication e.g. with
passenger module in
Apache or Ngnix none, local or LDAP
Authorisation Local, LDAP
Under development,
passenger can be used
Account or group
based, local or LDAP
Performance on
modest hardware
suitable for
Small and medium sized
business
Medium sized business
and enterprise Enterprise
Log lines /second
announced thousands per second thousands per second
tens of thousands
per second
Log lines /second
tested 1428,6 5681,82 14285,7
Saved Searches Streams No Yes
Search syntax Lucene + regular expressions Apache Lucene search Google syntax
Event triggering
and alerts
Regular expressions
templates + email alerts
No native alerts or
event triggering, done
on log shipper side
Scheduled searches,
actions + alerts
Table 1 Basic overview of the log management solution
21
3.1. Structure
3.1.1. Graylog2 structure
Graylog2 consist of two main components a server written in Java and a web interface written in
Ruby using Ruby on Rails web framework. The Graylog2 server listens to log messages, receives,
parses, does the indexing and stores messages in Elasticsearch and statistical data, graphs and user
accounts in the MongoDB. For an overview of how Graylog2 can be implemented please see the
scheme in Appendix 6.
Elasticsearch is a highly scalable, resilient, schema free, document oriented non-sequel open-
source database. It is an Apache 2 licensed open-source distributed search engine, built on top of
Apache Lucene [13]. Elasticsearch is used for Graylog2 and Kibana.
3.1.2. Kibana structure
Kibana is a web interface written in Java-Script and Ruby using Sinatra web framework [14].
Typical minimal deployment of Kibana consists of Logstash and Kibana. Logstash is used for
receiving log messages from various sources, optionally filtering the logs and sending them
though one of the supported outputs. A simple example would be Logstash listening for IETF
syslog formatted messages on TCP and UDP ports 514 and without applying additional filters
forwards the logs to Elasticsearch.
Logstash inputs, filters and outputs will be described in more detail in chapter 3.2.2 of the thesis.
As an alternative setup for Kibana, Logstash can be replaced with Rsyslog for sending specifically
parsed logs to Kibana via Elasticsearch.
Rsyslog sends formatted logs into Elasticsearch using omelasticsearch module. These logs are
parsed using Json and indexed in a format suitable for.
22
Here are the lines in rsyslog.conf file that are required for this setup:
module(load="omelasticsearch")
$template Syslog2Kibana, "{"@timestamp":"%timereported:::date-
rfc3339%","@message":"%rawmsg:::json%","@type":"syslog","@tags":
[],"@fields":{"receptiontime":"%timegenerated:::date-
rfc3339%","host":"%HOSTNAME:::json%","tag":"%syslogtag:::json%","
facility":"%syslogfacility-text%","severity":"%syslogseverity-
text%","msgtext":"%msg:::json%"}}"
$template SyslogIndex, "rsyslog-%timereported:1:10:date-rfc3339%"
*.* action(type="omelasticsearch"
template="Syslog2Kibana"
dynSearchIndex="on"
searchIndex="SyslogIndex"
server="localhost"
serverport="9200"
bulkmode="on" )
The first line enables omelasticsearch - output module to Elasticsearch. Next line defines the
pattern for structuring the message and timestamp which was given the name “Syslog2Kibana”.
Second template is for the search index, which was given the name “SylogIndex”. There is a
special setting in KibanaConfig.rb file that needs to be set for Kibana to Index the logs coming
from Rsyslog. The settings are presented below:
Smart_index = true
#Smart_index_pattern = 'logstash-%Y.%m.%d'
Smart_index_pattern = 'rsyslog-%Y-%m-%d'
These lines in KibanaConfig.rb enable the smart index feature and replace Logstash pattern with
Rsyslog pattern to allow Kibana index Rsyslog data from Elasticsearch. For an overview of how
Kibana can be implemented please see the scheme in Appendix 4.
3.1.3. ELSA structure
ELSA uses Syslog-ng for receiving logs, its PatternDB for parsing, which is claimed by the
designer to be more efficient than using the computationally intensive regular expressions.
Alternative input is via HTTP, which is used for communicating between nodes in a cluster.
Parsed logs are written into a raw file and are then batch loaded into the MySQL database and are
indexed by Sphinx search. Batch is loaded by a script by default every minute. This setting can be
23
changed in elsa_node.conf file by setting a value in seconds for “index_interval” After each batch
is loaded Sphinx indexes the newly inserted rows in temporary indexes, then again in larger
batches every few hours in permanent indexes [12]. ELSA flow diagram is presented below:
Network → Syslog-ng (PatternDB) → Raw text file
or
HTTP upload → Raw text file
Batch load (by default every minute):
Raw text file → MySQL → Sphinx
Additional functionality can be added to ELSA using plugins. New plugins can be added by sub-
classing the "Info" Perl class and editing the elsa_web.conf file to include them. Plugins that are
included in ELSA by default are presented below:
• Windows logs from Eventlog-to-Syslog
• Snort/Suricata logs
• Bro logs
• Url logs from httpry_loggere
These plugins allow applying specific actions using the log data. For example if URL plugin is
configured - any log that has an IP address in it will have a "getPcap" option which will auto-fill
pcap request parameters for one-click access to the traffic related to the log being viewed. This
option is available if a pcap server like OpenFPC or StreamDB is installed and configured in
elsa_web.conf.
3.2. Input and output
3.2.1. Graylog2 input and output
Graylog2 accepts Syslog messages via TCP and UDP. Additionally it accepts messages in its own
Graylog Extended Log Format (GELF) via TCP, UDP and HTTP. GELF logs are basically
messages archived with Unix Gzip and formatted in JSON.
Graylog2 also supports AMQP input (Advanced Message Queuing Protocol) via such message
queuing middleware like RabbitMQ, Apache Qpid, OpenAMQ, SwiftMQ etc. Message queuing
software is used to make sure that the messages are delivered from point A to point B. It stores
messages in memory (writes to disk), waits for the buffer to clear after a peak of log traffic and
then offers these messages to the logging system.
For syslog default port is 514, GELF 12201 and AMQP 5672. Graylog2 is using Drools Expert
[16] to check the incoming log messages against a user defined rule file Jabber/XMPP is used for
24
sending alerts. Internal metrics and stream counts can be stored into Graphite [17] and Librato
[18] to turn these stats into visualization.
3.2.2. Kibana input and output
Kibana imports logs from Elasticsearch. Originally Kibana was designed as frontend for Logstash.
Logstash supports a wide range of inputs including IETF syslog, Gelf, Elasticsearch, snmptrap,
eventlog, Twitter etc. On the homepage of Logstash there are currently supported 37 inputs, 28
filters and 47 outputs (for a complete list see Appendix 2)
A simple scenario (see configuration below): receiving logs from TCP and UDP ports 514 and
sending all logs to Elasticsearch. For TCP and UDP inputs “port” and “type” are required fields
(for all options see Appendix 5).
input {
tcp {
port => 514
type => syslog
}
udp {port => 514
type => syslog
}
}
output {
elasticsearch {
}}
In scenario where Rsyslog is used instead of Logstash all the Rsyslog functionality applies
including inputs and outputs. Rsyslog receives local messages from within the kernel, remote
messages can be received in BSD or IETF syslog format. Received messages can be written to log
files, sent to remote syslog servers, etc. Additional modules can be used in Rsyslog e.g.
oemelasticsearch (Rsyslog to Elasticsearch) for input to specific sources.
25
A more advanced Logstash configuration is presented below.
input {
tcp {
port => 514
type => rsyslog
}
udp {
port => 514
type => rsyslog
}
}
filter {
grok {
type => "rsyslog"
pattern => [
"<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp}
%{SYSLOGHOST:syslog_hostname}
%{PROG:syslog_program}(?:[%{POSINT:syslog_pid}])?:
%{GREEDYDATA:syslog_message}" ]
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{@source_host}" ]
}
syslog_pri {
type => "rsyslog"
}
date {
type => "rsyslog"
syslog_timestamp => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]}
mutate {
type => "rsyslog"
exclude_tags => "_grokparsefailure"
replace => [ "@source_host", "%{syslog_hostname}" ]
replace => [ "@message", "%{syslog_message}" ]
}
mutate {
type => "rsyslog"
remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp"
]
}
}
output {
elasticsearch_http { }
}
In this scenario: Rsyslog ships logs to Logstash, then “syslog_pri”, “grok” and “mutate” plugins
are used to parse the logs which are then sent to Elasticsearch via HTTP [19].
26
3.2.3. ELSA input and output
ELSA is using Syslog-ng as the log receiver. All the inputs for Syslog-ng are valid for ELSA,
they are called source drivers. In Syslog-ng.conf it is configured using this syntax [7]:
source id { driver1(opt1); driver2(opt2); ...; };
Some source drivers:
– file(fname [options]) – read messages from a file fname (usually employed for
reading messages from special file of kernel messages, for example /proc/kmsg)
– internal() – read Syslog-ng internal messages – unix-stream(fname [options]),
unix-dgram(fname [options]) –
read messages from a UNIX file system socket fname in stream or datagram mode –
tcp([options]), udp([options]) – receive BSD syslog messages from remote hosts over
TCP or UDP – syslog([options]) – receive IETF syslog
ELSA node can forward log messages between nodes using SCP and HTTP/HTTPS. Although
ELSA has a certain predefined log flow it should be possible to send output using Syslog-ng in
parallel to the batch loading into MySQL - e.g. syslog messages to an IP address and a TCP or
UDP port.
3.3. Interface
3.3.1. Graylog2 interface
Graylog2 Interface is arranged into tabs. Menu buttons at the top for the page can be used to
switch between the tabs. These buttons are: messages, streams, hosts, blacklists, settings and
users. By default settings the messages tab is first displayed to the administrator after logon.
Regular user can see only streams tab. Messages tab consists of next main parts: search field,
menu, overview table and a sidebar.
Search field is wide across the whole page and contains sample search instructions in transparent
text which disappear once clicked for placing the cursor in it. There is a dropdown with relative
time next to the search field with options ranging from 5 minutes through 1 day, 1 month to
always. The default time is 5 minutes, so recent data is displayed to the administrator right after
the logon.
Overview table contains a list of log message lines. Messages can be clicked to display more
information in the sidebar: permalink, breakdown of the message, full message and stream name
if it belongs to any. Overview table also shows the total amount of logs and has links to toggle
between the view of recent and all log messages. Additionally there is a button in the shape of an
asterisk which can be used for highlighting today’s messages.
27
Sidebar by default shows a graph of recent incoming logs and a welcome message. Favourite
streams’ mini graphs are also shown there. Sidebar basically shows details of the active objects
clicked by the user e.g. log message in the overview table. When scrolling down the list of logs a
“Back to top” button appears on the screen which makes it comfortable to get back to the top of
the page where all the menus and search field are located.
When the sidebar is displaying a graph it has a “server health” button. It leads to a page with a
dashboard with near-real-time throughput statistics also showing the recent highest value. (The
current throughput in logs per second is also shown on the main page in the top right corner.)
Apart from the dashboard “server health” page contains status information on the Elasticsearch
server and also shows main server applications status log messages produced by Graylog2 (e.g.
Graylog2 server start-up and shutdown). Streams tab contains controls to create saved searches
and arrange them into categories. Hosts tab contains a list of hosts which are automatically added
once logs start coming from a source. Blacklist tab has an option to create a blacklist with a set of
regular expressions rules to filter out unneeded content which will be discarded.
Settings tab has some subsections which allow defining the length of a message shown of the
user, adding a column to the log list, configuring AMQP settings, adding comments to messages
using regular expressions, define templates for sensitive data for filtering out, enable or disable
plugins and check if the last version of Graylog2 is installed. Users tab allows creating user
accounts of two types: admin and reader. Reader user see streams tab only with the streams
assigned by the admin user in it.
3.3.2. Kibana interface
Kibana has a very well designed interface. It has exactly what is needed for easily searching
through formatted data and analysing it with a single click. Does not mean that searching through
unformatted logs is not possible, it would just require manual query writing.
Kibana web interface home page has next main sections: search field, field panel containing
message fields (also referred to as “Show fields” section), graph, and a table panel which is
mainly a list of logs.
Search area is a big black rectangular frame positioned across the page on the top. In its left part
there is a small white Kibana logo serving as “home” link and a time dropdown. A white search
field is in the middle and when it is blank has “Search” inside written in thin font. On the right
from the search field there is a blue “Search” and a red “Reset” buttons. The extreme right part of
the search area has a mini dashboard displaying the current number of search hits. The time
dropdown next to the search field has relative time options ranging from 5 minutes through 12
28
hours, 7 days to “All Time”. The default time is 15 minutes and there is also an option of Custom
time frame.
Interface overall is very dynamic and interactive. All the lines of logs are clickable and
expandable into more detailed fields. Each field can be used for dynamic query building. When
fields in the field panel are clicked - a menu with quick stats appears. Buttons such as “score”,
“trend”, “terms” and “stats” inside this menu can be used for various analytical manipulations like
changes in share, average values, distribution represented in pie charts, stock market type tables
etc. With Kibana 3 it is possible to design a custom interface interactively without any coding. It
is possible to create custom panels and dashboards and save these interfaces.
3.3.3. ELSA interface
It is possible to create custom panels and dashboards and save these interfaces. ELSA interface
design is very minimalistic and conservative. In administrator account there are five dropdowns
which remind a bit of Windows 95 menu. In the left top corner over the search field are located
Elsa and Admin menus (Elsa not ELSA is used for referring to the menu, this is the way it is
written in the ELSA interface, same applies to other menu names). Elsa menu consists of Query
Log, Saved Results, Alerts, Active Queries, Dashboards, Saved Searches and Preferences. Query
Log contains the list of recent queries and statistics with the time used for running the query.
Saved results section has the list of saved results and allows creating an alert or schedule.
Additionally it allows rerunning the search and presents permalink for the query results. It is
possible to schedule a rerun of a certain query and apply an action if the new events matching the
search criteria are recorded. Here are some of the actions available: save report, send email, send
to CIRT send to malware analysis sandbox. Elsa dropdown can also be used to view alerts, saved
searches and active queries. Dashboards can be created and managed through the Dashboards
option in the Elsa dropdown.
Admin dropdown menu allows managing permissions, viewing stats on a general dashboard,
cancelling livetails and viewing alerts. Livetails are live streams of logs. This function is currently
deprecated in ELSA because of stability issues [12].
Search results are presented in a tab below the search field, a new one is created by default for
each search. It is possible to use the same tab for updating the search by ticking “reuse current
tab” on the right from the search field. It is possible to change the ordering style inside the tab to
“grid” with a second tick in the same area.
29
It is possible to apply an action (e.g. export results, alert or schedule, add to dashboard, save
search etc.) to the search results using “Results Options” dropdown inside the tab. Search field
consists of a field called “Query” and “Submit Query”. There are two separate fields (From and
To) for starting time and end time of the query which can be filled using a calendar popup or
manually. There is an “Add term” and “Report on” dropdowns that allow using predefined
templates for building specific queries such as BRO, SNORT and Windows messages. There is a
separate dropdown for setting the type of Index to search in: Index, Archive, livetail etc.
3.4. Features
3.4.1. Graylog2 features
Streams in Graylog2 are saved searches that allow quick access to an overview of a certain
predefined situation. Streams are defined by rules which can be regular expressions, facility,
severity, host or a custom additional field with certain predefined value. It is possible to sort them
by custom categories. Here is an example of a stream in Graylog2.
Category: security
Stream name: SSH authentication failure
Regular expressions rule: sshd[d+]: Failed password for (invalid user )?(S+)? from
([d.]+) port (d+)
There is a possibility to create blacklists with a set of regular expressions terms inside to filter out
certain messages. The messages that match the predefined regular expressions patterns will be
dropped by the server.
Once a message is received and accepted by Graylog2 the originating host is automatically added
to the hosts list. The entire logging stream for any monitored host can be quickly accessed in the
hosts section. A host can be easily deleted from this list if it is no longer used. There is a “quick
jump to host” search field that might be very useful if there is a big list of monitored hosts. To
show all the logs that are presented specifically in this part of the graph - a segment of a graph can
be highlighted by clicking on the “Show messages in range” button.
It is possible to assign an alert for all users or for each stream, so that users that are assigned to
this stream would get an email. This is useful in case there is an event that needs urgent
interaction by a specific person or group. Log rotation can be achieved by setting
elasticsearch_max_number_of_indices in graylog2.conf elasticsearch_max_number_of_indices
multiplied by elasticsearch_max_docs_per_index equals total number of messages held within the
setup.
30
3.4.2. Kibana features
Kibana has a very dynamic interface which allows flexible on-demand data analysis and visual
representation. Each log line can be expanded with one click within the same area to allow access
to details. There are action buttons that can be used for dynamically creating very specific queries.
Each line within the “fields panel” can be clicked to get a multi-purpose menu with quick stats.
Using this menu it is possible to see the distribution of the most popular occurrences, add specific
columns to the logs in the “table panel” (same as clicking a plus sign next to any field in the field
panel), include and exclude certain fields from the query with a single click (same as actions
within log details in “table panel”) use analytical tools on this data and mark all these occurrences
in the table panel with red font.
Such functions as Score, Terms, Trend, Stats can be used for data analysis. It can be done by
pressing the corresponding buttons inside the menu a field or manually piping in the search field.
@fields.host:log NOT @fields.facility:"user" | terms severity
This query can be produced dynamically by 6 clicks in the “fields panel”. First two clicks: one on
@fields.host to open a popup menu, second click on “include” icon (which looks like a
magnifying glass) next to hostname “log”. Next to exclude all messages with user facility click on
@fields.facility in the “fields panel” and just click on the “exclude” icon (which looks like a “no
parking” sign - a slashed circle). Now the last two clicks: first one on @fields.severity in the
“fields panel” and second one on the “terms” button inside the popup menu. Statistics is based on
the last 2000 logs received, but this amount can be changed by editing the value of
"Analyze_limit" in KibanaConfig.rb.
Kibana does not have its own user management by default, but authentication modules can be
configured in KibanaConfig.rb. It is possible to hold user accounts in Elasticsearch and use Ldap
for authentication [21].Alternatively user authentication can be done with the help of e.g. Apache
or Nginx. Log rotation can be done by scheduling a script for deleting old Elasticsearch indices as
they are recorded in separate files by date.
Kibana 3 – new version released in 2013 has an extended dashboard and analytics module. Kibana
3 allows creating great custom dashboards, compare ranges of events by combining into one
graph etc. It is possible to save interfaces and queries into Elasticsearch, export to a file into “gist”
on the Github website [22].
31
3.4.3. ELSA features
ELSA is a more performance than dashboard oriented solution which was designed for incident
response and fighting APT. It has a similar to Google style search and allows sorting search
results by any field and produce custom reports. It is possible to export results as permalink or in
Excel, PDF, CSV, and HTML. ELSA supports full Active Directory/LDAP integration for
authentication, authorization and email settings. It supports archiving of logs with better than 10:1
ratio. ELSA supports email alerts and other actions that can be triggered if defined queries get hits
on the new log messages. Fully distributed architecture, can handle n nodes with all queries
executing in parallel. ELSA ships with normalization for some Cisco logs, Snort/Suricata, Bro,
and Windows via Eventlog-to-Syslog or Snare [23]. Log rotation can be done by bytes or
retention period values set in elsa_node.conf file.
3.5. Search
3.5.1. Graylog2 search
In earlier versions of Graylog2 web interface the search field was divided into separate fields like
message, timeframe, facility, severity etc. Some of the fields supported Lucene syntax, some
required use of regular expressions. Starting from version 0.10 Graylog2 applied a more user-
friendly search method. Now there is one search field and Lucene syntax can be used in it. There
is a quick filter option to filter the search results by message, timeframe, facility, severity and
host. Graylog2 search message field is split into terms. Each part of the query delimited by space
is searched for separately. Apache Lucene syntax allows using wildcards, do fuzzy and proximity
searches (see Appendix 9 for more details).
3.5.2. Kibana search
Kibana (as well as Graylog2 and Elasticsearch) uses Lucene Query Syntax for search. It is
possible to do simple full text query across all the lines of log messages, or use Lucene to be very
specific and target certain fields and add conditions (see appendix 10 for more details). Its
dynamic interface makes creating new queries very easy.
3.5.3. ELSA search
ELSA syntax is basically the same as Google syntax. There is a possibility to do sub-searches by
piping one search into another. There is an important difference between the way the queries are
done in ELSA and the other two solutions. In ELSA it is not possible to use wildcards in basic
queries. Only special asynchronous queries can contain wildcards. Results for such queries are
sent later by email (see Appendix 11 for more details).
32
3.6. Conclusion based on comparative analysis
Each of these log management solutions has its strong and weak sides. The choice of a system
strongly depends on the environment it will be used at and the goals that are pursued. There is no
perfect solution for every purpose and environment. Here in the table below are some main
advantages and disadvantages of the log management solutions according to the author’s opinion
(see Table 2).
Advantages Disadvantages
Graylog2
1. Easy basic user management
with possibility of advanced
authentication (e.g. LDAP)
2. Saved searches (called streams)
can be easily assigned per user
3. Creating blacklists to drop logs
that match a pattern from the web
interface menu
4. Nice and simple interface
1. Insufficient analytical functionality
2. Too many operations needed to
see the log details
Kibana
1. Easy point and click analysis
2. Choice between easy integration
with Logstash or Rsyslog
3. Really usable and efficient
interface
4. Kibana 3 offers easy interface
customisation
5. Great dashboards
1. No alerts
2. No native user management (in
development)
3. No saved searches (in
development)
ELSA
1. High-volume receiving/indexing
(a single node can receive > 30k
logs/sec, sustained)
2. Settings can be changed without
restarting services as scheduled
script reads the configuration
3. Customisable action of Info field
in the logs depending on the log
type (plugins needed)
4. Allows scheduling searches and
various alerts and actions triggered:
email, ticket creation,
5. Gathers statistics for queries by
user and log size and count
1. Not too flexible, designed
specifically for Incident response and
high scale
2. Web interface very conservative
3. Livetail not available currently
Table 2 Advantages and disadvantages of log management solutions
33
According to the author’s opinion Graylog2 is a great tool for environments that need to give
access to specific logs only. An example would be a company that is providing IT services and
has different teams: developers, system administrators, network administrators, supervisors who
should only have access to specific part of the logs.
Kibana would be the best choice for environments that benefit from combination of great
usability, analytics and good performance.
ELSA should be suitable for high volume and high scale log management. It is specifically
designed for network incident response and fighting APT. This is a great tool for large network
monitoring, for example ISPs or CERT could benefit from using ELSA.
34
4. Choosing a log management solution
In order to choose the best suiting log management solution some primary and secondary data was
collected for a detailed comparison. Secondary data was collected from official websites of the
log management solutions, configuration files and related web resources: e.g. forums and
discussions including Github - website for managing development projects [24]. Primary data
was generated by setting up latest versions of all three log management systems in virtual
environment and performing a series of tests. Testing process and results will be described in
chapter 4.2 and its subsections.
4.1. Logging requirements for small business
Small companies usually have a wide variety of different systems and devices in their
infrastructure. It could sometimes be a mixture of different vendors and different sorts of
operating systems. This sets the requirements that the log management system should be suitable
for mixed types of logs.
As the event rates and log message volumes are normally modest performance is not the key
factor in the choice of a log management system. The usual rate for a small company might be
100 – 200 events per second. The number of course can be different depending on the size of the
network, specific environment, logging level and the tasks solved by log management. This
allows solutions with lower performance like Graylog2 and Kibana to compete with high
performance ones like ELSA in the framework of a small company.
What concerns the target company where the chosen log management solution will be
implemented the event rate is estimated to around 1000 - 2000 logs per second with hypothetical
peaks of 3000 per second in case debugging is turned on for main systems. This relatively high
event rate for a small company is expected because all the syslog capable devices in local network
would be sending logs to the central log management solution and additionally the logs from
critical servers in the cloud might be sent as well. For big companies event rate could be much
higher 50 000 – 100 000 logs per second.
35
4.2. Testing
Performance and usability testing was carried out for gathering primary data which is needed for
comparison. Usability testing results are based on the author’s experience and opinion.
4.2.1. Testing environment
Modest hardware specifications were chosen for the performance testing as normally small
companies, including the one where the tests were carried out, have limited resources.
Additionally, performance on hardware with low specifications might show how efficiently the
system utilizes limited resources. CentOS was chosen as the operating system (e.g. not Debian)
because it is officially supported by Microsoft Hyper V, which would be the production
environment for the log management solution [25]. Testing was done on virtual machines using
Oracle VirtualBox version 4.2.10 r84104.
Here in the below are the basic specifications of the host used for testing are described:
Hardware used: Acer TimelineX 5830
OS Microsoft Windows 7 Professional 64-bit SP1
CPU Intel Core i5 2430M @ 2.40Ghz Sandy Bridge 32nm
RAM 6,00GB Dual-Channel DDR3 @ 665MHz (9-9-9-24)
Motherboard Acer JM50_HR (CPU1)
Hard Drive 238GB V4-CT256V4SSD2 (SSD)
NIC Atheros AR8151 PCI-E Gigabit Ethernet Controller
CentOS was chosen as the guest operating system. Here are the hardware resources and exact
version of operating system used:
CentOS 6.4 Kernel 2.6.32-358.2.1.el6.x86_64
Assigned hardware resources per log management server:
1 virtual CPU core (for single-core test)
4 virtual CPU cores (for multi-core test)
2048 Mbytes of RAM
Dynamic VHD disks space
36
4.2.1.1. Graylog2 software components
Latest version of Graylog2 log management solution at the time when the performance testing
was done was 0.11.0. This version of Graylog2 requires minimum java 1.6 and ruby 1.9 or higher.
Here is the list of the main components of Graylog2 solution and the corresponding logos (see
Figure 2 below)
Figure 2 Graylog2 software components
See Appendix 8 for more details.
37
4.2.1.2. Kibana software components
Kibana was designed as a frontend for Logstash, but it can be used with other backend systems
which can send specially structured logs into Elasticsearch. (e.g. Rsyslog with oemelasticsearch
module) Here is the list of the main components of Kibana solution and the corresponding logos.
(see Figure 3)
Figure 3 Kibana main components
See Appendix 8 for more details on components.
38
4.2.1.3. ELSA software components
ELSA can be installed with a fully automated script install.sh which installs the program and all
the dependencies from scratch. Here is the list of the main components of ELSA solution and the
corresponding logos. (see Figure 4)
Figure 4 ELSA main components
See Appendix 8 for more details on components.
39
4.2.2. Performance testing
For comparing the log management systems, a performance test was done. The benchmark used
for stress-testing each system comprised of sending a large batch of 100,000 IETF syslog
messages to the tested system. In order to ensure reliable delivery of all messages, they were sent
over TCP protocol, without any delays between issuing individual messages. The performance of
the system was measured in overall test execution time. In other words, the execution time reflects
what the event processing speed of the system is that is observed by the client, and how much log
data can the client realistically transmit to the system in a given time frame.
Command in the script is used for sending IETF formatted logs. The commands are presented
here:
#!/bin/bash
printf '<6>1 2013-04-25T22:00:00Z myhost kernel - - - this message is a testn%.0s'
{1..100000} | nc -w 1 -t 127.0.0.1 514
In addition to measuring the event processing speed, CPU consumption of the individual parts of
each log management system was investigated in order to identify potential bottlenecks.
Tools used for performance testing: time [26], nc [27] (netcat), htop [28].
A simple tests script logtest.sh was used. Unix time utility was used to calculate the time it takes
to run the script. Here below is the shell command used for running the test.
/usr/bin/time -f'%E' ./logtest.sh
(-f'%E' to show only elapsed time without user or system time)
Unix ”printf” command is used to generate standard output. Operator n is used to indicate the end
of the line. Variable %.0s uses value range in curly brackets to generate corresponding number of
lines. Then through the pipe these lies of formatted text are forwarded to netcat and sent using
TCP or UDP to the needed IP address and port (“-w 1” defines 1 second timeout, means that if no
more input is detected for 1 second the connection is closed. “–t” means TCP as we needed to
make sure the logs get to the destination to count time. “127.0.0.1 514” target IP and port.
40
4.2.2.1. Performance testing results:
Performance testing showed that in given configuration these solutions can be set in the next order
from highest performance to lowest:
1. ELSA
2. Kibana and Rsyslog
3. Kibana and Logstash
4. Graylog2
Figure 5 presented below shows the comparison of performance test results in logs/per second.
Figure 5 Performance test results statistics compared
*tweaked setup (described in the end of 4.2.2.1) results are presented in green
CPU percentage stated in the test results are based on indicators in htop, which interprets each
virtual CPU core (thread inside a physical CPU core) - as a 100% of CPU. Both single and
multicore setups were used for each series of performance tests.
0
2000
4000
6000
8000
10000
12000
14000
16000
Graylog2
logs/sec *
Kibana and
Logstash
logs/sec
Kibana and
Rsyslog
logs/sec
ELSA logs/sec
1 Virtual CPU core
4 Virtual CPU cores
4 V. CPU cores tweaked
41
Use of 4 cores increased the performance of log management systems: Graylog2 about 70%,
Kibana and Logstash 60%, Kibana and Rsyslog 25% and ELSA 28,6% (see Figure 6).
Figure 6 Relative increase in performance with 4 cores
Graylog2 and Kibana showed very good increase in multi-core setup as both programs are multi-
threaded and CPU intensive. As Elasticsearch, which also is CPU intensive, was run on the same
machine in this test setup – adding more CPU power increased performance considerably. Kibana
with Rsyslog and ELSA had a smaller increase in performance when more CPU cores were
added. For Rsyslog this can be explained by the limits of the oemelasticsearch module
performance. It can send messages via TCP up to 10 000 logs per second [29]. It is a good result
for such modest hardware to achieve more than 50% of the maximum performance. ELSA is
already so efficient, that the change in performance was not so big. Additionally the difference
was hard to measure accurately using an external stopwatch.
4.2.2.1.1. Graylog2 performance test
Sending 100 000 IETF formatted logs resulted in average time of 2 minutes and 50 seconds,
which is about 588,2 logs per second.
This is an average score calculated based on 20 tests. During the performance test most CPU was
used by Graylog2 server process, which utilised in average around 58% of CPU. Second CPU
intensive process was Elasticsearch, which consumed in average close to 38%.
When 4 virtual cores were used, the time needed for handling 100 000 logs went down to average
of around 1 minute 40 seconds. This is 1000 logs per second.
0,00%
20,00%
40,00%
60,00%
80,00%
100,00%
120,00%
140,00%
160,00%
Graylog2
logs/sec
Kibana and
Logstash
logs/sec
Kibana and
Rsyslog
logs/sec
ELSA logs/sec
Increase in performance
tweaked
42
As the number of logs per second was relatively low compared to other systems, additional tests
for Graylog2 were carried out tweaking the configurations. The tests were done using 4 virtual
CPU cores. The best performance in this setup was achieved by limiting the number of processors
used by Graylog2, which allowed more CPU to be used by Elasticsearch. During the test the most
CPU was consumed by Elasticsearch in average around 280%, which translates into 2.8 virtual
cores. Graylog2 consumed in average around 100% CPU, which is one virtual core.
This was achieved by setting processbuffer_processors = 1 and outputbuffer_processors = 1 in
graylog2 conf file. (see appendix 7 for configuration file sample) This setup might most likely be
not good for production as it might cause buffer overflow. It was used for testing purposes only
and it eventually gave the best performance results. During this test the Graylog2-server.jar
process was started in foreground to make sure there is no buffer overflow or other error messages
because of such setup.
As the result of tweaking the settings, the best time needed for handling 100 000 logs was in
average in 1 minute 10 seconds. This is about 1428,6 logs per second (see Figure 7).
Figure 7 Graylog2 of performance test results logs/sec
0
200
400
600
800
1000
1200
1400
1600
Graylog2 logs/sec
1 Virtual CPU core
4 Virtual CPU core
4 V. CPU cores tweaked
43
4.2.2.1.2. Kibana & Logstash performance test
Output in logstash.conf set to elasticsearch _http. Grok, mutate and syslog_pri used for filtering
and indexing. (see advanced scenario in chapter 3.2.2.)
Sending 100 000 IETF formatted logs resulted in average time of 1 minutes and 41 seconds,
which is about 990 logs per second.
This is an average score from 20 tests with IETF formatted messages. Most CPU was consumed
by Logstash server process, which took in average around 60% of CPU. Second CPU intensive
process was Elasticsearch, which consumed in average around 35%. Kibana.rb process consumed
around 2% of CPU.
When multicore setup of 4 virtual cores were used, the time needed for handling 100 000 logs
went down to average of around 1 minute 3 seconds. This is about 1587 logs per second. (see
Figure 8)
During the multicore test about 170 %, which is 1,7 cores was used by Logstash. Around 100%
which is 1 virtual core in average was used by Elasticsearch sometimes peaking at 150%. At the
same time process of Kibana.rb was consuming 2-3% of a CPU virtual core.
Figure 8 Kibana and Logstash performance test results logs/sec
0
200
400
600
800
1000
1200
1400
1600
1800
Kibana & Logstash logs/sec
1 Virtual CPU core
4 Virtual CPU core
44
4.2.2.1.3. Kibana and Rsyslog performance test
In single core test Elasticsearch consumed in average about 85 % of the CPU. Rsyslog consumed
about 2-3% of a single core. Kibana stayed around 2% of a single core CPU mark. Single core
test: 100 000 IETF log lines in 22 seconds – 4545,45 logs per second. (see Figure 9)
Figure 9 Kibana and Rsyslog performance test results logs/sec
During the test using 4 virtual cores Elasticsearch multi-process averaged around 250% of CPU
which is 2,5 virtual cores, sometimes peaked at 370 %. Rsyslog 2 processes utilised 12% CPU in
average each. Kibana.rb consumed 2-3% of 1 CPU virtual core. Test results with 4 virtual cores:
100 000 IETF log lines in 17,6 seconds – 5681,82 logs per second.
0
1000
2000
3000
4000
5000
6000
Kibana & Rsyslog logs/sec
1 Virtual CPU core
4 Virtual CPU core
45
4.2.2.1.4. ELSA performance test
Since ELSA log reception and log storage procedures are separated from each other and log data
is written into storage asynchronously, the event processing speed observed by the client is very
high, since there is no performance penalty that database access would incur. Nevertheless, while
asynchronous log storing provides performance benefits to the client, it also leaves the database
out of sync for a certain time frame (by default, for 1 minute). In order to provide a fair
comparison with other systems, the log reception and log storing times were measured separately
and added up. While this method is not 100% precise, it provides a good estimate of log data
processing time from the client's perspective.
Accroding to results of the test with single CPU core it takes about 9 seconds to send 100 000 logs
through Syslog-ng using PatterDB for parsing them until these logs become available for querying
in the ELSA web interface. This is about 11 111 log lines per second.
Multi CPU core setup the operation starting from sending the logs to getting results in ELSA web
interface took about 7 seconds. This accounts for about 14285,7 logs per second. (see Figure 10)
CPU consumption during the tests showed how efficient ELSA actually is. The most CPU
intensive processes were ELSA, Syslog-ng and Sphinx Search. When the single core test was run,
first ELSA and Syslog-ng consumed almost 50% of CPU each. Then after the batch was loaded
into MySQL database, Sphinx shortly peaked at almost 100% CPU. The CPU peaks lasted one or
two seconds. This shows how much more efficient ELSA (written in C) is compared to Graylog2
and Kibana (written in Javascript). Multicore setup showed a bit different distribution with
utilization of more resources. Syslog.ng and ELSA each used one full virtual core 100%. Sphinx
search used one core for100% and sometimes utilised more resources. The rest was used by
MySQL and other processes.
Figure 10 ELSA performance test results logs/sec
0
2 000
4 000
6 000
8 000
10 000
12 000
14 000
16 000
ELSA logs/sec
1 Virtual CPU core
4 Virtual CPU core
46
4.2.3. Usability testing
According to the authors opinion all three systems have well-built web interfaces. Kibana is the
most dynamic of the three, and is the most usable according to the author’s experience. Graylog2
is very user-friendly and has many functions at the finger tips such as user management and
streams. It is a bit less dynamic than Kibana. One of the main reasons for this is that the sidebar is
needed for showing log details, which makes adds an extra action. The choice depends on the
environment where it would be used. There are some important visual and functional differences
which would most surely influence the decision. All the qualities and features of the log
management systems are discussed in chapter 4.2.3.1 and are evaluated and ranked from the
usability perspective based on author’s experience and opinion.
4.2.3.1. Usability testing results
The systems were given points for each test depending on the rating. First place gave 3 points,
second place 1 point and third place 0 points. (3, 1 and 0 point system was chosen to support the
solution that takes first place more times) According to author’s opinion, considering pure
usability experience, the programs can be put in next order with best usable on top:
1. Kibana
2. Graylog2
3. ELSA
The table below contains the total usability score and scores for every test of each solution (see
Table 3).
Usability test results
Graylog2 Kibana ELSA
Visuals and design 1 3 0
Saved searches 3 0 1
Alerts 1 0 3
Authentication and Authorisation 3 0 1
Search syntax 1 3 0
Analytics 0 3 1
Ease of use 1 3 0
Universality 1 3 0
Ease of installation 0 3 1
total: 11 18 7
Table 3 Usability test score
Comments for each test are added in parts 3.1.3.1.1 – 3.1.3.1.9.
47
4.2.3.1.1. Visuals and design
Kibana and Graylog2 have a more colourful interface with high contrast schemes if compared to
ELSA. For search field Kibana uses a bold black frame on the very top of the web page which
seems very comfortable as most browsers have the navigation bar on top of the page (used for
URL input). Graylog2 has quite a big part on the top of the page used for the logo and the tabs.
The search field is located right under the tabs.
Kibana has the most functional, user-friendly and nice looking dashboards and graphs. Elsa would
probably take the second place as it uses Google visualizations. The drawback is they are
dependent on internet access (specifically access to Google site).
Kibana has a solid dynamic interface which gives a feeling everything is at the fingertips.
Graylog2 uses a tab like structure for menus in comparison to Kibana it provides much more
modest visualization and offers minimum data analysis. ELSA has a conservative looking
interface with grey dropdowns and sub-menus. Interface gets the job done, but seems a bit boring
and rigid. According to author’s opinion considering visuals and design the programs can be put
in next order with the system having the best visuals on top:
1. Kibana
2. Graylog2
3. ELSA
4.2.3.1.2. Saved searches
Graylog2 streams are very easy to configure but require using regular expressions. Although
Kibana does not have saved searches there are workarounds on how to save URLs with the query
and there is a feature request, so it is being worked on at the moment [30]. ELSA has saved
searches based on a query and allows scheduling saved searches. According to author’s opinion
considering saved searches the programs can be put in next order with best options for saved
searches on top:
1. Graylog2
2. ELSA
3. Kibana
48
4.2.3.1.3. Alerts
Graylog2 can send email alerts in case a pattern is matched in the incoming logs during a set
period. Grace period option was added to the latest release, which allows limiting the number of
notifications. Kibana does not have the alert functionality.
ELSA allows scheduling saved queries which search within the new logs. If there are positive
results on the query a defined action like an alert or sub-query is triggered. ELSA supports a
number of ways for sending alerts e.g. email, ticket creation and sub-query execution to search
within the results for more precise search. According to author’s opinion considering alert options
the programs can be put in next order with best options for alerts on top:
1. ELSA
2. Graylog2
3. Kibana
4.2.3.1.4. Authentication and authorisation
According to the author’s opinion Graylog2 has the best authentication and authorisation options.
It allows easily creating basic user accounts in the web interface and supports more complex
authentication mechanism like LDAP. Graylog2 can be easily used with basic authentication and
then later settings can be added into ldap.yml configuration file for using LDAP.
Kibana’s native authentication and authorisation module “kibana-ruby-auth” is currently under
development [31]. As a workaround it is possible to use LDAP and other authentication using
Phusion Passenger e.g. as an Apache or Nginx module [32].
ELSA has three basic authentication and authorisation modes: none, local and LDAP. First mode
allows any user that accesses the web page to have administrative access as a pseudo-user. Second
mode allows access based on credentials and group settings in local system database. The third
option is using LDAP/AD accounts and security groups. According to author’s opinion
considering authentication and accounting the programs can be put in next order with best options
on top:
1. Graylog2
2. ELSA
3. Kibana
49
4.2.3.1.5. Search syntax
In Graylog2 earlier versions search used to be in multiple fields, some of which supported Apache
Lucene and some regular expressions. Starting from version 0.10 Garylog2 uses a single search
field which supports pure Apache Lucene syntax. Saved searches are still done in regular
expressions and have only possibility to combine templates for matching positives, but no
templates to define exclusions can be added. So in general it is still a combination of Lucene and
regular expressions. There is a quick filter function which allows filtering the search results by
message, timeframe, facility, severity and host.
Kibana has always used Apache Lucene search syntax. As dynamic queries are very easily created
in Kibana, it makes it very simple to make very specific search patters from scratch.
ELSA uses a close to Google style search syntax, but the important difference is that no wildcards
can be used in basic queries. Only asynchronous queries can have wildcards, in which case results
would come later by email, which is not very convenient in many cases. According to author’s
opinion considering search syntax the programs can be put in next order with the best application
of search syntax on top:
1. Kibana
2. Graylog2
3. ELSA
4.2.3.1.6. Analytics
Concerning data analysis Graylog2 has very limited functionality. It has some basic graphs which
show the amount of logs per given period.
Kibana offers flexible and functional analysis tools with very good dashboards. Kibana 3 allows
creating custom interfaces and dashboards.
Elsa has good dashboards based on Google Visualisations, which are a powerful tool, but require
internet access from the server, which is not always a good option and sometimes not possible.
According to author’s opinion considering analytics the programs can be put in next order with
the best analytics software on top:
1. Kibana
2. ELSA
3. Graylog2
50
4.2.3.1.7. Ease of use
According to the author’s opinion Kibana is the most intuitive and easy to use. All the operations
take minimum clicks and movements and can be done in more than one way. Graylog2 version
0.11 has improved in terms of ease of use in comparison to 0.9x Single search field was
introduced which supports Apache Lucene syntax. It takes more operations than in Kibana to see
details of an event log. To do that a permalink inside the sidebar should be clicked. This is not
very convenient. ELSA has the least intuitive and easy to use interface of the three solutions.
According to author’s opinion considering ease of use the programs can be put in next order with
the easiest to use on top:
1. Kibana
2. Graylog2
3. ELSA
4.2.3.1.8. Universality
Central log management can be used in different environments: networking administration,
application development, system administration, web administration, software testing etc.
Although there is no perfect universal solution to fit every environment, Kibana according to the
author’s opinion is likely to fit more types of environment because of its high usability and
analytics. ELSA is probably the least universal of the three solutions because it is designed
specifically for high scale network analysis. According to author’s opinion considering
universality the programs can be put in next order with the most universal on top:
1. Kibana
2. Graylog2
3. ELSA
4.2.3.1.9. Ease of installation
According to author’s experience Kibana was the easiest and most straightforward to install of the
three solutions. Installation of Elasticsearch consists of downloading, extracting and starting it.
Kibana has two simple commands more as it uses Ruby. Logstash is available in a single java file.
Rsyslog could be installed using packages (see Appendix 3).
Although ELSA has a fully automatic script tested on a number of Unix platforms, it works well
on clean OS installations only. The script resolves dependencies and installs the whole solution
within minutes and can be used for updating, but if there is an issue with a specific component, it
might fail. Then troubleshooting might be quite complicated as the structure is not trivial, manual
installation is also quite complex.
51
From author’s opinion Graylog2 installation was the most complicated because of its web
Interface, which added a lot of non-trivial Installation steps. Another point because of MongoDB
Graylog2 initially requires substantially more disks pace, so the default 8 Gigabytes of disks space
assigned by default for CentOS by OracleVirtual Box had to be increased. According to author’s
opinion considering ease of installation the programs can be put in next order with the easiest to
install on top:
1. Kibana
2. ELSA
3. Graylog2
52
5. Implementation
Based on the research and testing it was decided to implement Kibana as the front end of the log
management solution. The main log shipper for Kibana would be Rsyslog. The estimated event
rate is 1000 – 2000 with peaks up to 3000 events per second. According to the performance test
results, which were higher than 4000 logs per second, Kibana should be a suitable solution for the
environment.
5.1. Production environment
The environment for implementation of central log management system with Kibana as the front
end is a small office in Tallinn. This is a central reservations office for a Norwegian company.
There are around 150 client nodes. These are Intel hardware based workstation with Microsoft
Windows 7 pro managed through Active Directory. The main business critical services are kept in
a datacentre. There are some local servers e.g. DNS, Active Directory, Microsoft SharePoint,
antivirus management, Cacti network monitoring, Samba fileserver etc. Most of the servers are
installed as virtual machines on a Hyper V server. There are two separate network lines: one
dedicated line for internal business critical traffic and a local ISP for internet access. Dedicated
line connects to the datacentre and other offices. Main internal traffic is VoIP (Microsoft Lync),
Citrix and some web applications.
5.2. Implementation of Kibana in production
The production environment of the target company currently has: 10 switches, 5 routers, 10
servers and more than 150 workstations in local network. There are 10 critical servers held in the
cloud, which is a datacentre connected with a dedicated line to the local office. It would be a good
solution to get a backup of the logs kept on the main servers in the cloud. Additionally datacentre
storage space is much more expensive than local storage, so it is possible to fit more logs.
In the first step of implementation VHD with Kibana log management solution was imported from
the Virtualbox test environment to the Hyper V server. As the operating system is CentOS the
drivers for Hyper V are included and the migration was done with no issues [33].
Here are the specifications of central local server on Hyper V:
HP Proliant ML350G6 E5620 P410i/512+BBWC 3x2GB 3x146GB
30 GB HP REG PC3-10600
4x146GB 6G SAS 10K rpm SFF (2.5-inch) Dual Port Hard Drives
53
Current setup is that Kibana is receiving syslog messages from all Unix servers, internal gateway
which is a cisco 800 series router and a 10 HP Procurve switches. Desired setup is to install to and
send log messages from all syslog capable devices to Kibana. As Windows does not support
Syslog a software client capable of converting event log to syslog should be installed on all
windows based workstation and servers (see Figure 11).
Figure 11 Scheme of Kibana implementation
While authentication and saved search features are being developed in Kibana, it is planned to use
Apache with passenger module for authentication and save searches manually. Queries in Kibana
generate URLs in a Base64 format [30]. Rsyslog would have two parallel outputs: one into
Elasticsearch and another one into text files. The text files would be kept to minimum reasonable
size and would be rotated using Rsyslog log rotation to avoid duplicate logging on the same
machine. Alerting could be configured using Simple Event Correlator (SEC) which would be
watching rotated log files created by Rsyslog and would be sending emails if a pattern is matched.
54
6. Future research
The log management solutions described in this thesis were tested for small business. These
solutions could be used in bigger companies as well. These companies could benefit from using
those open-source solutions as the costs for log management can be very high in big companies
when using commercial solutions.
As the scope of the thesis is log management solutions for small business the tests were carried
out on modest hardware. Performance on more powerful hardware could be tested. This would
show how the systems would suit a larger scale environment. The tests were done on single nodes.
Scalability and performance of the solutions could be tested by installing clusters of different size.
The usability test results provided in this thesis were based on the author’s opinion. Similar testing
for a larger environment could be carried on a target group, as bigger companies have more
personnel available.
55
7. Summary
According to the authors opinion all three systems have well-built web interfaces that serve their
intended purpose. The choice depends on the environment where it would be used.
Graylog2 is a great tool for environments that need to give access to specific logs only. An
example would be a company that is providing IT services and has different teams: developers,
system administrators, network administrators, supervisors etc.
Performance testing showed that ELSA is the fastest and can handle about 14285,7 logs per
second with the modest hardware resources used for testing. As the solution is meant for small
business, performance is not a crucial factor so Graylog2 and Kibana could very well compete
with ELSA in the given conditions.
According to usability test results Kibana is the most usable system.
Kibana would be the best choice for environments that benefit from combination of great
usability, analytics and good performance.
ELSA should be suitable for high volume and high scale log management. It is specifically
designed for network incident response and fighting APT. This seems like a great tool for large
network monitoring, for example ISPs or CERT could benefit from using ELSA.
Kibana and Rsyslog were chosen for installation in production environment because of the
usability, ease of installation and suitable performance.
56
Vabavaralise logihaldussüsteemi valik väikeettevõttele
Magistritöö kood ITI70LT
(30 EAP)
tudeng: Artjom Tšurilin
matrikkli number: 113832IVCMM
Juhendaja: Risto Vaarandi, Ph.D
Resüme
Antud lõputöö keskendub kolme populaarse vabavaralise logihaldussüsteemi võrdlusele. Lõputöö
eesmärgiks on anda ülevaade kolmest populaarsest logihaldussüsteemist ja pakkuda juhiseid
sellise valikuks, mis parimal võimalikul moel sobiks väikeettevõttele.
Valik põhineb võrdleval analüüsil ning efektiivsuse ja kasutuskõlblikkuse testimisel.
Ettevõtte logiotsing ja arhiiv (ELSA) on ülimalt tõhus vabavaraline logihaldussüsteem, mis võib
silmad ette anda ettevõtte kvaliteetsetele kommertslahendustele. See on projekteeritud tõhusaks
häiringute tõrjeks ja võitluseks komplekssete püsiohtude (APT) vastu.
Kibana on logi analüüsi eeskomponent Logstash ja Elasticsearch jaoks. Seda võib samuti kasutada
muude tagasüsteemidega, mis toetavad vormindatud väljundit süsteemi Elasticsearch, sellist nagu
on Rsyslog lõppvalmistaja Elasticsearch mooduliga.
Graylog2 on alternatiivne logihaldusvahend omaenese veebi graafilise kasutajaliidesega (GUI).
Graylog2 eriomaduseks on, et logisid võib hõlpsasti jagada erinevatesse voogudesse, võimaldades
erinevatel kasutajatel juurdepääsu eri tüüpi logidele.
Tõhususe testimine näitas, et ELSA on kiireim ja suudab käsitleda umbes 14285,7 logi sekundis,
testimisel kasutatud tagasihoidlike riistvararessursside juures. Kuna lahendus on ette nähtud
väikeettevõtlusele, siis pole tõhusus otsustavaks teguriks ning Graylog2 ja Kibana suudavad väga
hästi antud tingimustes konkureerida ELSA-ga.
Lähtuvalt kasutuskõlblikkuse testi tulemustest on Kibana enim kasutuskõlblik ja süsteemseim.
Kibana koos Rsyslog’iga valiti sobivaimaks lahenduseks väikeettevõttele. Sellel on teatud
puudused, mis ilmnevad autentimisel ja salvestatud otsingutel, kuid kasutuskõlblikkus,
installimise kergus ja universaalsus teevad sellest väljapaistva lahenduse väikeettevõtlusele.
Puuduvad funktsioonid on väljatöötluse staadiumis, samas on võimalus kasutada
välismehhanisme ja vastukaalusid vea neutraliseerimiseks.
57
List of References
[1] K Chandy. Mani Event-Driven Applications: Costs, Benefits and Design Approaches,
California Institute of Technology, www.infospheres.caltech.edu/sites/default/files/Event-
Driven%20Applications%20-
%20Costs,%20Benefits%20and%20Design%20Approaches.pdf (accessed 29.03.2013)
[2] http://www.webopedia.com/TERM/E/event.html (accessed 05.03.2013)
[3] R. Vaarandi, Cyber Defense Monitoring Solutions, 1-event-logs-and-syslog
[4] http://www.ietf.org/rfc/rfc3164.txt (accessed 07.04.2013)
[5] http://www.rsyslog.com/doc/history.html (accessed 06.04.2013)
[6] http://www.balabit.com/network-security/Syslog-ng/opensource-logging-
system/features/comparison (accessed 08.03.2013)
[7] R. Vaarandi, Cyber Defense Monitoring Solutions, 5-Syslog-ng-framework
[8] http://www.rsyslog.com/doc/licensing.html (accessed 06.04.2013)
[9] R. Gerhards, “Should I use rsyslog's new or old config style?” http://blog.gerhards.net/
(accessed 06.04.2013)
[10] http://www.graylog2.org/about (accessed 06.03.2013)
[11] https://docs.google.com/file/d/0By1KXg1ivlIeUjVoSVVjTVcxbzg/edit?pli=1 (accessed
18:03.2013)
[12] https://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation
(accessed 18:03.2013)
[13] http://elasticsearch.com/products/elasticsearch/(accessed 12.03.2013)
[14] http://www.sinatrarb.com/ (accessed 25.03.2013)
[15] http://enterprise-log-search-and-archive.googlecode.com/svn-
history/r112/wiki/Documentation.wiki (accessed 18:03.2013)
[16] http://www.jboss.org/drools/drools-expert (accessed 10.03.2013)
[17] http://graphite.wikidot.com/ (accessed 25.03.2013)
[18] http://support.torch.sh/help/kb/graylog2-server/using-librato-metrics-with-graylog2
(accessed 06.03.2013)
[19] http://www.logstash.net/docs/1.1.10/ (accessed 08.04.2013)
[20] http://linuxdrops.com/log-management-using-logstash-and-kibana-on-centos-rhel-fedora/#
(accessed 10.03.2013)
58
[21] https://github.com/rashidkpc/Kibana/pull/261 (accessed 04.04.2013)
[22] https://gist.github.com/ (accessed 07.04.2013)
[23] https://code.google.com/p/enterprise-log-search-and-archive/ (accessed 18:03.2013)
[24] https://github.com (accessed 15.03.2013)
[25] http://technet.microsoft.com/en-us/library/cc794868%28v=ws.10%29.aspx (accessed
06.04.2013)
[26] http://linux.about.com/library/cmd/blcmdl1_time.htm (accessed 15.03.2013)
[27] http://netcat.sourceforge.net/(accessed 15.03.2013)
[28] http://htop.sourceforge.net/ (accessed 10.03.2013)
[29] https://code.google.com/p/enterprise-log-management-appliance/wiki/omelasticsearch
[30] https://github.com/rashidkpc/Kibana/issues/326 (accessed 07.04.2013)
[31] https://github.com/rashidkpc/Kibana/issues/310 (accessed 06.04.2013)
[32] https://www.phusionpassenger.com/ (accessed 12.04.2013)
[33] http://wiki.centos.org/Manuals/ReleaseNotes/CentOS6.4 (accessed 15.03.2013)
[35] http://semicomplete.com/presentations/logstash-puppetconf-2012/#/ (accessed
12.03.2013)
[36] http://kibana.org/infrastructure.html (accessed 25.03.2013)
[37] http://graylog2.com/about (accessed 06.03.2013)
[38] http://support.torch.sh/help/kb/graylog2-web-interface/message-search-syntax (accessed
06.03.2013)
59
Appendices
Appendix - 1 Basic Event Log Cycle
[35]
60
Appendix 2 - Logstash Inputs, Filters and Outputs
Inputs filters outputs
amqp alter amqp
drupal_dblog anonymize boundary
elasticsearch checksum circonus
eventlog clone cloudwatch
exec Csv datadog
file date elasticsearch
ganglia dns elasticsearch_http
gelf environment elasticsearch_river
gemfire gelfify email
generator geoip exec
graphite grep file
heroku grok ganglia
imap grokdiscovery gelf
irc json gemfire
log4j Kv graphite
lumberjack metrics graphtastic
lumberjack2 multiline hipchat
pipe mutate http
rabbitmq noop internal
redis ruby irc
relp sleep juggernaut
snmptrap split librato
sqs syslog_pri loggly
stdin translate lumberjack
stomp urldecode metriccatcher
syslog useragent mongodb
tcp xml nagios
twitter zeromq nagios_nsca
udp null
varnishlog opentsdb
websocket pagerduty
xmpp pipe
zenoss rabbitmq
zeromq redis
riak
riemann
sns
sqs
statsd
stdout
stomp
syslog
tcp
websocket
xmpp
zabbix
zeromq
[19]
61
Appendix 3 - Rsyslog main components installation
RPMs for installing Rsyslog v7
#!/bin/sh
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libee-devel-0.4.1-1.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libee-0.4.1-1.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/json-c-0.9-4.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/json-c-devel-0.9-4.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libestr-0.1.5-1.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libestr-devel-0.1.5-
1.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/liblognorm-devel-0.3.4-
5.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/liblognorm-0.3.4-5.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/rsyslog-7.2.6-3.el6.x86_64.rpm
wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/rsyslog-elasticsearch-7.2.6-
3.el6.x86_64.rpm
Installing from RPM
rpm -ivh libee-devel-0.4.1-1.el6.x86_64.rpm libee-0.4.1-1.el6.x86_64.rpm json-c-0.9-
4.el6.x86_64.rpm json-c-devel-0.9-4.el6.x86_64.rpm libestr-0.1.5-1.el6.x86_64.rpm libestr-devel-
0.1.5-1.el6.x86_64.rpm liblognorm-devel-0.3.4-5.el6.x86_64.rpm liblognorm-0.3.4-
5.el6.x86_64.rpm rsyslog-7.2.6-3.el6.x86_64.rpm rsyslog-elasticsearch-7.2.6-3.el6.x86_64.rpm
62
Appendix 4 - Kibana setup example scheme
[36]
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business
Choosing an open source log management system for small business

More Related Content

Similar to Choosing an open source log management system for small business

HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdf
HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdfHOL-0419-01-PowerProtect_Data_Manager_-19.11.pdf
HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdfHua Chiang
 
Operating System Structure Of A Single Large Executable...
Operating System Structure Of A Single Large Executable...Operating System Structure Of A Single Large Executable...
Operating System Structure Of A Single Large Executable...Jennifer Lopez
 
MarkLogic Overview, Ron Avnur, MarkLogic
MarkLogic Overview, Ron Avnur, MarkLogicMarkLogic Overview, Ron Avnur, MarkLogic
MarkLogic Overview, Ron Avnur, MarkLogicmug-fr
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Skelton Thatcher Consulting Ltd
 
Session Auditor - Transparent Network Behavior Recorder
Session Auditor - Transparent Network Behavior RecorderSession Auditor - Transparent Network Behavior Recorder
Session Auditor - Transparent Network Behavior RecorderBMST
 
Cloud ERP Security: Guidelines for evaluation
Cloud ERP Security: Guidelines for evaluationCloud ERP Security: Guidelines for evaluation
Cloud ERP Security: Guidelines for evaluationNazli Sahin
 
PLSQL Standards and Best Practices
PLSQL Standards and Best PracticesPLSQL Standards and Best Practices
PLSQL Standards and Best PracticesAlwyn D'Souza
 
Azure Incident Response Cheat Sheet.pdf
Azure Incident Response Cheat Sheet.pdfAzure Incident Response Cheat Sheet.pdf
Azure Incident Response Cheat Sheet.pdfChristopher Doman
 
Report on forensics tools
Report on forensics toolsReport on forensics tools
Report on forensics toolsVishnuPratap7
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environmentsDocker, Inc.
 
Log analyzer Needle in a haystack
Log analyzer  Needle in a haystackLog analyzer  Needle in a haystack
Log analyzer Needle in a haystackCenterRetro
 
Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Skelton Thatcher Consulting Ltd
 
Why And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineWhy And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineKuzinski
 
Why And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineWhy And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineKuzinski
 
FOISDBA-Ver1.1.pptx
FOISDBA-Ver1.1.pptxFOISDBA-Ver1.1.pptx
FOISDBA-Ver1.1.pptxssuser20fcbe
 

Similar to Choosing an open source log management system for small business (20)

HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdf
HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdfHOL-0419-01-PowerProtect_Data_Manager_-19.11.pdf
HOL-0419-01-PowerProtect_Data_Manager_-19.11.pdf
 
Final viva
Final vivaFinal viva
Final viva
 
Operating System Structure Of A Single Large Executable...
Operating System Structure Of A Single Large Executable...Operating System Structure Of A Single Large Executable...
Operating System Structure Of A Single Large Executable...
 
MarkLogic Overview, Ron Avnur, MarkLogic
MarkLogic Overview, Ron Avnur, MarkLogicMarkLogic Overview, Ron Avnur, MarkLogic
MarkLogic Overview, Ron Avnur, MarkLogic
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
 
Session Auditor - Transparent Network Behavior Recorder
Session Auditor - Transparent Network Behavior RecorderSession Auditor - Transparent Network Behavior Recorder
Session Auditor - Transparent Network Behavior Recorder
 
Cloud ERP Security: Guidelines for evaluation
Cloud ERP Security: Guidelines for evaluationCloud ERP Security: Guidelines for evaluation
Cloud ERP Security: Guidelines for evaluation
 
PLSQL Standards and Best Practices
PLSQL Standards and Best PracticesPLSQL Standards and Best Practices
PLSQL Standards and Best Practices
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
 
Azure Incident Response Cheat Sheet.pdf
Azure Incident Response Cheat Sheet.pdfAzure Incident Response Cheat Sheet.pdf
Azure Incident Response Cheat Sheet.pdf
 
Report on forensics tools
Report on forensics toolsReport on forensics tools
Report on forensics tools
 
NATE-Central-Log
NATE-Central-LogNATE-Central-Log
NATE-Central-Log
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
Log analyzer Needle in a haystack
Log analyzer  Needle in a haystackLog analyzer  Needle in a haystack
Log analyzer Needle in a haystack
 
1.7 system calls
1.7 system calls1.7 system calls
1.7 system calls
 
11i Logs
11i Logs11i Logs
11i Logs
 
Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017
 
Why And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineWhy And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra Engine
 
Why And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra EngineWhy And Ontology Engine Drives The Point Cross Orchestra Engine
Why And Ontology Engine Drives The Point Cross Orchestra Engine
 
FOISDBA-Ver1.1.pptx
FOISDBA-Ver1.1.pptxFOISDBA-Ver1.1.pptx
FOISDBA-Ver1.1.pptx
 

More from FixNix Inc.,

RBI Cyber Security Guidelines- FixNix GRC
RBI Cyber Security Guidelines- FixNix GRCRBI Cyber Security Guidelines- FixNix GRC
RBI Cyber Security Guidelines- FixNix GRCFixNix Inc.,
 
FoFN Friends of FixNix Partner briefing - Aug 2nd
FoFN Friends of FixNix Partner briefing - Aug 2ndFoFN Friends of FixNix Partner briefing - Aug 2nd
FoFN Friends of FixNix Partner briefing - Aug 2ndFixNix Inc.,
 
Fix nix GRC DEMO FOR RISK TEAM MPHASIS
Fix nix GRC DEMO FOR RISK TEAM MPHASISFix nix GRC DEMO FOR RISK TEAM MPHASIS
Fix nix GRC DEMO FOR RISK TEAM MPHASISFixNix Inc.,
 
FixNix corporate profile
FixNix corporate profileFixNix corporate profile
FixNix corporate profileFixNix Inc.,
 
New Business Model v1
New Business Model v1New Business Model v1
New Business Model v1FixNix Inc.,
 
Business model israel_v1.0
Business model israel_v1.0Business model israel_v1.0
Business model israel_v1.0FixNix Inc.,
 
Fixnixbusinessmodelv1.0
Fixnixbusinessmodelv1.0Fixnixbusinessmodelv1.0
Fixnixbusinessmodelv1.0FixNix Inc.,
 
Fix nix business model for npc
Fix nix business model for npcFix nix business model for npc
Fix nix business model for npcFixNix Inc.,
 
Fixnix GRC Suite A Glance
Fixnix GRC Suite A GlanceFixnix GRC Suite A Glance
Fixnix GRC Suite A GlanceFixNix Inc.,
 
FixNix 17 products1.0
FixNix 17 products1.0FixNix 17 products1.0
FixNix 17 products1.0FixNix Inc.,
 
Lets understand the GRC market well with Ponemon analysis- FixNix
Lets understand the GRC market well with Ponemon analysis- FixNixLets understand the GRC market well with Ponemon analysis- FixNix
Lets understand the GRC market well with Ponemon analysis- FixNixFixNix Inc.,
 
GRC 101 ISACA Bengaluru on 28th Dec 2013
GRC 101 ISACA Bengaluru on 28th Dec 2013GRC 101 ISACA Bengaluru on 28th Dec 2013
GRC 101 ISACA Bengaluru on 28th Dec 2013FixNix Inc.,
 
ISACA session about GRC
ISACA session about GRCISACA session about GRC
ISACA session about GRCFixNix Inc.,
 

More from FixNix Inc., (20)

RBI Cyber Security Guidelines- FixNix GRC
RBI Cyber Security Guidelines- FixNix GRCRBI Cyber Security Guidelines- FixNix GRC
RBI Cyber Security Guidelines- FixNix GRC
 
FoFN Friends of FixNix Partner briefing - Aug 2nd
FoFN Friends of FixNix Partner briefing - Aug 2ndFoFN Friends of FixNix Partner briefing - Aug 2nd
FoFN Friends of FixNix Partner briefing - Aug 2nd
 
Fix nix GRC DEMO FOR RISK TEAM MPHASIS
Fix nix GRC DEMO FOR RISK TEAM MPHASISFix nix GRC DEMO FOR RISK TEAM MPHASIS
Fix nix GRC DEMO FOR RISK TEAM MPHASIS
 
FixNix corporate profile
FixNix corporate profileFixNix corporate profile
FixNix corporate profile
 
Vc us v4.0
Vc us v4.0Vc us v4.0
Vc us v4.0
 
Fixnix us vc_v3.0
Fixnix us vc_v3.0Fixnix us vc_v3.0
Fixnix us vc_v3.0
 
Fix nix, inc.
Fix nix, inc.Fix nix, inc.
Fix nix, inc.
 
New Business Model v1
New Business Model v1New Business Model v1
New Business Model v1
 
Business model israel_v1.0
Business model israel_v1.0Business model israel_v1.0
Business model israel_v1.0
 
Fix nix, inc
Fix nix, incFix nix, inc
Fix nix, inc
 
Fixnixbusinessmodelv1.0
Fixnixbusinessmodelv1.0Fixnixbusinessmodelv1.0
Fixnixbusinessmodelv1.0
 
Fix nix business model for npc
Fix nix business model for npcFix nix business model for npc
Fix nix business model for npc
 
Fixnix GRC Suite A Glance
Fixnix GRC Suite A GlanceFixnix GRC Suite A Glance
Fixnix GRC Suite A Glance
 
FixNix 17 products1.0
FixNix 17 products1.0FixNix 17 products1.0
FixNix 17 products1.0
 
FixNix GRC suite
FixNix GRC suiteFixNix GRC suite
FixNix GRC suite
 
Lets understand the GRC market well with Ponemon analysis- FixNix
Lets understand the GRC market well with Ponemon analysis- FixNixLets understand the GRC market well with Ponemon analysis- FixNix
Lets understand the GRC market well with Ponemon analysis- FixNix
 
Fix Nix deck
Fix Nix deckFix Nix deck
Fix Nix deck
 
FixNix Pitch
FixNix PitchFixNix Pitch
FixNix Pitch
 
GRC 101 ISACA Bengaluru on 28th Dec 2013
GRC 101 ISACA Bengaluru on 28th Dec 2013GRC 101 ISACA Bengaluru on 28th Dec 2013
GRC 101 ISACA Bengaluru on 28th Dec 2013
 
ISACA session about GRC
ISACA session about GRCISACA session about GRC
ISACA session about GRC
 

Recently uploaded

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 

Recently uploaded (20)

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Choosing an open source log management system for small business

  • 1. TALLINN UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Computer Science Chair of Network Software CHOOSING AN OPEN-SOURCE LOG MANAGEMENT SYSTEM FOR SMALL BUSINESS Master’s Thesis ITI70LT Student: Artyom Churilin Student code: 113832IVCMM Advisor: Risto Vaarandi, Ph.D Tallinn, 2013
  • 2. 2 Declaration I hereby declare that I am the sole author of this thesis. The work is original and has not been submitted for any degree or diploma at any other University. I further declare that the material obtained from other sources has been duly acknowledged in the thesis. ……………………………………. ……………………………… (date) (signature)
  • 3. 3 List of Acronyms and Abbreviations AMQP Advanced Message Queuing Protocol APT Advanced Persistent Threat CERT Computer Emergency Response Team CIRT Critical Incident Response Team CPU Central Processing Unit DNS Domain Name System. Often used to refer to a DNS server ELSA Enterprise Log Search and Archive. Open-source log management system created by Martin Holste – former Security Incident Response Team Lead specializing in network security monitoring and open-source tools FIFO First In First Out, in this paper used as named or unnamed pipe. A pipe is a mechanism for inter-process communication; data written to the pipe by one process can be read by another process GELF Graylog Extended Log Format GNU A recursive acronym for GNU's Not Unix GNU GPL GNU General Public License, widely used license for free software GUI Graphic User Interface LDAP Lightweight Directory Access Protocol AD Microsoft Active Directory PCAP Packet Capture. Application programming interface for capturing network traffic. PRI Priority field in Syslog message RFC Request for Comments RPM A package management system for many Linux distributions.
  • 4. 4 sendmail mail server application used on Unix platforms TCP Transmission Control Protocol. URL Uniform Resource Locator. Sometimes referred to as a “web link” UDP User Datagram Protocol. VHD File format supported by many virtual platforms. Virtual Hard Disk
  • 5. 5 Abstract This thesis focuses on comparison of three popular open-source log management systems. The purpose of this thesis is to give overview of three popular log management systems and provide guidelines for choosing the best suiting one for a small company. The choice was based on the comparative analysis as well as performance and usability testing. ELSA is a high performance open-source log management system that can challenge enterprise grade commercial solutions. It was designed for effective incident response and fighting against APT (Advanced Persistent Threat). Kibana is log analysis front end for Logstash and Elasticsearch. It can also be used with other back ends that support formatted output into Elasticsearch, such as Rsyslog with oemelasticsearch module. Graylog2 is an alternative log management tool with its own web GUI. Speciality of Graylog2 is that logs can be easily divided into different streams to give access to specific type of logs to different users. Performance testing showed that ELSA is the fastest and can handle in average 14285,7 logs per second with the modest hardware resources used for testing. As the solution is meant for small business, performance is not a crucial factor so Graylog2 and Kibana could very well compete with ELSA in the given conditions. According to usability test results Kibana is the most usable system. Kibana with Rsyslog was chosen as the best fitting solution for a small company. It has some shortcomings with authentication and saved searches, but the usability, ease of installation and universality makes it such an outstanding solution for small business. The lacking functions are under development, meanwhile there is possibility to use external mechanisms and workarounds.
  • 6. 6 Table of Contents List of Figures .................................................................................................................................. 9 List of Tables.................................................................................................................................. 10 1. Introduction ................................................................................................................................ 11 1.1. Event logs ............................................................................................................................ 11 1.2. Central log management...................................................................................................... 11 1.3. Purpose of the thesis............................................................................................................ 12 1.4. Outline of the thesis............................................................................................................. 12 2. Log collection............................................................................................................................. 13 2.1. Logging protocols................................................................................................................ 13 2.1.1. BSD Syslog protocol .................................................................................................... 13 2.1.2. IETF Syslog protocol.................................................................................................... 14 2.2. Non GUI logging solutions.................................................................................................. 14 2.2.1. Unix Syslogd software suite ......................................................................................... 15 2.2.2. Syslog-ng framework.................................................................................................... 15 2.2.3. Rsyslog software suite .................................................................................................. 15 2.3. Graphical log management solutions .................................................................................. 17 2.3.1. Graylog2 ....................................................................................................................... 18 2.3.2. Kibana........................................................................................................................... 18 2.3.3. ELSA ............................................................................................................................ 19 3. Comparative analysis ................................................................................................................. 20 3.1. Structure .............................................................................................................................. 21 3.1.1. Graylog2 structure ........................................................................................................ 21 3.1.2. Kibana structure............................................................................................................ 21 3.1.3. ELSA structure ............................................................................................................. 22 3.2. Input and output................................................................................................................... 23 3.2.1. Graylog2 input and output............................................................................................ 23 3.2.2. Kibana input and output................................................................................................ 24
  • 7. 7 3.2.3. ELSA input and output ................................................................................................. 26 3.3. Interface............................................................................................................................... 26 3.3.1. Graylog2 interface ........................................................................................................ 26 3.3.2. Kibana interface............................................................................................................ 27 3.3.3. ELSA interface ............................................................................................................. 28 3.4. Features................................................................................................................................ 29 3.4.1. Graylog2 features.......................................................................................................... 29 3.4.2. Kibana features ............................................................................................................. 30 3.4.3. ELSA features............................................................................................................... 31 3.5. Search .................................................................................................................................. 31 3.5.1. Graylog2 search............................................................................................................ 31 3.5.2. Kibana search................................................................................................................ 31 3.5.3. ELSA search ................................................................................................................. 31 3.6. Conclusion based on comparative analysis ......................................................................... 32 4. Choosing a log management solution......................................................................................... 34 4.1. Logging requirements for small business............................................................................ 34 4.2. Testing................................................................................................................................. 35 4.2.1. Testing environment ..................................................................................................... 35 4.2.2. Performance testing ...................................................................................................... 39 4.2.3. Usability testing............................................................................................................ 46 5. Implementation........................................................................................................................... 52 5.1. Production environment ...................................................................................................... 52 5.2. Implementation of Kibana in production............................................................................. 52 6. Future research ........................................................................................................................... 54 7. Summary .................................................................................................................................... 55 Resüme........................................................................................................................................... 56 List of References........................................................................................................................... 57 Appendices..................................................................................................................................... 59
  • 8. 8 Appendix - 1 Basic Event Log Cycle ......................................................................................... 59 Appendix 2 - Logstash Inputs, Filters and Outputs.................................................................... 60 Appendix 3 - Rsyslog main components installation................................................................. 61 Appendix 4 - Kibana setup example scheme.............................................................................. 62 Appendix 5 -TCP and UDP output options in Logstash ............................................................ 63 Appendix 6 – Graylog2 setup example scheme ......................................................................... 65 Appendix 7 – Graylog2 tweaked settings................................................................................... 66 Appendix 8 – Graylog2, Kibana and ELSA component details................................................. 67 Appendix 9 – Lucene search ...................................................................................................... 68 Appendix 10 – Kibana search examples..................................................................................... 70 Appendix 11 – ELSA search examples...................................................................................... 72 Appendix 12 – ELSA performance test details .......................................................................... 73
  • 9. 9 List of Figures Figure 1 Log management solution model..................................................................................... 17 Figure 2 Graylog2 software components ....................................................................................... 36 Figure 3 Kibana main components................................................................................................. 37 Figure 4 ELSA main components .................................................................................................. 38 Figure 5 Performance test results statistics compared.................................................................... 40 Figure 6 Relative increase in performance with 4 cores ................................................................ 41 Figure 7 Graylog2 of performance test results logs/sec................................................................. 42 Figure 8 Kibana and Logstash performance test results logs/sec................................................... 43 Figure 9 Kibana and Rsyslog performance test results logs/sec .................................................... 44 Figure 10 ELSA performance test results logs/sec......................................................................... 45 Figure 11 Scheme of Kibana implementation................................................................................ 53
  • 10. 10 List of Tables Table 1 Basic overview of the log management solution............................................................... 20 Table 2 Advantages and disadvantages of log management solutions........................................... 32 Table 3 Usability test score ............................................................................................................ 46
  • 11. 11 1. Introduction Today’s computer networks are very complex. Operating systems have millions of lines of code, amounts of data and data transfer rates are continuously growing to meet the demands of the market. Even relatively small networks can have millions of events per second. They vary in importance and are often interconnected. What are these events, how can they be managed and how to get useful information from these events? What do current popular solutions offer and how to choose one for a small company? There are various commercial log management solutions available on the market. These solutions are quite expensive and are hardly affordable by small companies. Fortunately there are open- source log management tools, which are free of charge and the only cost is the hardware or hardware resources on a virtual platform. As small companies normally have only few technicians it is important that the solution is easy to install, maintain and use. Performance requirements for small business are normally moderate, but it depends on the specific environment. 1.1. Event logs Event can be defined as “a relevant change in state” of a system [1], alternatively - an “action or occurrence detected by a program”. An example could be: a network packet arrived to switch or a firewall, user ran an executable, a network link went down, a user browsing a website received error code 404 because of a broken URL etc. IT systems handling those events generate event messages and usually by default store them locally or, if configured specifically, send to a remote location. When these messages are recorded they are referred to as event logs or simply logs. There are several standards and formats of log messages, but in general all logs consist of two main parts. First is the timestamp, stating the date and time the event happened. Second part is the data, containing information about the event itself. Logs can have more distinctive parts like facility (type of software that generated the event), source IP, severity (e.g. error, info, debug) etc. A typical event log cycle is presented by a diagram in Appendix 1 at the end of the thesis. 1.2. Central log management Many server and client operating systems, network switches, routers, firewalls, printers, even VoIP phones have capability to produce logs and send them through the network. Depending on the size and complexity of the IT infrastructure there could be tens, thousands or possibly millions of events per second. These events vary in importance and urgency but all of them are required to get the full picture of what is going on in the network and inside the nodes’ operating systems. By default logs are stored locally. This setup has many drawbacks. Firstly it is not efficient as each device has to be managed separately. Secondly the logs stored locally can be deleted or
  • 12. 12 changed. If an attacker or malware managed to infiltrate a network device or a server, logs including the records about the security breach could be changed or deleted. In this case the attack would not be even noticed. Thirdly, if a device memory is corrupted or hardware fails, then the local logs might not be accessible at all. In this case it might not be possible to find out the reason of this malfunction. Central log management and event alert system can help solve these issues. It is of crucial importance for an IT department of any organisation to be able to efficiently track any event in the network within a needed timeframe. One logical solution to this issue is to send all logs into a central log server. Modern log protocols support encryption and authentication to secure the log collection. Software development, website administration, network administration, incident response these are some example activities that require efficient log management. 1.3. Purpose of the thesis The purpose of this thesis is to give guidelines for choosing a solution for small business and choose the best suiting open-source log management system for a small target company. The choice is based on a comparative analysis as well as performance and usability testing. 1.4. Outline of the thesis Chapter one of the thesis states the problem of the research. Main standards and protocol suites is described in chapter two. Comparative analysis based on the features, performance and usability testing is presented in chapter three. Chapter four describes the performance and usability testing and presents the results. Implementation plan of a chosen log management system in a small company is described in chapter five. Chapter six offers some ideas for future research. Chapter seven summarizes the thesis.
  • 13. 13 2. Log collection Event logs can be generated by most of the applications, operating systems and network devices. Logs can be used for incident investigation, historical reporting, debugging etc. Because event logs are produced in real-time – they can also be used for real-time monitoring systems. Often such monitoring solutions include a frontend with analytical module and dashboards that show the current status as well as the historical. Usually such solutions have capability to send notifications for specific events (e.g. in a form of email alerts). 2.1. Logging protocols There are several main standards and protocol suites that are currently used in applications, operating systems and network devices. New standards were introduced to address the shortcomings of their predecessors. 2.1.1. BSD Syslog protocol BSD Syslog Was developed in 1980s by Eric Allman for sendmail application as an alternative for appending messages to flat files from programs. According to RFC3164, the sender sends a syslog message with maximum size of 1KB to the receiver over the UDP protocol; destination port 514 is used and source port 514 is recommended. Syslog message is sent with a UDP packet which has following payload: <PRI>Timestamp Hostname Content The formula for calculating PRI: PRI = 8*Facility + Severity Facility defines the software component that generated the event. Here are the facility values used for calculating PRI: kern (0), user (1), mail (2), daemon (3), local0..7 (16..23) Severity defines the level of relative event importance. Here are the severity values used for calculating PRI: emerg (0), alert (1), crit (2), error (3), warning (4), notice (5), info (6), debug (7). Timestamp has the next syntax: “MMM DD hh:mm:ss”. Hostname part contains the sender hostname or IP address. First 32 alphanumeric characters in the content field are regarded as tag field (name of the logging program), and the rest is regarded as message field [3]. One of the drawbacks of the BSD Syslog protocol is that it uses UDP only. This means there is no delivery control as no acknowledgement of the receipt is made [4]. Another limitation is that BSD syslog does not support encryption, so messages are sent in clear text. It also does not support authentication. Timestamps have no time zone information and time is given in seconds. UTF encoded characters are not supported. These shortcomings were addressed by IETF syslog protocol (Chapter 2.1.2).
  • 14. 14 2.1.2. IETF Syslog protocol IETF Syslog protocol is defined by RFCs 5424-5426. It supports TLS and default port for message reception is 6514/tcp. Both the message sender and receiver must support certificate based authentication. However, the administrator chooses the authentication options. Messages are sent as TLS application data which consists of one or more syslog frames. RFC 5426 sets requirements for message transmission over UDP: default message reception at 514/udp, a message is sent as a single UDP packet. IETF syslog messages are more structured than the ones of BSD syslog. Here is the structure of IETF messages: <PRI>Version Timestamp Hostname Application PID MsgID StructData Message To sum up: IETF syslog protocol is a more structured, transmission-reliable and secure than BSD syslog [3]. 2.2. Non GUI logging solutions Since 1980 when the BSD syslog protocol was created, there have been some important developments in syslog based solutions. Here are some important events that have formed today’s non-GUI open-source syslog market: • 1998 Balabit releases Syslog-ng • 2004 Rsyslog is released • 2007 Syslog-ng announces Syslog-ng PE (premium edition) At the same time that Syslog-ng went partially commercial in 2007 by introducing the PE version, Rsyslog got to the same level with its features. On February 28th Rsyslog 3.12.0 was released. According to Rainer Gerhards, from that date on Rsyslog supported all Syslog-ng major features, but had a number of major features exclusive to it. Rainer Gerhards considered Rsyslog 3.12.0 fully superior to Syslog-ng at the same period of time with exception of platform support [5].” Syslog-PE has some additional advanced features like encrypted log storage, Microsoft Windows support and client-side failover [6]. According to the popularity, community support and online discussions Syslog-ng OSE (open-source edition) and Rsyslog are the most widely used open- source non-GUI syslog solutions.
  • 15. 15 2.2.1. Unix Syslogd software suite UNIX syslogd (syslog daemon) can receive messages from a local file system socket and UDP port 514, and send output to local files or remote syslogd instance. Syslogd configuration is usually stored in /etc/syslog.conf that contains single-line rules. Each rule consists of selector and action, where selector is a list of “facility.severity” pairs and action specifies a destination for the message. Facility can be set to the standard syslog facility classifiers alternatively it can have “*”, which means any facility. Severity can be set to the standard syslog severity classifiers or “none”. Flat files, FIFOs, terminals and remote log servers are usually supported as destinations. This suite is still used for simple solutions, but generally it has been by more functional software suites like Syslog-ng and Rsyslog. 2.2.2. Syslog-ng framework Syslog-ng is one of the most prominent syslog frameworks with a very large user base. Supports logging both over UDP and TCP In addition to BSD syslog protocol, also supports IETF syslog protocol including encryption and authentication. Syslog-ng employs regular expressions for matching and filtering messages by tag, message text, etc. It supports custom message templates and allows user to change the log message format and the set of message fields that are logged [7]. 2.2.3. Rsyslog software suite Rsyslog is an advanced open-source logging solution. Letter R in the name stands for reliable, which mainly emphasises the use of TCP as transport and does not point to the unreliability of predecessors. Rsyslog can be used under terms of GPLv3 license but can be used for a non- GPLv3 compatible project in some special cases described in the license agreement [8]. Rsyslog has been developed in 2004 based on the sysklogd (syslog and klogd – latter handles kernel messages) standard package. The goal of the Rsyslog project is to provide a feature-richer and reliable syslog daemon while retaining drop-in replacement capabilities to stock syslogd [5]. It adds a lot of features to Unix syslogd, including support for IETF syslog protocol along with other features has advanced message filtering and custom message formatting. Rsyslog configuration is usually stored in /etc/rsyslog.conf It supports traditional selector-action rules of Unix syslogd, in order to ease migration from syslogd. Rsyslog has become the default logging solution for many Linux distributions. According to Rainer Gerhards, the main author of the Rsyslog, the main competitor for Rsyslog is Syslog-ng. Rsyslog’s advantage is that it is free of charge including all features, but full-featured Syslog-ng PE (premium edition) has a paid license.
  • 16. 16 Rsyslog maintains backward compatibility with syslogd: basic syslog.conf format is extremely well known, covered in a lot of text books, taught in numerous courses and used in a myriad of Internet tutorials. So if we would abandon it, we would thrash a lot of people's knowledge and help resources [9].
  • 17. 17 2.3. Graphical log management solutions Whatever solution is used for the backed of the log collection it important to have the logs presented in a comprehensive and useful manner. Aim of a Graphical User Interface (GUI) is to give quick and easy access to an IT system. User management, system configuration, graphs with historical data, dashboards with real time statistics – these are some of the main useful features that are available in a good GUI frontend. Productivity and user experience of an operator of such GUI depends on how flexible, customisable and usable these options are. There is no perfect GUI for all cases it is more a question of what suits best to the given environment. Open-source graphical log management solutions are quite flexible and can be used in combination with other non-GUI solutions. Following main components of a graphical log management solution could be outlined (see Figure 1): • log shipper • log parser • log storage, indexing and search • web interface Figure 1 Log management solution model Most of the components, depending on the solution, could be replaced by some alternative ones. log shipper storage search index web interface logs logs log parser logs
  • 18. 18 Log shipper can normally be any log collection software like a syslog daemon. It serves as the entry point for event logs from local services or network and applies some action to the logs. In log management system it normally sends the logs for further parsing and filtering. (Syslogd, Rsyslog, Syslog-ng, Logstash, Graylog2 etc.) Log parser is a separate process or module which is responsible for parsing fields out from raw log messages and creating structured messages which are suitable for writing into log storage. (Grok, Json, Ruby, Syslog4j etc.) Log storage, indexing and search are performed using databases and indexing software. (MySQL, MongoDB, Tokyo Cabinet, Elasticsearch, Sphinx search) Web interface works as a frontend to all of the components and provides means to manage the log data. (Log analyser, Kibana, Graylog2 web interface, Elsa web interface etc.) Distribution of the functions among components depends on the architecture of the solution. Multiple functions can be executed by a single part of the log management solution e.g. Graylog2 server is log shipper and parser. Logstash is a log shipper, indexer and has its own integrated web interface. Kibana is a front end web interface and indexer. In many cases the parsing and storing functionality is implemented inside the log shipper. Single function can be divided among multiple components e.g. Graylog2 storage is done by Elasticsearch (messages) and MongoDB (statistics, user accounts) [10]. Log management solutions might include other components like various plugins, filters and middleware. This will be described in more detail in chapter 3.1 of the thesis. 2.3.1. Graylog2 Graylog2 is an open-source GPLv3 licensed log management system that stores logs in Elasticsearch. It was designed by Lennart Koopmann, developer at XING AG, and was released in 2010. It consists of a server written in Java that accepts syslog messages via TCP or UDP and stores them in indexes of Elasticsearch. The second part is a Ruby on Rails web interface. Graylog2 web interface allows searching through the logs, apply filters, blacklist strings, quickly view logs for each monitored host and flexibly manage access to the logs by authorising users to see specific log “streams”.Main configuration file is graylog2.conf. Embedded Elasticsearch configuration file is graylog2-elasticsearch.yml. Elasticsearch - elasticsearch.yml 2.3.2. Kibana Kibana is a browser based frontend for Logstash and Elasticsearch written in Java Script and Ruby. It was designed and developed in 2012 by Rashid Khan, developer at Elasticsearch project. Its default log shipper Logstash is a flexible open-source log management software supporting a
  • 19. 19 long list of inputs, filters and outputs. As an alternative to Logstash, Kibana can be configured to work with other log management software which supports output to Elasticsearch. The setup examples described further in the thesis are Kibana with Logstash and Kibana with Rsyslog. Main configuration file for Kibana is Kibana.Config.rb, for Elasticsearch - elasticsearch.yml, for Logstash - logstash.conf and for Rsyslog – rsyslog.conf. 2.3.3. ELSA ELSA stands for Enterprise Logs Search and Archive. It is an open-source log management solution written in C. ELSA was created by Michael Holste - former Security Incident Response Team leader, currently employed at Mandiant (company offering information security services). Its author describes the program in short as: GPLv2 framework around Syslog-ng, MySQL, and Sphinx search. [11] Perl is used as a pipe between the components e.g. logs are taken from Syslog-ng output and prepared for batch loading into MySQL. ELSA was designed to support efficient network incident response. It is oriented on high performance and is advertised to handle more than a million of logs per minute and give a billion results for a query in half a second on modest hardware [12]. ELSA has two main installations: node and web. ELSA nodes that only gather, store and forward the logs need only node component installed. Nodes that are used as a gateway for queries need the ELSA web component installation. In small setup scenarios, like the one used for testing, both components are installed on the same node. Main configuration files are elsa_node.conf and elsa_web.conf.
  • 20. 20 3. Comparative analysis Comparative analysis is based on the primary data generated during the tests and secondary data from the web resources. Here is the basic overview of the solutions presented in the table presented in Table 1. Name: Graylog2 Kibana ELSA Language Java, Java Script, Ruby Java, Java Script, Ruby C, Perl Protocols BSD & IETF syslog, GELF, GELF via http, AMQP BSD & IETF syslog, AMQP, XMPP… BSD & IETF syslog Transport TCP, UDP TCP, UDP TCP, UDP Log shipper Graylog2 Logstash, Rsyslog Syslog-ng Log parser syslog4j grok, json, syslog4j… 28 filters perl, PatternDB Storage Elasticsearch, MongoDB Elasticsearch MySQL Indexing Elasticsearch Elasticsearch Sphinx search License GNU GPLv3 Apache 2.0 GNU GPL v2 Documentation Good: platform independent instructions, official examples for Debian, unofficial for RHEL Good: platform independent instructions, official examples for Debian, unofficial for RHEL Excellent Installation scripts Script available for Debian based Script available for Debian based Multiplatform fully auto Demo http://public- graylog2.taulia.com/session http://demo.kibana.org /#/dashboard no live demo Authentication Local, LDAP Needs external authentication e.g. with passenger module in Apache or Ngnix none, local or LDAP Authorisation Local, LDAP Under development, passenger can be used Account or group based, local or LDAP Performance on modest hardware suitable for Small and medium sized business Medium sized business and enterprise Enterprise Log lines /second announced thousands per second thousands per second tens of thousands per second Log lines /second tested 1428,6 5681,82 14285,7 Saved Searches Streams No Yes Search syntax Lucene + regular expressions Apache Lucene search Google syntax Event triggering and alerts Regular expressions templates + email alerts No native alerts or event triggering, done on log shipper side Scheduled searches, actions + alerts Table 1 Basic overview of the log management solution
  • 21. 21 3.1. Structure 3.1.1. Graylog2 structure Graylog2 consist of two main components a server written in Java and a web interface written in Ruby using Ruby on Rails web framework. The Graylog2 server listens to log messages, receives, parses, does the indexing and stores messages in Elasticsearch and statistical data, graphs and user accounts in the MongoDB. For an overview of how Graylog2 can be implemented please see the scheme in Appendix 6. Elasticsearch is a highly scalable, resilient, schema free, document oriented non-sequel open- source database. It is an Apache 2 licensed open-source distributed search engine, built on top of Apache Lucene [13]. Elasticsearch is used for Graylog2 and Kibana. 3.1.2. Kibana structure Kibana is a web interface written in Java-Script and Ruby using Sinatra web framework [14]. Typical minimal deployment of Kibana consists of Logstash and Kibana. Logstash is used for receiving log messages from various sources, optionally filtering the logs and sending them though one of the supported outputs. A simple example would be Logstash listening for IETF syslog formatted messages on TCP and UDP ports 514 and without applying additional filters forwards the logs to Elasticsearch. Logstash inputs, filters and outputs will be described in more detail in chapter 3.2.2 of the thesis. As an alternative setup for Kibana, Logstash can be replaced with Rsyslog for sending specifically parsed logs to Kibana via Elasticsearch. Rsyslog sends formatted logs into Elasticsearch using omelasticsearch module. These logs are parsed using Json and indexed in a format suitable for.
  • 22. 22 Here are the lines in rsyslog.conf file that are required for this setup: module(load="omelasticsearch") $template Syslog2Kibana, "{"@timestamp":"%timereported:::date- rfc3339%","@message":"%rawmsg:::json%","@type":"syslog","@tags": [],"@fields":{"receptiontime":"%timegenerated:::date- rfc3339%","host":"%HOSTNAME:::json%","tag":"%syslogtag:::json%"," facility":"%syslogfacility-text%","severity":"%syslogseverity- text%","msgtext":"%msg:::json%"}}" $template SyslogIndex, "rsyslog-%timereported:1:10:date-rfc3339%" *.* action(type="omelasticsearch" template="Syslog2Kibana" dynSearchIndex="on" searchIndex="SyslogIndex" server="localhost" serverport="9200" bulkmode="on" ) The first line enables omelasticsearch - output module to Elasticsearch. Next line defines the pattern for structuring the message and timestamp which was given the name “Syslog2Kibana”. Second template is for the search index, which was given the name “SylogIndex”. There is a special setting in KibanaConfig.rb file that needs to be set for Kibana to Index the logs coming from Rsyslog. The settings are presented below: Smart_index = true #Smart_index_pattern = 'logstash-%Y.%m.%d' Smart_index_pattern = 'rsyslog-%Y-%m-%d' These lines in KibanaConfig.rb enable the smart index feature and replace Logstash pattern with Rsyslog pattern to allow Kibana index Rsyslog data from Elasticsearch. For an overview of how Kibana can be implemented please see the scheme in Appendix 4. 3.1.3. ELSA structure ELSA uses Syslog-ng for receiving logs, its PatternDB for parsing, which is claimed by the designer to be more efficient than using the computationally intensive regular expressions. Alternative input is via HTTP, which is used for communicating between nodes in a cluster. Parsed logs are written into a raw file and are then batch loaded into the MySQL database and are indexed by Sphinx search. Batch is loaded by a script by default every minute. This setting can be
  • 23. 23 changed in elsa_node.conf file by setting a value in seconds for “index_interval” After each batch is loaded Sphinx indexes the newly inserted rows in temporary indexes, then again in larger batches every few hours in permanent indexes [12]. ELSA flow diagram is presented below: Network → Syslog-ng (PatternDB) → Raw text file or HTTP upload → Raw text file Batch load (by default every minute): Raw text file → MySQL → Sphinx Additional functionality can be added to ELSA using plugins. New plugins can be added by sub- classing the "Info" Perl class and editing the elsa_web.conf file to include them. Plugins that are included in ELSA by default are presented below: • Windows logs from Eventlog-to-Syslog • Snort/Suricata logs • Bro logs • Url logs from httpry_loggere These plugins allow applying specific actions using the log data. For example if URL plugin is configured - any log that has an IP address in it will have a "getPcap" option which will auto-fill pcap request parameters for one-click access to the traffic related to the log being viewed. This option is available if a pcap server like OpenFPC or StreamDB is installed and configured in elsa_web.conf. 3.2. Input and output 3.2.1. Graylog2 input and output Graylog2 accepts Syslog messages via TCP and UDP. Additionally it accepts messages in its own Graylog Extended Log Format (GELF) via TCP, UDP and HTTP. GELF logs are basically messages archived with Unix Gzip and formatted in JSON. Graylog2 also supports AMQP input (Advanced Message Queuing Protocol) via such message queuing middleware like RabbitMQ, Apache Qpid, OpenAMQ, SwiftMQ etc. Message queuing software is used to make sure that the messages are delivered from point A to point B. It stores messages in memory (writes to disk), waits for the buffer to clear after a peak of log traffic and then offers these messages to the logging system. For syslog default port is 514, GELF 12201 and AMQP 5672. Graylog2 is using Drools Expert [16] to check the incoming log messages against a user defined rule file Jabber/XMPP is used for
  • 24. 24 sending alerts. Internal metrics and stream counts can be stored into Graphite [17] and Librato [18] to turn these stats into visualization. 3.2.2. Kibana input and output Kibana imports logs from Elasticsearch. Originally Kibana was designed as frontend for Logstash. Logstash supports a wide range of inputs including IETF syslog, Gelf, Elasticsearch, snmptrap, eventlog, Twitter etc. On the homepage of Logstash there are currently supported 37 inputs, 28 filters and 47 outputs (for a complete list see Appendix 2) A simple scenario (see configuration below): receiving logs from TCP and UDP ports 514 and sending all logs to Elasticsearch. For TCP and UDP inputs “port” and “type” are required fields (for all options see Appendix 5). input { tcp { port => 514 type => syslog } udp {port => 514 type => syslog } } output { elasticsearch { }} In scenario where Rsyslog is used instead of Logstash all the Rsyslog functionality applies including inputs and outputs. Rsyslog receives local messages from within the kernel, remote messages can be received in BSD or IETF syslog format. Received messages can be written to log files, sent to remote syslog servers, etc. Additional modules can be used in Rsyslog e.g. oemelasticsearch (Rsyslog to Elasticsearch) for input to specific sources.
  • 25. 25 A more advanced Logstash configuration is presented below. input { tcp { port => 514 type => rsyslog } udp { port => 514 type => rsyslog } } filter { grok { type => "rsyslog" pattern => [ "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{PROG:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" ] add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{@source_host}" ] } syslog_pri { type => "rsyslog" } date { type => "rsyslog" syslog_timestamp => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]} mutate { type => "rsyslog" exclude_tags => "_grokparsefailure" replace => [ "@source_host", "%{syslog_hostname}" ] replace => [ "@message", "%{syslog_message}" ] } mutate { type => "rsyslog" remove => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ] } } output { elasticsearch_http { } } In this scenario: Rsyslog ships logs to Logstash, then “syslog_pri”, “grok” and “mutate” plugins are used to parse the logs which are then sent to Elasticsearch via HTTP [19].
  • 26. 26 3.2.3. ELSA input and output ELSA is using Syslog-ng as the log receiver. All the inputs for Syslog-ng are valid for ELSA, they are called source drivers. In Syslog-ng.conf it is configured using this syntax [7]: source id { driver1(opt1); driver2(opt2); ...; }; Some source drivers: – file(fname [options]) – read messages from a file fname (usually employed for reading messages from special file of kernel messages, for example /proc/kmsg) – internal() – read Syslog-ng internal messages – unix-stream(fname [options]), unix-dgram(fname [options]) – read messages from a UNIX file system socket fname in stream or datagram mode – tcp([options]), udp([options]) – receive BSD syslog messages from remote hosts over TCP or UDP – syslog([options]) – receive IETF syslog ELSA node can forward log messages between nodes using SCP and HTTP/HTTPS. Although ELSA has a certain predefined log flow it should be possible to send output using Syslog-ng in parallel to the batch loading into MySQL - e.g. syslog messages to an IP address and a TCP or UDP port. 3.3. Interface 3.3.1. Graylog2 interface Graylog2 Interface is arranged into tabs. Menu buttons at the top for the page can be used to switch between the tabs. These buttons are: messages, streams, hosts, blacklists, settings and users. By default settings the messages tab is first displayed to the administrator after logon. Regular user can see only streams tab. Messages tab consists of next main parts: search field, menu, overview table and a sidebar. Search field is wide across the whole page and contains sample search instructions in transparent text which disappear once clicked for placing the cursor in it. There is a dropdown with relative time next to the search field with options ranging from 5 minutes through 1 day, 1 month to always. The default time is 5 minutes, so recent data is displayed to the administrator right after the logon. Overview table contains a list of log message lines. Messages can be clicked to display more information in the sidebar: permalink, breakdown of the message, full message and stream name if it belongs to any. Overview table also shows the total amount of logs and has links to toggle between the view of recent and all log messages. Additionally there is a button in the shape of an asterisk which can be used for highlighting today’s messages.
  • 27. 27 Sidebar by default shows a graph of recent incoming logs and a welcome message. Favourite streams’ mini graphs are also shown there. Sidebar basically shows details of the active objects clicked by the user e.g. log message in the overview table. When scrolling down the list of logs a “Back to top” button appears on the screen which makes it comfortable to get back to the top of the page where all the menus and search field are located. When the sidebar is displaying a graph it has a “server health” button. It leads to a page with a dashboard with near-real-time throughput statistics also showing the recent highest value. (The current throughput in logs per second is also shown on the main page in the top right corner.) Apart from the dashboard “server health” page contains status information on the Elasticsearch server and also shows main server applications status log messages produced by Graylog2 (e.g. Graylog2 server start-up and shutdown). Streams tab contains controls to create saved searches and arrange them into categories. Hosts tab contains a list of hosts which are automatically added once logs start coming from a source. Blacklist tab has an option to create a blacklist with a set of regular expressions rules to filter out unneeded content which will be discarded. Settings tab has some subsections which allow defining the length of a message shown of the user, adding a column to the log list, configuring AMQP settings, adding comments to messages using regular expressions, define templates for sensitive data for filtering out, enable or disable plugins and check if the last version of Graylog2 is installed. Users tab allows creating user accounts of two types: admin and reader. Reader user see streams tab only with the streams assigned by the admin user in it. 3.3.2. Kibana interface Kibana has a very well designed interface. It has exactly what is needed for easily searching through formatted data and analysing it with a single click. Does not mean that searching through unformatted logs is not possible, it would just require manual query writing. Kibana web interface home page has next main sections: search field, field panel containing message fields (also referred to as “Show fields” section), graph, and a table panel which is mainly a list of logs. Search area is a big black rectangular frame positioned across the page on the top. In its left part there is a small white Kibana logo serving as “home” link and a time dropdown. A white search field is in the middle and when it is blank has “Search” inside written in thin font. On the right from the search field there is a blue “Search” and a red “Reset” buttons. The extreme right part of the search area has a mini dashboard displaying the current number of search hits. The time dropdown next to the search field has relative time options ranging from 5 minutes through 12
  • 28. 28 hours, 7 days to “All Time”. The default time is 15 minutes and there is also an option of Custom time frame. Interface overall is very dynamic and interactive. All the lines of logs are clickable and expandable into more detailed fields. Each field can be used for dynamic query building. When fields in the field panel are clicked - a menu with quick stats appears. Buttons such as “score”, “trend”, “terms” and “stats” inside this menu can be used for various analytical manipulations like changes in share, average values, distribution represented in pie charts, stock market type tables etc. With Kibana 3 it is possible to design a custom interface interactively without any coding. It is possible to create custom panels and dashboards and save these interfaces. 3.3.3. ELSA interface It is possible to create custom panels and dashboards and save these interfaces. ELSA interface design is very minimalistic and conservative. In administrator account there are five dropdowns which remind a bit of Windows 95 menu. In the left top corner over the search field are located Elsa and Admin menus (Elsa not ELSA is used for referring to the menu, this is the way it is written in the ELSA interface, same applies to other menu names). Elsa menu consists of Query Log, Saved Results, Alerts, Active Queries, Dashboards, Saved Searches and Preferences. Query Log contains the list of recent queries and statistics with the time used for running the query. Saved results section has the list of saved results and allows creating an alert or schedule. Additionally it allows rerunning the search and presents permalink for the query results. It is possible to schedule a rerun of a certain query and apply an action if the new events matching the search criteria are recorded. Here are some of the actions available: save report, send email, send to CIRT send to malware analysis sandbox. Elsa dropdown can also be used to view alerts, saved searches and active queries. Dashboards can be created and managed through the Dashboards option in the Elsa dropdown. Admin dropdown menu allows managing permissions, viewing stats on a general dashboard, cancelling livetails and viewing alerts. Livetails are live streams of logs. This function is currently deprecated in ELSA because of stability issues [12]. Search results are presented in a tab below the search field, a new one is created by default for each search. It is possible to use the same tab for updating the search by ticking “reuse current tab” on the right from the search field. It is possible to change the ordering style inside the tab to “grid” with a second tick in the same area.
  • 29. 29 It is possible to apply an action (e.g. export results, alert or schedule, add to dashboard, save search etc.) to the search results using “Results Options” dropdown inside the tab. Search field consists of a field called “Query” and “Submit Query”. There are two separate fields (From and To) for starting time and end time of the query which can be filled using a calendar popup or manually. There is an “Add term” and “Report on” dropdowns that allow using predefined templates for building specific queries such as BRO, SNORT and Windows messages. There is a separate dropdown for setting the type of Index to search in: Index, Archive, livetail etc. 3.4. Features 3.4.1. Graylog2 features Streams in Graylog2 are saved searches that allow quick access to an overview of a certain predefined situation. Streams are defined by rules which can be regular expressions, facility, severity, host or a custom additional field with certain predefined value. It is possible to sort them by custom categories. Here is an example of a stream in Graylog2. Category: security Stream name: SSH authentication failure Regular expressions rule: sshd[d+]: Failed password for (invalid user )?(S+)? from ([d.]+) port (d+) There is a possibility to create blacklists with a set of regular expressions terms inside to filter out certain messages. The messages that match the predefined regular expressions patterns will be dropped by the server. Once a message is received and accepted by Graylog2 the originating host is automatically added to the hosts list. The entire logging stream for any monitored host can be quickly accessed in the hosts section. A host can be easily deleted from this list if it is no longer used. There is a “quick jump to host” search field that might be very useful if there is a big list of monitored hosts. To show all the logs that are presented specifically in this part of the graph - a segment of a graph can be highlighted by clicking on the “Show messages in range” button. It is possible to assign an alert for all users or for each stream, so that users that are assigned to this stream would get an email. This is useful in case there is an event that needs urgent interaction by a specific person or group. Log rotation can be achieved by setting elasticsearch_max_number_of_indices in graylog2.conf elasticsearch_max_number_of_indices multiplied by elasticsearch_max_docs_per_index equals total number of messages held within the setup.
  • 30. 30 3.4.2. Kibana features Kibana has a very dynamic interface which allows flexible on-demand data analysis and visual representation. Each log line can be expanded with one click within the same area to allow access to details. There are action buttons that can be used for dynamically creating very specific queries. Each line within the “fields panel” can be clicked to get a multi-purpose menu with quick stats. Using this menu it is possible to see the distribution of the most popular occurrences, add specific columns to the logs in the “table panel” (same as clicking a plus sign next to any field in the field panel), include and exclude certain fields from the query with a single click (same as actions within log details in “table panel”) use analytical tools on this data and mark all these occurrences in the table panel with red font. Such functions as Score, Terms, Trend, Stats can be used for data analysis. It can be done by pressing the corresponding buttons inside the menu a field or manually piping in the search field. @fields.host:log NOT @fields.facility:"user" | terms severity This query can be produced dynamically by 6 clicks in the “fields panel”. First two clicks: one on @fields.host to open a popup menu, second click on “include” icon (which looks like a magnifying glass) next to hostname “log”. Next to exclude all messages with user facility click on @fields.facility in the “fields panel” and just click on the “exclude” icon (which looks like a “no parking” sign - a slashed circle). Now the last two clicks: first one on @fields.severity in the “fields panel” and second one on the “terms” button inside the popup menu. Statistics is based on the last 2000 logs received, but this amount can be changed by editing the value of "Analyze_limit" in KibanaConfig.rb. Kibana does not have its own user management by default, but authentication modules can be configured in KibanaConfig.rb. It is possible to hold user accounts in Elasticsearch and use Ldap for authentication [21].Alternatively user authentication can be done with the help of e.g. Apache or Nginx. Log rotation can be done by scheduling a script for deleting old Elasticsearch indices as they are recorded in separate files by date. Kibana 3 – new version released in 2013 has an extended dashboard and analytics module. Kibana 3 allows creating great custom dashboards, compare ranges of events by combining into one graph etc. It is possible to save interfaces and queries into Elasticsearch, export to a file into “gist” on the Github website [22].
  • 31. 31 3.4.3. ELSA features ELSA is a more performance than dashboard oriented solution which was designed for incident response and fighting APT. It has a similar to Google style search and allows sorting search results by any field and produce custom reports. It is possible to export results as permalink or in Excel, PDF, CSV, and HTML. ELSA supports full Active Directory/LDAP integration for authentication, authorization and email settings. It supports archiving of logs with better than 10:1 ratio. ELSA supports email alerts and other actions that can be triggered if defined queries get hits on the new log messages. Fully distributed architecture, can handle n nodes with all queries executing in parallel. ELSA ships with normalization for some Cisco logs, Snort/Suricata, Bro, and Windows via Eventlog-to-Syslog or Snare [23]. Log rotation can be done by bytes or retention period values set in elsa_node.conf file. 3.5. Search 3.5.1. Graylog2 search In earlier versions of Graylog2 web interface the search field was divided into separate fields like message, timeframe, facility, severity etc. Some of the fields supported Lucene syntax, some required use of regular expressions. Starting from version 0.10 Graylog2 applied a more user- friendly search method. Now there is one search field and Lucene syntax can be used in it. There is a quick filter option to filter the search results by message, timeframe, facility, severity and host. Graylog2 search message field is split into terms. Each part of the query delimited by space is searched for separately. Apache Lucene syntax allows using wildcards, do fuzzy and proximity searches (see Appendix 9 for more details). 3.5.2. Kibana search Kibana (as well as Graylog2 and Elasticsearch) uses Lucene Query Syntax for search. It is possible to do simple full text query across all the lines of log messages, or use Lucene to be very specific and target certain fields and add conditions (see appendix 10 for more details). Its dynamic interface makes creating new queries very easy. 3.5.3. ELSA search ELSA syntax is basically the same as Google syntax. There is a possibility to do sub-searches by piping one search into another. There is an important difference between the way the queries are done in ELSA and the other two solutions. In ELSA it is not possible to use wildcards in basic queries. Only special asynchronous queries can contain wildcards. Results for such queries are sent later by email (see Appendix 11 for more details).
  • 32. 32 3.6. Conclusion based on comparative analysis Each of these log management solutions has its strong and weak sides. The choice of a system strongly depends on the environment it will be used at and the goals that are pursued. There is no perfect solution for every purpose and environment. Here in the table below are some main advantages and disadvantages of the log management solutions according to the author’s opinion (see Table 2). Advantages Disadvantages Graylog2 1. Easy basic user management with possibility of advanced authentication (e.g. LDAP) 2. Saved searches (called streams) can be easily assigned per user 3. Creating blacklists to drop logs that match a pattern from the web interface menu 4. Nice and simple interface 1. Insufficient analytical functionality 2. Too many operations needed to see the log details Kibana 1. Easy point and click analysis 2. Choice between easy integration with Logstash or Rsyslog 3. Really usable and efficient interface 4. Kibana 3 offers easy interface customisation 5. Great dashboards 1. No alerts 2. No native user management (in development) 3. No saved searches (in development) ELSA 1. High-volume receiving/indexing (a single node can receive > 30k logs/sec, sustained) 2. Settings can be changed without restarting services as scheduled script reads the configuration 3. Customisable action of Info field in the logs depending on the log type (plugins needed) 4. Allows scheduling searches and various alerts and actions triggered: email, ticket creation, 5. Gathers statistics for queries by user and log size and count 1. Not too flexible, designed specifically for Incident response and high scale 2. Web interface very conservative 3. Livetail not available currently Table 2 Advantages and disadvantages of log management solutions
  • 33. 33 According to the author’s opinion Graylog2 is a great tool for environments that need to give access to specific logs only. An example would be a company that is providing IT services and has different teams: developers, system administrators, network administrators, supervisors who should only have access to specific part of the logs. Kibana would be the best choice for environments that benefit from combination of great usability, analytics and good performance. ELSA should be suitable for high volume and high scale log management. It is specifically designed for network incident response and fighting APT. This is a great tool for large network monitoring, for example ISPs or CERT could benefit from using ELSA.
  • 34. 34 4. Choosing a log management solution In order to choose the best suiting log management solution some primary and secondary data was collected for a detailed comparison. Secondary data was collected from official websites of the log management solutions, configuration files and related web resources: e.g. forums and discussions including Github - website for managing development projects [24]. Primary data was generated by setting up latest versions of all three log management systems in virtual environment and performing a series of tests. Testing process and results will be described in chapter 4.2 and its subsections. 4.1. Logging requirements for small business Small companies usually have a wide variety of different systems and devices in their infrastructure. It could sometimes be a mixture of different vendors and different sorts of operating systems. This sets the requirements that the log management system should be suitable for mixed types of logs. As the event rates and log message volumes are normally modest performance is not the key factor in the choice of a log management system. The usual rate for a small company might be 100 – 200 events per second. The number of course can be different depending on the size of the network, specific environment, logging level and the tasks solved by log management. This allows solutions with lower performance like Graylog2 and Kibana to compete with high performance ones like ELSA in the framework of a small company. What concerns the target company where the chosen log management solution will be implemented the event rate is estimated to around 1000 - 2000 logs per second with hypothetical peaks of 3000 per second in case debugging is turned on for main systems. This relatively high event rate for a small company is expected because all the syslog capable devices in local network would be sending logs to the central log management solution and additionally the logs from critical servers in the cloud might be sent as well. For big companies event rate could be much higher 50 000 – 100 000 logs per second.
  • 35. 35 4.2. Testing Performance and usability testing was carried out for gathering primary data which is needed for comparison. Usability testing results are based on the author’s experience and opinion. 4.2.1. Testing environment Modest hardware specifications were chosen for the performance testing as normally small companies, including the one where the tests were carried out, have limited resources. Additionally, performance on hardware with low specifications might show how efficiently the system utilizes limited resources. CentOS was chosen as the operating system (e.g. not Debian) because it is officially supported by Microsoft Hyper V, which would be the production environment for the log management solution [25]. Testing was done on virtual machines using Oracle VirtualBox version 4.2.10 r84104. Here in the below are the basic specifications of the host used for testing are described: Hardware used: Acer TimelineX 5830 OS Microsoft Windows 7 Professional 64-bit SP1 CPU Intel Core i5 2430M @ 2.40Ghz Sandy Bridge 32nm RAM 6,00GB Dual-Channel DDR3 @ 665MHz (9-9-9-24) Motherboard Acer JM50_HR (CPU1) Hard Drive 238GB V4-CT256V4SSD2 (SSD) NIC Atheros AR8151 PCI-E Gigabit Ethernet Controller CentOS was chosen as the guest operating system. Here are the hardware resources and exact version of operating system used: CentOS 6.4 Kernel 2.6.32-358.2.1.el6.x86_64 Assigned hardware resources per log management server: 1 virtual CPU core (for single-core test) 4 virtual CPU cores (for multi-core test) 2048 Mbytes of RAM Dynamic VHD disks space
  • 36. 36 4.2.1.1. Graylog2 software components Latest version of Graylog2 log management solution at the time when the performance testing was done was 0.11.0. This version of Graylog2 requires minimum java 1.6 and ruby 1.9 or higher. Here is the list of the main components of Graylog2 solution and the corresponding logos (see Figure 2 below) Figure 2 Graylog2 software components See Appendix 8 for more details.
  • 37. 37 4.2.1.2. Kibana software components Kibana was designed as a frontend for Logstash, but it can be used with other backend systems which can send specially structured logs into Elasticsearch. (e.g. Rsyslog with oemelasticsearch module) Here is the list of the main components of Kibana solution and the corresponding logos. (see Figure 3) Figure 3 Kibana main components See Appendix 8 for more details on components.
  • 38. 38 4.2.1.3. ELSA software components ELSA can be installed with a fully automated script install.sh which installs the program and all the dependencies from scratch. Here is the list of the main components of ELSA solution and the corresponding logos. (see Figure 4) Figure 4 ELSA main components See Appendix 8 for more details on components.
  • 39. 39 4.2.2. Performance testing For comparing the log management systems, a performance test was done. The benchmark used for stress-testing each system comprised of sending a large batch of 100,000 IETF syslog messages to the tested system. In order to ensure reliable delivery of all messages, they were sent over TCP protocol, without any delays between issuing individual messages. The performance of the system was measured in overall test execution time. In other words, the execution time reflects what the event processing speed of the system is that is observed by the client, and how much log data can the client realistically transmit to the system in a given time frame. Command in the script is used for sending IETF formatted logs. The commands are presented here: #!/bin/bash printf '<6>1 2013-04-25T22:00:00Z myhost kernel - - - this message is a testn%.0s' {1..100000} | nc -w 1 -t 127.0.0.1 514 In addition to measuring the event processing speed, CPU consumption of the individual parts of each log management system was investigated in order to identify potential bottlenecks. Tools used for performance testing: time [26], nc [27] (netcat), htop [28]. A simple tests script logtest.sh was used. Unix time utility was used to calculate the time it takes to run the script. Here below is the shell command used for running the test. /usr/bin/time -f'%E' ./logtest.sh (-f'%E' to show only elapsed time without user or system time) Unix ”printf” command is used to generate standard output. Operator n is used to indicate the end of the line. Variable %.0s uses value range in curly brackets to generate corresponding number of lines. Then through the pipe these lies of formatted text are forwarded to netcat and sent using TCP or UDP to the needed IP address and port (“-w 1” defines 1 second timeout, means that if no more input is detected for 1 second the connection is closed. “–t” means TCP as we needed to make sure the logs get to the destination to count time. “127.0.0.1 514” target IP and port.
  • 40. 40 4.2.2.1. Performance testing results: Performance testing showed that in given configuration these solutions can be set in the next order from highest performance to lowest: 1. ELSA 2. Kibana and Rsyslog 3. Kibana and Logstash 4. Graylog2 Figure 5 presented below shows the comparison of performance test results in logs/per second. Figure 5 Performance test results statistics compared *tweaked setup (described in the end of 4.2.2.1) results are presented in green CPU percentage stated in the test results are based on indicators in htop, which interprets each virtual CPU core (thread inside a physical CPU core) - as a 100% of CPU. Both single and multicore setups were used for each series of performance tests. 0 2000 4000 6000 8000 10000 12000 14000 16000 Graylog2 logs/sec * Kibana and Logstash logs/sec Kibana and Rsyslog logs/sec ELSA logs/sec 1 Virtual CPU core 4 Virtual CPU cores 4 V. CPU cores tweaked
  • 41. 41 Use of 4 cores increased the performance of log management systems: Graylog2 about 70%, Kibana and Logstash 60%, Kibana and Rsyslog 25% and ELSA 28,6% (see Figure 6). Figure 6 Relative increase in performance with 4 cores Graylog2 and Kibana showed very good increase in multi-core setup as both programs are multi- threaded and CPU intensive. As Elasticsearch, which also is CPU intensive, was run on the same machine in this test setup – adding more CPU power increased performance considerably. Kibana with Rsyslog and ELSA had a smaller increase in performance when more CPU cores were added. For Rsyslog this can be explained by the limits of the oemelasticsearch module performance. It can send messages via TCP up to 10 000 logs per second [29]. It is a good result for such modest hardware to achieve more than 50% of the maximum performance. ELSA is already so efficient, that the change in performance was not so big. Additionally the difference was hard to measure accurately using an external stopwatch. 4.2.2.1.1. Graylog2 performance test Sending 100 000 IETF formatted logs resulted in average time of 2 minutes and 50 seconds, which is about 588,2 logs per second. This is an average score calculated based on 20 tests. During the performance test most CPU was used by Graylog2 server process, which utilised in average around 58% of CPU. Second CPU intensive process was Elasticsearch, which consumed in average close to 38%. When 4 virtual cores were used, the time needed for handling 100 000 logs went down to average of around 1 minute 40 seconds. This is 1000 logs per second. 0,00% 20,00% 40,00% 60,00% 80,00% 100,00% 120,00% 140,00% 160,00% Graylog2 logs/sec Kibana and Logstash logs/sec Kibana and Rsyslog logs/sec ELSA logs/sec Increase in performance tweaked
  • 42. 42 As the number of logs per second was relatively low compared to other systems, additional tests for Graylog2 were carried out tweaking the configurations. The tests were done using 4 virtual CPU cores. The best performance in this setup was achieved by limiting the number of processors used by Graylog2, which allowed more CPU to be used by Elasticsearch. During the test the most CPU was consumed by Elasticsearch in average around 280%, which translates into 2.8 virtual cores. Graylog2 consumed in average around 100% CPU, which is one virtual core. This was achieved by setting processbuffer_processors = 1 and outputbuffer_processors = 1 in graylog2 conf file. (see appendix 7 for configuration file sample) This setup might most likely be not good for production as it might cause buffer overflow. It was used for testing purposes only and it eventually gave the best performance results. During this test the Graylog2-server.jar process was started in foreground to make sure there is no buffer overflow or other error messages because of such setup. As the result of tweaking the settings, the best time needed for handling 100 000 logs was in average in 1 minute 10 seconds. This is about 1428,6 logs per second (see Figure 7). Figure 7 Graylog2 of performance test results logs/sec 0 200 400 600 800 1000 1200 1400 1600 Graylog2 logs/sec 1 Virtual CPU core 4 Virtual CPU core 4 V. CPU cores tweaked
  • 43. 43 4.2.2.1.2. Kibana & Logstash performance test Output in logstash.conf set to elasticsearch _http. Grok, mutate and syslog_pri used for filtering and indexing. (see advanced scenario in chapter 3.2.2.) Sending 100 000 IETF formatted logs resulted in average time of 1 minutes and 41 seconds, which is about 990 logs per second. This is an average score from 20 tests with IETF formatted messages. Most CPU was consumed by Logstash server process, which took in average around 60% of CPU. Second CPU intensive process was Elasticsearch, which consumed in average around 35%. Kibana.rb process consumed around 2% of CPU. When multicore setup of 4 virtual cores were used, the time needed for handling 100 000 logs went down to average of around 1 minute 3 seconds. This is about 1587 logs per second. (see Figure 8) During the multicore test about 170 %, which is 1,7 cores was used by Logstash. Around 100% which is 1 virtual core in average was used by Elasticsearch sometimes peaking at 150%. At the same time process of Kibana.rb was consuming 2-3% of a CPU virtual core. Figure 8 Kibana and Logstash performance test results logs/sec 0 200 400 600 800 1000 1200 1400 1600 1800 Kibana & Logstash logs/sec 1 Virtual CPU core 4 Virtual CPU core
  • 44. 44 4.2.2.1.3. Kibana and Rsyslog performance test In single core test Elasticsearch consumed in average about 85 % of the CPU. Rsyslog consumed about 2-3% of a single core. Kibana stayed around 2% of a single core CPU mark. Single core test: 100 000 IETF log lines in 22 seconds – 4545,45 logs per second. (see Figure 9) Figure 9 Kibana and Rsyslog performance test results logs/sec During the test using 4 virtual cores Elasticsearch multi-process averaged around 250% of CPU which is 2,5 virtual cores, sometimes peaked at 370 %. Rsyslog 2 processes utilised 12% CPU in average each. Kibana.rb consumed 2-3% of 1 CPU virtual core. Test results with 4 virtual cores: 100 000 IETF log lines in 17,6 seconds – 5681,82 logs per second. 0 1000 2000 3000 4000 5000 6000 Kibana & Rsyslog logs/sec 1 Virtual CPU core 4 Virtual CPU core
  • 45. 45 4.2.2.1.4. ELSA performance test Since ELSA log reception and log storage procedures are separated from each other and log data is written into storage asynchronously, the event processing speed observed by the client is very high, since there is no performance penalty that database access would incur. Nevertheless, while asynchronous log storing provides performance benefits to the client, it also leaves the database out of sync for a certain time frame (by default, for 1 minute). In order to provide a fair comparison with other systems, the log reception and log storing times were measured separately and added up. While this method is not 100% precise, it provides a good estimate of log data processing time from the client's perspective. Accroding to results of the test with single CPU core it takes about 9 seconds to send 100 000 logs through Syslog-ng using PatterDB for parsing them until these logs become available for querying in the ELSA web interface. This is about 11 111 log lines per second. Multi CPU core setup the operation starting from sending the logs to getting results in ELSA web interface took about 7 seconds. This accounts for about 14285,7 logs per second. (see Figure 10) CPU consumption during the tests showed how efficient ELSA actually is. The most CPU intensive processes were ELSA, Syslog-ng and Sphinx Search. When the single core test was run, first ELSA and Syslog-ng consumed almost 50% of CPU each. Then after the batch was loaded into MySQL database, Sphinx shortly peaked at almost 100% CPU. The CPU peaks lasted one or two seconds. This shows how much more efficient ELSA (written in C) is compared to Graylog2 and Kibana (written in Javascript). Multicore setup showed a bit different distribution with utilization of more resources. Syslog.ng and ELSA each used one full virtual core 100%. Sphinx search used one core for100% and sometimes utilised more resources. The rest was used by MySQL and other processes. Figure 10 ELSA performance test results logs/sec 0 2 000 4 000 6 000 8 000 10 000 12 000 14 000 16 000 ELSA logs/sec 1 Virtual CPU core 4 Virtual CPU core
  • 46. 46 4.2.3. Usability testing According to the authors opinion all three systems have well-built web interfaces. Kibana is the most dynamic of the three, and is the most usable according to the author’s experience. Graylog2 is very user-friendly and has many functions at the finger tips such as user management and streams. It is a bit less dynamic than Kibana. One of the main reasons for this is that the sidebar is needed for showing log details, which makes adds an extra action. The choice depends on the environment where it would be used. There are some important visual and functional differences which would most surely influence the decision. All the qualities and features of the log management systems are discussed in chapter 4.2.3.1 and are evaluated and ranked from the usability perspective based on author’s experience and opinion. 4.2.3.1. Usability testing results The systems were given points for each test depending on the rating. First place gave 3 points, second place 1 point and third place 0 points. (3, 1 and 0 point system was chosen to support the solution that takes first place more times) According to author’s opinion, considering pure usability experience, the programs can be put in next order with best usable on top: 1. Kibana 2. Graylog2 3. ELSA The table below contains the total usability score and scores for every test of each solution (see Table 3). Usability test results Graylog2 Kibana ELSA Visuals and design 1 3 0 Saved searches 3 0 1 Alerts 1 0 3 Authentication and Authorisation 3 0 1 Search syntax 1 3 0 Analytics 0 3 1 Ease of use 1 3 0 Universality 1 3 0 Ease of installation 0 3 1 total: 11 18 7 Table 3 Usability test score Comments for each test are added in parts 3.1.3.1.1 – 3.1.3.1.9.
  • 47. 47 4.2.3.1.1. Visuals and design Kibana and Graylog2 have a more colourful interface with high contrast schemes if compared to ELSA. For search field Kibana uses a bold black frame on the very top of the web page which seems very comfortable as most browsers have the navigation bar on top of the page (used for URL input). Graylog2 has quite a big part on the top of the page used for the logo and the tabs. The search field is located right under the tabs. Kibana has the most functional, user-friendly and nice looking dashboards and graphs. Elsa would probably take the second place as it uses Google visualizations. The drawback is they are dependent on internet access (specifically access to Google site). Kibana has a solid dynamic interface which gives a feeling everything is at the fingertips. Graylog2 uses a tab like structure for menus in comparison to Kibana it provides much more modest visualization and offers minimum data analysis. ELSA has a conservative looking interface with grey dropdowns and sub-menus. Interface gets the job done, but seems a bit boring and rigid. According to author’s opinion considering visuals and design the programs can be put in next order with the system having the best visuals on top: 1. Kibana 2. Graylog2 3. ELSA 4.2.3.1.2. Saved searches Graylog2 streams are very easy to configure but require using regular expressions. Although Kibana does not have saved searches there are workarounds on how to save URLs with the query and there is a feature request, so it is being worked on at the moment [30]. ELSA has saved searches based on a query and allows scheduling saved searches. According to author’s opinion considering saved searches the programs can be put in next order with best options for saved searches on top: 1. Graylog2 2. ELSA 3. Kibana
  • 48. 48 4.2.3.1.3. Alerts Graylog2 can send email alerts in case a pattern is matched in the incoming logs during a set period. Grace period option was added to the latest release, which allows limiting the number of notifications. Kibana does not have the alert functionality. ELSA allows scheduling saved queries which search within the new logs. If there are positive results on the query a defined action like an alert or sub-query is triggered. ELSA supports a number of ways for sending alerts e.g. email, ticket creation and sub-query execution to search within the results for more precise search. According to author’s opinion considering alert options the programs can be put in next order with best options for alerts on top: 1. ELSA 2. Graylog2 3. Kibana 4.2.3.1.4. Authentication and authorisation According to the author’s opinion Graylog2 has the best authentication and authorisation options. It allows easily creating basic user accounts in the web interface and supports more complex authentication mechanism like LDAP. Graylog2 can be easily used with basic authentication and then later settings can be added into ldap.yml configuration file for using LDAP. Kibana’s native authentication and authorisation module “kibana-ruby-auth” is currently under development [31]. As a workaround it is possible to use LDAP and other authentication using Phusion Passenger e.g. as an Apache or Nginx module [32]. ELSA has three basic authentication and authorisation modes: none, local and LDAP. First mode allows any user that accesses the web page to have administrative access as a pseudo-user. Second mode allows access based on credentials and group settings in local system database. The third option is using LDAP/AD accounts and security groups. According to author’s opinion considering authentication and accounting the programs can be put in next order with best options on top: 1. Graylog2 2. ELSA 3. Kibana
  • 49. 49 4.2.3.1.5. Search syntax In Graylog2 earlier versions search used to be in multiple fields, some of which supported Apache Lucene and some regular expressions. Starting from version 0.10 Garylog2 uses a single search field which supports pure Apache Lucene syntax. Saved searches are still done in regular expressions and have only possibility to combine templates for matching positives, but no templates to define exclusions can be added. So in general it is still a combination of Lucene and regular expressions. There is a quick filter function which allows filtering the search results by message, timeframe, facility, severity and host. Kibana has always used Apache Lucene search syntax. As dynamic queries are very easily created in Kibana, it makes it very simple to make very specific search patters from scratch. ELSA uses a close to Google style search syntax, but the important difference is that no wildcards can be used in basic queries. Only asynchronous queries can have wildcards, in which case results would come later by email, which is not very convenient in many cases. According to author’s opinion considering search syntax the programs can be put in next order with the best application of search syntax on top: 1. Kibana 2. Graylog2 3. ELSA 4.2.3.1.6. Analytics Concerning data analysis Graylog2 has very limited functionality. It has some basic graphs which show the amount of logs per given period. Kibana offers flexible and functional analysis tools with very good dashboards. Kibana 3 allows creating custom interfaces and dashboards. Elsa has good dashboards based on Google Visualisations, which are a powerful tool, but require internet access from the server, which is not always a good option and sometimes not possible. According to author’s opinion considering analytics the programs can be put in next order with the best analytics software on top: 1. Kibana 2. ELSA 3. Graylog2
  • 50. 50 4.2.3.1.7. Ease of use According to the author’s opinion Kibana is the most intuitive and easy to use. All the operations take minimum clicks and movements and can be done in more than one way. Graylog2 version 0.11 has improved in terms of ease of use in comparison to 0.9x Single search field was introduced which supports Apache Lucene syntax. It takes more operations than in Kibana to see details of an event log. To do that a permalink inside the sidebar should be clicked. This is not very convenient. ELSA has the least intuitive and easy to use interface of the three solutions. According to author’s opinion considering ease of use the programs can be put in next order with the easiest to use on top: 1. Kibana 2. Graylog2 3. ELSA 4.2.3.1.8. Universality Central log management can be used in different environments: networking administration, application development, system administration, web administration, software testing etc. Although there is no perfect universal solution to fit every environment, Kibana according to the author’s opinion is likely to fit more types of environment because of its high usability and analytics. ELSA is probably the least universal of the three solutions because it is designed specifically for high scale network analysis. According to author’s opinion considering universality the programs can be put in next order with the most universal on top: 1. Kibana 2. Graylog2 3. ELSA 4.2.3.1.9. Ease of installation According to author’s experience Kibana was the easiest and most straightforward to install of the three solutions. Installation of Elasticsearch consists of downloading, extracting and starting it. Kibana has two simple commands more as it uses Ruby. Logstash is available in a single java file. Rsyslog could be installed using packages (see Appendix 3). Although ELSA has a fully automatic script tested on a number of Unix platforms, it works well on clean OS installations only. The script resolves dependencies and installs the whole solution within minutes and can be used for updating, but if there is an issue with a specific component, it might fail. Then troubleshooting might be quite complicated as the structure is not trivial, manual installation is also quite complex.
  • 51. 51 From author’s opinion Graylog2 installation was the most complicated because of its web Interface, which added a lot of non-trivial Installation steps. Another point because of MongoDB Graylog2 initially requires substantially more disks pace, so the default 8 Gigabytes of disks space assigned by default for CentOS by OracleVirtual Box had to be increased. According to author’s opinion considering ease of installation the programs can be put in next order with the easiest to install on top: 1. Kibana 2. ELSA 3. Graylog2
  • 52. 52 5. Implementation Based on the research and testing it was decided to implement Kibana as the front end of the log management solution. The main log shipper for Kibana would be Rsyslog. The estimated event rate is 1000 – 2000 with peaks up to 3000 events per second. According to the performance test results, which were higher than 4000 logs per second, Kibana should be a suitable solution for the environment. 5.1. Production environment The environment for implementation of central log management system with Kibana as the front end is a small office in Tallinn. This is a central reservations office for a Norwegian company. There are around 150 client nodes. These are Intel hardware based workstation with Microsoft Windows 7 pro managed through Active Directory. The main business critical services are kept in a datacentre. There are some local servers e.g. DNS, Active Directory, Microsoft SharePoint, antivirus management, Cacti network monitoring, Samba fileserver etc. Most of the servers are installed as virtual machines on a Hyper V server. There are two separate network lines: one dedicated line for internal business critical traffic and a local ISP for internet access. Dedicated line connects to the datacentre and other offices. Main internal traffic is VoIP (Microsoft Lync), Citrix and some web applications. 5.2. Implementation of Kibana in production The production environment of the target company currently has: 10 switches, 5 routers, 10 servers and more than 150 workstations in local network. There are 10 critical servers held in the cloud, which is a datacentre connected with a dedicated line to the local office. It would be a good solution to get a backup of the logs kept on the main servers in the cloud. Additionally datacentre storage space is much more expensive than local storage, so it is possible to fit more logs. In the first step of implementation VHD with Kibana log management solution was imported from the Virtualbox test environment to the Hyper V server. As the operating system is CentOS the drivers for Hyper V are included and the migration was done with no issues [33]. Here are the specifications of central local server on Hyper V: HP Proliant ML350G6 E5620 P410i/512+BBWC 3x2GB 3x146GB 30 GB HP REG PC3-10600 4x146GB 6G SAS 10K rpm SFF (2.5-inch) Dual Port Hard Drives
  • 53. 53 Current setup is that Kibana is receiving syslog messages from all Unix servers, internal gateway which is a cisco 800 series router and a 10 HP Procurve switches. Desired setup is to install to and send log messages from all syslog capable devices to Kibana. As Windows does not support Syslog a software client capable of converting event log to syslog should be installed on all windows based workstation and servers (see Figure 11). Figure 11 Scheme of Kibana implementation While authentication and saved search features are being developed in Kibana, it is planned to use Apache with passenger module for authentication and save searches manually. Queries in Kibana generate URLs in a Base64 format [30]. Rsyslog would have two parallel outputs: one into Elasticsearch and another one into text files. The text files would be kept to minimum reasonable size and would be rotated using Rsyslog log rotation to avoid duplicate logging on the same machine. Alerting could be configured using Simple Event Correlator (SEC) which would be watching rotated log files created by Rsyslog and would be sending emails if a pattern is matched.
  • 54. 54 6. Future research The log management solutions described in this thesis were tested for small business. These solutions could be used in bigger companies as well. These companies could benefit from using those open-source solutions as the costs for log management can be very high in big companies when using commercial solutions. As the scope of the thesis is log management solutions for small business the tests were carried out on modest hardware. Performance on more powerful hardware could be tested. This would show how the systems would suit a larger scale environment. The tests were done on single nodes. Scalability and performance of the solutions could be tested by installing clusters of different size. The usability test results provided in this thesis were based on the author’s opinion. Similar testing for a larger environment could be carried on a target group, as bigger companies have more personnel available.
  • 55. 55 7. Summary According to the authors opinion all three systems have well-built web interfaces that serve their intended purpose. The choice depends on the environment where it would be used. Graylog2 is a great tool for environments that need to give access to specific logs only. An example would be a company that is providing IT services and has different teams: developers, system administrators, network administrators, supervisors etc. Performance testing showed that ELSA is the fastest and can handle about 14285,7 logs per second with the modest hardware resources used for testing. As the solution is meant for small business, performance is not a crucial factor so Graylog2 and Kibana could very well compete with ELSA in the given conditions. According to usability test results Kibana is the most usable system. Kibana would be the best choice for environments that benefit from combination of great usability, analytics and good performance. ELSA should be suitable for high volume and high scale log management. It is specifically designed for network incident response and fighting APT. This seems like a great tool for large network monitoring, for example ISPs or CERT could benefit from using ELSA. Kibana and Rsyslog were chosen for installation in production environment because of the usability, ease of installation and suitable performance.
  • 56. 56 Vabavaralise logihaldussüsteemi valik väikeettevõttele Magistritöö kood ITI70LT (30 EAP) tudeng: Artjom Tšurilin matrikkli number: 113832IVCMM Juhendaja: Risto Vaarandi, Ph.D Resüme Antud lõputöö keskendub kolme populaarse vabavaralise logihaldussüsteemi võrdlusele. Lõputöö eesmärgiks on anda ülevaade kolmest populaarsest logihaldussüsteemist ja pakkuda juhiseid sellise valikuks, mis parimal võimalikul moel sobiks väikeettevõttele. Valik põhineb võrdleval analüüsil ning efektiivsuse ja kasutuskõlblikkuse testimisel. Ettevõtte logiotsing ja arhiiv (ELSA) on ülimalt tõhus vabavaraline logihaldussüsteem, mis võib silmad ette anda ettevõtte kvaliteetsetele kommertslahendustele. See on projekteeritud tõhusaks häiringute tõrjeks ja võitluseks komplekssete püsiohtude (APT) vastu. Kibana on logi analüüsi eeskomponent Logstash ja Elasticsearch jaoks. Seda võib samuti kasutada muude tagasüsteemidega, mis toetavad vormindatud väljundit süsteemi Elasticsearch, sellist nagu on Rsyslog lõppvalmistaja Elasticsearch mooduliga. Graylog2 on alternatiivne logihaldusvahend omaenese veebi graafilise kasutajaliidesega (GUI). Graylog2 eriomaduseks on, et logisid võib hõlpsasti jagada erinevatesse voogudesse, võimaldades erinevatel kasutajatel juurdepääsu eri tüüpi logidele. Tõhususe testimine näitas, et ELSA on kiireim ja suudab käsitleda umbes 14285,7 logi sekundis, testimisel kasutatud tagasihoidlike riistvararessursside juures. Kuna lahendus on ette nähtud väikeettevõtlusele, siis pole tõhusus otsustavaks teguriks ning Graylog2 ja Kibana suudavad väga hästi antud tingimustes konkureerida ELSA-ga. Lähtuvalt kasutuskõlblikkuse testi tulemustest on Kibana enim kasutuskõlblik ja süsteemseim. Kibana koos Rsyslog’iga valiti sobivaimaks lahenduseks väikeettevõttele. Sellel on teatud puudused, mis ilmnevad autentimisel ja salvestatud otsingutel, kuid kasutuskõlblikkus, installimise kergus ja universaalsus teevad sellest väljapaistva lahenduse väikeettevõtlusele. Puuduvad funktsioonid on väljatöötluse staadiumis, samas on võimalus kasutada välismehhanisme ja vastukaalusid vea neutraliseerimiseks.
  • 57. 57 List of References [1] K Chandy. Mani Event-Driven Applications: Costs, Benefits and Design Approaches, California Institute of Technology, www.infospheres.caltech.edu/sites/default/files/Event- Driven%20Applications%20- %20Costs,%20Benefits%20and%20Design%20Approaches.pdf (accessed 29.03.2013) [2] http://www.webopedia.com/TERM/E/event.html (accessed 05.03.2013) [3] R. Vaarandi, Cyber Defense Monitoring Solutions, 1-event-logs-and-syslog [4] http://www.ietf.org/rfc/rfc3164.txt (accessed 07.04.2013) [5] http://www.rsyslog.com/doc/history.html (accessed 06.04.2013) [6] http://www.balabit.com/network-security/Syslog-ng/opensource-logging- system/features/comparison (accessed 08.03.2013) [7] R. Vaarandi, Cyber Defense Monitoring Solutions, 5-Syslog-ng-framework [8] http://www.rsyslog.com/doc/licensing.html (accessed 06.04.2013) [9] R. Gerhards, “Should I use rsyslog's new or old config style?” http://blog.gerhards.net/ (accessed 06.04.2013) [10] http://www.graylog2.org/about (accessed 06.03.2013) [11] https://docs.google.com/file/d/0By1KXg1ivlIeUjVoSVVjTVcxbzg/edit?pli=1 (accessed 18:03.2013) [12] https://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation (accessed 18:03.2013) [13] http://elasticsearch.com/products/elasticsearch/(accessed 12.03.2013) [14] http://www.sinatrarb.com/ (accessed 25.03.2013) [15] http://enterprise-log-search-and-archive.googlecode.com/svn- history/r112/wiki/Documentation.wiki (accessed 18:03.2013) [16] http://www.jboss.org/drools/drools-expert (accessed 10.03.2013) [17] http://graphite.wikidot.com/ (accessed 25.03.2013) [18] http://support.torch.sh/help/kb/graylog2-server/using-librato-metrics-with-graylog2 (accessed 06.03.2013) [19] http://www.logstash.net/docs/1.1.10/ (accessed 08.04.2013) [20] http://linuxdrops.com/log-management-using-logstash-and-kibana-on-centos-rhel-fedora/# (accessed 10.03.2013)
  • 58. 58 [21] https://github.com/rashidkpc/Kibana/pull/261 (accessed 04.04.2013) [22] https://gist.github.com/ (accessed 07.04.2013) [23] https://code.google.com/p/enterprise-log-search-and-archive/ (accessed 18:03.2013) [24] https://github.com (accessed 15.03.2013) [25] http://technet.microsoft.com/en-us/library/cc794868%28v=ws.10%29.aspx (accessed 06.04.2013) [26] http://linux.about.com/library/cmd/blcmdl1_time.htm (accessed 15.03.2013) [27] http://netcat.sourceforge.net/(accessed 15.03.2013) [28] http://htop.sourceforge.net/ (accessed 10.03.2013) [29] https://code.google.com/p/enterprise-log-management-appliance/wiki/omelasticsearch [30] https://github.com/rashidkpc/Kibana/issues/326 (accessed 07.04.2013) [31] https://github.com/rashidkpc/Kibana/issues/310 (accessed 06.04.2013) [32] https://www.phusionpassenger.com/ (accessed 12.04.2013) [33] http://wiki.centos.org/Manuals/ReleaseNotes/CentOS6.4 (accessed 15.03.2013) [35] http://semicomplete.com/presentations/logstash-puppetconf-2012/#/ (accessed 12.03.2013) [36] http://kibana.org/infrastructure.html (accessed 25.03.2013) [37] http://graylog2.com/about (accessed 06.03.2013) [38] http://support.torch.sh/help/kb/graylog2-web-interface/message-search-syntax (accessed 06.03.2013)
  • 59. 59 Appendices Appendix - 1 Basic Event Log Cycle [35]
  • 60. 60 Appendix 2 - Logstash Inputs, Filters and Outputs Inputs filters outputs amqp alter amqp drupal_dblog anonymize boundary elasticsearch checksum circonus eventlog clone cloudwatch exec Csv datadog file date elasticsearch ganglia dns elasticsearch_http gelf environment elasticsearch_river gemfire gelfify email generator geoip exec graphite grep file heroku grok ganglia imap grokdiscovery gelf irc json gemfire log4j Kv graphite lumberjack metrics graphtastic lumberjack2 multiline hipchat pipe mutate http rabbitmq noop internal redis ruby irc relp sleep juggernaut snmptrap split librato sqs syslog_pri loggly stdin translate lumberjack stomp urldecode metriccatcher syslog useragent mongodb tcp xml nagios twitter zeromq nagios_nsca udp null varnishlog opentsdb websocket pagerduty xmpp pipe zenoss rabbitmq zeromq redis riak riemann sns sqs statsd stdout stomp syslog tcp websocket xmpp zabbix zeromq [19]
  • 61. 61 Appendix 3 - Rsyslog main components installation RPMs for installing Rsyslog v7 #!/bin/sh wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libee-devel-0.4.1-1.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libee-0.4.1-1.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/json-c-0.9-4.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/json-c-devel-0.9-4.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libestr-0.1.5-1.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/libestr-devel-0.1.5- 1.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/liblognorm-devel-0.3.4- 5.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/liblognorm-0.3.4-5.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/rsyslog-7.2.6-3.el6.x86_64.rpm wget http://rpms.adiscon.com/v7-stable/epel-6/x86_64/RPMS/rsyslog-elasticsearch-7.2.6- 3.el6.x86_64.rpm Installing from RPM rpm -ivh libee-devel-0.4.1-1.el6.x86_64.rpm libee-0.4.1-1.el6.x86_64.rpm json-c-0.9- 4.el6.x86_64.rpm json-c-devel-0.9-4.el6.x86_64.rpm libestr-0.1.5-1.el6.x86_64.rpm libestr-devel- 0.1.5-1.el6.x86_64.rpm liblognorm-devel-0.3.4-5.el6.x86_64.rpm liblognorm-0.3.4- 5.el6.x86_64.rpm rsyslog-7.2.6-3.el6.x86_64.rpm rsyslog-elasticsearch-7.2.6-3.el6.x86_64.rpm
  • 62. 62 Appendix 4 - Kibana setup example scheme [36]