This document discusses ways to optimize logging by centralizing and proactively using log data. It recommends using Monolog to log from application code in a standardized format. Rsyslog can then collect logs centrally from applications and systems. Logstash can further process logs with filters and output them to destinations like Elasticsearch. Graylog2 provides a web interface for powerful log searching, analytics, and alerting. Centralizing, standardizing, and proactively analyzing logs with these open source tools allows for improved monitoring and troubleshooting.
2. Who?
●
●
●
●
●
Ex-pat Englishman, now living in
Southern Ontario.
Web developer for 5 years, mostly
PHP.
(Almost) senior software engineer at
TribeHR.
Co-organiser of Guelph PHP User
Group.
Ex-professional musician.
11. What's wrong with
error_log?
●
Nothing at all but...
●
It's limited:
–
Have to format the message yourself.
–
Limited number of destinations.
–
Doesn't support logging levels defined
in RFC 5424.
15. Channels
●
●
●
A channel is a name or category for
a logger.
Each logger instance is given a
channel when instantiated.
Allows for multiple loggers, each
with a different channel.
16. Handlers
●
●
●
●
Handlers write log messages to a
storage medium.
Multiple handlers can be attached to
each logger.
Set lowest level handler logs at and
if it 'bubbles'.
Many handlers available or you can
write your own.
18. Formatters
●
●
Processes a log message into the
appropriate format for a handler.
Each handler has a default
formatter to use but this can be
overridden.
22. Processors
●
●
●
Used to amend or add to the log
message.
PHP callable, called when a
message is logged.
Built in processors available:
–
IntrospectionProcessor
–
WebProcessor
–
MemoryUsageProcessor
–
MemoryPeakUsageProcessor
–
ProcessIdProcessor
–
UidProcessor
27. Why Syslog?
●
●
Loggable events don't only happen
in code!
To get a full picture of what's going
on we need to monitor what's going
on in other services too.
28. Syslog basics
●
●
●
●
OS daemon to process log
messages.
Messages are assigned a facility,
such as auth, authpriv, daemon or
cron or a custom one.
Messages are also assigned a
severity, defined in RFC 5424.
Messages can be sent to files,
console or a remote location.
29. Which Syslog daemon
to use?
●
In part will depend on your OS.
●
Things to consider:
–
Syslog is the oldest with not as many
features.
–
Syslog-ng is produced under a dual
license.
–
Rsyslog fully featured and open
source.
30. Introduction to Rsyslog
●
Fork of syslog by Rainer Gerhards.
●
Drop in replacement for syslog.
●
●
Many, many features including
plugin system for extending.
Default syslogger in Debian, can be
installed on other distros too.
31. Remote logging with
Rsyslog
●
Rsyslog can be configured to work
in a client-server setup.
–
–
●
One or more machines are setup as
clients to forward log messages.
One machine is setup to receive and
store them.
Probably want to filter sender on the
receiving machine...
34. Leveling up with Rsyslog
●
●
Apache can send all error logs to syslog
directly.
Rsyslog can also monitor other log files
using the Text File Input module.
–
Example of monitoring Apache access log at
https://gist.github.com/joseph12631/2580615
37. What is Logstash?
●
●
●
Tool to collect, filter and output log
messages.
Built in web interface or richer web
interface project called Kibana
available.
Full information at
http://logstash.net/ and Kibana
demo at http://demo.logstash.net/
38. Installing Logstash
●
●
Current release is 1.3.3 and can be
downloaded from here.
Run from cli, use supervisord or an
init.d/upstart script (cookbook entry
on how to do this at
http://cookbook.logstash.net/).
40. Logstash config
●
●
●
When starting specify the path to a
config file for Logstash to use.
Three main sections: input, filter
and output.
Each section may have multiple
instances of each type.
44. What is Graylog2?
●
●
●
●
Log storage and search application.
Can accept thousands of messages
per second and store terabytes of
data.
Web interface for searching and
analytics.
Built in alerting and metrics.
46. Getting log messages into
Graylog2
●
Can accept log messages in 3
ways:
–
–
Syslog via UDP or TCP.
–
●
Graylog Extended Log Format (GELF)
via UDP .
AMQP.
Multiple Graylog2 server instances
can be run in parallel.
47. Graylog2 web interface
●
●
Main view shows recent log
messages and graphs of recent
message numbers.
Single message can be clicked on
to view all details for it.
●
Dashboard views.
●
Full search functionality.
●
Analytics dashboard and metrics.
51. Searches and streams
●
●
●
Web interface allows fine grained
searching by different fields.
Frequently used searches can be
saved as streams.
Streams can be marked as
favourites by users and can be
viewed as dashboards.
52. Stream alarms
●
●
Alarms can be sent for a stream
with user defined sensitivity.
Plugins for sending alarms include:
–
–
PagerDuty
–
HipChat
–
Twilio SMS
–
●
Email
Jabber/XMPP
You can also write your own
Of all the things you would come to a conference like this to hear about...
Crisscott.com seems to be Scott Mattocks.
Logging
Unit Testing
Configuration
Isolates features
Documented
You can't optimise what you can't measure...
How many people monitor log files regularly?
How many only look at them during a major crisis?
Many log files generated by many applications/pieces of software.
Last time want to be digging through this is in a crisis.
Mention that I can't tell you how to do this.
This talk will introduce some tools that can get you to this point.
Combination of tools will get you to a pro-active log monitoring solution.
Also mention that for each tool I'm talking about there are many alternatives...
Mention closed source alternatives.
Mention that this is being used in production at MRX.
Of course this will be different for everyone!
Also mention that it's specifically for logging errors, not informational or debug messages.
Difficult to format messages.
Destinations: file or email.
Define log levels in RFC 5425
Mention that there are many logging libraries but Monolog has seemed to have gained the most traction.
Describe what PSR-3 is.
PPI takes pieces of Zend 2, Sf2 and Doctrine2 and mashes them!
Silex allows you to register a Monolog provider.
Channel equates to facility in Syslog.
Makes it easy to use different loggers for different parts/functionality in an app.
The handlers constructor accepts the minimum log level that the handler should accept. Defaults differently depending on handler.
Handlers can be shared between multiple loggers.
Needs care when not bubbling! Add more specific handlers later.
Rotating File Handler: Creates one file per day but meant as a quick + dirty solution.
Mail handlers include native mail and Swiftmail handlers.
Pushover handler sends mobile notifications through the Pushover API.
HipChatHandler send notification to a HipChat chat room (Rafael Dohms wrote it)
FirePHP and ChromePHP write to FireBug or Chrome consoles. DEV ONLY!!
Use Handler::setFormatter() method to set the formatter for a handler.
Mention that logging a message accepts up to two arguments:
The message (string) and an array of context.
Mention that handlers added last are called first.
Mention that this takes away some of the repetition of adding context to each log message.
IntrospectionProcessor: Adds the line/file/class/method from which the log call originated.
WebProcessor: Adds the current request URI, request method and client IP to a log record.
MemoryUsageProcessor: Adds the current memory usage to a log record.
MemoryPeakUsageProcessor: Adds the peak memory usage to a log record.
ProcessIdProcessor: Adds the process id to a log record.
UidProcessor: Adds a unique identifier to a log record.
Problems often caused by the intersection of different pieces of software.
Mention that you can often replace the default syslog daemon in an OS.
Mention that not going into all features of Rsyslog, just focusing on remote logging.
Suggest 'man rsyslog' or 'man rsyslog.conf'.
Also mention that can use something like Rsyslog or IPTables to filter remote loggers.
Note this should be added to main rsyslog config file or a file that's included in it.
This is for UDP forwarding. TCP would use @@.
Mention that normally you would need just one of these.
Also that the corresponding port needs to be opened in the server config.
This would only load the handler for the remote logs. Still needs to be processed with other directives.
Note that if all you want is to centralise all of your logs this could be the solution...
Mention that Logstash is written in Java.
34 inputs, has 28 filters and 47 different outputs.
Varnishlog – input from Varnishes memory log.
Anonymize – anonymise fields using a consistent hash.
Grok – regex library for parsing log messages and processing matches.
Geoip – add geo data to ip addresses in log messages.
Mutate – General mutations (rename, remove, replace, modify) to fields.
Of course this will be different for everyone!
Discuss advantages and disadvantages to using Graylog or Logstash.
Mention that graylog server and elasticsearch are written in Java, web interface is a Rails app.
Mention login details for the demo – username admin or user, password graylog2.
Benefits of UDP – 'Fire and forget'.
Drawbacks of UDP – Lack of acknowledgement of receiving messages.
TCP can mitigate packet loss but slower.
AMQP guarantees delivery, but more complex to setup and run.
GELF is basically JSON. Ideal for sending messages from app code. Libraries in many languages, including a Monolog handler.