SlideShare une entreprise Scribd logo
1  sur  33
Logstash::Intro
           @ARGV
Why use Logstash?

• We already have splunk, syslog-ng, chukwa,
  graylog2, scribe, flume and so on.
• But we want a free, light-weight and high-
  integrality frame for our log:
•   non free --> splunk
•   heavy java --> scribe,flume
•   lose data --> syslog
•   non flex --> nxlog
How logstash works?

• Ah, just like others, logstash has
  input/filter/output plugins.
• Attention: logstash process events, not (only)
  loglines!
• "Inputs generate events, filters modify them,
  outputs ship them elsewhere." -- [the life of an
  event in logstash]
• "events are passed from each phase using
  internal queues......Logstash sets each queue
  size to 20." -- [the life of an event in logstash]
Existing plugins
Most popular plugins(inputs)

•   amqp
•   eventlog
•   file
•   redis
•   stdin
•   syslog
•   ganglia
Most popular plugins(filters)

•   date
•   grep
•   grok
•   multiline
Most popular plugins(outputs)

•   amqp
•   elasticsearch
•   email
•   file
•   ganglia
•   graphite
•   mongodb
•   nagios
•   redis
•   stdout
•   zabbix
•   websocket
Usage in cluster - agent install

• Only an 'all in one' jar download in
  http://logstash.net/
• All source include ruby and JRuby in
  http://github.com/logstash/
• But we want a lightweight agent in cluster.
Usage in cluster - agent install

• Edit Gemfile like:
   –   source "http://ruby.taobao.org/"
   –   gem "cabin", "0.4.1"
   –   gem "bunny"
   –   gem "uuidtools"
   –   gem "filewatch", "0.3.3"
• clone logstash/[bin|lib]:
   – git clone https://github.com/chenryn/logstash.git
   – git branch pure-ruby
• Gem install
   – gem install bundler
   – bundle
• Run
   – ruby logstash/bin/logstash -f logstash/etc/logstash-agent.conf
Usage in cluster - agent configuration

  –   input {
  –     file {
  –       type => "nginx"
  –       path => ["/data/nginx/logs/access.log" ]
  –    }
  –   }
  –   output {
  –     redis {
  –       type => "nginx"
  –       host => "5.5.5.5"
  –       key => "nginx"
  –       data_type => "channel"
  –     }
  –   }
Usage in cluster - server install

• Server is another agent run some filter and
  storages.
• Message queue(RabbitMQ is too heavy, Redis
  just enough):
  – yum install redis-server
  – service redis-server start
• Storage: mongo/elasticsearch/Riak
• Visualization: kibana/statsd/riemann/opentsdb
• Run:
  – java -jar logstash-1.1.0-monolithic.jar agent -f logstash/etc/server.conf
Usage in cluster - server configuration

  –   input {
  –     redis {
  –       type => "nginx"
  –       host => "5.5.5.5"
  –       data_type => "channel"
  –       key => "nginx"
  –     }
  –   }
  –   filter {
  –     grok {
  –       type => "nginx"
  –       pattern => "%{NGINXACCESS}"
  –       patterns_dir => ["/usr/local/logstash/etc/patterns"]
  –     }
  –   }
  –   output {
  –     elasticsearch {
  –       cluster => 'logstash'
  –       host => '10.5.16.109'
  –       port => 9300
  –     }
  –   }
Usage in cluster - grok

• jls-grok is a pattern tool wrote by JRuby
• Lots of examples can be found at:
  https://github.com/logstash/logstash/tree/master/patterns

• Here is my "nginx" patterns:
   – NGINXURI %{URIPATH}(?:%{URIPARAM})*
   – NGINXACCESS [%{HTTPDATE}] %{NUMBER:code:int} %{IP:client} %
     {HOSTNAME} %{WORD:method} %{NGINXURI:req} %{URIPROTO}/%
     {NUMBER:version} %{IP:upstream}(:%{POSINT:port})? %
     {NUMBER:upstime:float} %{NUMBER:reqtime:float} %{NUMBER:size:int}
     "(%{URIPROTO}://%{HOST:referer}%{NGINXURI:referer}|-)" %
     {QS:useragent} "(%{IP:x_forwarder_for}|-)"
Usage in cluster - elasticsearch

• ElasticSearch is a production build-on Luence
  for the cloud compute.
• more information at:
  – http://www.elasticsearch.cn/

• Logstash has an embedded ElasticSearch
  already!
• Attention: If you want to build your own
  distributed elasticsearch cluster, make sure the
  server version is equal to the client used by
  logstash!
Usage in cluster - elasticsearch

•   elasticsearch/config/elasticsearch.yml:
     –   cluster.name: logstash
     –   node.name: "ES109"
     –   node.master: true
     –   node.data: false
     –   index.number_of_replicas: 0
     –   index.number_of_shards: 1
     –   path.data: /data1/ES/data
     –   path.logs: /data1/ES/logs
     –   network.host: 10.5.16.109
     –   transport.tcp.port: 9300
     –   transport.tcp.compress: true
     –   gateway.type: local
     –   discovery.zen.minimum_master_nodes: 1
Usage in cluster - elasticsearch

• The embedded web front for ES is too simple,
  sometimes naïve~Try Kibana and EShead.
•   https://github.com/rashidkpc/Kibana
•   https://github.com/mobz/elasticsearch-head.git

• Attention:there is a bug about ES ---- ifdown
  your external network before ES starting and
  ifup later.Otherwase your ruby client cannot
  connect ES server!
Try it please!

• Ah, do not want install,install,install and install?
• Here is a killer application:
   –   sudo zypper install virtualbox rubygems
   –   gem install vagrant
   –   git clone https://github.com/mediatemple/log_wrangler.git
   –   cd log_wrangler
   –   PROVISION=1 vagrant up
Other output example

• For monitor(example):
  –   filter {
  –     grep {
  –       type => "linux-syslog"
  –       match => [ "@message","(error|ERROR|CRITICAL)" ]
  –       add_tag => [ "nagios-update" ]
  –       add_field => [ "nagios_host", "%{@source_host}", "nagios_service", "the name of your
      nagios service check" ]
  –     }
  –   }
  –   output{
  –     nagios {
  –       commandfile => “/usr/local/nagios/var/rw/nagios.cmd"
  –       tags => "nagios-update"
  –       type => "linux-syslog"
  –     }
  –    }
Other output example

• For metric
  – output {
  – statsd {
  –   increment => "apache.response.%{response}"
  –   count => [ "apache.bytes", "%{bytes}" ]
  – }
  – }
Advanced Questions

• Is ruby1.8.7 stability enough?
•   Try Message::Passing module in CPAN, I love perl~

• Is ElasticSearch high-speedy enough?
•   Try Sphinx, see report in ELSA project:
     –    In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to
          slowest for indexing speeds (non-scientifically tested):
     1.   Tokyo Cabinet
     2.   MongoDB
     3.   TokuDB MySQL plugin
     4.   Elastic Search (Lucene)
     5.   Splunk
     6.   HBase
     7.   CouchDB
     8.   MySQL Fulltext
•   http://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation#Why_ELSA?
Advanced Testing

• How much event/sec can ElasticSearch hold?
•   - Logstash::Output::Elasticsearch(HTTP) can only indexes 200+ msg/sec for
    one thread.
•   - Try _bulk API by myself using perl ElasticSearch::Transport::HTTPLite
    module.
•   -- speed testing result is 2500+ msg/sec
•   -- tesing record see:
    http://chenlinux.com/2012/09/16/elasticsearch-bulk-index-speed-testing/




                           WHY?!
Maybe…

• Logstash use an experimental module, we can
  see the Logstash::Output::ElasticsearchHTTP
  use ftw as http client but it cannot hold bulk size
  larger than 200!!
• So we all suggest to use multi-output block in
  agent.conf.
Advanced ES Settings(1)--problems

• Kibana can search data by using facets APIs.
  But when you indexes URLs, they would be
  auto-splitted by ‘/’~~
• And search facets at ip from 1000w msgs use
  0.1s,but at urls use…ah, timeout!
• When you check your indices size, you will find
  that (indices size/indices count) : message
  length ~~ 10:1 !!
Advanced ES Settings(2)--solution

• Setting ElasticSearch default _mapping
  template!
• In fact, ES “store” index data, and then “store”
  store data… Yes! If you don’t set “store” : “no”,
  all the data reduplicate stored.
• And ES has many analyze plugins.They
  automate split words by whitespaces, path
  hierachy, keword etc.
• So, set “index”:”not_analyzed” and facets 100k+
  URLs can be finished in 1s.
Advanced ES Settings(2)--solution

• Optimze:
• Call _optimze API everyday may decrease some
  indexed size~

• You can found those solutions in:
•   https://github.com/logstash/logstash/wiki/Elasticsearch-Storage-Optimization
•   https://github.com/logstash/logstash/wiki/Elasticsearch----Using-index-templates-&-dynamic-
Advanced Input -- question

• Now we know how to disable _all field, but there
  are still duplicated fields: @fields and
  @message!
• Logstash search ES default in @message field
  but logstash::Filter::Grok default capture
  variables into @fields just from @message!
• How to solve?
Advanced Input -- solution

• We know some other systems like
  Message::Passing have encode/decode in
  addition to input/filter/output.
• In fact logstash has them too~but rename them
  as ‘format’.
• So we can define the message format ourself,
  just using logformat in nginx.conf.

•   (example as follow)
Advanced Input -- nginx.conf

   – logformat json '{"@timestamp":"$time_iso8601",'
     '"@source":"$server_addr",‘
     '"@fields":{‘
     '"client":"$remote_addr",'
     '"size":$body_bytes_sent,'
     '"responsetime":$request_time,' '"upstreamtime":
     $upstream_response_time,'
     '"oh":"$upstream_addr",'
     '"domain":"$host",'
     '"url":"$uri",'
     '"status":"$status"}}';
   – access_log /data/nginx/logs/access.json json;
• See
  http://cookbook.logstash.net/recipes/apache-json-logs/
Advanced Input -- json_event

• Now define input block with format:
     – input {
     –    stdin {
     –       type => "nginx“
     –       format => "json_event“
     –    }
     – }

• And start in command line:
     – tail -F /data/nginx/logs/access.json 
     – | sed 's/upstreamtime":-/upstreamtime":0/' 
     – | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/agent.conf &
•   Attention: Upstreamtime may be “-” if status is 400.
Advanced Web GUI

• Write your own website using ElasticSearch
  RESTful API to search as follows:
  –   curl -XPOST http://es.domain.com:9200/logstash-2012.09.18/nginx/_search?pretty=1 –d ‘
      {
        “query”: {
          “range”: {
            “from”: “now-1h”,
            “to”: “now”
          }
        },
        “facets”: {
          “curl_test”: {
            “date_histogram”: {
              “key_field”: “@timestamp”,
              “value_field”: “url”,
              “interval “: “5m”
            }
          }
        },
        “size”: 0
      }
      ’
Additional Message::Passing demo

• I do write a demo using Message::Passing,
  Regexp::Log, ElasticSearch and so on perl
  modules working similar to logstash usage
  showed here.
• See:
  – http://chenlinux.com/2012/09/16/message-passing-agent/
  – http://chenlinux.com/2012/09/16/regexp-log-demo-for-nginx/
  – http://chenlinux.com/2012/09/16/message-passing-filter-demo/
Reference

•   http://logstash.net/docs/1.1.1/tutorials/metrics-from-logs
•   http://logwrangler.mtcode.com/
•   https://www.virtualbox.org/wiki/Linux_Downloads
•   http://vagrantup.com/v1/docs/getting-started/index.html
•   http://www.elasticsearch.cn
•   http://search.cpan.org/~bobtfish/Message-Passing-
    0.010/lib/Message/Passing.pm
Logstash

Contenu connexe

Tendances

Monitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_TutorialMonitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_Tutorial
Tim Vaillancourt
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 

Tendances (20)

The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
ELK introduction
ELK introductionELK introduction
ELK introduction
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Log analytics with ELK stack
Log analytics with ELK stackLog analytics with ELK stack
Log analytics with ELK stack
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Logstash-Elasticsearch-Kibana
Logstash-Elasticsearch-KibanaLogstash-Elasticsearch-Kibana
Logstash-Elasticsearch-Kibana
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELK
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Elk devops
Elk devopsElk devops
Elk devops
 
ELK Elasticsearch Logstash and Kibana Stack for Log Management
ELK Elasticsearch Logstash and Kibana Stack for Log ManagementELK Elasticsearch Logstash and Kibana Stack for Log Management
ELK Elasticsearch Logstash and Kibana Stack for Log Management
 
Redis introduction
Redis introductionRedis introduction
Redis introduction
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Monitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_TutorialMonitoring_with_Prometheus_Grafana_Tutorial
Monitoring_with_Prometheus_Grafana_Tutorial
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
Introducing ELK
Introducing ELKIntroducing ELK
Introducing ELK
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elk
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 

En vedette

How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
琛琳 饶
 

En vedette (6)

Elk stack
Elk stackElk stack
Elk stack
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Webinar usando graylog para la gestión centralizada de logs
Webinar usando graylog para la gestión centralizada de logsWebinar usando graylog para la gestión centralizada de logs
Webinar usando graylog para la gestión centralizada de logs
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
 

Similaire à Logstash

Similaire à Logstash (20)

Managing Your Security Logs with Elasticsearch
Managing Your Security Logs with ElasticsearchManaging Your Security Logs with Elasticsearch
Managing Your Security Logs with Elasticsearch
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Open Source Logging and Metric Tools
Open Source Logging and Metric ToolsOpen Source Logging and Metric Tools
Open Source Logging and Metric Tools
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with Puppet
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Open Source Logging and Metrics Tools
Open Source Logging and Metrics ToolsOpen Source Logging and Metrics Tools
Open Source Logging and Metrics Tools
 
Open Source Logging and Monitoring Tools
Open Source Logging and Monitoring ToolsOpen Source Logging and Monitoring Tools
Open Source Logging and Monitoring Tools
 
Don’t turn your logs into cuneiform
Don’t turn your logs into cuneiformDon’t turn your logs into cuneiform
Don’t turn your logs into cuneiform
 
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
DSLing your System For Scalability Testing Using Gatling - Dublin Scala User ...
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
 
(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석
 
Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3
 
Securing Your Webserver By Pradeep Sharma
Securing Your Webserver By Pradeep SharmaSecuring Your Webserver By Pradeep Sharma
Securing Your Webserver By Pradeep Sharma
 
ITB2019 NGINX Overview and Technical Aspects - Kevin Jones
ITB2019 NGINX Overview and Technical Aspects - Kevin JonesITB2019 NGINX Overview and Technical Aspects - Kevin Jones
ITB2019 NGINX Overview and Technical Aspects - Kevin Jones
 
Introducing the Seneca MVP framework for Node.js
Introducing the Seneca MVP framework for Node.jsIntroducing the Seneca MVP framework for Node.js
Introducing the Seneca MVP framework for Node.js
 
20120816 nodejsdublin
20120816 nodejsdublin20120816 nodejsdublin
20120816 nodejsdublin
 

Plus de 琛琳 饶 (9)

{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
 
More kibana
More kibanaMore kibana
More kibana
 
Monitor is all for ops
Monitor is all for opsMonitor is all for ops
Monitor is all for ops
 
Perl调用微博API实现自动查询应答
Perl调用微博API实现自动查询应答Perl调用微博API实现自动查询应答
Perl调用微博API实现自动查询应答
 
Add mailinglist command to gitolite
Add mailinglist command to gitoliteAdd mailinglist command to gitolite
Add mailinglist command to gitolite
 
Skyline 简介
Skyline 简介Skyline 简介
Skyline 简介
 
DNS协议与应用简介
DNS协议与应用简介DNS协议与应用简介
DNS协议与应用简介
 
Mysql测试报告
Mysql测试报告Mysql测试报告
Mysql测试报告
 
Perl在nginx里的应用
Perl在nginx里的应用Perl在nginx里的应用
Perl在nginx里的应用
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Logstash

  • 2. Why use Logstash? • We already have splunk, syslog-ng, chukwa, graylog2, scribe, flume and so on. • But we want a free, light-weight and high- integrality frame for our log: • non free --> splunk • heavy java --> scribe,flume • lose data --> syslog • non flex --> nxlog
  • 3. How logstash works? • Ah, just like others, logstash has input/filter/output plugins. • Attention: logstash process events, not (only) loglines! • "Inputs generate events, filters modify them, outputs ship them elsewhere." -- [the life of an event in logstash] • "events are passed from each phase using internal queues......Logstash sets each queue size to 20." -- [the life of an event in logstash]
  • 5. Most popular plugins(inputs) • amqp • eventlog • file • redis • stdin • syslog • ganglia
  • 6. Most popular plugins(filters) • date • grep • grok • multiline
  • 7. Most popular plugins(outputs) • amqp • elasticsearch • email • file • ganglia • graphite • mongodb • nagios • redis • stdout • zabbix • websocket
  • 8. Usage in cluster - agent install • Only an 'all in one' jar download in http://logstash.net/ • All source include ruby and JRuby in http://github.com/logstash/ • But we want a lightweight agent in cluster.
  • 9. Usage in cluster - agent install • Edit Gemfile like: – source "http://ruby.taobao.org/" – gem "cabin", "0.4.1" – gem "bunny" – gem "uuidtools" – gem "filewatch", "0.3.3" • clone logstash/[bin|lib]: – git clone https://github.com/chenryn/logstash.git – git branch pure-ruby • Gem install – gem install bundler – bundle • Run – ruby logstash/bin/logstash -f logstash/etc/logstash-agent.conf
  • 10. Usage in cluster - agent configuration – input { – file { – type => "nginx" – path => ["/data/nginx/logs/access.log" ] – } – } – output { – redis { – type => "nginx" – host => "5.5.5.5" – key => "nginx" – data_type => "channel" – } – }
  • 11. Usage in cluster - server install • Server is another agent run some filter and storages. • Message queue(RabbitMQ is too heavy, Redis just enough): – yum install redis-server – service redis-server start • Storage: mongo/elasticsearch/Riak • Visualization: kibana/statsd/riemann/opentsdb • Run: – java -jar logstash-1.1.0-monolithic.jar agent -f logstash/etc/server.conf
  • 12. Usage in cluster - server configuration – input { – redis { – type => "nginx" – host => "5.5.5.5" – data_type => "channel" – key => "nginx" – } – } – filter { – grok { – type => "nginx" – pattern => "%{NGINXACCESS}" – patterns_dir => ["/usr/local/logstash/etc/patterns"] – } – } – output { – elasticsearch { – cluster => 'logstash' – host => '10.5.16.109' – port => 9300 – } – }
  • 13. Usage in cluster - grok • jls-grok is a pattern tool wrote by JRuby • Lots of examples can be found at: https://github.com/logstash/logstash/tree/master/patterns • Here is my "nginx" patterns: – NGINXURI %{URIPATH}(?:%{URIPARAM})* – NGINXACCESS [%{HTTPDATE}] %{NUMBER:code:int} %{IP:client} % {HOSTNAME} %{WORD:method} %{NGINXURI:req} %{URIPROTO}/% {NUMBER:version} %{IP:upstream}(:%{POSINT:port})? % {NUMBER:upstime:float} %{NUMBER:reqtime:float} %{NUMBER:size:int} "(%{URIPROTO}://%{HOST:referer}%{NGINXURI:referer}|-)" % {QS:useragent} "(%{IP:x_forwarder_for}|-)"
  • 14. Usage in cluster - elasticsearch • ElasticSearch is a production build-on Luence for the cloud compute. • more information at: – http://www.elasticsearch.cn/ • Logstash has an embedded ElasticSearch already! • Attention: If you want to build your own distributed elasticsearch cluster, make sure the server version is equal to the client used by logstash!
  • 15. Usage in cluster - elasticsearch • elasticsearch/config/elasticsearch.yml: – cluster.name: logstash – node.name: "ES109" – node.master: true – node.data: false – index.number_of_replicas: 0 – index.number_of_shards: 1 – path.data: /data1/ES/data – path.logs: /data1/ES/logs – network.host: 10.5.16.109 – transport.tcp.port: 9300 – transport.tcp.compress: true – gateway.type: local – discovery.zen.minimum_master_nodes: 1
  • 16. Usage in cluster - elasticsearch • The embedded web front for ES is too simple, sometimes naïve~Try Kibana and EShead. • https://github.com/rashidkpc/Kibana • https://github.com/mobz/elasticsearch-head.git • Attention:there is a bug about ES ---- ifdown your external network before ES starting and ifup later.Otherwase your ruby client cannot connect ES server!
  • 17. Try it please! • Ah, do not want install,install,install and install? • Here is a killer application: – sudo zypper install virtualbox rubygems – gem install vagrant – git clone https://github.com/mediatemple/log_wrangler.git – cd log_wrangler – PROVISION=1 vagrant up
  • 18. Other output example • For monitor(example): – filter { – grep { – type => "linux-syslog" – match => [ "@message","(error|ERROR|CRITICAL)" ] – add_tag => [ "nagios-update" ] – add_field => [ "nagios_host", "%{@source_host}", "nagios_service", "the name of your nagios service check" ] – } – } – output{ – nagios { – commandfile => “/usr/local/nagios/var/rw/nagios.cmd" – tags => "nagios-update" – type => "linux-syslog" – } – }
  • 19. Other output example • For metric – output { – statsd { – increment => "apache.response.%{response}" – count => [ "apache.bytes", "%{bytes}" ] – } – }
  • 20. Advanced Questions • Is ruby1.8.7 stability enough? • Try Message::Passing module in CPAN, I love perl~ • Is ElasticSearch high-speedy enough? • Try Sphinx, see report in ELSA project: – In designing ELSA, I tried the following components but found them too slow. Here they are ordered from fastest to slowest for indexing speeds (non-scientifically tested): 1. Tokyo Cabinet 2. MongoDB 3. TokuDB MySQL plugin 4. Elastic Search (Lucene) 5. Splunk 6. HBase 7. CouchDB 8. MySQL Fulltext • http://code.google.com/p/enterprise-log-search-and-archive/wiki/Documentation#Why_ELSA?
  • 21. Advanced Testing • How much event/sec can ElasticSearch hold? • - Logstash::Output::Elasticsearch(HTTP) can only indexes 200+ msg/sec for one thread. • - Try _bulk API by myself using perl ElasticSearch::Transport::HTTPLite module. • -- speed testing result is 2500+ msg/sec • -- tesing record see: http://chenlinux.com/2012/09/16/elasticsearch-bulk-index-speed-testing/ WHY?!
  • 22. Maybe… • Logstash use an experimental module, we can see the Logstash::Output::ElasticsearchHTTP use ftw as http client but it cannot hold bulk size larger than 200!! • So we all suggest to use multi-output block in agent.conf.
  • 23. Advanced ES Settings(1)--problems • Kibana can search data by using facets APIs. But when you indexes URLs, they would be auto-splitted by ‘/’~~ • And search facets at ip from 1000w msgs use 0.1s,but at urls use…ah, timeout! • When you check your indices size, you will find that (indices size/indices count) : message length ~~ 10:1 !!
  • 24. Advanced ES Settings(2)--solution • Setting ElasticSearch default _mapping template! • In fact, ES “store” index data, and then “store” store data… Yes! If you don’t set “store” : “no”, all the data reduplicate stored. • And ES has many analyze plugins.They automate split words by whitespaces, path hierachy, keword etc. • So, set “index”:”not_analyzed” and facets 100k+ URLs can be finished in 1s.
  • 25. Advanced ES Settings(2)--solution • Optimze: • Call _optimze API everyday may decrease some indexed size~ • You can found those solutions in: • https://github.com/logstash/logstash/wiki/Elasticsearch-Storage-Optimization • https://github.com/logstash/logstash/wiki/Elasticsearch----Using-index-templates-&-dynamic-
  • 26. Advanced Input -- question • Now we know how to disable _all field, but there are still duplicated fields: @fields and @message! • Logstash search ES default in @message field but logstash::Filter::Grok default capture variables into @fields just from @message! • How to solve?
  • 27. Advanced Input -- solution • We know some other systems like Message::Passing have encode/decode in addition to input/filter/output. • In fact logstash has them too~but rename them as ‘format’. • So we can define the message format ourself, just using logformat in nginx.conf. • (example as follow)
  • 28. Advanced Input -- nginx.conf – logformat json '{"@timestamp":"$time_iso8601",' '"@source":"$server_addr",‘ '"@fields":{‘ '"client":"$remote_addr",' '"size":$body_bytes_sent,' '"responsetime":$request_time,' '"upstreamtime": $upstream_response_time,' '"oh":"$upstream_addr",' '"domain":"$host",' '"url":"$uri",' '"status":"$status"}}'; – access_log /data/nginx/logs/access.json json; • See http://cookbook.logstash.net/recipes/apache-json-logs/
  • 29. Advanced Input -- json_event • Now define input block with format: – input { – stdin { – type => "nginx“ – format => "json_event“ – } – } • And start in command line: – tail -F /data/nginx/logs/access.json – | sed 's/upstreamtime":-/upstreamtime":0/' – | /usr/local/logstash/bin/logstash -f /usr/local/logstash/etc/agent.conf & • Attention: Upstreamtime may be “-” if status is 400.
  • 30. Advanced Web GUI • Write your own website using ElasticSearch RESTful API to search as follows: – curl -XPOST http://es.domain.com:9200/logstash-2012.09.18/nginx/_search?pretty=1 –d ‘ { “query”: { “range”: { “from”: “now-1h”, “to”: “now” } }, “facets”: { “curl_test”: { “date_histogram”: { “key_field”: “@timestamp”, “value_field”: “url”, “interval “: “5m” } } }, “size”: 0 } ’
  • 31. Additional Message::Passing demo • I do write a demo using Message::Passing, Regexp::Log, ElasticSearch and so on perl modules working similar to logstash usage showed here. • See: – http://chenlinux.com/2012/09/16/message-passing-agent/ – http://chenlinux.com/2012/09/16/regexp-log-demo-for-nginx/ – http://chenlinux.com/2012/09/16/message-passing-filter-demo/
  • 32. Reference • http://logstash.net/docs/1.1.1/tutorials/metrics-from-logs • http://logwrangler.mtcode.com/ • https://www.virtualbox.org/wiki/Linux_Downloads • http://vagrantup.com/v1/docs/getting-started/index.html • http://www.elasticsearch.cn • http://search.cpan.org/~bobtfish/Message-Passing- 0.010/lib/Message/Passing.pm