Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 1
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2
1. Intro
2. Problem to solve?
3. How does Flume/S...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3
Ops Architect at Cisco CCATG (WebEx)
Ensure opera...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 4
© 2012 Cisco and/or its affiliates. All rights reserved. 5
Cisco WebEx Meetings
• Voice, video, desktop sharing
• Meeting/...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights res...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights res...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights res...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9
“If a problem has no solution, it may not be a pr...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10
Even better: proactive
Good: reactive
Your searc...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 11
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12
Flume
Log4j
File
Avro
Syslog
Other Sinks
Solr
Si...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14
agent agent agent
File
Channel 1
Avro
src
DC1
Av...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16
File
Channel 1
Avro
src
Solr
Sink
HDFS
sink
File...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17
Isn’t Big Data “schema on read”?
• Why does Solr...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18
Cloudera Morphlines
• Framework to simplify even...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19
Convert syslog message..
<179>Jun 16 04:36:49 co...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27
Convert syslog message
<179>Jun 16 04:36:49 colo...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28
Command: readLine
Command: grok
Command: loadSol...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29
ZooKeeper
leader1
replica1
Shard1
leader2
replic...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30
Collection “syslog” with
three shards
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31
Special case of search
• Logs are time series da...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32
Solr
• No multi-datacenter cluster support
HDFS
...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35
Tiers to scale
• Flume Collector tier
• Flume St...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36
100 – 5000 servers per a datacenter
agent agent ...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37
DC 1 collectors
DC 1
storage tier
Flume 1
DC 2
s...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38
ZooKeeper
leader1
replica1
Shard1
leader2
replic...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39
Central syslog servers
• Network and OS system m...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 40
Screen shots
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41
Search by time
Sort by select field
Facets by se...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42
Wildcard query by field
Highlight the query
keyw...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43
Data sources: REST/JSON, log4j, syslog, Avro, Th...
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44
email: ari.flink@webex.com
twitter: @raaka
C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 45
Thank you.
Prochain SlideShare
Chargement dans…5
×

Large scale near real-time log indexing with Flume and SolrCloud

9 249 vues

Publié le

Apache Flume’s extensible architecture allows Cisco to stream system and application logs from worldwide production data centers to a central Hadoop cluster and Solr. This architecture enables a new level of scalable indexing so that a larger volume of logs is searchable within seconds. Using Solr 4.0′s near real time features together with Hadoop, we can execute mission critical tasks much quicker, improving our ability to meet tight SLAs. At the same time, using the same infrastructure, we can perform large-scale historical analysis and pattern extraction to help further improve our services. This talk will explore our infrastructure and decisions we?ve made to meet key requirements, i.e. high indexing load, high availability and disaster recovery. We will further explore other uses of Flume and SolrCloud within Cisco including dynamic event routing, parsing and multi-tenancy.

  • Soyez le premier à commenter

Large scale near real-time log indexing with Flume and SolrCloud

  1. 1. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 1
  2. 2. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2 1. Intro 2. Problem to solve? 3. How does Flume/Solr help? 4. Syslog indexing example 5. HA, DR & scalability
  3. 3. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3 Ops Architect at Cisco CCATG (WebEx) Ensure operational readiness for complex distributed services HA, DR, monitoring, config, deployment Previously eBay, Excite@Home, IBM, VISA Operations architecture, monitoring, event correlation
  4. 4. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 4
  5. 5. © 2012 Cisco and/or its affiliates. All rights reserved. 5 Cisco WebEx Meetings • Voice, video, desktop sharing • Meeting/Event/Support/Training • Centers • Integration with TelePresence Cisco WebEx Social • Social networking • Content creation • Integrated IM Cisco WebEx Messenger • IM, presence • Integrate with voice, video • XMPP
  6. 6. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6 Participants from over 231 countries, 52% market share 2.2 Billion meeting minutes per month 40.5 Million meeting attendees per month 9.4 million registered hosts worldwide 4 Million mobile downloads
  7. 7. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights reserved. 7 Datacenter / PoP Leased network link Global Scale: 13 datacenters & iPoPs around the globe Dedicated network: dual path 10G circuits between DCs Multi-tenant: 95k sites Real-time collaboration: voice, desktop sharing, video, chat
  8. 8. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights reserved. 8 Datacenter / PoP Leased network link People make mistakes Hardware fails Software fails Even failovers sometimes fail
  9. 9. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9 “If a problem has no solution, it may not be a problem, but a fact, not to be solved, but to be coped with over time” — Shimon Peres (“Peres’s Law”) People/HW/SW failures are facts, not problems Operations main goal is to maintain high service availability • Recovery/repair is how we cope with above facts • Improving recovery/repair improves availability UnAvailability = MTTR / MTBF 1/10th MTTR just as valuable as 10x MTBF
  10. 10. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10 Even better: proactive Good: reactive Your search – What is the root cause of the outage? – did not match any documents.
  11. 11. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 11
  12. 12. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12 Flume Log4j File Avro Syslog Other Sinks Solr Sink Applicationstate&APIs HDFS Thrift AMQP RDBMS Sqoop HTTP/REST MySQL Unstructured/semi-structured data Structured data Cisco UCS C240 M3 servers 12 x 3TB = 36 TB / server HDFS Sink SolrCloud Raw dataSolr index
  13. 13. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier
  14. 14. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14 agent agent agent File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 … Replicating fan-out flow Flume Collector server Failover & load balancing agents Flume Storage tier All events replicated to both Channels DC1 DC2
  15. 15. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier
  16. 16. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16 File Channel 1 Avro src Solr Sink HDFS sink File Channel 2 … Multiplexing fan-out flow Flume Storage tier server Failover & load balancing agents Flume Collector Flume Collector Flume Collector HDFSSolrCloud Routing to Solr by Flume event header All events to HDFS
  17. 17. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17 Isn’t Big Data “schema on read”? • Why does Solr require a schema on write? • Dirty little secret: there’s always a schema • Performance & functionality vs flexibility • Optimize operations and storage based on field type - that's how you get sub second response times There’s always a schema • Application code vs. central location
  18. 18. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18 Cloudera Morphlines • Framework to simplify event transformation • Compatible with existing grok patterns • Reusable across multiple index workloads: Flume & M/R Command: readLine Command: grok Command: loadSolr Solr Flume event = headers + body Record Document matching schema.xml Command: tryRules Command: addValues … Record Record Record Record SolrSink
  19. 19. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19 Convert syslog message.. <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234 .. into Solr schema fields Severity=[3] Facility=[22] host=[colo01-wxp00-ace01b-connect.webex.com] timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[54asdf654] id=[b2f839c3-dece-404f-a535-e0141ad549bf] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  20. 20. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 1: readLine reads in Flume event headers and body timestamp=[1371357409000] host=[colo01-wxp00-ace01b-connect.webex.com] category=[545f5sfsd5sf] Severity=[3] Facility=[22] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] Headers Body
  21. 21. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 2: convertTimestamp converts epoch to ISO 8601 format timestamp=[2013-06-16T04:36:49.000Z] host=[colo01-wxp00-ace01b-connect.webex.com] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] Severity=[3] Facility=[22]
  22. 22. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 3: addValues creates new field access_token timestamp=[2013-06-16T04:36:49.000Z] category=[545f5sfsd5sf] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] Severity=[3] Facility=[22]
  23. 23. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 4: tryRules creates field severity_label for severity timestamp=[2013-06-16T04:36:49.000Z] severity_label=[error] access_token=[545f5sfsd5sf] message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] host=[colo01-wxp00-ace01b-connect.webex.com] category=[545f5sfsd5sf] Severity=[3] Facility=[22]
  24. 24. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 5: tryRules creates new fields syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  25. 25. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 6: sanitizeUnknownSolrFields drops non-schema fields timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  26. 26. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 7: generateUUID creates an unique id for the document timestamp=[2013-06-16T04:36:49.000Z] syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234] severity_label=[error] access_token=[545f5sfsd5sf] id=[b2f839c3-dece-404f-a535-e0141ad549bf] host=[colo01-wxp00-ace01b-connect.webex.com] cisco_product=[ACE] cisco_level=[3] cisco_id=[251008] cisco_code=[%ACE-3-251008]
  27. 27. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27 Convert syslog message <179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3- 251008: Health probe failed for server 10.240.22.111 on port 1234 Step 8: loadSolr loads a record into a Solr server
  28. 28. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28 Command: readLine Command: grok Command: loadSolr SolrCloud Flume syslog event = headers + body Record Document matching schema.xml Command: tryRules Command: addValues … Record Record Record Record SolrSink
  29. 29. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29 ZooKeeper leader1 replica1 Shard1 leader2 replica2 Shard2 leader3 replica3 Shard3 SolrCloud cluster zk1 zk2 zk3 Pluggable filesystem (local, HDFS) Add doc to syslog index • Collections, shards & replicas • Pluggable file system • Central config & coordination with ZK • Full HA, automatic fail-over • NRT indexing • Automatic routing Where can I index data? leader3 Collection
  30. 30. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30 Collection “syslog” with three shards
  31. 31. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31 Special case of search • Logs are time series data: timestamp + data • High indexing rate, no updates • New data is more frequently searched than old Collection aliases • Time partitioned collections – e.g. one collection per day • Reduces the workload to near-real-time data only • One-to-many collection mapping: queries go to a logical representation mapped to multiple, same-schema collection • Simplifies for hot-warm-cold migration of data Index expiration • Old data is aged out by Collection Aliases • Remap only the latest collection to an alias
  32. 32. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32 Solr • No multi-datacenter cluster support HDFS • No multi-datacenter cluster support Options? • All our services must survive DC outage • . . so should logging and indexing
  33. 33. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS Flume SolrCloud Flume Flume DC 1 Flume Flume Flume syslog log4j file DC 2 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tierPlanned or unplanned outage Flume Collector disk channel buffering DC1 events DC1 Hadoop cluster back online after outage Replicate aggregate data
  34. 34. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34 DC 1 HDFS Flume SolrCloud Flume Flume DC 2 HDFS SolrCloud DC 1 Flume Flume Flume syslog log4j file DC 2 Flume Flume Flume syslog log4j file DC N Flume Flume Flume syslog log4j file … Collector tier Storage tier Flume Flume Flume distcp Manual CNAME change to DC2 DC1 back online, sync data from DC2 Data sent only to a single DC distcp DNS CNAME change back to DC1 Flip distcp the other way Flume buffering events at collector tier Create indexes with M/R
  35. 35. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35 Tiers to scale • Flume Collector tier • Flume Storage tier • SolrCloud
  36. 36. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36 100 – 5000 servers per a datacenter agent agent agent File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 … Replicating fan-out flow agent agent agent … …Flume Collector More agents and data FileChannel: 14MB/sec NIC: 100MB/sec NIC: 100MB/sec File Channel 1 Avro src DC1 Avro sink DC2 Avro sink File Channel 2 Replicating fan-out flow Max per server: 14MB/s 1.2 TB/day 70k events/s
  37. 37. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37 DC 1 collectors DC 1 storage tier Flume 1 DC 2 storage tier Avro sink 1 Avro sink 2 Avro sink N … DC 2 collectors Avro sink 1 Avro sink 2 Avro sink N … DC N collectors Avro sink 1 Avro sink 2 Avro sink N …… File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow File Chan1 Avro src HDFS sink Solr sink File Chan2 Multiplexing fan-out flow Max per server: 14MB/s 1.2 TB/day 70k events/s
  38. 38. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38 ZooKeeper leader1 replica1 Shard1 leader2 replica2 Shard2 leader3 replica3 Shard3 SolrCloud cluster zk1 zk2 zk3 Pluggable filesystem (local, HDFS) New logs to index Search queries 1000 tx/sec/core 2x8 cores 16k tx/sec 3 shards 3 x 16k = 48k tx/sec
  39. 39. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39 Central syslog servers • Network and OS system messages forwarded to several central syslog servers Forward syslog to Solr using Flume Morphline SolrSink • Parse messages with Morphline and grok patterns SolrCloud • Index log lines as documents into a Collection (i.e. index) HUE Solr search • Simple UI to build a customized search page layout with faceting, sorting. • Easy drill down with multiple facets: severity, datacenter, hostname, etc
  40. 40. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 40 Screen shots
  41. 41. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41 Search by time Sort by select field Facets by selected fields
  42. 42. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42 Wildcard query by field Highlight the query keywords
  43. 43. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43 Data sources: REST/JSON, log4j, syslog, Avro, Thrift Parsing: Cloudera Morphlines NRT Indexing: SolrCloud embedded in CDH Batch indexing: MapReduce Analytics: Use your favorite tool, raw detailed data stored in HDFS
  44. 44. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44 email: ari.flink@webex.com twitter: @raaka
  45. 45. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 45 Thank you.

×