SlideShare une entreprise Scribd logo
1  sur  27
1
Dr. Stefan Schadwinkel und Mike Lohmann
22
Who we are.
Log everything
Mike Lohmann
Architektur
Author (PHPMagazin, IX, heise.de)
Dr. Stefan Schadwinkel
Analytics
Author (heise.de, Cereb.Cortex, EJN, J.Neurophysiol.)
33
Agenda.
Log everything
 What we did. What we do.
 Log everything! - Our way from Requirement to Solution
 Infrastructure and technologies: Simple, Scalable, Open Source
 Happy business users.
44
What we did.
Log everything
 Creating & operating education communities
 Webapplications
 Multi-language
 Different market rules in different countries
 Consolidating the technological basis for multiple (new) products
55
DECK36 GmbH & Co. KG
Log everything
 DECK36 is a young spin-off from ICANS
 7 core engineers with longstanding expertise
(operate, scale, automate, analyze)
 Consulting and engineering services for the
etruvian group and external customers
66
Numberfacts of PokerStrategy.com
Log everything
6.000.000
Registered Users
PokerStrategy.com
Education since 2005
19 Languages
2.800.000
PI/Day
700.000
Posts/Day
77
Moving on…
Log everything
 Build more Education communities like PokerStrategy…
 Assume PokerStrategy KPIs(?)
 Other Business models
 Add mobile and the social web…
 Our requirement: Log everything!
88
Logging Tools / Technologies
Producer
 Web/Mobile Apps
 JS Frontend
 Servers
 Databases
9/22/2013
Transport
Now:
RabbitMQ +
Erlang Consumer
OR
Kafka +
Any other Consumer
Was:
Flume
Storage
Now:
S3 Storage +
Hadoop with EMR
OR
Any other storage
Was:
Virtualized Inhouse
Hadoop
Analytics
MapReduce with
Hive/Pig
Results in any format
Excel, QlikView,
RDMS, ...
Realtime Datastream Analytics
Storm / Trident
99
Logging Infrastructure
Producer
9/22/2013
Transport Storage Analytics
Databases
and Server
S3
Rabbit MQ
Consumer
Excel,
QlikView,
Tableau,
SASS, ...
Graylog
Zabbix
Apps
1-x
Hadoop
- Cluster
RDMS
Realtime Datastream Analytics (Storm)
Nimbus
(Master)
ZookeeperZookeeper
Zookeeper
SupervisorSupervisorSupervisor
Worker
Worker
Worker
NodeJS
1010
Producer
9/22/2013
Page
Controller
Monolog-
Logger
Shovel
Local
RabbitMQ
PageHit
Event
Listener
Processor
Handler
Formatter
PageHit-Event
Logger::log()
LogMessage, JSON
/Home
1111
Producer JS (in progress)
9/22/2013
JS Client
DataCollector
(NodeJS)
Shovel
Local
RabbitMQ
Local
Storage
Validator
Tracks Event
/Home
Trigger
WebSocket
1212
Producer
9/22/2013
 LoggingComponent: Provides interfaces, filters and handlers
 LoggingBundle: Glues all together with Symfony2
 Drupal Logging Module: Using the LoggingComponent
 JS Frontend Client: LogClient for Browsers (in progress)
https://github.com/ICANS/IcansLoggingComponent
https://github.com/ICANS/IcansLoggingBundle
https://github.com/ICANS/drupal-logging-module
https://github.com/DECK36/starlog-js-frontend-client
1313
Transport
9/22/2013
 1st Solution: Flume
+ Part of the Hadoop Ecosystem
+ Flexible Central config, Extensible via Plugins
- Not mature software (flume, flume-ng, plugin interfaces, ..)
- Central config has problems with puppet
 2nd Solution: RabbitMQ
+ Local RabbitMQ  Cluster
+ Decentralized config (producers & consumers simply connect)
- HDFS Sink not pre-packaged
1414
Storage
9/22/2013
 1st Solution: Self-hosted Hadoop
- Virtualized Infrastructure makes HDFS redundant
- High costs (cluster always running, admin work)
 2nd Solution: Cloud Storage
+ Amazon S3
+ Elastic MapReduce: Hadoop on demand
+ cost effective (only pay, what you use)
1515
Compaction
9/22/2013
 RabbitMQ consumer (Erlang) stores data to cloud
 Yet: we have a mixed message stream, but want:
s3://[BUCKET]/icanslog/[WEBSITE]/icans.content/year=2012/month=10/day=01/part-00000.lzo
 MapReduce:
 Streaming (stdin/stdout to any tool)
 Computation (Hive, Pig, Cascalog, etc.)
 Amazon Redshift
 PostgreSQL-compatible Data Warehouse
Hive Partitioning!
1616
Analytics
9/22/2013
 Cascalog is Clojure, Clojure is Lisp
(?<- (stdout) [?person] (age ?person ?age) … (< ?age 30))
Query
Operator
Cascading
Output Tap
Columns of
the dataset
generated
by the query
„Generator“ „Predicate“
 as many as you want
 both can be any clojure function
 clojure can call anything that is
available within a JVM
1717
Analytics
9/22/2013
• We use Cascalog to preprocess and organize that incoming flow of log messages:
1818
Analytics
9/22/2013
 Let‘s run the Cascalog processing on Amazon EMR:
./elastic-mapreduce --create --name „Log Message Compaction"
--bootstrap-action s3://[BUCKET]/mapreduce/configure-daemons
--num-instances $NUM
--slave-instance-type m1.large
--master-instance-type m1.large
--jar s3://[BUCKET]/mapreduce/compaction/icans-cascalog.jar
--step-action TERMINATE_JOB_FLOW
--step-name "Cascalog"
--main-class icans.cascalogjobs.processing.compaction
--args "s3://[BUCKET]/incoming/*/*/*/","s3://[BUCKET]/icanslog","s3://[BUCKET]/icanslog-error
1919
Analytics
9/22/2013
 Now we can access the log data within Hive and store results again to S3:
2020
Analytics
9/22/2013
 Now, get the stats by executing a query:
 We can now simply copy the data from S3 and import in any local analytical tool
 Excel, Redshift, QlikView, R, etc.
2121
Realtime Datastream Analytics
9/22/2013
• Storm: Hadoop for realtime analytics
• Rock solid HA concept
• Highly scalable
• Can:
Processing Streams (and trigger events)
Provide a DRPC functionality
Work on enormous data load
• Fancy names for modules
(spouts/bolts/tuple/topology)
• Easy to use
Small and easy to understand API
DevMode
• Add new topologies at run time
2222
Realtime Datastream Analytics
9/22/2013
2323
Happy business users!
9/22/2013
 Questions they have often can be automated (ETL, Reports)
 New questions can be explored (Ad-hoc, Search)
 Insights can be used as feedback into the system (Decisions, Websockets)
 Data-driven applications can be created that can be used by multiple websites or
they can be taylored to individual needs.
2424
Merci.
9/22/2013
Questions
?
2525
Contacts.
9/22/2013
Dr. Stefan Schadwinkel
stefan.schadwinkel@deck36.de
ICANS_StScha
Mike Lohmann
mike.lohmann@deck36.de
mikelohmann
2626
Tools/Technologies
9/22/2013
27
DECK36 GmbH & CO. KG
Valentinskamp 18
20354 Hamburg
Germany
Phone: +49 40 22 63 82 9-0
Fax: +49 40 38 67 15 92
Web: www.deck36.de

Contenu connexe

Tendances

Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesDatabricks
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...InfluxData
 
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...InfluxData
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Databricks
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Spark Summit
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkDataWorks Summit
 
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...Codemotion
 
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...InfluxData
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsDatabricks
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformMeetupDataScienceRoma
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData InfluxData
 
The Future of Sharding
The Future of ShardingThe Future of Sharding
The Future of ShardingEDB
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
A Graph-Based Method For Cross-Entity Threat Detection
 A Graph-Based Method For Cross-Entity Threat Detection A Graph-Based Method For Cross-Entity Threat Detection
A Graph-Based Method For Cross-Entity Threat DetectionJen Aman
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...InfluxData
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKInfluxData
 
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...BJ Jang
 

Tendances (20)

H20 - Thirst for Machine Learning
H20 - Thirst for Machine LearningH20 - Thirst for Machine Learning
H20 - Thirst for Machine Learning
 
Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
 
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
 
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...
Serverless Data Architecture at scale on Google Cloud Platform - Lorenzo Ridi...
 
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy Models
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud Platform
 
Mapreduce
MapreduceMapreduce
Mapreduce
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
 
The Future of Sharding
The Future of ShardingThe Future of Sharding
The Future of Sharding
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
A Graph-Based Method For Cross-Entity Threat Detection
 A Graph-Based Method For Cross-Entity Threat Detection A Graph-Based Method For Cross-Entity Threat Detection
A Graph-Based Method For Cross-Entity Threat Detection
 
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
IoT Event Processing and Analytics with InfluxDB in Google Cloud | Christoph ...
 
OPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACKOPTIMIZING THE TICK STACK
OPTIMIZING THE TICK STACK
 
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...Managing Multi-DBMS on a Single UI, a Web-based Spatial DB Manager-FOSS4G A...
Managing Multi-DBMS on a Single UI , a Web-based Spatial DB Manager-FOSS4G A...
 

En vedette

365 Daily Success Quotes
365 Daily Success Quotes365 Daily Success Quotes
365 Daily Success QuotesSreedhar K R
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summitOpen Analytics
 
DocOnTime: introducción a la empresa
DocOnTime: introducción a la empresaDocOnTime: introducción a la empresa
DocOnTime: introducción a la empresaDocOnTime
 
Булатулы Асхат+GPS чипы+Производители
Булатулы Асхат+GPS чипы+ПроизводителиБулатулы Асхат+GPS чипы+Производители
Булатулы Асхат+GPS чипы+ПроизводителиАсхат Булатулы
 
105 useful webites list
105 useful webites list105 useful webites list
105 useful webites listMathivanan M
 
durumstore katalog
durumstore katalogdurumstore katalog
durumstore katalogDilek Mete
 
Ekhi por Esther Jiménez y Miguel Ángel González
Ekhi por Esther Jiménez y Miguel Ángel GonzálezEkhi por Esther Jiménez y Miguel Ángel González
Ekhi por Esther Jiménez y Miguel Ángel GonzálezMiguel Yasuyuki Hirota
 
army social media handbook
army social media handbook army social media handbook
army social media handbook NGMS
 
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+ Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+ Alhambra Instituto
 
Zirkulazio aparatua
Zirkulazio aparatuaZirkulazio aparatua
Zirkulazio aparatuaKOSMODISEA
 
Apropiacion social 2
Apropiacion social 2Apropiacion social 2
Apropiacion social 25ForoASCTI
 
Contra el mito de la neutralidad de la ciencia: el papel de la historia
Contra el mito de la neutralidad de la ciencia:  el papel de la historiaContra el mito de la neutralidad de la ciencia:  el papel de la historia
Contra el mito de la neutralidad de la ciencia: el papel de la historiacienciaspsiquicas
 

En vedette (20)

365 Daily Success Quotes
365 Daily Success Quotes365 Daily Success Quotes
365 Daily Success Quotes
 
Oas schwartz 16
Oas schwartz 16Oas schwartz 16
Oas schwartz 16
 
angular2-learn
angular2-learnangular2-learn
angular2-learn
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summit
 
AmazonRedshift
AmazonRedshiftAmazonRedshift
AmazonRedshift
 
maria
mariamaria
maria
 
DocOnTime: introducción a la empresa
DocOnTime: introducción a la empresaDocOnTime: introducción a la empresa
DocOnTime: introducción a la empresa
 
Булатулы Асхат+GPS чипы+Производители
Булатулы Асхат+GPS чипы+ПроизводителиБулатулы Асхат+GPS чипы+Производители
Булатулы Асхат+GPS чипы+Производители
 
105 useful webites list
105 useful webites list105 useful webites list
105 useful webites list
 
durumstore katalog
durumstore katalogdurumstore katalog
durumstore katalog
 
Ekhi por Esther Jiménez y Miguel Ángel González
Ekhi por Esther Jiménez y Miguel Ángel GonzálezEkhi por Esther Jiménez y Miguel Ángel González
Ekhi por Esther Jiménez y Miguel Ángel González
 
HYTORC CERTIFICATION
HYTORC CERTIFICATIONHYTORC CERTIFICATION
HYTORC CERTIFICATION
 
AMARAPORN THEPHUDSADIN NA AYUTTHAYA
AMARAPORN  THEPHUDSADIN  NA AYUTTHAYAAMARAPORN  THEPHUDSADIN  NA AYUTTHAYA
AMARAPORN THEPHUDSADIN NA AYUTTHAYA
 
Los Medios de Comunicación
Los Medios de ComunicaciónLos Medios de Comunicación
Los Medios de Comunicación
 
army social media handbook
army social media handbook army social media handbook
army social media handbook
 
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+ Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+
Spanska kurser för äldre | Sprakresor till Spanien | Spanska Språkresor för 50+
 
Zirkulazio aparatua
Zirkulazio aparatuaZirkulazio aparatua
Zirkulazio aparatua
 
MID Licencias
MID LicenciasMID Licencias
MID Licencias
 
Apropiacion social 2
Apropiacion social 2Apropiacion social 2
Apropiacion social 2
 
Contra el mito de la neutralidad de la ciencia: el papel de la historia
Contra el mito de la neutralidad de la ciencia:  el papel de la historiaContra el mito de la neutralidad de la ciencia:  el papel de la historia
Contra el mito de la neutralidad de la ciencia: el papel de la historia
 

Similaire à Logging Infrastructure for Web and Mobile Apps

Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark DownscalingDatabricks
 
Log everything!
Log everything!Log everything!
Log everything!ICANS GmbH
 
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Henning Jacobs
 
ThoughtWorks Technology Radar Roadshow - Sydney
ThoughtWorks Technology Radar Roadshow - SydneyThoughtWorks Technology Radar Roadshow - Sydney
ThoughtWorks Technology Radar Roadshow - SydneyThoughtworks
 
Google Cloud Next 2021 Recap
 Google Cloud Next 2021 Recap Google Cloud Next 2021 Recap
Google Cloud Next 2021 RecapErvin Weber
 
Build your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resourcesBuild your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resourcesMartin Czygan
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...OCCIware
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...Marc Dutoo
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB
 
Speeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCSpeeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCinside-BigData.com
 
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...Paris Open Source Summit
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONAdrian Cockcroft
 
Decrease build time and application size
Decrease build time and application sizeDecrease build time and application size
Decrease build time and application sizeKeval Patel
 
CNCF Québec Meetup du 16 Novembre 2023
CNCF Québec Meetup du 16 Novembre 2023CNCF Québec Meetup du 16 Novembre 2023
CNCF Québec Meetup du 16 Novembre 2023Anthony Dahanne
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Databricks
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
MongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local Austin 2018: MongoDB Ops Manager + KubernetesMongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local Austin 2018: MongoDB Ops Manager + KubernetesMongoDB
 

Similaire à Logging Infrastructure for Web and Mobile Apps (20)

Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
 
Log everything!
Log everything!Log everything!
Log everything!
 
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
 
ThoughtWorks Technology Radar Roadshow - Sydney
ThoughtWorks Technology Radar Roadshow - SydneyThoughtWorks Technology Radar Roadshow - Sydney
ThoughtWorks Technology Radar Roadshow - Sydney
 
Smartblitzmerker
SmartblitzmerkerSmartblitzmerker
Smartblitzmerker
 
Google Cloud Next 2021 Recap
 Google Cloud Next 2021 Recap Google Cloud Next 2021 Recap
Google Cloud Next 2021 Recap
 
Build your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resourcesBuild your own discovery index of scholary e-resources
Build your own discovery index of scholary e-resources
 
PyCharm_31
PyCharm_31PyCharm_31
PyCharm_31
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
 
Speeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCSpeeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCC
 
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Decrease build time and application size
Decrease build time and application sizeDecrease build time and application size
Decrease build time and application size
 
CNCF Québec Meetup du 16 Novembre 2023
CNCF Québec Meetup du 16 Novembre 2023CNCF Québec Meetup du 16 Novembre 2023
CNCF Québec Meetup du 16 Novembre 2023
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
MongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local Austin 2018: MongoDB Ops Manager + KubernetesMongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local Austin 2018: MongoDB Ops Manager + Kubernetes
 

Dernier

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Logging Infrastructure for Web and Mobile Apps

  • 1. 1 Dr. Stefan Schadwinkel und Mike Lohmann
  • 2. 22 Who we are. Log everything Mike Lohmann Architektur Author (PHPMagazin, IX, heise.de) Dr. Stefan Schadwinkel Analytics Author (heise.de, Cereb.Cortex, EJN, J.Neurophysiol.)
  • 3. 33 Agenda. Log everything  What we did. What we do.  Log everything! - Our way from Requirement to Solution  Infrastructure and technologies: Simple, Scalable, Open Source  Happy business users.
  • 4. 44 What we did. Log everything  Creating & operating education communities  Webapplications  Multi-language  Different market rules in different countries  Consolidating the technological basis for multiple (new) products
  • 5. 55 DECK36 GmbH & Co. KG Log everything  DECK36 is a young spin-off from ICANS  7 core engineers with longstanding expertise (operate, scale, automate, analyze)  Consulting and engineering services for the etruvian group and external customers
  • 6. 66 Numberfacts of PokerStrategy.com Log everything 6.000.000 Registered Users PokerStrategy.com Education since 2005 19 Languages 2.800.000 PI/Day 700.000 Posts/Day
  • 7. 77 Moving on… Log everything  Build more Education communities like PokerStrategy…  Assume PokerStrategy KPIs(?)  Other Business models  Add mobile and the social web…  Our requirement: Log everything!
  • 8. 88 Logging Tools / Technologies Producer  Web/Mobile Apps  JS Frontend  Servers  Databases 9/22/2013 Transport Now: RabbitMQ + Erlang Consumer OR Kafka + Any other Consumer Was: Flume Storage Now: S3 Storage + Hadoop with EMR OR Any other storage Was: Virtualized Inhouse Hadoop Analytics MapReduce with Hive/Pig Results in any format Excel, QlikView, RDMS, ... Realtime Datastream Analytics Storm / Trident
  • 9. 99 Logging Infrastructure Producer 9/22/2013 Transport Storage Analytics Databases and Server S3 Rabbit MQ Consumer Excel, QlikView, Tableau, SASS, ... Graylog Zabbix Apps 1-x Hadoop - Cluster RDMS Realtime Datastream Analytics (Storm) Nimbus (Master) ZookeeperZookeeper Zookeeper SupervisorSupervisorSupervisor Worker Worker Worker NodeJS
  • 11. 1111 Producer JS (in progress) 9/22/2013 JS Client DataCollector (NodeJS) Shovel Local RabbitMQ Local Storage Validator Tracks Event /Home Trigger WebSocket
  • 12. 1212 Producer 9/22/2013  LoggingComponent: Provides interfaces, filters and handlers  LoggingBundle: Glues all together with Symfony2  Drupal Logging Module: Using the LoggingComponent  JS Frontend Client: LogClient for Browsers (in progress) https://github.com/ICANS/IcansLoggingComponent https://github.com/ICANS/IcansLoggingBundle https://github.com/ICANS/drupal-logging-module https://github.com/DECK36/starlog-js-frontend-client
  • 13. 1313 Transport 9/22/2013  1st Solution: Flume + Part of the Hadoop Ecosystem + Flexible Central config, Extensible via Plugins - Not mature software (flume, flume-ng, plugin interfaces, ..) - Central config has problems with puppet  2nd Solution: RabbitMQ + Local RabbitMQ  Cluster + Decentralized config (producers & consumers simply connect) - HDFS Sink not pre-packaged
  • 14. 1414 Storage 9/22/2013  1st Solution: Self-hosted Hadoop - Virtualized Infrastructure makes HDFS redundant - High costs (cluster always running, admin work)  2nd Solution: Cloud Storage + Amazon S3 + Elastic MapReduce: Hadoop on demand + cost effective (only pay, what you use)
  • 15. 1515 Compaction 9/22/2013  RabbitMQ consumer (Erlang) stores data to cloud  Yet: we have a mixed message stream, but want: s3://[BUCKET]/icanslog/[WEBSITE]/icans.content/year=2012/month=10/day=01/part-00000.lzo  MapReduce:  Streaming (stdin/stdout to any tool)  Computation (Hive, Pig, Cascalog, etc.)  Amazon Redshift  PostgreSQL-compatible Data Warehouse Hive Partitioning!
  • 16. 1616 Analytics 9/22/2013  Cascalog is Clojure, Clojure is Lisp (?<- (stdout) [?person] (age ?person ?age) … (< ?age 30)) Query Operator Cascading Output Tap Columns of the dataset generated by the query „Generator“ „Predicate“  as many as you want  both can be any clojure function  clojure can call anything that is available within a JVM
  • 17. 1717 Analytics 9/22/2013 • We use Cascalog to preprocess and organize that incoming flow of log messages:
  • 18. 1818 Analytics 9/22/2013  Let‘s run the Cascalog processing on Amazon EMR: ./elastic-mapreduce --create --name „Log Message Compaction" --bootstrap-action s3://[BUCKET]/mapreduce/configure-daemons --num-instances $NUM --slave-instance-type m1.large --master-instance-type m1.large --jar s3://[BUCKET]/mapreduce/compaction/icans-cascalog.jar --step-action TERMINATE_JOB_FLOW --step-name "Cascalog" --main-class icans.cascalogjobs.processing.compaction --args "s3://[BUCKET]/incoming/*/*/*/","s3://[BUCKET]/icanslog","s3://[BUCKET]/icanslog-error
  • 19. 1919 Analytics 9/22/2013  Now we can access the log data within Hive and store results again to S3:
  • 20. 2020 Analytics 9/22/2013  Now, get the stats by executing a query:  We can now simply copy the data from S3 and import in any local analytical tool  Excel, Redshift, QlikView, R, etc.
  • 21. 2121 Realtime Datastream Analytics 9/22/2013 • Storm: Hadoop for realtime analytics • Rock solid HA concept • Highly scalable • Can: Processing Streams (and trigger events) Provide a DRPC functionality Work on enormous data load • Fancy names for modules (spouts/bolts/tuple/topology) • Easy to use Small and easy to understand API DevMode • Add new topologies at run time
  • 23. 2323 Happy business users! 9/22/2013  Questions they have often can be automated (ETL, Reports)  New questions can be explored (Ad-hoc, Search)  Insights can be used as feedback into the system (Decisions, Websockets)  Data-driven applications can be created that can be used by multiple websites or they can be taylored to individual needs.
  • 27. 27 DECK36 GmbH & CO. KG Valentinskamp 18 20354 Hamburg Germany Phone: +49 40 22 63 82 9-0 Fax: +49 40 38 67 15 92 Web: www.deck36.de