SlideShare une entreprise Scribd logo
1  sur  63
Télécharger pour lire hors ligne
ELASTICSEARCH @SYNTHESIO
FRED DE VILLAMIL, DIRECTOR OF
INFRASTRUCTURE
@FDEVILLAMIL
BACKGROUND
• FRED DE VILLAMIL, 38 ANS, DIRECTOR OF INFRASTRUCTURE
@SYNTHESIO
• LINUX / (FREE)BSD SINCE 1996
• OPEN SOURCE CONTRIBUTOR SINCE 1998
• RUNS ELASTICSEARCH IN PRODUCTION SINCE 0.17.6
ABOUT SYNTHESIO
• Synthesio is the leading social intelligence tool for
social media monitoring & social analytics.
• Synthesio crawls the Web for relevant data,
enriches it with sentiment analysis and
demographics to build social analytics dashboards.
ELASTICSEARCH @SYNTHESIO, SEPTEMBER 2016
• 5 clusters, 163 physical servers, 400TB storage,
10.2TB RAM
• 75B indexed documents, 200TB data
• 1.8B indexed documents each month: mix of Web
pages, forums and social media posts
POWERING 13000 DASHBOARDS WITH ELASTICSEARCH
DEC. 2014: THE MYSQL NIGHTMARE
• Cross clusters queries on 3 massive Galera clusters
• Up to 50M rows fetched from a massive 4B rows
reference table
• Then a cross cluster joint on a 20TB, 35B records
monolithic MySQL database
• Poor performances, frequent timeouts
JAN. 2015: CLIPPING REVOLUTION
• 1 global index, 512 shards, 5B documents
• 1000 new documents / second
• 47 servers running ElasticSearch 1.3.2 then 1.3.9
• Capacity : 37TB, 24TB data, 2.62TB RAM
CLUSTER TYPOLOGY
• 2 QUERY NODES: VIRTUAL MACHINES, 4 CORE, 8GB RAM
EACH
• 3 MASTER NODES: VIRTUAL MACHINES, 4 CORE, 8GB RAM
• 42 DATA NODES: PHYSICAL SERVERS, 6 CORE XEON
E5-1650 V2, 3*900GB SSD IN RAID 0, 64GB RAM
CLIPPING REVOLUTION DATA MODEL
• ROUTING ON A MONTHLY
BASIS
• EACH CRAWLED DOCUMENT
IS INDEXED WITH NESTED
DASHBOARD IDS.
• QUERIES ON TIME PERIOD
+ DASHBOARD ID
{ "document": {
"dashboards": {
"dashboard_id": 1,
"dashboard_id": 2
}
}
}
PROBLEMS
• TOO MANY SHARDS (WAS MEANT TO BE A WEEKLY ROUTING)
• 500GB TO 900GB SHARDS (!!!) GROWING AFTER THE
MONTH IS OVER. 3 HOURS FOR A REALLOCATION
• A ROLLING RESTART TAKES 3 FULL DAYS (IF WE’RE LUCKY)
• GARBAGE COLLECTOR NIGHTMARE, CONSTANTLY FLAPPING
CLUSTER
MMAPFS VS NIOFS
• MMAPFS : MAPS LUCENE FILES ON THE VIRTUAL MEMORY
USING MMAP. NEEDS AS MUCH MEMORY AS THE FILE BEING
MAPPED
• NIOFS : APPLIES A SHARED LOCK ON LUCENE FILES AND
RELIES ON THE FILE SYSTEM CACHE
CMS VS G1GC
• CMS: SHARED CPU TIME WITH THE APPLICATION.“STOPS
THE WORLD” WHEN TOO MANY MEMORY TO CLEAN UNTIL IT
SENDS AN OUTOFMEMORYERROR
• G1GC: SHORT, MORE FREQUENT, PAUSES. WON’T STOP A
NODE UNTIL IT LEAVES THE CLUSTER
G1GC OPTIONS
MAXGCPAUSEMILLIS=200: ENSURE LONGER GARBAGE
COLLECTION
GCPAUSEINTERVALMILLIS=1000: BUT LESS FREQUENT
INITIATINGHEAPOCCUPANCYPERCENT=35: STARTS
COLLECTING WHEN THE HEAP IS 35% USED
FIELD DATA CACHE EXPIRE
• FORCES ELASTICSEARCH TO PERIODICALLY EMPTY ITS INTERNAL
FIELDDATA CACHE
• OVERLAPS THE GARBAGE COLLECTOR JOB
• PERFORMANCES ISSUES WITH FREQUENTLY ACCESSED DATA
• USE OF FIELD BREAKERS TO STOP GREEDY QUERIES
• ELASTIC SAYS NEVER DO THIS!!! BUT IT FIXED OUR BIGGEST
PROBLEM
MORE PROBLEMS
• IMMUTABLE, MONOLITHIC MAPPING: NEW FEATURE BLOCKED
UNTIL WE FIX IT
• IMPOSSIBLE TO DELETE A DASHBOARD WITHOUT
REINDEXING A WHOLE MONTH
• 20% DELETED DOCUMENTS WASTING 3TB
IMMUTABLE MAPPING AND DELETED DATA
• SEGMENTS : IMMUTABLE FILES USED BY LUCENE TO WRITE
ITS DATA. UP TO 2500 / SHARD (!!!)
• NO REAL DELETE: UPDATED AND DELETED DOCUMENTS GET
THE DELETED FLAG
• ELASTICSEARCH _OPTIMIZE: MERGE A SHARD SEGMENTS IN
1 AND PURGE DELETED DOCS
• BUT: REQUIRES 150% OF THE SHARD SIZE ON DISK
MERGE AND DELETE
JAN. 2015: BLINK
• 13200 indexes, 12B documents
• 5500 new document / second
• 3 clusters, 75 physical servers running
ElasticSearch 1.7.5
• Capacity: 187,5TB, 48TB data 4.7TB RAM
CLUSTERS TYPOLOGY
4 CORE XEON D-1520, 64GB RAM SERVERS
• 2 QUERY NODES
• 3 MASTER NODES
• 20 DATA NODE : 4*800GB SSD IN RAID 0
NEW PRODUCT DESIGN
• 1 INDEX / DASHBOARD, 1 SHARD / 5 MILLIONS DE DOCS
• VERSIONED MAPPING: MAPPING_ID__DASHBOARD_ID
• MULTIPLE MAPPING VERSIONS OF A DASHBOARD IN //
• MAPPING UPGRADE AND REINDEX WITHOUT INTERRUPTION
• BALDUR FOR DASHBOARDS ROUTING
BALDUR
BALDUR IN A NUTSHELL
1. THE API SERVER SENDS AN ELASTIC SEARCH QUERY
2. BALDUR INTERCEPTS THE QUERY AND GETS THE
DASHBOARD CLUSTER ID AND ACTIVE MAPPING VERSION
3. BALDUR ROUTES THE QUERY TO THE CLUSTER HOSTING
THE DASHBOARD DATA
ADDING A MAPPING VERSION
• THE INDEXER CREATES NEW NEW_MAPPING_ID__DASHBOARD_ID
INDEX
• THE INDEXER ADDS A LINE IN BALDUR’S DATABASE WITH THE
DASHBOARD AND MAPPING IDS
• THE INDEXERS INDEXES BOTH MAPPING_ID__DASHBOARD_ID AND
NEW_MAPPING_ID__DASHBOARD_ID
• WHEN NEW_MAPPING_ID__DASHBOARD_ID HAS CAUGHT UP,
BALDER SWITCHES THE ACTIVE MAPPING
TOO MANY LUCENE SEGMENTS
• EACH DATA NODE HOSTS 1000S LUCENE SEGMENTS
• 75% OF THE HEAP IS USED FOR SEGMENTS MANAGEMENT
• WE CREATE MORE SEGMENTS THAN WE’RE ABLE TO OPTIMIZE
• CONTINUOUS OPTIMISATION SCRIPTS, INDEXES WITH THE
MOST DELETED DOCS FIRST
• CONTINUOUS OLD INDEXES CLEANUP
MYSQL CAN’T RESIST
• 5000 DOCS, RANDOM READS BASED, BULK INDEXING PUTS
MYSQL ON THEIR KNEES
• FETCH THE DOCUMENTS IN BLACKHOLE BY 5000
• IF SOME DOCUMENTS ARE MISSING, FETCH IN MYSQL
• RESULT: 99.9% DOCUMENTS EXTRACTED FROM BLACKHOLE,
THROUGHPUT * 5
REPLACE MYSQL WITH BLACKHOLE
RACK AWARENESS IN A NUTSHELL
1. DEFINE 2 VIRTUAL RACK IDS
2. ASSIGN EACH DATA NODES A RACK
3. ENABLE RACK AWARENESS
4. PRIMARY SHARDS PICK UP A SIDE, REPLICA PICK UP THE
OTHER ONE
FULL CLUSTER RESTART IN 20 MINUTES
• CONFIGURATION TUNING REQUIRES LOTS OF RESTART
• RELY ON RACK AWARENESS TO RESTART HALF CLUSTER AT
ONCE
• BLOCK SHARD ALLOCATION DURING SERVICE RESTART
• GET GREEN
• REPEAT
ADDING NEW DOCUMENTS TO A DASHBOARD
WHERE DO YOU GO, MY LOVELY?
• PROBLEM: HOW DO WE KNOW IN WHICH DASHBOARD FITS A
NEW DOCUMENT
• STOP RELYING ON MYSQL AND SPHINX
• 50M NEW DOCUMENTS TO PROCESS A DAY
SOLUTION : PERCOLATION
• REVERSE DIRECTORY SYSTEM
• WE STORE QUERIES, NOT DOCUMENTS
• FOR EACH NEW DOCUMENT, WE MATCH THE DOCUMENT
AGAINST OUR QUERIES
PERCOLATION ISSUES
• IT TRIES TO MATCH EVERY STORED QUERY
• SO FAR, WE HAVE 35000 STORE QUERIES
• RAW USE: 1.750.000.000.000 MATCHES A DAY
• CPU GREEDY
SOLUTIONS
• ROUTING WITH THE DASHBOARD AND DOCUMENT
LANGUAGES
• FILTER AGAINST THE QUERY SECOND
• RESULT: UP TO 100.000 QUERIES / SECOND
GENERATING DASHBOARDS WITH 3 YEARS OF DATA
BLACKHOLE V1
• 36 INDEXES, 40B DOCUMENTS
• 1,5B NEW DOCUMENTS EACH MONTH
• 72 SERVERS RUNNING ELASTICSEARCH 2.3
• CAPACITY: 209TB, 120TB DATA, 4,5TB RAM
• QUERIES ON THE WHOLE DATASET
CLUSTER TYPOLOGY
75 4 CORE XEON D-1520, 64GB RAM SERVERS
• 4 HTTP NODES
• 3 MASTER NODES
• 68 DATA NODE : 4*800GB SSD IN RAID 0
BEFORE ELASTICSEARCH
• RUN QUERIES AGAINST A SPHINX CLUSTER TO GET THE
RIGHT DOCUMENTS ID
• FETCH THE DOCUMENTS FROM A GALERA CLUSTER AND THE
METADATA FROM ANOTHER GALERA CLUSTER
• MERGE AND DISPLAY THE DOCUMENTS
SPHINX NIGHTMARE
• CAN’T SCALE HORIZONTALY: LIMITED TO 14 MONTHS OF DATA
• A COMPLEX QUERY AND THE WHOLE CLUSTER REACHES 400
LOAD
• MYSQL LOAD
INDEXING PROCESS
INDEXING
• A GO PROGRAM MERGES 3 GALERA CLUSTERS AND 1 ES
CLUSTER INTO A KAFKA QUEUE: 30.000 DOCUMENTS /
SECOND
• 8 GO INDEXERS MAP THE INDEX / DATA NODE
DISTRIBUTION AND PUSH THE DATA DIRECTLY ON THE RIGHT
DATA NODE: 60.000 DOCUMENTS / SECOND DURING 3
WEEKS WITH 200.000 / SECOND PEAKS
PROBLEM: WE’RE CPU BOUND
KAFKA IS TOO SLOW
• THE 72TB KAFKA QUEUE IS TOO SLOW: 10000
DOCUMENTS / SECOND / PARTITION ONLY BECAUSE
SPINNING DISKS
MASSIVE QUERIES CRASH HALF THE CLUSTER
• ELASTICSEARCH CACHES THE RESULT OF FILTERED QUERIES:
SET _CACHE TO FALSE.
• UPGRADE TO 1.7.5: FILTERED QUERIES CACHE HAVE A
MEMORY LEAK IN 1.7.4
BIGGEST QUERIES ARE SLOW AS HELL
• DIVIDE THE GLOBAL QUERIES PER INDEX AND RUN IN
PARALLEL
• PROCESS THE RESULTS POST QUERY ON THE API LEVEL
REINDEXING 40 BILLION DOCS IN 5 DAYS
UPGRADING A MAPPING, ON A LIVE CLUSTER
• CAN’T CHANGE A FIELD TYPE
• CAN’T UPDATE ANALYZERS LIVE
• CAN’T REORGANIZE THE MAPPING WITH EXISTING DATA
CLUSTER DESIGN
• 36 MONTHS DATA, 40 BILLION DOCUMENTS
• 70 DATA NODE, 3 MASTERS
• 1 INDEX PER DAY, 12 SHARDS, 1 REPLICA
REINDEXING
• USE LOGSTASH TO READ, TRANSFORM AND WRITE EXISTING
DATA ON EACH DATA NODE
• EACH DATA NODE WRITES ON A DEFINED FRIEND IN THE
OTHER PART OF THE CLUSTER
• 5000 DOCUMENTS SCROLL, 10 INDEXING WORKERS
• SCROLLS AGAINST A FULL DAY
LOGSTASH CONFIGURATION
INPUT {
ELASTICSEARCH {
HOSTS => [ "LOCAL ELASTICSEARCH NODE" ]
INDEX => "INDEX TO READ FROM"
SIZE => 5000
SCROLL => "20M" # 5 MINUTES INITIAL
DOCINFO => TRUE
QUERY => '{ "QUERY": { "RANGE": { "DATE": { "GTE": "2015-07-23T10:00.000+01:00","LTE": "2015-07-23T11:00.000+01:00" } } } }'
}
}
OUTPUT {
ELASTICSEARCH {
HOST => "REMOTE ELASTICSEARCH NODE"
INDEX => "INDEX TO WRITE TO"
PROTOCOL => "HTTP"
INDEX_TYPE => "%{[@METADATA][_TYPE]}"
DOCUMENT_ID => "%{[@METADATA][_ID]}"
WORKERS => 10
}
STDOUT {
CODEC => RUBYDEBUG # BECAUSE REMOVING THE TIMESTAMP FIELD MAKES LOGSTASH CRASH
}
}
FILTER {
MUTATE {
RENAME => { "SOME FIELD" => "SOME OTHER FIELD" }
RENAME => { "ANOTHER FIELD" => "SOMEWHERE ELSE" }
REMOVE_FIELD => [ "SOMETHING","SOMETHING ELSE","ANOTHER FIELD","SOME FIELD","@TIMESTAMP","@VERSION" ]
}
}
ES CONFIGURATION CHANGES
MEMORY:
INDEX_BUFFER_SIZE: 50% (INSTEAD OF 10%)
INDEX:
STORE:
THROTTLE:
TYPE : "NONE" (AS FAST AS YOUR SSD CAN GO)
TRANSLOG:
DISABLE_FLUSH: TRUE
REFRESH_INTERVAL: -1 (INSTEAD OF 1S)
INDICES:
STORE:
THROTTLE:
MAX_BYTES_PER_SEC: "2GB"
PROBLEMS
• MISSING DOCUMENTS AS SCROLL LOST ITS SEARCH CONTEXT
• SOMETIMES, THE INDEXING NODES CRASH
• LOGSTASH DOES NOT LIKE NETWORK ISSUES
• NEED TO REPLAY A FULL DAY TO CATCH UP WITH THE DATA
SOLUTIONS
• PLAY HOURLY QUERIES
• WRITE A SMALL ORCHESTRATOR
• INTRODUCING YOKO AND MOULINETTE
YOKO, THE REINDEXING ORCHESTRATOR
• SMALL PYTHON DAEMON TO QUERY A MYSQL DATABASE
• INDEX FROM
• INDEX TO
• LOGSTASH QUERY
• STATUS: TODO, PROCESSING, DONE, COMPLETE, FAILED
YOKO, THE REINDEXING ORCHESTRATOR
• CREATES THE DAILY INDEXES.
• COMPARES THE NUMBER OF DOCUMENTS FROM THE INITIAL
INDEX RUNNING THE LOGSTASH QUERY ON “DONE" INDEXES.
• MOVES EACH SUCCESSFUL "DONE" LINE TO "COMPLETE" IF THE
COUNT MATCHES OR "FAILED".
• DELETE EACH MONTHLY INDEX WHEN EVERY DAY OF A MONTH
IS "COMPLETE".
MOULINETTE, THE REINDEXING SCRIPT
• SMALL BASH SCRIPT THAT QUERIES YOKO
• GENERATES THE LOGSTASH.CONF FILE FROM YOKO DATA
• RUNS LOGSTASH
• SWITCHES YOKO LINE TO DONE WHEN DONE
PROBLEMS
• LOGSTASH TRANSFORM FIELDS IS SLOW
• SHOULD RUN INDEXING ON LESS NODES
• SOMETIMES, LOGSTASH HANGS UP AND NEEDS TO BE
FORCED KILLED
• YOKO SHOULD DETECT THIS AND RAISE AN ERROR
UPGRADING FROM 1.7 TO 2.3
BEFORE UPGRADING
• CHECK YOUR PLUGINS CAN RUN ON 2.X
• CHECK YOUR MAPPINGS ARE 2.X COMPLIANT
• CHECK FOR CONFIGURATION DEPRECATION
SEEMS EASY?
1. SHUTDOWN CLUSTER
2. UPGRADE ES TO 2.3
3. UPGRADE PLUGINS
4. START THE WHOLE CLUSTER, MASTERS FIRST
UNSUPPORTED PLUGINS AND ANALYZERS
• CAN’T UPDATE AN ANALYZER ON AN OPEN INDEX
• CLOSE ALL INDEXES
• APPLY A TEMPORARY DUMMY ANALYZER
• REOPEN INDEXES
DUMMY KOREAN ANALYZER
"ANALYZER": {
"KR_ANALYZER": {
"TOKENIZER": "STANDARD",
"FILTER": [
"CJK_WIDTH",
"LOWERCASE",
"CJK_BIGRAM",
"ENGLISH_STOP"
]
}
}
QUESTIONS ?

Contenu connexe

Tendances

Seattle Cassandra Meetup - HasOffers
Seattle Cassandra Meetup - HasOffersSeattle Cassandra Meetup - HasOffers
Seattle Cassandra Meetup - HasOffersbtoddb
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLNguyen Van Vuong
 
ELK: a log management framework
ELK: a log management frameworkELK: a log management framework
ELK: a log management frameworkGiovanni Bechis
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHPMike Lively
 
Puppet at Spotify (stockholm)
Puppet at Spotify (stockholm)Puppet at Spotify (stockholm)
Puppet at Spotify (stockholm)Puppet
 
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015Johan
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Oleksiy Panchenko
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaAvinash Ramineni
 
Puppetcamp Melbourne - puppetdb
Puppetcamp Melbourne - puppetdbPuppetcamp Melbourne - puppetdb
Puppetcamp Melbourne - puppetdbm_richardson
 
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...DataStax Academy
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchInfluxData
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.comRenzo Tomà
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.Renzo Tomà
 
Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stackVikrant Chauhan
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffTimescale
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon
 
Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013gdusbabek
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleSematext Group, Inc.
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to PriamJason Brown
 

Tendances (20)

Seattle Cassandra Meetup - HasOffers
Seattle Cassandra Meetup - HasOffersSeattle Cassandra Meetup - HasOffers
Seattle Cassandra Meetup - HasOffers
 
Sphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQLSphinx - High performance full-text search for MySQL
Sphinx - High performance full-text search for MySQL
 
ELK: a log management framework
ELK: a log management frameworkELK: a log management framework
ELK: a log management framework
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
Puppet at Spotify (stockholm)
Puppet at Spotify (stockholm)Puppet at Spotify (stockholm)
Puppet at Spotify (stockholm)
 
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
Puppetcamp Melbourne - puppetdb
Puppetcamp Melbourne - puppetdbPuppetcamp Melbourne - puppetdb
Puppetcamp Melbourne - puppetdb
 
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.How bol.com makes sense of its logs, using the Elastic technology stack.
How bol.com makes sense of its logs, using the Elastic technology stack.
 
Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnB
 
Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to Priam
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 

En vedette

elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practiceJano Suchal
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityStéphane Gamard
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch IntroductionMark Cheeseman
 
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearchmartijnvg
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformMongoDB
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleBharvi Dixit
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemAvleen Vig
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsAlaa Elhadba
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 

En vedette (11)

elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Ebay peter maas
Ebay peter maasEbay peter maas
Ebay peter maas
 
The Commando Devops
The Commando DevopsThe Commando Devops
The Commando Devops
 
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearch
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 

Similaire à Scaling Elasticsearch at Synthesio

Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersFred de Villamil
 
Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineDataWorks Summit
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)Oshin Hung
 
Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Viswanath J
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters MongoDB
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.Julian Hyde
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...DataStax Academy
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
Geo Analytics Canada Overview - May 2020
Geo Analytics Canada Overview - May 2020Geo Analytics Canada Overview - May 2020
Geo Analytics Canada Overview - May 2020GEO Analytics Canada
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionLucidworks
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 

Similaire à Scaling Elasticsearch at Synthesio (20)

Running & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch ClustersRunning & Scaling Large Elasticsearch Clusters
Running & Scaling Large Elasticsearch Clusters
 
Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query Engine
 
MySQL vs. MonetDB
MySQL vs. MonetDBMySQL vs. MonetDB
MySQL vs. MonetDB
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)
 
Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Getting started with Cassandra 2.1
Getting started with Cassandra 2.1
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
LesFurets.com: From 0 to Cassandra on AWS in 30 days - Tsunami Alerting Syste...
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Geo Analytics Canada Overview - May 2020
Geo Analytics Canada Overview - May 2020Geo Analytics Canada Overview - May 2020
Geo Analytics Canada Overview - May 2020
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 

Plus de Fred de Villamil

Scaling your Engineering Team
Scaling your Engineering TeamScaling your Engineering Team
Scaling your Engineering TeamFred de Villamil
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
 
Hiring and Managing Happy Engineers - CTO Pizza #3
Hiring and Managing Happy Engineers - CTO Pizza #3Hiring and Managing Happy Engineers - CTO Pizza #3
Hiring and Managing Happy Engineers - CTO Pizza #3Fred de Villamil
 
Devops commando - Paris Devops 2016-04
Devops commando - Paris Devops 2016-04Devops commando - Paris Devops 2016-04
Devops commando - Paris Devops 2016-04Fred de Villamil
 
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...Fred de Villamil
 

Plus de Fred de Villamil (8)

Scaling your Engineering Team
Scaling your Engineering TeamScaling your Engineering Team
Scaling your Engineering Team
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Hiring and Managing Happy Engineers - CTO Pizza #3
Hiring and Managing Happy Engineers - CTO Pizza #3Hiring and Managing Happy Engineers - CTO Pizza #3
Hiring and Managing Happy Engineers - CTO Pizza #3
 
Devops commando - Paris Devops 2016-04
Devops commando - Paris Devops 2016-04Devops commando - Paris Devops 2016-04
Devops commando - Paris Devops 2016-04
 
How People Use Iphone
How People Use IphoneHow People Use Iphone
How People Use Iphone
 
Zendcon Performance Oci8
Zendcon Performance Oci8Zendcon Performance Oci8
Zendcon Performance Oci8
 
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...
Applications Web En Entreprise Avec Ruby On Rails Benefices Et Limitations Gu...
 
Presentation Rails
Presentation RailsPresentation Rails
Presentation Rails
 

Dernier

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Scaling Elasticsearch at Synthesio

  • 1. ELASTICSEARCH @SYNTHESIO FRED DE VILLAMIL, DIRECTOR OF INFRASTRUCTURE @FDEVILLAMIL
  • 2. BACKGROUND • FRED DE VILLAMIL, 38 ANS, DIRECTOR OF INFRASTRUCTURE @SYNTHESIO • LINUX / (FREE)BSD SINCE 1996 • OPEN SOURCE CONTRIBUTOR SINCE 1998 • RUNS ELASTICSEARCH IN PRODUCTION SINCE 0.17.6
  • 3. ABOUT SYNTHESIO • Synthesio is the leading social intelligence tool for social media monitoring & social analytics. • Synthesio crawls the Web for relevant data, enriches it with sentiment analysis and demographics to build social analytics dashboards.
  • 4. ELASTICSEARCH @SYNTHESIO, SEPTEMBER 2016 • 5 clusters, 163 physical servers, 400TB storage, 10.2TB RAM • 75B indexed documents, 200TB data • 1.8B indexed documents each month: mix of Web pages, forums and social media posts
  • 5. POWERING 13000 DASHBOARDS WITH ELASTICSEARCH
  • 6. DEC. 2014: THE MYSQL NIGHTMARE • Cross clusters queries on 3 massive Galera clusters • Up to 50M rows fetched from a massive 4B rows reference table • Then a cross cluster joint on a 20TB, 35B records monolithic MySQL database • Poor performances, frequent timeouts
  • 7. JAN. 2015: CLIPPING REVOLUTION • 1 global index, 512 shards, 5B documents • 1000 new documents / second • 47 servers running ElasticSearch 1.3.2 then 1.3.9 • Capacity : 37TB, 24TB data, 2.62TB RAM
  • 8. CLUSTER TYPOLOGY • 2 QUERY NODES: VIRTUAL MACHINES, 4 CORE, 8GB RAM EACH • 3 MASTER NODES: VIRTUAL MACHINES, 4 CORE, 8GB RAM • 42 DATA NODES: PHYSICAL SERVERS, 6 CORE XEON E5-1650 V2, 3*900GB SSD IN RAID 0, 64GB RAM
  • 9. CLIPPING REVOLUTION DATA MODEL • ROUTING ON A MONTHLY BASIS • EACH CRAWLED DOCUMENT IS INDEXED WITH NESTED DASHBOARD IDS. • QUERIES ON TIME PERIOD + DASHBOARD ID { "document": { "dashboards": { "dashboard_id": 1, "dashboard_id": 2 } } }
  • 10. PROBLEMS • TOO MANY SHARDS (WAS MEANT TO BE A WEEKLY ROUTING) • 500GB TO 900GB SHARDS (!!!) GROWING AFTER THE MONTH IS OVER. 3 HOURS FOR A REALLOCATION • A ROLLING RESTART TAKES 3 FULL DAYS (IF WE’RE LUCKY) • GARBAGE COLLECTOR NIGHTMARE, CONSTANTLY FLAPPING CLUSTER
  • 11. MMAPFS VS NIOFS • MMAPFS : MAPS LUCENE FILES ON THE VIRTUAL MEMORY USING MMAP. NEEDS AS MUCH MEMORY AS THE FILE BEING MAPPED • NIOFS : APPLIES A SHARED LOCK ON LUCENE FILES AND RELIES ON THE FILE SYSTEM CACHE
  • 12. CMS VS G1GC • CMS: SHARED CPU TIME WITH THE APPLICATION.“STOPS THE WORLD” WHEN TOO MANY MEMORY TO CLEAN UNTIL IT SENDS AN OUTOFMEMORYERROR • G1GC: SHORT, MORE FREQUENT, PAUSES. WON’T STOP A NODE UNTIL IT LEAVES THE CLUSTER
  • 13. G1GC OPTIONS MAXGCPAUSEMILLIS=200: ENSURE LONGER GARBAGE COLLECTION GCPAUSEINTERVALMILLIS=1000: BUT LESS FREQUENT INITIATINGHEAPOCCUPANCYPERCENT=35: STARTS COLLECTING WHEN THE HEAP IS 35% USED
  • 14. FIELD DATA CACHE EXPIRE • FORCES ELASTICSEARCH TO PERIODICALLY EMPTY ITS INTERNAL FIELDDATA CACHE • OVERLAPS THE GARBAGE COLLECTOR JOB • PERFORMANCES ISSUES WITH FREQUENTLY ACCESSED DATA • USE OF FIELD BREAKERS TO STOP GREEDY QUERIES • ELASTIC SAYS NEVER DO THIS!!! BUT IT FIXED OUR BIGGEST PROBLEM
  • 15. MORE PROBLEMS • IMMUTABLE, MONOLITHIC MAPPING: NEW FEATURE BLOCKED UNTIL WE FIX IT • IMPOSSIBLE TO DELETE A DASHBOARD WITHOUT REINDEXING A WHOLE MONTH • 20% DELETED DOCUMENTS WASTING 3TB
  • 16. IMMUTABLE MAPPING AND DELETED DATA • SEGMENTS : IMMUTABLE FILES USED BY LUCENE TO WRITE ITS DATA. UP TO 2500 / SHARD (!!!) • NO REAL DELETE: UPDATED AND DELETED DOCUMENTS GET THE DELETED FLAG • ELASTICSEARCH _OPTIMIZE: MERGE A SHARD SEGMENTS IN 1 AND PURGE DELETED DOCS • BUT: REQUIRES 150% OF THE SHARD SIZE ON DISK
  • 18. JAN. 2015: BLINK • 13200 indexes, 12B documents • 5500 new document / second • 3 clusters, 75 physical servers running ElasticSearch 1.7.5 • Capacity: 187,5TB, 48TB data 4.7TB RAM
  • 19. CLUSTERS TYPOLOGY 4 CORE XEON D-1520, 64GB RAM SERVERS • 2 QUERY NODES • 3 MASTER NODES • 20 DATA NODE : 4*800GB SSD IN RAID 0
  • 20. NEW PRODUCT DESIGN • 1 INDEX / DASHBOARD, 1 SHARD / 5 MILLIONS DE DOCS • VERSIONED MAPPING: MAPPING_ID__DASHBOARD_ID • MULTIPLE MAPPING VERSIONS OF A DASHBOARD IN // • MAPPING UPGRADE AND REINDEX WITHOUT INTERRUPTION • BALDUR FOR DASHBOARDS ROUTING
  • 22. BALDUR IN A NUTSHELL 1. THE API SERVER SENDS AN ELASTIC SEARCH QUERY 2. BALDUR INTERCEPTS THE QUERY AND GETS THE DASHBOARD CLUSTER ID AND ACTIVE MAPPING VERSION 3. BALDUR ROUTES THE QUERY TO THE CLUSTER HOSTING THE DASHBOARD DATA
  • 23. ADDING A MAPPING VERSION • THE INDEXER CREATES NEW NEW_MAPPING_ID__DASHBOARD_ID INDEX • THE INDEXER ADDS A LINE IN BALDUR’S DATABASE WITH THE DASHBOARD AND MAPPING IDS • THE INDEXERS INDEXES BOTH MAPPING_ID__DASHBOARD_ID AND NEW_MAPPING_ID__DASHBOARD_ID • WHEN NEW_MAPPING_ID__DASHBOARD_ID HAS CAUGHT UP, BALDER SWITCHES THE ACTIVE MAPPING
  • 24. TOO MANY LUCENE SEGMENTS • EACH DATA NODE HOSTS 1000S LUCENE SEGMENTS • 75% OF THE HEAP IS USED FOR SEGMENTS MANAGEMENT • WE CREATE MORE SEGMENTS THAN WE’RE ABLE TO OPTIMIZE • CONTINUOUS OPTIMISATION SCRIPTS, INDEXES WITH THE MOST DELETED DOCS FIRST • CONTINUOUS OLD INDEXES CLEANUP
  • 25. MYSQL CAN’T RESIST • 5000 DOCS, RANDOM READS BASED, BULK INDEXING PUTS MYSQL ON THEIR KNEES • FETCH THE DOCUMENTS IN BLACKHOLE BY 5000 • IF SOME DOCUMENTS ARE MISSING, FETCH IN MYSQL • RESULT: 99.9% DOCUMENTS EXTRACTED FROM BLACKHOLE, THROUGHPUT * 5
  • 26. REPLACE MYSQL WITH BLACKHOLE
  • 27. RACK AWARENESS IN A NUTSHELL 1. DEFINE 2 VIRTUAL RACK IDS 2. ASSIGN EACH DATA NODES A RACK 3. ENABLE RACK AWARENESS 4. PRIMARY SHARDS PICK UP A SIDE, REPLICA PICK UP THE OTHER ONE
  • 28. FULL CLUSTER RESTART IN 20 MINUTES • CONFIGURATION TUNING REQUIRES LOTS OF RESTART • RELY ON RACK AWARENESS TO RESTART HALF CLUSTER AT ONCE • BLOCK SHARD ALLOCATION DURING SERVICE RESTART • GET GREEN • REPEAT
  • 29. ADDING NEW DOCUMENTS TO A DASHBOARD
  • 30. WHERE DO YOU GO, MY LOVELY? • PROBLEM: HOW DO WE KNOW IN WHICH DASHBOARD FITS A NEW DOCUMENT • STOP RELYING ON MYSQL AND SPHINX • 50M NEW DOCUMENTS TO PROCESS A DAY
  • 31. SOLUTION : PERCOLATION • REVERSE DIRECTORY SYSTEM • WE STORE QUERIES, NOT DOCUMENTS • FOR EACH NEW DOCUMENT, WE MATCH THE DOCUMENT AGAINST OUR QUERIES
  • 32. PERCOLATION ISSUES • IT TRIES TO MATCH EVERY STORED QUERY • SO FAR, WE HAVE 35000 STORE QUERIES • RAW USE: 1.750.000.000.000 MATCHES A DAY • CPU GREEDY
  • 33. SOLUTIONS • ROUTING WITH THE DASHBOARD AND DOCUMENT LANGUAGES • FILTER AGAINST THE QUERY SECOND • RESULT: UP TO 100.000 QUERIES / SECOND
  • 34. GENERATING DASHBOARDS WITH 3 YEARS OF DATA
  • 35. BLACKHOLE V1 • 36 INDEXES, 40B DOCUMENTS • 1,5B NEW DOCUMENTS EACH MONTH • 72 SERVERS RUNNING ELASTICSEARCH 2.3 • CAPACITY: 209TB, 120TB DATA, 4,5TB RAM • QUERIES ON THE WHOLE DATASET
  • 36. CLUSTER TYPOLOGY 75 4 CORE XEON D-1520, 64GB RAM SERVERS • 4 HTTP NODES • 3 MASTER NODES • 68 DATA NODE : 4*800GB SSD IN RAID 0
  • 37. BEFORE ELASTICSEARCH • RUN QUERIES AGAINST A SPHINX CLUSTER TO GET THE RIGHT DOCUMENTS ID • FETCH THE DOCUMENTS FROM A GALERA CLUSTER AND THE METADATA FROM ANOTHER GALERA CLUSTER • MERGE AND DISPLAY THE DOCUMENTS
  • 38. SPHINX NIGHTMARE • CAN’T SCALE HORIZONTALY: LIMITED TO 14 MONTHS OF DATA • A COMPLEX QUERY AND THE WHOLE CLUSTER REACHES 400 LOAD • MYSQL LOAD
  • 40. INDEXING • A GO PROGRAM MERGES 3 GALERA CLUSTERS AND 1 ES CLUSTER INTO A KAFKA QUEUE: 30.000 DOCUMENTS / SECOND • 8 GO INDEXERS MAP THE INDEX / DATA NODE DISTRIBUTION AND PUSH THE DATA DIRECTLY ON THE RIGHT DATA NODE: 60.000 DOCUMENTS / SECOND DURING 3 WEEKS WITH 200.000 / SECOND PEAKS
  • 42. KAFKA IS TOO SLOW • THE 72TB KAFKA QUEUE IS TOO SLOW: 10000 DOCUMENTS / SECOND / PARTITION ONLY BECAUSE SPINNING DISKS
  • 43. MASSIVE QUERIES CRASH HALF THE CLUSTER • ELASTICSEARCH CACHES THE RESULT OF FILTERED QUERIES: SET _CACHE TO FALSE. • UPGRADE TO 1.7.5: FILTERED QUERIES CACHE HAVE A MEMORY LEAK IN 1.7.4
  • 44. BIGGEST QUERIES ARE SLOW AS HELL • DIVIDE THE GLOBAL QUERIES PER INDEX AND RUN IN PARALLEL • PROCESS THE RESULTS POST QUERY ON THE API LEVEL
  • 45. REINDEXING 40 BILLION DOCS IN 5 DAYS
  • 46. UPGRADING A MAPPING, ON A LIVE CLUSTER • CAN’T CHANGE A FIELD TYPE • CAN’T UPDATE ANALYZERS LIVE • CAN’T REORGANIZE THE MAPPING WITH EXISTING DATA
  • 47. CLUSTER DESIGN • 36 MONTHS DATA, 40 BILLION DOCUMENTS • 70 DATA NODE, 3 MASTERS • 1 INDEX PER DAY, 12 SHARDS, 1 REPLICA
  • 48. REINDEXING • USE LOGSTASH TO READ, TRANSFORM AND WRITE EXISTING DATA ON EACH DATA NODE • EACH DATA NODE WRITES ON A DEFINED FRIEND IN THE OTHER PART OF THE CLUSTER • 5000 DOCUMENTS SCROLL, 10 INDEXING WORKERS • SCROLLS AGAINST A FULL DAY
  • 49. LOGSTASH CONFIGURATION INPUT { ELASTICSEARCH { HOSTS => [ "LOCAL ELASTICSEARCH NODE" ] INDEX => "INDEX TO READ FROM" SIZE => 5000 SCROLL => "20M" # 5 MINUTES INITIAL DOCINFO => TRUE QUERY => '{ "QUERY": { "RANGE": { "DATE": { "GTE": "2015-07-23T10:00.000+01:00","LTE": "2015-07-23T11:00.000+01:00" } } } }' } } OUTPUT { ELASTICSEARCH { HOST => "REMOTE ELASTICSEARCH NODE" INDEX => "INDEX TO WRITE TO" PROTOCOL => "HTTP" INDEX_TYPE => "%{[@METADATA][_TYPE]}" DOCUMENT_ID => "%{[@METADATA][_ID]}" WORKERS => 10 } STDOUT { CODEC => RUBYDEBUG # BECAUSE REMOVING THE TIMESTAMP FIELD MAKES LOGSTASH CRASH } } FILTER { MUTATE { RENAME => { "SOME FIELD" => "SOME OTHER FIELD" } RENAME => { "ANOTHER FIELD" => "SOMEWHERE ELSE" } REMOVE_FIELD => [ "SOMETHING","SOMETHING ELSE","ANOTHER FIELD","SOME FIELD","@TIMESTAMP","@VERSION" ] } }
  • 50. ES CONFIGURATION CHANGES MEMORY: INDEX_BUFFER_SIZE: 50% (INSTEAD OF 10%) INDEX: STORE: THROTTLE: TYPE : "NONE" (AS FAST AS YOUR SSD CAN GO) TRANSLOG: DISABLE_FLUSH: TRUE REFRESH_INTERVAL: -1 (INSTEAD OF 1S) INDICES: STORE: THROTTLE: MAX_BYTES_PER_SEC: "2GB"
  • 51. PROBLEMS • MISSING DOCUMENTS AS SCROLL LOST ITS SEARCH CONTEXT • SOMETIMES, THE INDEXING NODES CRASH • LOGSTASH DOES NOT LIKE NETWORK ISSUES • NEED TO REPLAY A FULL DAY TO CATCH UP WITH THE DATA
  • 52. SOLUTIONS • PLAY HOURLY QUERIES • WRITE A SMALL ORCHESTRATOR • INTRODUCING YOKO AND MOULINETTE
  • 53. YOKO, THE REINDEXING ORCHESTRATOR • SMALL PYTHON DAEMON TO QUERY A MYSQL DATABASE • INDEX FROM • INDEX TO • LOGSTASH QUERY • STATUS: TODO, PROCESSING, DONE, COMPLETE, FAILED
  • 54. YOKO, THE REINDEXING ORCHESTRATOR • CREATES THE DAILY INDEXES. • COMPARES THE NUMBER OF DOCUMENTS FROM THE INITIAL INDEX RUNNING THE LOGSTASH QUERY ON “DONE" INDEXES. • MOVES EACH SUCCESSFUL "DONE" LINE TO "COMPLETE" IF THE COUNT MATCHES OR "FAILED". • DELETE EACH MONTHLY INDEX WHEN EVERY DAY OF A MONTH IS "COMPLETE".
  • 55. MOULINETTE, THE REINDEXING SCRIPT • SMALL BASH SCRIPT THAT QUERIES YOKO • GENERATES THE LOGSTASH.CONF FILE FROM YOKO DATA • RUNS LOGSTASH • SWITCHES YOKO LINE TO DONE WHEN DONE
  • 56.
  • 57. PROBLEMS • LOGSTASH TRANSFORM FIELDS IS SLOW • SHOULD RUN INDEXING ON LESS NODES • SOMETIMES, LOGSTASH HANGS UP AND NEEDS TO BE FORCED KILLED • YOKO SHOULD DETECT THIS AND RAISE AN ERROR
  • 59. BEFORE UPGRADING • CHECK YOUR PLUGINS CAN RUN ON 2.X • CHECK YOUR MAPPINGS ARE 2.X COMPLIANT • CHECK FOR CONFIGURATION DEPRECATION
  • 60. SEEMS EASY? 1. SHUTDOWN CLUSTER 2. UPGRADE ES TO 2.3 3. UPGRADE PLUGINS 4. START THE WHOLE CLUSTER, MASTERS FIRST
  • 61. UNSUPPORTED PLUGINS AND ANALYZERS • CAN’T UPDATE AN ANALYZER ON AN OPEN INDEX • CLOSE ALL INDEXES • APPLY A TEMPORARY DUMMY ANALYZER • REOPEN INDEXES
  • 62. DUMMY KOREAN ANALYZER "ANALYZER": { "KR_ANALYZER": { "TOKENIZER": "STANDARD", "FILTER": [ "CJK_WIDTH", "LOWERCASE", "CJK_BIGRAM", "ENGLISH_STOP" ] } }