SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
ElasticSearch from the
        trenches



     Vinicius Carvalho / @vccarvalho
            blog.furiousbob.com
About

emusic.com is one of the leading digital music
retailers, is committed to serving music
enthusiasts with aggressive development of
tools and features solely designed to
enhance personalized music experience

Searching is a BIG part of emusic.com
discovery experience
Search challenges

Results MUST be relevant

  How do we define relevancy?

Low response times ( < 100ms)

High availability (Users don’t tolerate not
finding what they are looking for)

Static search results are old school, your
engine should “know” your users preferences

How to recover from catastrophic crashes
Goals

Improve results relevancy : The Adele 21
problem

Get rid of proprietary software

Have an extensible search engine

Understand what’s happening under the hood

Integrate with other user projects :
recommendation, affinity, activity
SEABRO Project
Replace current proprietary Endeca search
engine to a modern search engine

Get more relevant results

Flexible API

Facet searching (for browsing)

Ability to scale out

3 months in execution (1st phase)
Why ElasticSearch?


Built from ground up to scale out

Powerful DSL

Schema-less -> JSON

Distributed Lucene

Very powerful API, allows automation of
the whole cluster using simple curl
commands

It’s bonsai cool
SOLR: The contender



Powered by Lucene

Just too much XML files to get anything
done

Uses XML

No Query DSL
Elasticsearch at a glance


client
          q1
                    node
         r1
client


                    node
         q1
client


         r1
Cluster with 1 node
                            Cluster state




                         Node name




  Cool Names

                Shards
Cluster with 2 nodes : Rebalancing




              Shard being
              relocated
Cluster with 2 nodes : stable




      Recently relocated shard
Cluster with 3 nodes : stable
Cluster with 3 nodes : adding replicas
                          Yellow state, not all
                          replicas set
Cluster with 3 nodes : adding replicas



                              Index aliasing




            Primary Shard
Cluster with 3 nodes : node crash
                             Without replication, our
                             cluster now is missing
                             one shard.
Client Node types


Three types

  Data node : joins the cluster, fetch shard
  data

  Client node : joins the cluster, participates
  on sorting operations, don’t fetch data

  Transport node : Don’t join the cluster,
  only send requests towards it
Extensible architecture : plugins
Site : GUI interfaces

River : integration hook to fetch data from
other sources (DB, MQ, FS ... )

Transport : Allow different transports to be
plugged into ES core

Scripting : Allow adding new languages to
the scripting system

Analysis : Custom analyzers for indexing/
searching

Misc : You know, everything else
Site plugins
Our numbers

25+ million documents

Multi types: Songs, Albums, Artists, Audio
Books, Composers, Labels, Authors,
Conductors, Composers ...

5+ million search requests per day

~ 100 gb index. And it only takes 1 hour to
build it from ground up (thanks to Akka
engine)
Indexing flow


                    ES cluster

            a
            k
Oracle
            k
            a
           actors
Lessons learned
Lesson #1
      Get professional Help


elasticsearch.org is very
well documented

But when it comes to
prod, ask the experts

Get professional support
from elasticsearch.com
Lesson #2
            Understand shards


Sharding is what make
distributable search
possible

Understanding what they
mean and how can they
speed up your engine is a
must
Lesson #2
Understand shards
Lesson #2
        Understand shards


Increasing the number of shards will boost
your query times

Each shard maps to a Lucene Index Reader/
Writer, the more power your box has, the
more shards you should have

Replication will boost cluster response time
Lesson # 3
Design your data flow ahead of your
              schema


  The way you model your schemas have a
  deep impact on how fast your engine can
  become

  Don’t be afraid to replicate information using
  a different structure
Lesson #4
        Master Query DSL


Just like you know SQL, you should
understand the query DSL pretty very well

Indexed Data won’t find itself

Understand that sometimes you must change
data representation in order to get things to
be found
Lesson #5
Learn at least a bit about lucene
            internals



 Understanding how lucene’s scoring works
 helps designing better queries

 Elasticsearch supports custom score using
 scripts

   Could hurt you on performance :(
Lesson #6
Put slow queries to work. Use
           explain


Explain gives useful information on how
documents are being scored

Slow query log will show you which queries
are actually hurting you

  Sometimes its just document cache misses
Lesson #7
      Take GC by the horns


ES nodes can demand a lot of memory

JDK still thinks its 2003 when it comes to
memory size

Memory fragmentation

Full GC times can bring your cluster to its
knees
Lesson #7
     Take GC by the horns


Maximum 30 GB per node

Beefy machines = more nodes per machine

Changed full GC threshold to start when
memory reaches 60% -> Giving JVM plenty
of room until memory is claimed
Lesson #8
   Caching can eat up your memory




Caching is a necessary evil but:

  Field cache stores sorted and faceted data

  Filter cache stores filtered data

Cache eviction must be controlled
Lesson #8
Caching can eat up your memory



Your queries and how you facet will have a
huge impact on cache size

Bigger your shard is, more memory you will
need for caching

Facet caching for multi valued fields in 0.20
is not optimal, take that in consideration
Lesson #9
       Monitor your cluster

Keep an eye on your cluster

It’s vital to monitor both system metrics
(CPU, memory, file system) but also correlate
that with query information

ES provides nice plugins like bigdesk and
paramedic. But history is vital so get
something like sematext SPM
Lesson #10
 Distributed systems are hard



Needless to
say, but don’t
expect all
that power
to come for
free
Lesson #11
Have an A/B testing suite ready



Defining relevancy is hard

People have different views on relevance

Hard to explain to a user why Joe Doe does
not show up on its query results
Lesson # 11
Have an A/B testing suite ready

Start with a baseline search that returns “relevant
enough” results

Give points for every record found, the higher it is
the more points it get

Sum it all, and you have your score

When updating your queries, run the suite and
check if you get better results
Lesson # 12
        Track user interaction


Monitor how many “clicks” your users are
executing once you changed queries

Again, your definition of relevant may not be
what your users expect

Adapt
Final words
In the end, ES proved to be a very reliable
and affordable solution

Not only we increased the quality of results
but we have also reduced the query
response times

Request time dropped over 200%. Cluster
size reduced by 400% and with a 80%
increase in load

YES We did save money and increased
quality at the same time
Next steps
Classify data
Classify data during indexing time instead of
using custom scripts
Search +
Recommendation
Click analysis

Contenu connexe

Tendances

Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataRahul Jain
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckDataStax Academy
 
Flickr Php
Flickr PhpFlickr Php
Flickr Phproyans
 
Flintrock: A Faster, Better spark-ec2 by Nicholas Chammas
Flintrock: A Faster, Better spark-ec2 by Nicholas ChammasFlintrock: A Faster, Better spark-ec2 by Nicholas Chammas
Flintrock: A Faster, Better spark-ec2 by Nicholas ChammasSpark Summit
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceDanilo Poccia
 
Azure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerAzure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerRafał Hryniewski
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)Amazon Web Services
 
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Spark Summit
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and CassandraStratio
 

Tendances (12)

Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Flickr Php
Flickr PhpFlickr Php
Flickr Php
 
Flintrock: A Faster, Better spark-ec2 by Nicholas Chammas
Flintrock: A Faster, Better spark-ec2 by Nicholas ChammasFlintrock: A Faster, Better spark-ec2 by Nicholas Chammas
Flintrock: A Faster, Better spark-ec2 by Nicholas Chammas
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About Performance
 
Azure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL ServerAzure SQL - more or/and less than SQL Server
Azure SQL - more or/and less than SQL Server
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
AWS re:Invent 2016: Deep Dive on Amazon Elastic Block Store (STG301)
 
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and Cassandra
 

Similaire à Elastic search from the trenches

http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151xlight
 
MPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource IndexMPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource IndexChris Wilper
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At CraigslistJeremy Zawodny
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeDataStax
 
Gavin M
Gavin MGavin M
Gavin MOntico
 
Webinar: From Frustration to Fascination: Dissecting Replication
Webinar: From Frustration to Fascination: Dissecting ReplicationWebinar: From Frustration to Fascination: Dissecting Replication
Webinar: From Frustration to Fascination: Dissecting ReplicationHoward Greenberg
 
I/O & virtualization performance with a search engine based on an xml databa...
 I/O & virtualization performance with a search engine based on an xml databa... I/O & virtualization performance with a search engine based on an xml databa...
I/O & virtualization performance with a search engine based on an xml databa...lucenerevolution
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsVineet Gupta
 
MSR 2009
MSR 2009MSR 2009
MSR 2009swy351
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
 

Similaire à Elastic search from the trenches (20)

http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151http://www.hfadeel.com/Blog/?p=151
http://www.hfadeel.com/Blog/?p=151
 
MPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource IndexMPTStore: A Fast, Scalable, and Stable Resource Index
MPTStore: A Fast, Scalable, and Stable Resource Index
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hopeOracle to Cassandra Core Concepts Guid Part 1: A new hope
Oracle to Cassandra Core Concepts Guid Part 1: A new hope
 
Gavin M
Gavin MGavin M
Gavin M
 
Webinar: From Frustration to Fascination: Dissecting Replication
Webinar: From Frustration to Fascination: Dissecting ReplicationWebinar: From Frustration to Fascination: Dissecting Replication
Webinar: From Frustration to Fascination: Dissecting Replication
 
Supporting SQLserver
Supporting SQLserverSupporting SQLserver
Supporting SQLserver
 
I/O & virtualization performance with a search engine based on an xml databa...
 I/O & virtualization performance with a search engine based on an xml databa... I/O & virtualization performance with a search engine based on an xml databa...
I/O & virtualization performance with a search engine based on an xml databa...
 
Solr
SolrSolr
Solr
 
Scaling your website
Scaling your websiteScaling your website
Scaling your website
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Managing SQLserver
Managing SQLserverManaging SQLserver
Managing SQLserver
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
MSR 2009
MSR 2009MSR 2009
MSR 2009
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 

Dernier

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 

Dernier (20)

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 

Elastic search from the trenches

  • 1. ElasticSearch from the trenches Vinicius Carvalho / @vccarvalho blog.furiousbob.com
  • 2. About emusic.com is one of the leading digital music retailers, is committed to serving music enthusiasts with aggressive development of tools and features solely designed to enhance personalized music experience Searching is a BIG part of emusic.com discovery experience
  • 3. Search challenges Results MUST be relevant How do we define relevancy? Low response times ( < 100ms) High availability (Users don’t tolerate not finding what they are looking for) Static search results are old school, your engine should “know” your users preferences How to recover from catastrophic crashes
  • 4. Goals Improve results relevancy : The Adele 21 problem Get rid of proprietary software Have an extensible search engine Understand what’s happening under the hood Integrate with other user projects : recommendation, affinity, activity
  • 5. SEABRO Project Replace current proprietary Endeca search engine to a modern search engine Get more relevant results Flexible API Facet searching (for browsing) Ability to scale out 3 months in execution (1st phase)
  • 6. Why ElasticSearch? Built from ground up to scale out Powerful DSL Schema-less -> JSON Distributed Lucene Very powerful API, allows automation of the whole cluster using simple curl commands It’s bonsai cool
  • 7. SOLR: The contender Powered by Lucene Just too much XML files to get anything done Uses XML No Query DSL
  • 8. Elasticsearch at a glance client q1 node r1 client node q1 client r1
  • 9. Cluster with 1 node Cluster state Node name Cool Names Shards
  • 10. Cluster with 2 nodes : Rebalancing Shard being relocated
  • 11. Cluster with 2 nodes : stable Recently relocated shard
  • 12. Cluster with 3 nodes : stable
  • 13. Cluster with 3 nodes : adding replicas Yellow state, not all replicas set
  • 14. Cluster with 3 nodes : adding replicas Index aliasing Primary Shard
  • 15. Cluster with 3 nodes : node crash Without replication, our cluster now is missing one shard.
  • 16. Client Node types Three types Data node : joins the cluster, fetch shard data Client node : joins the cluster, participates on sorting operations, don’t fetch data Transport node : Don’t join the cluster, only send requests towards it
  • 17. Extensible architecture : plugins Site : GUI interfaces River : integration hook to fetch data from other sources (DB, MQ, FS ... ) Transport : Allow different transports to be plugged into ES core Scripting : Allow adding new languages to the scripting system Analysis : Custom analyzers for indexing/ searching Misc : You know, everything else
  • 19. Our numbers 25+ million documents Multi types: Songs, Albums, Artists, Audio Books, Composers, Labels, Authors, Conductors, Composers ... 5+ million search requests per day ~ 100 gb index. And it only takes 1 hour to build it from ground up (thanks to Akka engine)
  • 20. Indexing flow ES cluster a k Oracle k a actors
  • 22. Lesson #1 Get professional Help elasticsearch.org is very well documented But when it comes to prod, ask the experts Get professional support from elasticsearch.com
  • 23. Lesson #2 Understand shards Sharding is what make distributable search possible Understanding what they mean and how can they speed up your engine is a must
  • 25. Lesson #2 Understand shards Increasing the number of shards will boost your query times Each shard maps to a Lucene Index Reader/ Writer, the more power your box has, the more shards you should have Replication will boost cluster response time
  • 26. Lesson # 3 Design your data flow ahead of your schema The way you model your schemas have a deep impact on how fast your engine can become Don’t be afraid to replicate information using a different structure
  • 27. Lesson #4 Master Query DSL Just like you know SQL, you should understand the query DSL pretty very well Indexed Data won’t find itself Understand that sometimes you must change data representation in order to get things to be found
  • 28. Lesson #5 Learn at least a bit about lucene internals Understanding how lucene’s scoring works helps designing better queries Elasticsearch supports custom score using scripts Could hurt you on performance :(
  • 29. Lesson #6 Put slow queries to work. Use explain Explain gives useful information on how documents are being scored Slow query log will show you which queries are actually hurting you Sometimes its just document cache misses
  • 30. Lesson #7 Take GC by the horns ES nodes can demand a lot of memory JDK still thinks its 2003 when it comes to memory size Memory fragmentation Full GC times can bring your cluster to its knees
  • 31. Lesson #7 Take GC by the horns Maximum 30 GB per node Beefy machines = more nodes per machine Changed full GC threshold to start when memory reaches 60% -> Giving JVM plenty of room until memory is claimed
  • 32. Lesson #8 Caching can eat up your memory Caching is a necessary evil but: Field cache stores sorted and faceted data Filter cache stores filtered data Cache eviction must be controlled
  • 33. Lesson #8 Caching can eat up your memory Your queries and how you facet will have a huge impact on cache size Bigger your shard is, more memory you will need for caching Facet caching for multi valued fields in 0.20 is not optimal, take that in consideration
  • 34. Lesson #9 Monitor your cluster Keep an eye on your cluster It’s vital to monitor both system metrics (CPU, memory, file system) but also correlate that with query information ES provides nice plugins like bigdesk and paramedic. But history is vital so get something like sematext SPM
  • 35. Lesson #10 Distributed systems are hard Needless to say, but don’t expect all that power to come for free
  • 36. Lesson #11 Have an A/B testing suite ready Defining relevancy is hard People have different views on relevance Hard to explain to a user why Joe Doe does not show up on its query results
  • 37. Lesson # 11 Have an A/B testing suite ready Start with a baseline search that returns “relevant enough” results Give points for every record found, the higher it is the more points it get Sum it all, and you have your score When updating your queries, run the suite and check if you get better results
  • 38. Lesson # 12 Track user interaction Monitor how many “clicks” your users are executing once you changed queries Again, your definition of relevant may not be what your users expect Adapt
  • 39. Final words In the end, ES proved to be a very reliable and affordable solution Not only we increased the quality of results but we have also reduced the query response times Request time dropped over 200%. Cluster size reduced by 400% and with a 80% increase in load YES We did save money and increased quality at the same time
  • 41. Classify data Classify data during indexing time instead of using custom scripts