SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
SolrCloud: High Availability and Fault Tolerance
Mark Miller
Software Engineer, Cloudera
3
01
Who am I?
I’m Mark Miller
I’m a Lucene junkie (2006)
I’m a Lucene committer (2008)
And a Solr committer (2009)
And a member of the ASF (2011)
And a former Lucene PMC Chair (2014-2015)
I’ve done a lot of core Solr work and co-created SolrCloud
This talk is about how SolrCloud tries to protect your data.
And about some things that should change.
5
01
SolrCloud Diagram
6
03
Failure Cases (Shards of index can be treated independently)
• A Leader dies (loses ZK connection)
• A Replica dies or update from leader to
replica fails.
• A Replica is partitioned (eg can talk to
ZK, but not a shard leader)
R
L
ZK
7
01
Replica Recovery
• A replica will recover from the leader
on startup.
• A replica will recover if an update from
the leader to the replica fails.
• A replica may recover from the leader
in the leader election sync up dance.R
L
ZK
8
01
Replica Recovery Dance
• Start Buffering Updates from Leader
• Publish Recovering to ZK
• Wait for leader to see Recovering State
• On first Recovery try, PeerSync
• Otherwise full index replication
• Commit on leader
• Replicate Index
• Replay Buffered Documents
R
L
ZK
RecoveryStrategy
9
01
A Replica is Partitioned
• In the early days we half punted on this
• Now, when a leader cannot reach a
replica, it will put it in LIR in ZK.
• A replica in LIR will realize that it must
recover before clearing it’s LIR status.
• We worked through some bugs, but
this is very solid now.
R
L
ZK
X
10
01
Leader Recovery
• The ‘best effort’ leader recovery dance
• If it’s after startup and the last
published state is not active, can’t be
leader.
• Otherwise, try to peer sync with shard.
• If success, try to peer sync from
replicas to leader.
• If any of those sync fails, ask replicas to
recover from leader.
R
L
ZK
SyncStrategy / ElectionContext
11
01
Leader Election Forward Progress Stall…
• Each replica decides for itself if it
thinks it should be leader.
• Everyone may think they are unfit.
• Only replicas that have last published
ACTIVE will attempt to be leader after
the first election.
12
01
Leader Election Forward Progress Stall…
• While rare, if all replicas in a shard lose
their connection to ZK at the same
time, no replica will become leader
without intervention.
• There is a manual API to intervene, but
this should be done automatically.
• In practice, this tends to happen for
reasons that can be ‘tuned’ out of.
• Still needs to be improved.
13
01
User chooses durability requirements
• You can specify how many replicas you
want to see success from to consider
an update successful. minRf param.
• This won’t fail based on that criteria
though - simply flag you in the
response.
• If you replicate factor is not achieved,
that also does not mean the update is
rolled back.
14
01
User chooses durability requirements
• If we improve some of this…
• We can stop trying so hard.
• And put it on the user to specify a
replication factor that controls how
‘safe’ updates are.
15
JIRA
16
01
Handeling Cluster Shutdown / Startup
• What if an old replica returns?
• How to ensure every replica
participates in election?
• What if no replica thinks it should be
leader?
• Staggered shutdowns?
• Explicit cluster commands might help
17
Thank You!
Mark Miller
@heismark
Software
Engineer
Cloudera

Contenu connexe

En vedette

Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudthelabdude
 
Real World Analytics with Solr Cloud and Spark
Real World Analytics with Solr Cloud and SparkReal World Analytics with Solr Cloud and Spark
Real World Analytics with Solr Cloud and SparkQAware GmbH
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataShalin Shekhar Mangar
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupShalin Shekhar Mangar
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
 
Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Electionravikgiitk
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkitthelabdude
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsDuyhai Doan
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 

En vedette (15)

Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
 
Real World Analytics with Solr Cloud and Spark
Real World Analytics with Solr Cloud and SparkReal World Analytics with Solr Cloud and Spark
Real World Analytics with Solr Cloud and Spark
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Election
 
Apache SolrCloud
Apache SolrCloudApache SolrCloud
Apache SolrCloud
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 

Similaire à SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, Cloudera

Deep Dive into Slick
Deep Dive into SlickDeep Dive into Slick
Deep Dive into SlickKnoldus Inc.
 
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your Data
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your DataTraining Slides: 102 - Basics - Tungsten Replicator - How We Move Your Data
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your DataContinuent
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsITD Systems
 
Solr consistency and recovery internals - Mano Kovacs
Solr consistency and recovery internals - Mano KovacsSolr consistency and recovery internals - Mano Kovacs
Solr consistency and recovery internals - Mano KovacsMano Kovacs
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internalsCloudera, Inc.
 
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...Bob Pusateri
 
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...Bob Pusateri
 
Leveraging pull replicas in Solr 7
Leveraging pull replicas in Solr 7Leveraging pull replicas in Solr 7
Leveraging pull replicas in Solr 7Samuel Tatipamula
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantJoshua Goldbard
 
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, LucidworksApache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, LucidworksLucidworks
 
Geek Sync | Field Medic’s Guide to Database Mirroring
Geek Sync | Field Medic’s Guide to Database MirroringGeek Sync | Field Medic’s Guide to Database Mirroring
Geek Sync | Field Medic’s Guide to Database MirroringIDERA Software
 
20140228 fp and_performance
20140228 fp and_performance20140228 fp and_performance
20140228 fp and_performanceshinolajla
 
What’s up With Availability in Kafka? With Justine Olshan | Current 2022
What’s up With Availability in Kafka? With Justine Olshan | Current 2022What’s up With Availability in Kafka? With Justine Olshan | Current 2022
What’s up With Availability in Kafka? With Justine Olshan | Current 2022HostedbyConfluent
 
How SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
How SolrCloud Solved Recovery Issues - Dat Cao Manh, LucidworksHow SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
How SolrCloud Solved Recovery Issues - Dat Cao Manh, LucidworksLucidworks
 
Magic With Oracle - Presentation
Magic With Oracle - PresentationMagic With Oracle - Presentation
Magic With Oracle - PresentationFrancisco Alvarez
 
Learning to Rank with Apache Solr and Fusion
Learning to Rank with Apache Solr and FusionLearning to Rank with Apache Solr and Fusion
Learning to Rank with Apache Solr and FusionLucidworks
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Lucidworks
 

Similaire à SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, Cloudera (20)

Deep Dive into Slick
Deep Dive into SlickDeep Dive into Slick
Deep Dive into Slick
 
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your Data
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your DataTraining Slides: 102 - Basics - Tungsten Replicator - How We Move Your Data
Training Slides: 102 - Basics - Tungsten Replicator - How We Move Your Data
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
 
Solr consistency and recovery internals - Mano Kovacs
Solr consistency and recovery internals - Mano KovacsSolr consistency and recovery internals - Mano Kovacs
Solr consistency and recovery internals - Mano Kovacs
 
Solr consistency and recovery internals
Solr consistency and recovery internalsSolr consistency and recovery internals
Solr consistency and recovery internals
 
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (SQL Saturday M...
 
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...
Locks, Blocks, and Snapshots: Maximizing Database Concurrency (Chicago Suburb...
 
Leveraging pull replicas in Solr 7
Leveraging pull replicas in Solr 7Leveraging pull replicas in Solr 7
Leveraging pull replicas in Solr 7
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
 
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, LucidworksApache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
 
Geek Sync | Field Medic’s Guide to Database Mirroring
Geek Sync | Field Medic’s Guide to Database MirroringGeek Sync | Field Medic’s Guide to Database Mirroring
Geek Sync | Field Medic’s Guide to Database Mirroring
 
20140228 fp and_performance
20140228 fp and_performance20140228 fp and_performance
20140228 fp and_performance
 
What’s up With Availability in Kafka? With Justine Olshan | Current 2022
What’s up With Availability in Kafka? With Justine Olshan | Current 2022What’s up With Availability in Kafka? With Justine Olshan | Current 2022
What’s up With Availability in Kafka? With Justine Olshan | Current 2022
 
Declarative Network Configuration
Declarative Network Configuration Declarative Network Configuration
Declarative Network Configuration
 
How SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
How SolrCloud Solved Recovery Issues - Dat Cao Manh, LucidworksHow SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
How SolrCloud Solved Recovery Issues - Dat Cao Manh, Lucidworks
 
ORMs Meet SQL
ORMs Meet SQLORMs Meet SQL
ORMs Meet SQL
 
Magic With Oracle - Presentation
Magic With Oracle - PresentationMagic With Oracle - Presentation
Magic With Oracle - Presentation
 
Learning to Rank with Apache Solr and Fusion
Learning to Rank with Apache Solr and FusionLearning to Rank with Apache Solr and Fusion
Learning to Rank with Apache Solr and Fusion
 
Optimera STHLM 2011 - Mikael Berggren, Spotify
Optimera STHLM 2011 - Mikael Berggren, SpotifyOptimera STHLM 2011 - Mikael Berggren, Spotify
Optimera STHLM 2011 - Mikael Berggren, Spotify
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
 

Plus de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Plus de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Dernier

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Dernier (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, Cloudera

  • 1. O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
  • 2. SolrCloud: High Availability and Fault Tolerance Mark Miller Software Engineer, Cloudera
  • 3. 3 01 Who am I? I’m Mark Miller I’m a Lucene junkie (2006) I’m a Lucene committer (2008) And a Solr committer (2009) And a member of the ASF (2011) And a former Lucene PMC Chair (2014-2015) I’ve done a lot of core Solr work and co-created SolrCloud
  • 4. This talk is about how SolrCloud tries to protect your data. And about some things that should change.
  • 6. 6 03 Failure Cases (Shards of index can be treated independently) • A Leader dies (loses ZK connection) • A Replica dies or update from leader to replica fails. • A Replica is partitioned (eg can talk to ZK, but not a shard leader) R L ZK
  • 7. 7 01 Replica Recovery • A replica will recover from the leader on startup. • A replica will recover if an update from the leader to the replica fails. • A replica may recover from the leader in the leader election sync up dance.R L ZK
  • 8. 8 01 Replica Recovery Dance • Start Buffering Updates from Leader • Publish Recovering to ZK • Wait for leader to see Recovering State • On first Recovery try, PeerSync • Otherwise full index replication • Commit on leader • Replicate Index • Replay Buffered Documents R L ZK RecoveryStrategy
  • 9. 9 01 A Replica is Partitioned • In the early days we half punted on this • Now, when a leader cannot reach a replica, it will put it in LIR in ZK. • A replica in LIR will realize that it must recover before clearing it’s LIR status. • We worked through some bugs, but this is very solid now. R L ZK X
  • 10. 10 01 Leader Recovery • The ‘best effort’ leader recovery dance • If it’s after startup and the last published state is not active, can’t be leader. • Otherwise, try to peer sync with shard. • If success, try to peer sync from replicas to leader. • If any of those sync fails, ask replicas to recover from leader. R L ZK SyncStrategy / ElectionContext
  • 11. 11 01 Leader Election Forward Progress Stall… • Each replica decides for itself if it thinks it should be leader. • Everyone may think they are unfit. • Only replicas that have last published ACTIVE will attempt to be leader after the first election.
  • 12. 12 01 Leader Election Forward Progress Stall… • While rare, if all replicas in a shard lose their connection to ZK at the same time, no replica will become leader without intervention. • There is a manual API to intervene, but this should be done automatically. • In practice, this tends to happen for reasons that can be ‘tuned’ out of. • Still needs to be improved.
  • 13. 13 01 User chooses durability requirements • You can specify how many replicas you want to see success from to consider an update successful. minRf param. • This won’t fail based on that criteria though - simply flag you in the response. • If you replicate factor is not achieved, that also does not mean the update is rolled back.
  • 14. 14 01 User chooses durability requirements • If we improve some of this… • We can stop trying so hard. • And put it on the user to specify a replication factor that controls how ‘safe’ updates are.
  • 16. 16 01 Handeling Cluster Shutdown / Startup • What if an old replica returns? • How to ensure every replica participates in election? • What if no replica thinks it should be leader? • Staggered shutdowns? • Explicit cluster commands might help