SlideShare une entreprise Scribd logo
1  sur  16
Comparing open source
search engines
Richard Boulton
@rboulton
richard@cnav.co.uk
Search Engine?
Document oriented database
Inverted index
Ranking / weighting algorithm
Lucene
Java
Apache License
Low-level: Java API
Lucene Family
Solr: “REST-like” XML/JSON API
ElasticSearch: REST API
… and many commercial engines
Xapian
C++
GPLv2
Low-level: C++ API
Python/Ruby/PHP/Perl/Java bindings
Xapian
C++
GPLv2
Low-level: C++ API
Python/Ruby/PHP/Perl/Java bindings
Partiality
Risk
Xapian Family
Omega: Indexer + CGI interface
Flax: REST API
Xappy: Python wrapper
Sphinx
C++
GPLv2
SQL-like API
Others
Riak Search
Terrier
MySQL Fulltext
PostgreSQL FTS
Redis
Whoosh
Logos
Document model
Lucene, Xapian:
List of terms
Solr, Sphinx:
Fields in a predefined fixed schema.
Flax, Xappy:
Fields, with associated modifiable schema.
ElasticSearch:
Fields, document types, free schema.
Updates
Lucene, Xapian + families:
Dynamic updates
Use batches for fastest updates
Sphinx:
No updates to existing indexes
(“Realtime indexing” in beta with SQL API)
Data structures
Lucene:
Hash based segments
Heirarchical merge
Xapian:
B-tree, transactional
Scaling / replication
● All engines allow searches across databases
● Allows sharding
● All engines allow replication
● Allows spreading load and high availability
● Had difficulty with Sphinx
● Elastic search does it completely transparently
Commercial Support
Lucene: Lucid Imagination, Sematext, …
Xapian: Oligarchy Ltd, Flax, me
Sphinx: Sphinx Technologies Inc
● Lucene / Solr community – revolting (they say)
● Xapian – quieter, but steadily growing
● Sphinx – popular amongst relational database
users (apparently)
Community

Contenu connexe

Tendances

A importância da arquitetura de software
A importância da arquitetura de softwareA importância da arquitetura de software
A importância da arquitetura de software
Adriano Tavares
 
Active directory
Active directory Active directory
Active directory
deshvikas
 
Ferramenta de apoio a gerência de configuração de software
Ferramenta de apoio a gerência de configuração de softwareFerramenta de apoio a gerência de configuração de software
Ferramenta de apoio a gerência de configuração de software
elliando dias
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis Introduction
Alex Su
 

Tendances (20)

The Ldap Protocol
The Ldap ProtocolThe Ldap Protocol
The Ldap Protocol
 
Dns(Domain name system)
Dns(Domain name system)Dns(Domain name system)
Dns(Domain name system)
 
DNS - Domain Name System
DNS - Domain Name SystemDNS - Domain Name System
DNS - Domain Name System
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Domain Name System
Domain Name SystemDomain Name System
Domain Name System
 
A importância da arquitetura de software
A importância da arquitetura de softwareA importância da arquitetura de software
A importância da arquitetura de software
 
DHCP
DHCPDHCP
DHCP
 
Active directory
Active directory Active directory
Active directory
 
LISP: Introduction to lisp
LISP: Introduction to lispLISP: Introduction to lisp
LISP: Introduction to lisp
 
Ferramenta de apoio a gerência de configuração de software
Ferramenta de apoio a gerência de configuração de softwareFerramenta de apoio a gerência de configuração de software
Ferramenta de apoio a gerência de configuração de software
 
Facebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challengeFacebook architecture presentation: scalability challenge
Facebook architecture presentation: scalability challenge
 
BPEL, BPEL vs ESB (Integration)
BPEL, BPEL vs ESB (Integration)BPEL, BPEL vs ESB (Integration)
BPEL, BPEL vs ESB (Integration)
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.OVERVIEW  OF FACEBOOK SCALABLE ARCHITECTURE.
OVERVIEW OF FACEBOOK SCALABLE ARCHITECTURE.
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 8
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis Introduction
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 

En vedette

探索 Everything 背后的技术
探索 Everything 背后的技术探索 Everything 背后的技术
探索 Everything 背后的技术
yiwenshengmei
 
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf Conference
 

En vedette (12)

Xapian vs sphinx
Xapian vs sphinxXapian vs sphinx
Xapian vs sphinx
 
Solr vs ElasticSearch
Solr vs ElasticSearchSolr vs ElasticSearch
Solr vs ElasticSearch
 
探索 Everything 背后的技术
探索 Everything 背后的技术探索 Everything 背后的技术
探索 Everything 背后的技术
 
The Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphXThe Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphX
 
How to build_a_search_engine
How to build_a_search_engineHow to build_a_search_engine
How to build_a_search_engine
 
Elasticsearch cluster deep dive
Elasticsearch  cluster deep diveElasticsearch  cluster deep dive
Elasticsearch cluster deep dive
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Firefox OS de la théorie à la pratique - OSDC
Firefox OS de la théorie à la pratique - OSDCFirefox OS de la théorie à la pratique - OSDC
Firefox OS de la théorie à la pratique - OSDC
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Similaire à Comparing open source search engines

Java JSON Parser Comparison
Java JSON Parser ComparisonJava JSON Parser Comparison
Java JSON Parser Comparison
Allan Huang
 

Similaire à Comparing open source search engines (20)

Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015
 
A high profile project with Symfony and API Platform: beIN SPORTS
A high profile project with Symfony and API Platform: beIN SPORTSA high profile project with Symfony and API Platform: beIN SPORTS
A high profile project with Symfony and API Platform: beIN SPORTS
 
Salesforce Integration
Salesforce IntegrationSalesforce Integration
Salesforce Integration
 
Salesforce REST API
Salesforce  REST API Salesforce  REST API
Salesforce REST API
 
Solr -
Solr - Solr -
Solr -
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
 
Integration on Force.com Platform
Integration on Force.com PlatformIntegration on Force.com Platform
Integration on Force.com Platform
 
API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)
API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)
API Platform 2.1: when Symfony meets ReactJS (Symfony Live 2017)
 
Java JSON Parser Comparison
Java JSON Parser ComparisonJava JSON Parser Comparison
Java JSON Parser Comparison
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
 
The return of an old enemy
The return of an old enemyThe return of an old enemy
The return of an old enemy
 
LAJUG Napster REST API
LAJUG Napster REST APILAJUG Napster REST API
LAJUG Napster REST API
 
A Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF ProcessingA Comparison Between Python APIs For RDF Processing
A Comparison Between Python APIs For RDF Processing
 
API Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven ProjectsAPI Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven Projects
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Confluent and Elastic
Confluent and ElasticConfluent and Elastic
Confluent and Elastic
 
LAMP Stack Tutorial by jeetendra mandal
LAMP Stack Tutorial by  jeetendra mandalLAMP Stack Tutorial by  jeetendra mandal
LAMP Stack Tutorial by jeetendra mandal
 
Creating hypermedia APIs in a few minutes using the API Platform framework
Creating hypermedia APIs in a few minutes using the API Platform frameworkCreating hypermedia APIs in a few minutes using the API Platform framework
Creating hypermedia APIs in a few minutes using the API Platform framework
 
The Glory of Rest
The Glory of RestThe Glory of Rest
The Glory of Rest
 

Plus de Richard Boulton (8)

Improving relevance with log information
Improving relevance with log informationImproving relevance with log information
Improving relevance with log information
 
Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8Designing a generic Python Search Engine API - BarCampLondon 8
Designing a generic Python Search Engine API - BarCampLondon 8
 
Making a simple question into a complicated query
Making a simple question into a complicated queryMaking a simple question into a complicated query
Making a simple question into a complicated query
 
Interfaces to xapian
Interfaces to xapianInterfaces to xapian
Interfaces to xapian
 
Haystack
HaystackHaystack
Haystack
 
Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009Search as a Service with Xapian - Search Solutions 2009
Search as a Service with Xapian - Search Solutions 2009
 
Optimising Xapian
Optimising XapianOptimising Xapian
Optimising Xapian
 
The Xapian Open Source Search Engine
The Xapian Open Source Search EngineThe Xapian Open Source Search Engine
The Xapian Open Source Search Engine
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Comparing open source search engines