SlideShare une entreprise Scribd logo
1  sur  38
Elasticsearch
For BEGINNERS
Neil Baker
Software Engineer (Zapelin S.L.)
What is elasticsearch?
ElasticSearch is a free and open source distributed inverted index created by Shay Banon.
Build on top of Apache Lucene
- Lucene is a most popular java-based full text search index implementation.
First public release version v0.4 in February 2010.
Developed in Java, so inherently cross-plateform.
Which companies use elasticsearch?
Easy to scale
Everything is one JSON call away (RESTful API)
Unleashed power of Lucene under the hood
Excellent Query DSL
Multi-tenancy
Support for advanced search features (Full Text)
Configurable and Extensible
Document Oriented
Schema free
Conflict management
Active community
Why Elasticsearch?
Elasticsearch allows you to start small, but will grow
with your business. It is built to scale horizontally out
of the box.
As you need more capacity, just add more nodes, and
let the cluster reorganize itself to take advantage of
the extra hardware.
Easy to Scale
RESTful API
Elasticsearch is API driven. Almost any action can be performed using a simple
RESTful API using JSON over HTTP.  An API already exists in the language of your
choice.
Responses are always in JSON, which is both machine and human readable.
Excellent Query DSL
The REST API exposes a very complex and capable query DSL, that is very easy to use. Every query is just a
JSON object that can practically contain any type of query, or even several of them combined.
Using filtered queries, with some queries expressed as Lucene filters, helps leverage caching and thus speed
up common queries, or complex queries with parts that can be reused.
Faceting, another very common search feature, is just something that upon-request is accompanied to
search results, and then is ready for you to use.
Per-operation Persistence
Elasticsearch puts your data safety first. Document changes are recorded in
transaction logs on multiple nodes in the cluster to minimize the chance of any
data loss.
You can host multiple indexes on one Elasticsearch installation - node
or cluster. Each index can have multiple "types", which are essentially
completely different indexes.
The nice thing is you can query multiple types and multiple indexes
with one simple query. This opens quite a lot of options.
Multi-tenancy
Support for advanced search features (Full Text)
Elasticsearch uses Lucene under the covers to provide the most powerful full text search
capabilities available in any open source product.
Search comes with multi-language support, a powerful query language, support for
geolocation, context aware did-you-mean suggestions, autocomplete and search snippets.
script support in filters and scorers
Many of Elasticsearch configurations can be changed while Elasticsearch is running, but some will require a restart (and in some cases
reindexing). Most configurations can be changed using the REST API too.
Elasticsearch has several extension points - namely site plugins (let you serve static content from ES - like monitoring javascript apps),
rivers (for feeding data into Elasticsearch), and plugins that let you add modules or components within Elasticsearch itself. This allows you
to switch almost every part of Elasticsearch if so you choose, fairly easily.
If you need to create additional REST endpoints to your Elasticsearch cluster, that is easily done as well.
Configurable and Extensible
Document Oriented
Store complex real world entities in Elasticsearch as structured JSON
documents. All fields are indexed by default, and all the indices can be
used in a single query, to return results at breath taking speed.
Elasticsearch allows you to get started easily. Toss it a JSON document
and it will try to detect the data structure, index the data and make it
searchable. Later, apply your domain specific knowledge of your data
to customize how your data is indexed.
Schema free
Conflict management
Optimistic version control can be used where needed to ensure that data is
never lost due to conflicting changes from multiple processes.
Active community
The community, other than creating nice tools and plugins, is very
helpful and supporting. The overall vibe is really great, and this is an
important metric of any OSS project.
There are also some books currently being written by community
members, and many blog posts around the net sharing experiences
and knowledge
BASIC CONCEPTS
Cluster :
A cluster consistsofone or morenodeswhichsharethesamecluster name. Eachclusterhasasinglemasternode whichis chosenautomaticallyby
thecluster andwhichcan bereplacedif the currentmasternode fails.
Node :
A nodeisarunninginstanceofelasticsearchwhichbelongsto acluster. Multiplenodescanbestartedonasingleserverfor testingpurposes, but
usuallyyoushouldhaveone nodeper server.
Atstartup, anodewill useunicast(or multicast, if specified)to discoveranexistingcluster withthesameclusternameandwill tryto jointhatcluster.
Index :
Anindex is like a‘database’inarelationaldatabase. Ithas amappingwhichdefines multipletypes.
Anindex is alogicalnamespacewhichmapsto one or moreprimaryshardsandcanhavezero or morereplicashards.
Type :
A type islikea‘table’inarelationaldatabase. Each typehasalistoffields thatcanbespecifiedfordocuments of that type. The mappingdefines
howeachfieldinthedocumentis analyzed.
Document :
A documentisaJSONdocumentwhichis storedinelasticsearch. Itislike arowinatableinarelational database. Each documentisstoredinan
indexandhas atypeandan id.
A documentisaJSONobject(also knowninotherlanguages asa hash /hashmap/ associative array) whichcontains zeroor more fields, or key-
value pairs. Theoriginal JSONdocumentthatisindexedwillbestoredinthe_sourcefield, whichisreturnedbydefaultwhengettingor searching
for adocument.
Field :
A documentcontains alistoffields, or key-value pairs. Thevaluecanbeasimple(scalar)value(ega string, integer, date), or anestedstructurelike
an arrayoranobject. A fieldis similartoacolumnina table ina relationaldatabase.
The mappingfor eachfieldhas afield‘type’(notto be confusedwithdocumenttype)whichindicatesthetypeof data thatcanbe storedinthatfield,
eginteger, string, object. Themappingalso allows youto define(amongstother things) howthevalueforafieldshouldbe analyzed.
Mapping :
A mappingislikea‘schemadefinition’inarelational database. Eachindexhas amapping, whichdefineseachtype withintheindex, plus anumber
ofindex-widesettings.A mappingcaneither bedefinedexplicitly, or itwill begeneratedautomaticallywhenadocumentis indexed
Shard :
A shardisasingle Luceneinstance. Itis alow-level“worker” unitwhich is managedautomaticallybyelasticsearch. An indexisalogicalnamespace
whichpointstoprimaryandreplicashards.
Elasticsearchdistributes shards amongstallnodes in the cluster, andcanmove shardsautomaticallyfromone nodeto another inthecaseof node
failure, or theadditionof newnodes.
PrimaryShard :
Eachdocumentisstoredin asingleprimaryshard. Whenyouindexadocument, itisindexedfirston theprimaryshard, thenonallreplicasof the
primaryshard. Bydefault,anindex has5 primaryshards.Youcanspecifyfewer or more primaryshards to scalethenumber ofdocumentsthat
your indexcanhandle.
ReplicaShard :
Eachprimaryshardcanhavezero ormore replicas. A replicaisacopyof the primaryshard, andhastwo purposes:
1) increasefailover: areplicashardcanbepromotedto aprimaryshardiftheprimaryfails.
2) increaseperformance:get andsearchrequests canbehandledbyprimaryor replicashards.
ElasticSearch Routing
All of your data lives in a primary shard, somewhere in the cluster. You may have five shards or five hundred, but any particular
document is only located in one of them. Routing is the process of determining which shard that document will reside in.
Elasticsearch has no idea where to look for your document. All the docs were randomly distributed around your cluster. so
Elasticsearch has no choice but to broadcasts the request to all  shards. This is a non-negligible overhead and can easily impact
performance.
Wouldn’t it be nice if we could tell Elasticsearch which shard the document lived in? Then you would only have to search one shard
to find the document(s) that you need.
Routing ensures that all documents with the same routing value will locate to the same shard, eliminating the need to broadcast
searches.
Cluster Architecture
Index Request
Search Request
ELASTIC VS. MySQL
ElasticSearch
Indices
Types
Documents
Keys
MySQL
Database
Tables
Rows
Columns
Performance
Core i7, a 2Ghz, 8GB RAM, 128GB SSD)
Insert of 10 Mio. Datasets:
Elasticsearch: 23 Minutes
MySQL without index: 56 Minutes
MySQL with Index: 228 Minutes
Select name and firstname of 100 Entrys:
Elasticsearch: 5 ms
MySQL: 9 ms
Select of 100 full Entrys:
Elasticsearch: 5 ms
MySQL: 9 ms
Select of the next 100 full Entrys:
Elasticsearch: 4 ms
MySQL: 18 ms
INSTALL ELASTIC
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.1.tar.gz
tar -xzf elasticsearch-5.1.1.tar.gz
cd elasticsearch-5.1.1/
./bin/elasticsearch
Is it running?
GET http://localhost:9200/?pretty
Response :
{
"name": "Vivisector",
"cluster_name": "elasticsearch",
"version": {
"number": "2.3.3",
"build_hash": "218bdf10790eef486ff2c41a3df5cfa32dadcfde",
"build_timestamp": "2016-05-17T15:40:04Z",
"build_snapshot": false,
"lucene_version": "5.5.0"
},
"tagline": "You Know, for Search"
}
Let´s PLAY WITH
ELASTICSEARCH
Indexing a document
Request :
$ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{
"rank": 21,
"city": "Boston",
"state": "Massachusetts",
"population2010": 617594,
"land_area": 48.277,
"location": {
"lat": 42.332,
"lon": 71.0202 },
"abbreviation": "MA"
}‘
Response : {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":1}
Getting a document
Request:
$ curl -XGET "http://localhost:9200/test-data/cities/21?pretty"
Response:
{
"_index" : "test-data",
"_type" : "cities",
"_id" : "21",
"_version" : 1,
"exists" : true, "_source" : {
"rank": 21,
"city": "Boston",
"state": "Massachusetts",
"population2010": 617594,
"land_area": 48.277,
"location": {
"lat": 42.332,
"lon": 71.0202 },
"abbreviation": "MA"
}
}
Updating a document
Request :
$ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{
"rank": 21,
"city": "Boston",
"state": "Massachusetts",
"population2010": 617594,
"population2012": 636479,
"land_area": 48.277,
"location": {
"lat": 42.332,
"lon": 71.0202 },
"abbreviation": "MA"
}‘
Response : {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":2}
Searching
Searching and querying takes the format of: http://localhost:9200/[index]/[type]/[operation]
Search across all indexes and all types
http://localhost:9200/_search
Search across all types in the test-data index.
http://localhost:9200/test-data/_search
Search explicitly for documents of type cities within the test-data index.
http://localhost:9200/test-data/cities/_search
Search explicitly for documents of type cities within the test-data index using paging.
http://localhost:9200/test-data/cities/_search?size=5&from=10
There’s3 differenttypesofsearchqueries
 Full Text Search (query string)
 Structured Search (filter)
 Analytics (facets)
Full Text Search (query string)
Inthiscaseyouwillbe searchinginbitsofnaturallanguage for (partially) matchingquerystrings. TheQueryDSL
alternativefor searchingfor“Boston” inall documents, wouldlooklike:
Request:
$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{
“query": { “query_string": { “query": “boston" }}}’
Response: {
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0 },
"hits" : {
"total" : 1,
"max_score" : 6.1357985,
"hits" : [ {
"_index" : "test-data",
"_type" : "cities",
"_id" : "21",
"_score" : 6.1357985, "_source" : {"rank":"21","city":"Boston",...}
} ]
}
}...
Structured Search (filter)
Structuredsearchis aboutinterrogatingdata thathasinherentstructure. Dates, timesandnumbersareall structured—theyhave apreciseformatthatyou   
canperformlogicaloperations on. Commonoperations includecomparingrangesofnumbersor dates, or determiningwhichof two values is larger.
Withstructuredsearch, the answerto your questionis always ayes or no; somethingeither belongsinthesetor itdoes not. Structuredsearchdoes not
worryaboutdocumentrelevance or scoring—itsimplyincludes or excludesdocuments.   
Request:
$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{
“query": { “filtered": { “filter”: { “term": { “city” : “boston“ }}}}}’
$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{
"query": {
"range": {
"population2012": {
"from": 500000,
"to": 1000000
}}}}‘
$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{
"query": { "bool": { "should": [{ "match": { "state": "Texas"} }, {"match": { "state":
"California"} }],
"must": { "range": { "population2012": { "from": 500000, "to": 1000000 } } },
"minimum_should_match": 1}}}'
Analytics (facets)
Requestsofthistypewillnotreturnalistofmatchingdocuments,butastatisticalbreakdownof thedocuments.
Elasticsearchhasfunctionalitycalledaggregations,whichallowsyoutogeneratesophisticatedanalyticsoveryourdata.ItissimilartoGROUPBYinSQL.
Request:
$ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{
“aggs": { “all_states": { “terms“: { “field” : “state“ }}}}’
Response:
{ ...
"hits": { ... },
"aggregations": {
"all_states": {
"buckets": [
{"key": "massachusetts ", "doc_count": 2},
{"key": "danbury", "doc_count": 1}
]
}}}
ElasticSearch Monitoring
ElasticSearch-Head - https://github.com/mobz/elasticsearch-head
Marvel - http://www.elasticsearch.org/guide/en/marvel/current/#_marvel_8217_s_dashboards
Paramedic - https://github.com/karmi/elasticsearch-paramedic
Bigdesk - https://github.com/lukas-vlcek/bigdesk/
ElasticSearch Limitations
Security : ElasticSearchdoesnotprovideanybuild-in authenticationor accesscontrolfunctionality.
Transactions : There is no muchmoresupportfor transactions or processingondatamanipulation.
Durability : ESisdistributedandfairlystablebutbackupsanddurabilityarenotashighpriorityas inotherdatastores
Large Computations: Commandsfor searchingdataare notsuitedto"large"scansof data andadvancedcomputationonthe dbside.
Data Availability : ESmakesdataavailable in "near real-time" whichmayrequireadditional considerationsinyour application(ie:
commentspagewhereauser addsnewcomment,refreshingthepage mightnotactuallyshowthenewpostbecausetheindexisstill
updating).
Open-Source Libraries
https://github.com/elasticsearch
http://stackoverflow.com/questions/tagged/elasticsearch
stackoverflow
Get started.
www.elasticsearch.org
About me
Founder and Technical Director of Terions Communication LTD (London/Berlin, 1996-2006)
- Datacenter Operator for Press and Image Agencys and Part of the DPA (German Press Agency)
Software Developer Zapelin S.L (Adeje)
- Development of IPTV Solutions for Hotels e.g.
RHCE - Red Hat Certified Engineer
CCDP - Cisco Certified Design Professional
DBA - Oracle Certified Professional, MySQL 5 Database Administrator
Contact for Questions: neil@cconnect.es
Elasticsearch for beginners

Contenu connexe

Tendances

ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearchFadel Chafai
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Upfoundsearch
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseRicha Budhraja
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack PresentationAmr Alaa Yassen
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack IntroductionVikram Shinde
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search medcl
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchIsmaeel Enjreny
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search WalkthroughSuhel Meman
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearchMinsoo Jun
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 

Tendances (20)

ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearch
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Elasticsearch V/s Relational Database
Elasticsearch V/s Relational DatabaseElasticsearch V/s Relational Database
Elasticsearch V/s Relational Database
 
Elk
Elk Elk
Elk
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elastic search Walkthrough
Elastic search WalkthroughElastic search Walkthrough
Elastic search Walkthrough
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Elk - An introduction
Elk - An introductionElk - An introduction
Elk - An introduction
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 

En vedette

What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5Idan Tohami
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearchclintongormley
 
Sysmana 2017 monitorización de logs con el stack elk
Sysmana 2017   monitorización de logs con el stack elkSysmana 2017   monitorización de logs con el stack elk
Sysmana 2017 monitorización de logs con el stack elkJosé Ignacio Álvarez Ruiz
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutesDavid Pilato
 
01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started01 ElasticSearch : Getting Started
01 ElasticSearch : Getting StartedOpenThink Labs
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Karel Minarik
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민NAVER D2
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Oleksiy Panchenko
 
Introduction To Kibana
Introduction To KibanaIntroduction To Kibana
Introduction To KibanaJen Stirrup
 
A la recherche d'ElasticSearch
A la recherche d'ElasticSearchA la recherche d'ElasticSearch
A la recherche d'ElasticSearchNinnir
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeJames Turnbull
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Elasitcsearch + Logstash + Kibana 日誌監控
Elasitcsearch + Logstash + Kibana 日誌監控Elasitcsearch + Logstash + Kibana 日誌監控
Elasitcsearch + Logstash + Kibana 日誌監控Jui An Huang (黃瑞安)
 
Kibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stackKibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stackSylvain Wallez
 
Scaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with ElasticsearchScaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with Elasticsearchclintongormley
 

En vedette (20)

What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearch
 
Sysmana 2017 monitorización de logs con el stack elk
Sysmana 2017   monitorización de logs con el stack elkSysmana 2017   monitorización de logs con el stack elk
Sysmana 2017 monitorización de logs con el stack elk
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
 
01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started01 ElasticSearch : Getting Started
01 ElasticSearch : Getting Started
 
Elatic{on}'16 recap
Elatic{on}'16 recapElatic{on}'16 recap
Elatic{on}'16 recap
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
Elasticsearch 5.0
Elasticsearch 5.0Elasticsearch 5.0
Elasticsearch 5.0
 
Introduction To Kibana
Introduction To KibanaIntroduction To Kibana
Introduction To Kibana
 
A la recherche d'ElasticSearch
A la recherche d'ElasticSearchA la recherche d'ElasticSearch
A la recherche d'ElasticSearch
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesome
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Elasitcsearch + Logstash + Kibana 日誌監控
Elasitcsearch + Logstash + Kibana 日誌監控Elasitcsearch + Logstash + Kibana 日誌監控
Elasitcsearch + Logstash + Kibana 日誌監控
 
Kibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stackKibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stack
 
Scaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with ElasticsearchScaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with Elasticsearch
 

Similaire à Elasticsearch for beginners

ElasticSearch Getting Started
ElasticSearch Getting StartedElasticSearch Getting Started
ElasticSearch Getting StartedOnuralp Taner
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیEhsan Asgarian
 
Elastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptxElastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptxKnoldus Inc.
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingInexture Solutions
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and ElasticsearchDean Hamstead
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with railsTom Z Zeng
 
Introduction to ElasticSearch
Introduction to ElasticSearchIntroduction to ElasticSearch
Introduction to ElasticSearchManav Shrivastava
 
Elasticsearch: An Overview
Elasticsearch: An OverviewElasticsearch: An Overview
Elasticsearch: An OverviewRuby Shrestha
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!Alex Kursov
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and SparkAudible, Inc.
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearchJoey Wen
 
ElasticSearch Basics
ElasticSearch BasicsElasticSearch Basics
ElasticSearch BasicsAmresh Singh
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsTiziano Fagni
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solrmacrochen
 

Similaire à Elasticsearch for beginners (20)

Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 
ElasticSearch Getting Started
ElasticSearch Getting StartedElasticSearch Getting Started
ElasticSearch Getting Started
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
Elastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptxElastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptx
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with rails
 
Introduction to ElasticSearch
Introduction to ElasticSearchIntroduction to ElasticSearch
Introduction to ElasticSearch
 
The Power of Elasticsearch
The Power of ElasticsearchThe Power of Elasticsearch
The Power of Elasticsearch
 
Elasticsearch: An Overview
Elasticsearch: An OverviewElasticsearch: An Overview
Elasticsearch: An Overview
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 
Elastic search
Elastic searchElastic search
Elastic search
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 
ElasticSearch Basics
ElasticSearch BasicsElasticSearch Basics
ElasticSearch Basics
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analytics
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 

Dernier

modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 

Dernier (20)

modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 

Elasticsearch for beginners

  • 2. What is elasticsearch? ElasticSearch is a free and open source distributed inverted index created by Shay Banon. Build on top of Apache Lucene - Lucene is a most popular java-based full text search index implementation. First public release version v0.4 in February 2010. Developed in Java, so inherently cross-plateform.
  • 3. Which companies use elasticsearch?
  • 4. Easy to scale Everything is one JSON call away (RESTful API) Unleashed power of Lucene under the hood Excellent Query DSL Multi-tenancy Support for advanced search features (Full Text) Configurable and Extensible Document Oriented Schema free Conflict management Active community Why Elasticsearch?
  • 5. Elasticsearch allows you to start small, but will grow with your business. It is built to scale horizontally out of the box. As you need more capacity, just add more nodes, and let the cluster reorganize itself to take advantage of the extra hardware. Easy to Scale RESTful API Elasticsearch is API driven. Almost any action can be performed using a simple RESTful API using JSON over HTTP.  An API already exists in the language of your choice. Responses are always in JSON, which is both machine and human readable.
  • 6. Excellent Query DSL The REST API exposes a very complex and capable query DSL, that is very easy to use. Every query is just a JSON object that can practically contain any type of query, or even several of them combined. Using filtered queries, with some queries expressed as Lucene filters, helps leverage caching and thus speed up common queries, or complex queries with parts that can be reused. Faceting, another very common search feature, is just something that upon-request is accompanied to search results, and then is ready for you to use. Per-operation Persistence Elasticsearch puts your data safety first. Document changes are recorded in transaction logs on multiple nodes in the cluster to minimize the chance of any data loss.
  • 7. You can host multiple indexes on one Elasticsearch installation - node or cluster. Each index can have multiple "types", which are essentially completely different indexes. The nice thing is you can query multiple types and multiple indexes with one simple query. This opens quite a lot of options. Multi-tenancy Support for advanced search features (Full Text) Elasticsearch uses Lucene under the covers to provide the most powerful full text search capabilities available in any open source product. Search comes with multi-language support, a powerful query language, support for geolocation, context aware did-you-mean suggestions, autocomplete and search snippets. script support in filters and scorers
  • 8. Many of Elasticsearch configurations can be changed while Elasticsearch is running, but some will require a restart (and in some cases reindexing). Most configurations can be changed using the REST API too. Elasticsearch has several extension points - namely site plugins (let you serve static content from ES - like monitoring javascript apps), rivers (for feeding data into Elasticsearch), and plugins that let you add modules or components within Elasticsearch itself. This allows you to switch almost every part of Elasticsearch if so you choose, fairly easily. If you need to create additional REST endpoints to your Elasticsearch cluster, that is easily done as well. Configurable and Extensible Document Oriented Store complex real world entities in Elasticsearch as structured JSON documents. All fields are indexed by default, and all the indices can be used in a single query, to return results at breath taking speed.
  • 9. Elasticsearch allows you to get started easily. Toss it a JSON document and it will try to detect the data structure, index the data and make it searchable. Later, apply your domain specific knowledge of your data to customize how your data is indexed. Schema free Conflict management Optimistic version control can be used where needed to ensure that data is never lost due to conflicting changes from multiple processes. Active community The community, other than creating nice tools and plugins, is very helpful and supporting. The overall vibe is really great, and this is an important metric of any OSS project. There are also some books currently being written by community members, and many blog posts around the net sharing experiences and knowledge
  • 11. Cluster : A cluster consistsofone or morenodeswhichsharethesamecluster name. Eachclusterhasasinglemasternode whichis chosenautomaticallyby thecluster andwhichcan bereplacedif the currentmasternode fails. Node : A nodeisarunninginstanceofelasticsearchwhichbelongsto acluster. Multiplenodescanbestartedonasingleserverfor testingpurposes, but usuallyyoushouldhaveone nodeper server. Atstartup, anodewill useunicast(or multicast, if specified)to discoveranexistingcluster withthesameclusternameandwill tryto jointhatcluster. Index : Anindex is like a‘database’inarelationaldatabase. Ithas amappingwhichdefines multipletypes. Anindex is alogicalnamespacewhichmapsto one or moreprimaryshardsandcanhavezero or morereplicashards. Type : A type islikea‘table’inarelationaldatabase. Each typehasalistoffields thatcanbespecifiedfordocuments of that type. The mappingdefines howeachfieldinthedocumentis analyzed.
  • 12. Document : A documentisaJSONdocumentwhichis storedinelasticsearch. Itislike arowinatableinarelational database. Each documentisstoredinan indexandhas atypeandan id. A documentisaJSONobject(also knowninotherlanguages asa hash /hashmap/ associative array) whichcontains zeroor more fields, or key- value pairs. Theoriginal JSONdocumentthatisindexedwillbestoredinthe_sourcefield, whichisreturnedbydefaultwhengettingor searching for adocument. Field : A documentcontains alistoffields, or key-value pairs. Thevaluecanbeasimple(scalar)value(ega string, integer, date), or anestedstructurelike an arrayoranobject. A fieldis similartoacolumnina table ina relationaldatabase. The mappingfor eachfieldhas afield‘type’(notto be confusedwithdocumenttype)whichindicatesthetypeof data thatcanbe storedinthatfield, eginteger, string, object. Themappingalso allows youto define(amongstother things) howthevalueforafieldshouldbe analyzed. Mapping : A mappingislikea‘schemadefinition’inarelational database. Eachindexhas amapping, whichdefineseachtype withintheindex, plus anumber ofindex-widesettings.A mappingcaneither bedefinedexplicitly, or itwill begeneratedautomaticallywhenadocumentis indexed
  • 13. Shard : A shardisasingle Luceneinstance. Itis alow-level“worker” unitwhich is managedautomaticallybyelasticsearch. An indexisalogicalnamespace whichpointstoprimaryandreplicashards. Elasticsearchdistributes shards amongstallnodes in the cluster, andcanmove shardsautomaticallyfromone nodeto another inthecaseof node failure, or theadditionof newnodes. PrimaryShard : Eachdocumentisstoredin asingleprimaryshard. Whenyouindexadocument, itisindexedfirston theprimaryshard, thenonallreplicasof the primaryshard. Bydefault,anindex has5 primaryshards.Youcanspecifyfewer or more primaryshards to scalethenumber ofdocumentsthat your indexcanhandle. ReplicaShard : Eachprimaryshardcanhavezero ormore replicas. A replicaisacopyof the primaryshard, andhastwo purposes: 1) increasefailover: areplicashardcanbepromotedto aprimaryshardiftheprimaryfails. 2) increaseperformance:get andsearchrequests canbehandledbyprimaryor replicashards.
  • 14. ElasticSearch Routing All of your data lives in a primary shard, somewhere in the cluster. You may have five shards or five hundred, but any particular document is only located in one of them. Routing is the process of determining which shard that document will reside in. Elasticsearch has no idea where to look for your document. All the docs were randomly distributed around your cluster. so Elasticsearch has no choice but to broadcasts the request to all  shards. This is a non-negligible overhead and can easily impact performance. Wouldn’t it be nice if we could tell Elasticsearch which shard the document lived in? Then you would only have to search one shard to find the document(s) that you need. Routing ensures that all documents with the same routing value will locate to the same shard, eliminating the need to broadcast searches.
  • 20. Performance Core i7, a 2Ghz, 8GB RAM, 128GB SSD) Insert of 10 Mio. Datasets: Elasticsearch: 23 Minutes MySQL without index: 56 Minutes MySQL with Index: 228 Minutes Select name and firstname of 100 Entrys: Elasticsearch: 5 ms MySQL: 9 ms Select of 100 full Entrys: Elasticsearch: 5 ms MySQL: 9 ms Select of the next 100 full Entrys: Elasticsearch: 4 ms MySQL: 18 ms
  • 22. wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.1.tar.gz tar -xzf elasticsearch-5.1.1.tar.gz cd elasticsearch-5.1.1/ ./bin/elasticsearch
  • 23. Is it running? GET http://localhost:9200/?pretty Response : { "name": "Vivisector", "cluster_name": "elasticsearch", "version": { "number": "2.3.3", "build_hash": "218bdf10790eef486ff2c41a3df5cfa32dadcfde", "build_timestamp": "2016-05-17T15:40:04Z", "build_snapshot": false, "lucene_version": "5.5.0" }, "tagline": "You Know, for Search" }
  • 25. Indexing a document Request : $ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }‘ Response : {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":1}
  • 26. Getting a document Request: $ curl -XGET "http://localhost:9200/test-data/cities/21?pretty" Response: { "_index" : "test-data", "_type" : "cities", "_id" : "21", "_version" : 1, "exists" : true, "_source" : { "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" } }
  • 27. Updating a document Request : $ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "population2012": 636479, "land_area": 48.277, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }‘ Response : {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":2}
  • 28. Searching Searching and querying takes the format of: http://localhost:9200/[index]/[type]/[operation] Search across all indexes and all types http://localhost:9200/_search Search across all types in the test-data index. http://localhost:9200/test-data/_search Search explicitly for documents of type cities within the test-data index. http://localhost:9200/test-data/cities/_search Search explicitly for documents of type cities within the test-data index using paging. http://localhost:9200/test-data/cities/_search?size=5&from=10 There’s3 differenttypesofsearchqueries  Full Text Search (query string)  Structured Search (filter)  Analytics (facets)
  • 29. Full Text Search (query string) Inthiscaseyouwillbe searchinginbitsofnaturallanguage for (partially) matchingquerystrings. TheQueryDSL alternativefor searchingfor“Boston” inall documents, wouldlooklike: Request: $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{ “query": { “query_string": { “query": “boston" }}}’ Response: { "took" : 5, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 6.1357985, "hits" : [ { "_index" : "test-data", "_type" : "cities", "_id" : "21", "_score" : 6.1357985, "_source" : {"rank":"21","city":"Boston",...} } ] } }...
  • 30. Structured Search (filter) Structuredsearchis aboutinterrogatingdata thathasinherentstructure. Dates, timesandnumbersareall structured—theyhave apreciseformatthatyou    canperformlogicaloperations on. Commonoperations includecomparingrangesofnumbersor dates, or determiningwhichof two values is larger. Withstructuredsearch, the answerto your questionis always ayes or no; somethingeither belongsinthesetor itdoes not. Structuredsearchdoes not worryaboutdocumentrelevance or scoring—itsimplyincludes or excludesdocuments.    Request: $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{ “query": { “filtered": { “filter”: { “term": { “city” : “boston“ }}}}}’ $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "range": { "population2012": { "from": 500000, "to": 1000000 }}}}‘ $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "bool": { "should": [{ "match": { "state": "Texas"} }, {"match": { "state": "California"} }], "must": { "range": { "population2012": { "from": 500000, "to": 1000000 } } }, "minimum_should_match": 1}}}'
  • 31. Analytics (facets) Requestsofthistypewillnotreturnalistofmatchingdocuments,butastatisticalbreakdownof thedocuments. Elasticsearchhasfunctionalitycalledaggregations,whichallowsyoutogeneratesophisticatedanalyticsoveryourdata.ItissimilartoGROUPBYinSQL. Request: $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty=true" -d '{ “aggs": { “all_states": { “terms“: { “field” : “state“ }}}}’ Response: { ... "hits": { ... }, "aggregations": { "all_states": { "buckets": [ {"key": "massachusetts ", "doc_count": 2}, {"key": "danbury", "doc_count": 1} ] }}}
  • 32. ElasticSearch Monitoring ElasticSearch-Head - https://github.com/mobz/elasticsearch-head Marvel - http://www.elasticsearch.org/guide/en/marvel/current/#_marvel_8217_s_dashboards Paramedic - https://github.com/karmi/elasticsearch-paramedic Bigdesk - https://github.com/lukas-vlcek/bigdesk/
  • 33. ElasticSearch Limitations Security : ElasticSearchdoesnotprovideanybuild-in authenticationor accesscontrolfunctionality. Transactions : There is no muchmoresupportfor transactions or processingondatamanipulation. Durability : ESisdistributedandfairlystablebutbackupsanddurabilityarenotashighpriorityas inotherdatastores Large Computations: Commandsfor searchingdataare notsuitedto"large"scansof data andadvancedcomputationonthe dbside. Data Availability : ESmakesdataavailable in "near real-time" whichmayrequireadditional considerationsinyour application(ie: commentspagewhereauser addsnewcomment,refreshingthepage mightnotactuallyshowthenewpostbecausetheindexisstill updating).
  • 37. About me Founder and Technical Director of Terions Communication LTD (London/Berlin, 1996-2006) - Datacenter Operator for Press and Image Agencys and Part of the DPA (German Press Agency) Software Developer Zapelin S.L (Adeje) - Development of IPTV Solutions for Hotels e.g. RHCE - Red Hat Certified Engineer CCDP - Cisco Certified Design Professional DBA - Oracle Certified Professional, MySQL 5 Database Administrator Contact for Questions: neil@cconnect.es

Notes de l'éditeur

  1. <number>
  2. <number>
  3. <number>
  4. <number>
  5. <number>
  6. <number>
  7. <number>
  8. <number>
  9. <number>
  10. <number>
  11. <number>
  12. <number>