Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Elasticsearch

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Prochain SlideShare
Elasticsearch
Elasticsearch
Chargement dans…3
×

Consultez-les par la suite

1 sur 42 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Elasticsearch (20)

Publicité

Plus récents (20)

Publicité

Elasticsearch

  1. 1. Natixis Open Day 2018 Ricardo Peres @rjperes75
  2. 2. Natixis Open Day 2018
  3. 3. About Me Team Leader at Simplifydigital (Aveiro) Microsoft MVP Blogger Book author
  4. 4. Introduction NoSQL database for indexing JSON contents Documents are indexed as they are added (< 1s) Schema-less (kind of…) Distributed High performance REST semantics (GET, POST, PUT, DELETE) Based on Lucene Part of the ELK stack Open source (comercial license too)
  5. 5. Key Scenarios Distributed logging Document indexing Inexact searches Custom relevance scoring Alarms (via plugin)
  6. 6. Concepts: Cluster A collection of servers (nodes) running Elasticsearch Single master Multicast based discovery (can be explicit)
  7. 7. Concepts: Indexes A collection of types (deprecated as of v6) Similar to a database Will store documents directly from v7
  8. 8. Concepts: Shards Indexes are distributed by shards – default is 5 shards and 1 replica (cluster) Defined at index creation time Transparent to the user It is possible to define a hashing function
  9. 9. Concepts: Types Collection of documents Has a schema (implicit or explicit) Similar to a table Deprecated as of v6, will be around until v7
  10. 10. Concepts: Documents Self-contained data Exist in a type (or index as of v6) Have an id – explicitly or automatically set Have a version Have a schema Can have expiration
  11. 11. Concepts: Fields Documents are structured in fields Special fields: _id, _uid, _index, _type _timestamp, _all, _source, _ttl, _meta, _parent, _routing are optional A field can be analyzed multiple times with different settings
  12. 12. Data Types  text, keyword (full-text analyzed, keyword not analyzed)  long, integer, short, byte, double, float, half_float, scaled_float  date  boolean  binary  integer_range, float_range, double_range, date_range  geo_point  geo_shape  ip  completion  token_count  percolator  Array  object  nested
  13. 13. Creating a Document Auto Id POST blog/_doc { "title" : "My Blog", "url" : "http://my/blog", "tags" : [ "development" ] } Explicit Id POST blog/_doc/1 { "title" : "My Blog", "url" : "http://my/blog", "tags" : [ " development " ] } The result will contain the new document
  14. 14. Updating a Document (1 of 3) Partial POST website/blog/1/_update { "doc" : { "tags" : [ "testing" ], "views" : 1 } } Full POST website/blog/1/_update { "title" : "My Blog", "url" : "http://my/blog", "tags" : [ "testing" ] } Updating a document increments its version
  15. 15. Updating a Document (2 of 3) Scripted POST post/_doc/1/_update { "script" : "ctx._source.views += params.count", "params" : { "count" : 4 } } Upsert POST post/_doc/1/_update { "script" : "ctx._source.views += 1", "upsert" : { "views" : 1, "url": "…" } }
  16. 16. Updating a Document (3 of 3) Update By Query POST post/_update_by_query { "script": "ctx._source.views++", "query": { "term": { "content": "updated" } } }
  17. 17. Deleting a Document / Index Index DELETE website Document By Id DELETE website/blog/1 Document By Query POST post/_doc/_delete_by_query { "query" : { "match" : { "views" : 0 } } }
  18. 18. Mappings (1 of 2) Created at index or type level implicitly (dynamic) or explicitly Cannot modify, only add Can enforce schema or not PUT blog { "mappings": { "_doc": { " _timestamp": { "enabled" : true }, "dynamic" : "strict", "properties": { "title": { "type": "text", "analyzer": "standard" } } }
  19. 19. Mappings (2 of 2) A field can be indexed multiple times PUT post { "mappings": { "_doc": { "properties": { "title": { "type": "text", "fields": { "keyword": { "type": "keyword" } } } } } } }
  20. 20. Mapping Templates Automatically apply mappings to new types PUT website { "mappings": { "post": { "dynamic_templates": [ { "timestamp": { "date_detection": true, "dynamic_date_formats": [ "yyyy-MM-dd HH:mm", "yyyy-MM-dd" ], "match": "timestamp", "match_mapping_type": "date", "mapping": { "type": "date", "format" : "yyyy-MM-dd HH:mm" } } } ] } } }
  21. 21. Query and Filter Context Queries: scoring of the results Filters determine what appears in the results Are cached
  22. 22. Querying Search API  Uses the URL  Starting with <index> and <type> is optional <index>/<type>/_search?q=some thing <index>/<type1>,<type2>/_searc h?q=something _search?q=something _search?q:field:value _search?q=+firstname(john mary)&-surname:smith Query DSL  Query and filter context  simple_query_string, query_string, match, term, terms, range, multi_match, match_phrase, missing, exists, regexp, fuzzy, prefix, ids  bool, dis_max  more_like_this, script, template
  23. 23. Pagination, Sorting and Projection Page size, index: size, from Ordering: sort Projections: _source POST website/post/_search { "size" : 10, "from" : 0, "sort" : { "timestamp" : { "order" : "desc" } }, "_source" : [ "title", "_id" ] }
  24. 24. Script Fields GET post/_search { "_source": "content", "script_fields": { "visits_string": { "script": "doc['views'].value + ' views'" } } }
  25. 25. Percolator (1 of 3) Search in reverse: first define the query, then add documents to it Querying a document gives all percolator queries that it matches
  26. 26. Percolator (2 of 3) PUT percolator { "mappings": { "_doc": { "properties": { "message": { "type": "text" }, "query": { "type": "percolator" } } } } } PUT percolator/_doc/1?refresh { "query" : { "match" : { "message" : "bonsai tree" } } } PUT percolator/_doc/2?refresh { "query" : { "match" : { "message" : "orange tree" } } }
  27. 27. Percolator (3 of 3) GET percolator/_search { "query" : { "percolate" : { "field" : "query", "document" : { "message" : "bonsai tree" } } } }
  28. 28. Relevance (1 of 5) Term Frequency/Inverse Document Frequency/Field Length Norm Custom scores A match hit/miss can be explained
  29. 29. Relevance (2 of 5) Boosting: having this value is better GET sales/products/_search { "query": { "bool": { "should": [ { "term": { "size.keyword": { "value": "M", "boost": 10.0 } } }, { "term": { "size.keyword": "L" }
  30. 30. Relevance (3 of 5) Linear, Gauss, Exp: value X is better, all others decay GET sales/products/_search { "query": { "function_score": { "linear": { "price": { "origin": 0, "scale": 1 } } } } }
  31. 31. Relevance (4 of 5) Field Value Factor: give boost to a value GET website/post/_search { "query": { "function_score": { "field_value_factor": { "field": "likes", "factor": 10 } } } }
  32. 32. Relevance (5 of 5) Score Functions: custom function GET sales/products/_search { "query": { "function_score": { "boost": "5", "functions": [ { "filter": { "match": { "size": "M" } }, "weight": 10 }, { "filter": { "match": { "size": "L" } },
  33. 33. Index Aliases Used to refer to one or more indexes, one or more types, possibly with a filter Useful for "moving indexes" (month, year, country, etc) One index can be in many aliases POST _aliases { "actions" : [ { "add" : { "indices" : [ "social-2015", "social-2016" ], "alias" : "social-testing", "filter" : { "term" : { "tag" : "testing" } } } } ] }
  34. 34. Alias Templates Creates an alias when a type is created POST _template/social { "order": 0, "template": "social-*", "settings": { "index": { "refresh_interval": "5s" } }, "mappings": {}, "aliases": { "social": {} } }
  35. 35. Bulk Operations Perform multiple operations (index, update, delete) at once POST bulk/data/_bulk { "index" : { "_id" : "1" } } { "field1" : "value1" } { "index" : { "_id" : "2" } } { "field1" : "value1" } { "index" : { "_id" : "3" } } { "field1" : "value1" } { "update" : { "_id" : "2" } } { "doc": { "field2": "value2" } } { "delete" : { "_id" : "3" } }
  36. 36. Analytics Aggregations Can be nested Can use scripts GET sales/products/_search { "aggs": { "all_brands": { "terms": { "field": "brand.keyword" }, "aggs": { "average_price": { "terms": { "field": "price" } } } } }
  37. 37. APIs REST (native) .NET JavaScript/Node.js Python Java Groovy PHP Perl Ruby
  38. 38. Kibana Reporting Dashboards
  39. 39. Logstash Collect and transform data Input – Filters – Outputs Sources/destinations:  Elasticsearch  File  Syslog  Windows Eventlog  Redis  RabbitMQ  GitHub  HTTP  Beats  Twitter  WebSocket  …
  40. 40. Beats Data shipping Shippers:  Filebeat  Metricbeat  Packetbeat  Winlogbeat  Auditbeat  Heartbeat
  41. 41. References  https://www.elastic.co  https://www.elastic.co/guide/en/kibana/current/tutorial- load-dataset.html  https://www.gitbook.com/book/allen8807/elasticsearch- definitive-guide-en/details  https://github.com/elastic/cookbook-elasticsearch  https://github.com/elastic/elasticsearch-net  https://github.com/elastic/kibana  https://github.com/elastic/logstash  https://github.com/elastic/elasticsearch  https://github.com/elastic/beats  http://joelabrahamsson.com/elasticsearch-101
  42. 42. Thank you Thank you for attending! @rjperes75 rjperes@hotmail.com http://weblogs.asp.net/ricardoperes

Notes de l'éditeur

  • https://www.elastic.co/guide/en/elasticsearch/reference/6.2/mapping-types.html
    https://www.elastic.co/blog/strings-are-dead-long-live-strings
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html
    https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/6.2/dynamic-templates.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-source-filtering.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-using.html
    https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-percolate-query.html
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#function-decay
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#function-field-value-factor

×