Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Elastic meetup june16

Slides from Elastic Barcelona Meetup June16

  • Identifiez-vous pour voir les commentaires

Elastic meetup june16

  1. 1. 1 Miguel Bosin Support Engineer, @miguelbosin Hot/Warm Architecture + Sizing
  2. 2. 2 Intro Int • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it? – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  3. 3. 3 Intro • Miguel Bosin – Support engineer – Joined in 2015 – Interested in techonology – Passion about support • Elastic – Founded in 2012 – Distributed company – Elasticsearch: What’s it – Open-source: ES,LS,Kibana and Beats – Commercial: X-Pack
  4. 4. 4 What is it?  Open source  Distributed-scalable  Highly available  Document-oriented (JSON)  RESTful  FT search engine with real- time search and analytics capabilities
  5. 5. 5 Agenda Elastic overview1 Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture2
  6. 6. 6 Elastic current’s products overview
  7. 7. 7 Agenda Elastic overview Sizing introduction3 Hot/Warm architecture4 Elasticsearch basic architecture 1 2
  8. 8. 8 Elasticsearch terminology  A node is a single Elasticsearch instance, a single JVM  Multiple nodes can form a cluster  A cluster or a node can manage multiple indices  An index is a container for data  A shard is a single piece of an Elasticsearch index  A shard is either a primary or a replica
  9. 9. 9 Elasticsearch terminology II
  10. 10. 10 Elasticsearch terminology III
  11. 11. 11 Elasticsearch Architecture: Node roles Master node:  coordinates the cluster  only node able to apply changes to cluster state  publishes updated cluster state to all nodes Data node:  performs indexing  can allocate shards locally  knows cluster state
  12. 12. 12 Elasticsearch Architecture: Node roles II Client node:  does NOT perform indexing or allocate shards locally  does NOT perform cluster management operations  knows cluster state  smart load balancer (load balancing Kibana searches i.e.)  redirect operations to the nodes that holds the relevant data  calculate aggregations results
  13. 13. 13 Nodes roles are set in the elasticsearch.yml Elasticsearch Architecture: Node roles III
  14. 14. 14 Architecture: node roles
  15. 15. 15 Architecture: node roles
  16. 16. 16 Architecture special case: dedicated master nodes
  17. 17. 17 Dedicated master nodes –Why / minimum_master_nodes  Indexing and searching data is CPU-, memory-, and I/O-intensive work which can put pressure on a node’s resources  Avoiding split brain: 2 current master nodes on the same cluster DATA LOSS  Set this setting discovery.zen.minimum_master_nodes to the quorum: (master_eligible_nodes / 2) + 1
  18. 18. 18 Agenda Elastic overview Sizing introduction Hot/Warm architecture4 Elasticsearch basic architecture 1 3 2
  19. 19. 19 Sizing: general factors (server capacity) • Disks (SSD vs. HD) • RAM -1/2 total RAM for ES -ES heap size max: 30.5Gb • # CPU cores -ES threadpools concept **1 shard—>gets 1 thread—>1 java process—>1core**
  20. 20. 20 Sizing: Elasticsearch factors (logging case)  Size of shards  Number of shards on each node  Retention period of data  Mapping configuration  -Which fields are searchable, _source enabled or not,etc…  Size (average) of the documents
  21. 21. 21 Sizing: Capacity planning test I  FIRST: testing on a single node with a single index with one shard and no replica  THEN: insert as many documents as you can and run some typical queries  At some point, queries will start to slow down to a threshold, which no longer meet your requirements  This is the ideal number of documents a single shard is able to hold  NEXT: Find the ideal number your primary shards (by dividing your dataset size by the ideal shard size)  FINALLY: Add replicas for HA and improve the read throughput
  22. 22. 22 Sizing: Capacity planning test II Each experiment tries to accomplish a discreet goal and build upon previous 22 Determine various disk utilization 1 2 3 4 Determine breaking point of a shard Determine saturation point of a node Test desired configuration on two node cluster
  23. 23. 23 Agenda Elastic overview Sizing introduction Hot/Warm architecture 3 Elasticsearch basic architecture 1 2 4
  24. 24. 24 Hot / Warm architecture When using it?  Elasticsearch for larger time-data analytics use cases  Using time-based indices  Able to run an architecture with 3 different types of nodes
  25. 25. 25 Hot / Warm architecture: Type of nodes Master, Hot and Warm nodes:  Master nodes: 3 dedicated master nodes  Hot data nodes: perform all indexing and also hold the most recent daily (data to be queried most frequently). Powerful machines with SSD storage  Warm data nodes: handle a large amount of read-only indices that are not queried frequently. Very large attached spinning disks
  26. 26. 26 Hot / Warm architecture: tagging Which node is doing what?  ES needs to know which servers contain the hot nodes and which servers contain the warm nodes  This can be achieved by assigning arbitrary tags to each server (Hot/Warm)  Tag the node with node.box_type: xxx in elasticsearch.yml  OR start a node using ./bin/elasticsearch --node.box_type xxx
  27. 27. 27 Hot / Warm architecture: Force Merge API Optimizing your indices in the Warm Node  The force merge API allows to force merging of one or more indices through an API. Optimizes the index for faster search operation  The merge relates to the number of segments a Lucene index holds within each shard  The force merge operation allows to reduce the number of segments by merging them: $ curl -XPOST 'http://localhost:9200/my_index/_forcemerge'
  28. 28. 28 Hot / Warm architecture: Demo time!! DEMO

×