Scale confidently. From laptop to lots of nodes to multi-cluster, multi-use case deployments, Elastic experts are sharing best practices to master and pitfalls to avoid when it comes to scaling Elasticsearch.
4. Learning to Play
4
Practice Good Security
It’s free*, as in beer,
which we might have later
Minimally Enable Authentication & TLS
*Security Features included in Basic License 6.8 / 7.1 +
Enable Security!
7. Learning to Play
7
Shards and Replicas
Shards (per index)
• The fewer the better!
• But if you must… a good shard count is < 20 per 1GB of heap
• Good range for shard size is 30-80GB
Replicas (start with N+1)
• Add more to scale out query processing & fault tolerance
10. No Borrowed Instruments
10
Only some resources can be easily shared …
Machine
Learning
Data Master
Ingest Coord.
Data Master
Ingest Coord.
Data Master
Ingest Coord.
APM
11. 11
Index Definitions
• Always define your own mappings
- Fields can have one or more
data types, choose
appropriately (text, keyword,
date, etc.)
- Every field does not need to be
indexed
• Use templates to simplify/
standardize the index creation
process
• Index aliases are your friends
Learning the Scales
12. Know When to Improvise
12
Automated Index Field Management
• “dynamic” : true
- Newly detected fields are added to the mapping (default)
• “dynamic” : false
- Newly detected fields are ignored
• “dynamic” : strict
- If new fields are detected, an exception is thrown and the document
is rejected. New fields must be explicitly added to the mapping
13. • Cluster
• Nodes
• Indices
• Kibana
• Logstash
• Beats
• APM
Record The Show
13
Monitor the monitors
15. Now We Need Roadies …
15
Reduce resource contention for discrete functions
• Data Nodes
• Master Nodes
• Coordinator Nodes
• Ingest Nodes
• Machine Learning Nodes
• Alerting Node
cluster
Ingest Node
Data Node
ML Node
Data Node
ML Node
Data Node
Data Node Data Node Data Node Data Node
Master Node
Data Node
Coordinating Node
Master Node
Data Node
Coordinating Node
Master Node
Data Node
Coordinating Node
16. Organizing a Growing Group
16
Shard routing - built-in traffic cop for directing your data
• Route data to specific
nodes/hardware
(Hot/Warm/Cold)
• Maintain resilience
through distributed
replicas
• Create custom-tuned
architectures
cluster
Ingest Node
Coordinating Node ML Node
Hot Data Node Warm Data Node Cold Data Node
Master Node
ML Node
Master NodeMaster Node
Warm Data Node
18. Time to Update the Setlist
18
Automate data lifecycle management with policies
• Use date or size to move
data through phases:
- Hot/Warm/Cold
- Frozen indices
• Index Lifestyle
Management
• Snapshot Lifestyle
Management
19. Rollups for Fast Queries on Large Metric Data Sets
19
Save space and execute faster on time-series data
Raw Minute Hour Day
Docs: 9,041,000 1,448,285 49,554 8,447
Size: 2.23 gb 1.25gb 48.40mb 9.10mb
Docs % Change: -83.98% -99.45% -99.91%
Size % Change: -43.68% -97.84% -99.59%
20. Sharp Virtuoso
20
Advance Index Operations
• Rollover API
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/indices-shrink-index.html
• Split API
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-split-index.html
• Shrink API
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/indices-shrink-index.html
• Index Sorting
https://www.elastic.co/guide/en/elasticsearch/reference/7.3/index-modules-index-sorting.html
23. Many Individual Needs
23
Multiple Use Cases | Multiple Clusters
METRICSSECURITY
OPERATIONAL
ANALYTICS
SEARCH
LOG
ANALYTICS
CUSTOM
APPS
24. KibanaES-CCS
Whole Group Visibility
24
Cross Cluster Search
Dev Team Elasticsearch Clusters Support Team
Kibana ES-CCS
Billing Team
Kibana ES-CCS
Marketing Team
KibanaES-CCS
Logging Security
Search
Metrics Apps
25. Every 1st Chair Needs a 2nd
25
Cross Cluster Replication
Disaster Recovery Data Locality Central Reporting
Pro DC
DR DC
Leader Follower
Central DC
Canada DC Singapore DC
Canada DC Singapore DC
Central
Reporting
DC
26. More Artists, More Challenges
26
Additional Management Concerns
• Hardware Profiles
• Data Lifecycle
• Upgrade Policies
• Scale up / Scale Down
• Security Integrations
• Fleet management
“Ah’ve got blisters
me fingers!”
27. Managing Trios, Quartets, & More
27
Orchestrate across multiple clusters
METRICSSECURITY
OPERATIONAL
ANALYTICS SEARCH
LOG
ANALYTICS
CUSTOM
APPS
28. The official fully managed
Elasticsearch & Kibana
solution.
ESS
Available on AWS, GCP,
and Azure.
Pick the Right Venue
28
Download distributions
and install it on your
preferred infrastructure.
Orchestration tailor-made
for Elastic Stack. Centrally
manage multiple stacks
and versions.
ECE/ECKSelf-Managed
Deploy anywhere. Deploy anywhere.
The best software and support Fully Orchestrated Elastic Hosted
29. Benefits of Cloud
29
Self Managed ECE/ECK ESS
Shard Sizing & Mapping
Hardware Provisioning
Snapshot Repository Management * (unless you want to)
Scaling Deployments
Zero Downtime Upgrades
Hot/Warm Architecture
Shard Routing Across AZs
Secure Nodes Communication
Do it
Yourself
Done for
You