Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

How LogDNA Scaled Elasticsearch on Kubernetes

381 vues

Publié le

Here's a presentation that our Head of DevOps presented at Container World 2019 about our experience scaling Elastic Search with Kubernetes including pro tips of configuration files, index templates, and more.

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

How LogDNA Scaled Elasticsearch on Kubernetes

  1. 1. Scaling Elasticsearch on Kubernetes By Ryan Staatz
  2. 2. Fast multi-cloud logging What is Elasticsearch (ES) and why would I use it? ● Elasticsearch is a distributed full-text search engine that is queryable via a JSON API ● It’s the ‘E’ in the popular ELK stack and allows easy searching of unstructured data ● Native distributed clustering support makes adding Elasticsearch nodes easy ● You’ve been watching the Elasticsearch hype train and want to hop aboard In brief: Presentation by Ryan Staatz
  3. 3. Fast multi-cloud logging What is Kubernetes (k8s) and why would I run ES on it? ● Kubernetes is an open-source container orchestration platform developed by Google ● Scheduling & distributing application workloads onto hardware resources is automatic ● Configuration as code & static docker images enforce consistent pod behaviors ● You’ve been watching the Kubernetes hype train ship and want to hop aboard In brief: Presentation by Ryan Staatz
  4. 4. Fast multi-cloud logging At LogDNA we run ES on k8s at scale ● We needed a consistent way to deploy our software across varying infrastructures ● There are a number of custom modifications we have made to Elasticsearch interfaces ● We run in-house versions of the L (Logstash) and K (Kibana) of the ELK stack ● Kubernetes enables easier automation for versioning, CI/CD, and maintenance Both cloud and on-prem! Presentation by Ryan Staatz
  5. 5. Fast multi-cloud logging So managing ES with Kubernetes should be easy, right? ● Choose the appropriate Elasticsearch version and select the correct settings (there are hundreds of settings) ● Learn the expansive query language for Elasticsearch and integrate it into your workflows ● Set up a Kubernetes environment with access to appropriately sized hardware ● Configure the Elasticsearch k8s workload to request the appropriate resources, including disks ● Ensure the correct index templates and cluster settings are applied after launching your ES cluster ● Create k8s services such that Elasticsearch pods can find each other ● Troubleshoot all remaining issues as they arise and continue to manage and scale the cluster These are some of the steps involved in running ES on Kubernetes: Sounds great, let’s get started! Presentation by Ryan Staatz
  6. 6. Fast multi-cloud logging Getting started ● ES version 5.5 & Kubernetes cluster v1.11+ (for preemption) ● Hardware resources (k8s nodes) with at least 64 GB of RAM and 16 vCPUs (depends on your volume) ● Statefulsets and Services yaml configurations (we need identity, disks, and networking) ● Basic, but important cluster settings & a good starter index template ● Deploy an ES cluster management GUI (cerebro) to help with troubleshooting Maybe let’s just start with some sane defaults Presentation by Ryan Staatz
  7. 7. Fast multi-cloud logging A tale of too (many) yamls ● Two ConfigMaps: ○ The elasticsearch configuration file ○ A start script used to configure ulimits, permissions, and JVM heap size ● Three ES role types (statefulsets) ○ Master - handles lightweight cluster-wide actions (does not require disk) ○ Hot - handles incoming writes to active indices (higher cpu to disk ratio) ○ Cold - stores and queries older indices (lower cpu to disk ratio) There’s going to be a lot of these, but configuration as code is good! Presentation by Ryan Staatz
  8. 8. Fast multi-cloud logging Important ES configuration notes ● Use the alpine flavor of ES to reduce image size: elasticsearch:5.5.2-alpine ● Configure volumeClaimTemplates to dynamically provision disks ● Ensure the correct security context settings are specified in each statefulset ● Use k8s pre-emption to ensure your ES pods get scheduling priority ● Create a startup script to set the correct configuration prior to starting the JVM Pro tip: this slide contains several pro tips Presentation by Ryan Staatz
  9. 9. Fast multi-cloud logging Service discovery ● ES hot and cold have a single load balanced cluster IP service endpoint for insertions ● ES masters have 2 services ○ 1 load-balanced cluster IP for transport (9300) and http API requests (9200) ○ 1 clusterIP: None used for ES unicast discovery ● 2 important settings for clusterIP: None ○ Ensure DNS is publishable immediately ○ No sessionAffinity ensures up-to-date addresses Leverage Kubernetes’ native services Presentation by Ryan Staatz
  10. 10. Fast multi-cloud logging ES startup settings ● Ensure memory_lock is on ● Adjust the min master nodes based on the total number of masters you have ● The clusterIP: None service from the last slide is referenced by unicast settings ● Set the correct ES role ● Specify the number of cores Just the ones we use Presentation by Ryan Staatz
  11. 11. Fast multi-cloud logging Configuring an index template ● Configure index.total_shards_per_node based on your expected load ○ Optimizing shards can increase performance and reduced cluster state overhead ● Set a refresh_interval that works for you ○ Higher refresh intervals offer better throughput performance at the cost of latency ○ We typically use 15-30 seconds ● Change translog.durability to async (allow asynchronous translog writes) ○ We regret not discovering this setting sooner, as it gave us 5-10x increase in performance ● Note: index templates MUST be applied AFTER the ES masters are already running Index templates can have a huge impact on your cluster performance Presentation by Ryan Staatz
  12. 12. Fast multi-cloud logging Manage ES the GUI way: Cerebro ● Cerebro connects to your ES service endpoint(s) ● Contains an ES node/pod list and their health stats ● View indices and shards across the available data nodes ● Modify index settings, templates, and data ● Move shards around (important) ● Not all options are accessible via Cerebro Previously kopf if you’re using ES v2.X or lower Presentation by Ryan Staatz
  13. 13. Fast multi-cloud logging Manage ES the API way ● We use Insomnia (a REST API GUI to share API calls) ● Curl works too! ● API calls we commonly use: ○ /_cluster/health ○ /_cat/pending_tasks?v ○ /_flush?force & /_cluster/reroute?retry_failed=true A bit more work to start on, but automation is much easier Presentation by Ryan Staatz
  14. 14. Fast multi-cloud logging Wrap up ● ES requires some coaxing to properly run inside a container ○ Use the correct security context, ulimit, and vm settings ● There are native concepts in Kubernetes than can make running ES easier ○ Service discovery, volumeClaimTemplates, pre-emption, and more ○ ...or you could just use an operator! (your mileage may vary) ● Index templates have a big impact on how well your ES cluster runs ● GUIs (cerebro) and ES APIs are extremely useful for tuning performance That was a lot of info, but here’s what to walk away with: Presentation by Ryan Staatz
  15. 15. Fast Multi-Cloud Logging Visit Booth #215 ryan@logdna.com