SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
NYSE: DVN
devonenergy.com
Elastic Cloud Enterprise in Azure
A Devon Energy Story
Devon - Internal
2
Paul PC
• Alphabet soup: MBA, GSE, GREM, GCIA, GCIH, GSEC, GPEN, GPYC, CISSP
• Security Architect / Team Lead for Devon Energy, an Independent E&P
headquartered in OKC
• Fluent in Romanian, English, Python
• Loves the cloud almost as much as his Ducati
Prakhar S
• Educational Kaizen : Bachelor of Engineering (CSE), RHCE, SAS® Certified
• Works for Accenture Technology, collaborating with Devon Energy currently
• Close to nine years of experience, aspires to contribute in cutting edge tech
Who Are We?
Devon - Internal
3
Agenda / History
2013: ES 1.7 and Moloch
2015: Production-grade Servers
2017: Elastic Search in Security
Analytics Platform
2019: SIEM replacement
Devon - Internal
4
2017: Analytics Platform v1.0 – Working Experiments
Axioms:
• Collaboration with Data
Analytics team
• Only Cloud components
• Use as much azure 1st party
as possible
• Be able to ingest 500GB/
day
Devon - Internal
5
Approximately
• 100 billion documents
• 100 TB of Indexed data
• 1.5 – 2.0 TB DAILY
• 1300 indices
• 10000 shards
• Ingestion rate approximately 20-25k EPS
Scale
Devon - Internal
6
2019: Analytics Platform v3.0 – Production SIEM++
Enrichment moved to
Logstash
DYI Retention to BLOB
First shard optimizations
efforts
Automation is our best L1 HoD
More data, more user stories
• Network
• AKS
• Accounting
• SAP security
Devon - Internal
7
• Traditional SIEM problems:
• Retention
• Parsing - legacy logs are terrible
• WEF is great, until it isn’t
• Just enough normalization
• Enrichment > Correlation
• Atomic indicators
• Identity information
• Cluster > purpose collectors
• Logstash nodes are cattle, not pets
• Puppet was good, K8s is great
• Throughput vs. node count
Logstash - The Core of Our FrankenSIEM

Devon - Internal
8
Intent : Security should not be managing
it
What’s cool about ECE
• It saves Time : Streamline scaling, securing,
upgrading the stack
• Great way to enhance utility and value was to
grant other teams access to their logs
• Centralized management of logstash pipelines
and cluster user settings is cool
• Inbuilt monitoring gives a great insight
It was very different than traditional ES
• Troubleshooting is a bit different
• Elasticsearch.yml is not to be trifled with
• Initially, had to rely a lot more on support
Why ECE
8
Devon - Internal
9
Indexing throughput
• Use dedicated Master Nodes
• Index refresh interval refresh_interval : 30s (even -1 for initial load)
• Tune Indexing buffer size
OOMs
• Give JVM 50% of available memory
• Prevent JVM resizing : minHeap=maxHeap
Cautiously budget your cache
• Limit and monitor Field Data cache : it never goes away
• Using circuit breaker can save the day
Disk Sizing
• Monitor number of replica for your use case
• Experiment with Sharding : Larger shards means better indexing rate but needs can vary; more shards come with cost
Appraise your precedence, go Quid-pro-quo
• Memory Intensive queries vs near-real-time data vs long-term retention
Major Gotchas
Devon - Internal
10
• It hurt being the first Azure customers for ECE
• Docker ignorance didn’t help
• ECE 1.1.x supported specific version of docker with specific version of kernel
• Updates tanked the cluster at-least twice – multi zone helps
• Problems at our scale:
• Migrating shards between containers takes days
• In-place container upgrades:
advanced configuration will save your day/week
• Moving to dedicated master nodes – election problems
• Unexpected socket hang ups, difficult to diagnose
• Monitoring:
• Independent Monitoring cluster
• Created our own health check scripts
What We Learned – Operational Glitches
Devon - Internal
11
What’s Next For Us
Chugging the Docker and Kubernetes
Kool-Aid:
• Curator
• Enrichment data collection
• Logstash + autoscaling
• Health checks
Dedicated nodes:
• ML Nodes
• Search Only Nodes
• Ingest nodes
Separate security and non-security to
their own spaces
SAML / OpenIDc with MFA
Thank you.
Come call BS at Happy Hour


Contact info:
@p4ulpc – paul.poputa-clean@dvn.c
Prakhar.sengar@dvn.com
Code:
https://github.com/paulpc/elasticon

Contenu connexe

Tendances

Tendances (20)

Centralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLACentralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLA
 
Security Events Logging at Bell with the Elastic Stack
Security Events Logging at Bell with the Elastic StackSecurity Events Logging at Bell with the Elastic Stack
Security Events Logging at Bell with the Elastic Stack
 
Elastic @ John Deere
Elastic @ John DeereElastic @ John Deere
Elastic @ John Deere
 
Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...Industrial production process visualization with the Elastic Stack in real-ti...
Industrial production process visualization with the Elastic Stack in real-ti...
 
Transformational Search Performance with EnergyIQ
Transformational Search Performance with EnergyIQ Transformational Search Performance with EnergyIQ
Transformational Search Performance with EnergyIQ
 
Log Monitoring and Anomaly Detection at Scale at ORNL
Log Monitoring and Anomaly Detection at Scale at ORNLLog Monitoring and Anomaly Detection at Scale at ORNL
Log Monitoring and Anomaly Detection at Scale at ORNL
 
Migrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic CloudMigrating a legacy logging system: Etsy’s journey to Elastic Cloud
Migrating a legacy logging system: Etsy’s journey to Elastic Cloud
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impact
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
Building a reliable and cost effect logging system at Box
Building a reliable and cost effect logging system at Box Building a reliable and cost effect logging system at Box
Building a reliable and cost effect logging system at Box
 
The Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management SystemThe Elastic Evolution of CenturyLink’s Network Management System
The Elastic Evolution of CenturyLink’s Network Management System
 
Logging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations TrifectaLogging, Metrics, and APM: The Operations Trifecta
Logging, Metrics, and APM: The Operations Trifecta
 
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic StackSiscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
Siscale Lightning Talk: Automated Root Cause Analysis with Elastic Stack
 
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
Getting Started with Runtime Security on Azure Kubernetes Service (AKS)
 
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
Scaling Your Skillset with Your Data with Jarrett Garcia (Nielsen)
 
Empower Your Security Practitioners with Elastic SIEM
Empower Your Security Practitioners with Elastic SIEMEmpower Your Security Practitioners with Elastic SIEM
Empower Your Security Practitioners with Elastic SIEM
 
American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...
American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...
American Ancestors Use Case - Scalability & Support Using the Elasticsearch S...
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
 
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTODatadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
Datadog: From a single product to a growing platform by Alexis Lê-Quôc, CTO
 

Similaire à Elastic Cloud Enterprise in Azure with Devon

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
A1 engineered systems principles and architecture
A1   engineered systems principles and architectureA1   engineered systems principles and architecture
A1 engineered systems principles and architecture
Dr. Wilfred Lin (Ph.D.)
 

Similaire à Elastic Cloud Enterprise in Azure with Devon (20)

Simplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes ManagementSimplify Your Way To Expert Kubernetes Management
Simplify Your Way To Expert Kubernetes Management
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Application Deployment and Management at Scale at 1&1
Application Deployment and Management at Scale at 1&1Application Deployment and Management at Scale at 1&1
Application Deployment and Management at Scale at 1&1
 
Ceph Day Shanghai - Opening
Ceph Day Shanghai - Opening Ceph Day Shanghai - Opening
Ceph Day Shanghai - Opening
 
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt BaldwinApplication Deployment and Management at Scale with 1&1 by Matt Baldwin
Application Deployment and Management at Scale with 1&1 by Matt Baldwin
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Case Study: University Alabama-Birmingham.
Case Study: University Alabama-Birmingham.Case Study: University Alabama-Birmingham.
Case Study: University Alabama-Birmingham.
 
A1 engineered systems principles and architecture
A1   engineered systems principles and architectureA1   engineered systems principles and architecture
A1 engineered systems principles and architecture
 
Storage os kubernetes clusters need persistent data
Storage os   kubernetes clusters need persistent dataStorage os   kubernetes clusters need persistent data
Storage os kubernetes clusters need persistent data
 
Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseWebinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
Database as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on KubernetesDatabase as a Service (DBaaS) on Kubernetes
Database as a Service (DBaaS) on Kubernetes
 
DCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at NetflixDCSF19 Container Security: Theory & Practice at Netflix
DCSF19 Container Security: Theory & Practice at Netflix
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
Criteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech TalkCriteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech Talk
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Operating OpenStack on a Budget
Operating OpenStack on a BudgetOperating OpenStack on a Budget
Operating OpenStack on a Budget
 
Operating OpenStack on a Budget
Operating OpenStack on a BudgetOperating OpenStack on a Budget
Operating OpenStack on a Budget
 
Webcast: DevOps in AWS is different! How can containers help?
Webcast: DevOps in AWS is different! How can containers help? Webcast: DevOps in AWS is different! How can containers help?
Webcast: DevOps in AWS is different! How can containers help?
 

Plus de Elasticsearch

Plus de Elasticsearch (20)

An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
From MSP to MSSP using Elastic
From MSP to MSSP using ElasticFrom MSP to MSSP using Elastic
From MSP to MSSP using Elastic
 
Cómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios webCómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios web
 
Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas
 
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic CloudTirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
 
Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.
 
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
Welcome to a new state of find
Welcome to a new state of findWelcome to a new state of find
Welcome to a new state of find
 
Building great website search experiences
Building great website search experiencesBuilding great website search experiences
Building great website search experiences
 
Keynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified searchKeynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified search
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisiones
 
Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insights
 
Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?
 
Empowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside GovernmentEmpowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside Government
 
The opportunities and challenges of data for public good
The opportunities and challenges of data for public goodThe opportunities and challenges of data for public good
The opportunities and challenges of data for public good
 
Enterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and ElasticEnterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and Elastic
 

Dernier

Dernier (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Elastic Cloud Enterprise in Azure with Devon

  • 1. NYSE: DVN devonenergy.com Elastic Cloud Enterprise in Azure A Devon Energy Story
  • 2. Devon - Internal 2 Paul PC • Alphabet soup: MBA, GSE, GREM, GCIA, GCIH, GSEC, GPEN, GPYC, CISSP • Security Architect / Team Lead for Devon Energy, an Independent E&P headquartered in OKC • Fluent in Romanian, English, Python • Loves the cloud almost as much as his Ducati Prakhar S • Educational Kaizen : Bachelor of Engineering (CSE), RHCE, SAS® Certified • Works for Accenture Technology, collaborating with Devon Energy currently • Close to nine years of experience, aspires to contribute in cutting edge tech Who Are We?
  • 3. Devon - Internal 3 Agenda / History 2013: ES 1.7 and Moloch 2015: Production-grade Servers 2017: Elastic Search in Security Analytics Platform 2019: SIEM replacement
  • 4. Devon - Internal 4 2017: Analytics Platform v1.0 – Working Experiments Axioms: • Collaboration with Data Analytics team • Only Cloud components • Use as much azure 1st party as possible • Be able to ingest 500GB/ day
  • 5. Devon - Internal 5 Approximately • 100 billion documents • 100 TB of Indexed data • 1.5 – 2.0 TB DAILY • 1300 indices • 10000 shards • Ingestion rate approximately 20-25k EPS Scale
  • 6. Devon - Internal 6 2019: Analytics Platform v3.0 – Production SIEM++ Enrichment moved to Logstash DYI Retention to BLOB First shard optimizations efforts Automation is our best L1 HoD More data, more user stories • Network • AKS • Accounting • SAP security
  • 7. Devon - Internal 7 • Traditional SIEM problems: • Retention • Parsing - legacy logs are terrible • WEF is great, until it isn’t • Just enough normalization • Enrichment > Correlation • Atomic indicators • Identity information • Cluster > purpose collectors • Logstash nodes are cattle, not pets • Puppet was good, K8s is great • Throughput vs. node count Logstash - The Core of Our FrankenSIEM

  • 8. Devon - Internal 8 Intent : Security should not be managing it What’s cool about ECE • It saves Time : Streamline scaling, securing, upgrading the stack • Great way to enhance utility and value was to grant other teams access to their logs • Centralized management of logstash pipelines and cluster user settings is cool • Inbuilt monitoring gives a great insight It was very different than traditional ES • Troubleshooting is a bit different • Elasticsearch.yml is not to be trifled with • Initially, had to rely a lot more on support Why ECE 8
  • 9. Devon - Internal 9 Indexing throughput • Use dedicated Master Nodes • Index refresh interval refresh_interval : 30s (even -1 for initial load) • Tune Indexing buffer size OOMs • Give JVM 50% of available memory • Prevent JVM resizing : minHeap=maxHeap Cautiously budget your cache • Limit and monitor Field Data cache : it never goes away • Using circuit breaker can save the day Disk Sizing • Monitor number of replica for your use case • Experiment with Sharding : Larger shards means better indexing rate but needs can vary; more shards come with cost Appraise your precedence, go Quid-pro-quo • Memory Intensive queries vs near-real-time data vs long-term retention Major Gotchas
  • 10. Devon - Internal 10 • It hurt being the first Azure customers for ECE • Docker ignorance didn’t help • ECE 1.1.x supported specific version of docker with specific version of kernel • Updates tanked the cluster at-least twice – multi zone helps • Problems at our scale: • Migrating shards between containers takes days • In-place container upgrades: advanced configuration will save your day/week • Moving to dedicated master nodes – election problems • Unexpected socket hang ups, difficult to diagnose • Monitoring: • Independent Monitoring cluster • Created our own health check scripts What We Learned – Operational Glitches
  • 11. Devon - Internal 11 What’s Next For Us Chugging the Docker and Kubernetes Kool-Aid: • Curator • Enrichment data collection • Logstash + autoscaling • Health checks Dedicated nodes: • ML Nodes • Search Only Nodes • Ingest nodes Separate security and non-security to their own spaces SAML / OpenIDc with MFA
  • 12. Thank you. Come call BS at Happy Hour 
 Contact info: @p4ulpc – paul.poputa-clean@dvn.c Prakhar.sengar@dvn.com Code: https://github.com/paulpc/elasticon