Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Doing DevOps for Big Data? What You Need to Know About AIOps

AIOps has the promise to create hyper-efficiency within DevOps teams as they struggle with the diversity, complexity, and rate of change across the entire stack.

DevOps teams working with big data face unique challenges due to the complexity and diversity of the components that comprise the big data stack. At the same time, AIOps is maturing to the point of creating true efficiencies among these DevOps teams as they struggle against the diversity, complexity, dynamic behavior and rate of change across the entire stack.

  • Identifiez-vous pour voir les commentaires

Doing DevOps for Big Data? What You Need to Know About AIOps

  1. 1. What DevOps Need to Know About AIOPs Bala Venkatrao VP, Products at Unravel Data January 17, 2019
  2. 2. Polling questions (1) What are your most common big data application challenges? (select all that apply) - Performance tuning for slow applications - Root cause analysis for application failures - Establishing meaningful SLAs - Detecting runaway queries (2) What are your most common operational challenges for big data clusters ? (select all that apply) - Chargeback/Showback for multi-tenant clusters - Visibility into resource utilization - Job and workload management - Cluster tuning - Capacity Planning (3) What tools do you use today for troubleshooting and tuning your big data applications? (select all that apply) - Cloudera Manager/Apache Ambari - YARN/Spark WebUI - Dr. Elephant or other open source tools - Log Mgmt tools: Splunk, ELK etc. 2
  3. 3. Challenge: Operationalizing Modern Data Applications What does it mean for new data-driven applications and analytics to be enterprise grade? 3
  4. 4. 4 Examples of Big Data Apps: ETL, Analytics, Machine Learning, IoT, etc. Typical Big Data Architecture RDBMS SOCIAL SENSOR MACHINE ERP MOBILE Data Sources ModernDataPipelinesareVariedandComplex ETL DATA PIPELINE STREAM PROCESSING RAW DATA COMPUTED DATA MESSAGING RESULT STORE QUERY DATA REPORT ML APPS IoT APPS ANALYTICS B.I. ALERTS SERVICES Data ConsumerReal-time / Batch Process Result StoreData Collection
  5. 5. Impact of poor performance and failures in the data pipeline Low Productivity Sub-Optimized Resources Lack of Reliability 5
  7. 7. Tackling Complexity in Big Data Applications 7
  8. 8. DevOps and AIOps As big data adoption grows, the ability to manually intervene for hundreds of jobs running on thousands of nodes becomes problematic 8 … Need for an AI Powered Application Performance Management (APM) for Big Data
  9. 9. Essential Elements of an AIOps Solution for Big Data APM • Data Collection and Correlation • Observe and collect all relevant data • Operational Data Model • AI-assisted monitoring, troubleshooting, tuning, and managing requires a data model • Analytics • Statistical analysis – correlate, classify, extrapolate from operational data • Predictive/Prescriptive analytics – forecasting and recommendations for capacity • Pattern and anomaly detection, root-cause analysis • Context, topology and coded expertise • Automation • Auto-tuning of applications and resources • Cluster load balancing and job scheduling • Autonomous response to alerts and failures Data Collection and Correlation Modern Data Apps and Stack Data Model Analytics Automation Statistical Predictive/ Prescriptive Anomaly Detection Context/ Topology Auto-tuning Cluster Operations Resource Management Autonomous Remediation 9
  10. 10. Without AI, Big Data APM is a manual, logistical challenge One complete correlated view with built-in AI and ML. Multiple tools, no complete view, no intelligence. Big Data APM Without AI AI-Powered Big Data APM 10
  11. 11. Unravel: First AIOps Solution for Big Data APM Full-stack, Intelligent, Autonomous 11
  12. 12. AIOps Use Cases for Unravel Automated Cloud Cost Management • Optimize cost by right-sizing cloud images • Optimize cost by choosing the optimal price plan Automated Workload Management • Eliminate CPU, Memory, Network I/O and Disk I/O contention • Correctly size VM’s and Cloud Images • Place VM’s in the best Hosts and Clusters Automated Event Management • De-duplicate events • Support a collaborative (DevOps) problem resolution process Automated Performance Optimization and Remediation • Automatically learn the performance characteristics apps and supporting stack • Automatically optimize for a chosen KPI (performance, efficiency) 12
  13. 13. Unravel Applies Machine Learning (ML) at various levels Error Views & Analysis Tuning Recommendation Application Management Automated Tuning 13
  14. 14. Cluster Optimization Capacity Planning & Forecasting Operations Management Unravel Applies Machine Learning (ML) at various levels 14
  15. 15. A single pane of glass for application & operations management Anomaly detection to rapidly detect & diagnose unpredictable behavior Proactive alerting & remediation of cluster/SLA problems caused by applications Automatic root cause analysis of Workflow that missed SLA Intelligent tuning to make Yarn (Spark, Hive) applications faster & resource efficient Unravel AIOps Demo
  16. 16. Before Unravel: Global 200 Financial Services Company Complex Infrastructure Landscape Debugging Performance Problems is a Challenge Out of Control Costs Sub-optimal Capacity Management Missing Insights on Data Operations Ineffective Alerting and Automatic Actions 100+ projects 5,000+ jobs/day 600+ users globally 3PB+ of data >$1m spent on un-utilized storage >5 different interfaces for job monitoring >10 different logs for debugging a single workflow 1-2 weeks to determine root cause for performance issues 80% of the datasets can be candidates for lower cost storage 99% of all the current alerting cannot be co- related with performance issues Customer Case Study 16
  17. 17. Complex Infrastructure Landscape Debugging Performance Problems is a Challenge Out of Control Costs Sub-optimal Capacity Management Missing Insights on Data Operations Ineffective Alerting and Automatic Actions After Unravel: Global 200 Financial Services Company Customer Case Study 17 Scale to unlimited # of users, apps, data, projects 1 interface for job or workflow monitoring Reduce troubleshooting time by 98% Maximize resource utilization Save 60% on resource cost 70% reduction in support tickets
  18. 18. The benefits of Unravel’s AI-powered APM Solution 18
  19. 19. 19
  20. 20. Live Q&A questions 1. Does Unravel support big data workloads in the cloud? 2. I am planning to migrate from an on-premises installation to the cloud. Can Unravel help with that? 3. Does Unravel do more than monitoring? 20
  21. 21. Thank You Free Full Feature Trial on Amazon EMR, Microsoft Azure https://unraveldata.com/free-trial/ https://unraveldata.com/