Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Managing your Hadoop Clusters with Apache Ambari

13 273 vues

Publié le

Deploying, configuring, and managing large Apache Hadoop and HBase clusters can be quite complex. Once you have your clusters, keeping them up and running and making sure that the SLAs are met presents even more challenges and headaches to Hadoop operators. To make matters worse, managing upgrades can be a nightmare. Hadoop users are presented with their own fair share of difficulties such as slow running jobs and not knowing why they are slow. For third-party software vendors interested in incorporating Hadoop management and monitoring capabilities, there does not seem to be an obvious, easy solution. Apache Ambari is aimed at making lives of Hadoop operators, users, and integrators simpler by providing a management interface to do all of that and more. This session presents usages of Ambari`s Web UI for Hadoop operators (deploying, managing, and monitoring) as well as Hadoop users (job analytics). The talk will also touch upon Ambari`s REST API and how it is used in the real world. The session concludes by revealing the future roadmap of Ambari including queue management, upgrade, disaster recovery, high availability, and more.

Publié dans : Technologie
  • Soyez le premier à commenter

Managing your Hadoop Clusters with Apache Ambari

  1. 1. © Hortonworks Inc. 2013 Managing Your Hadoop Clusters with Apache Ambari Hadoop Summit June 2013
  2. 2. © Hortonworks Inc. 2013 Hello! • Yusaku Sako –Committer / PPMC member, Apache Ambari –Member of Technical Staff @ Hortonworks –yusaku@hortonworks.com • Jeff Sposetti –Contributor, Apache Ambari –Director of Product Management @ Hortonworks –jeff@hortonworks.com Page 2
  3. 3. © Hortonworks Inc. 2013 Today, We’ll Go Over… • Intro • Open Source Activity • Demo • Futures • Architecture • Recent Developments • Q & A Page 3
  4. 4. © Hortonworks Inc. 2013 Ambari: Enterprise Hadoop Operations Ambari is the only 100% open source framework for provisioning, managing and monitoring Apache Hadoop clusters HADOOP Storage & Process at Scale AMBARI PROVISION MANAGE MONITOR AMBARI WEB
  5. 5. © Hortonworks Inc. 2013 Features Today Provisioning: Simplified deployment across platforms Managing: Consistent controls across the Stack Monitoring: Visibility into key cluster metrics - Single pane of glass for Hadoop & System status - Pre-configured metrics & alerts - Single point for cluster operations - Customize w/o dealing with Hadoop complexities - Advanced configurations and host controls - Wizard-driven cluster install experience - Deploy 10s,100s or 1000s of Hadoop servers - Cloud, virtual and physical environments
  6. 6. © Hortonworks Inc. 2013 Apache Ambari – 100% Open Source! • Active community • 50+ Contributors / 20+ Committers • 140+ Ambari User Group Members • Steady progress/release cycle Page 6 Release Version Release Date JIRAs Resolved 0.9.0 Sep 2012 402 1.2.0 Feb 2013 441 1.2.1 Mar 2013 134 1.2.2 Apr 2013 106 1.2.3 Jun 2013 515 1.2.4 Jul 2013 109+ 1.2.5 Jul 2013 131+  Current Release  Today’s Demo
  7. 7. © Hortonworks Inc. 2013 Ambari System Architecture 7 Ambari Server Host Agent gmond Host Agent gmond Ganglia Server Agent Host Agent gmondgmetad gmond Ambari Web DB REST /clusters Nagios Server Agent
  8. 8. © Hortonworks Inc. 2012 Demo Page 8
  9. 9. © Hortonworks Inc. 2012 Futures Page 9
  10. 10. © Hortonworks Inc. 2013 Host Group Configuration Controls • Set custom configuration properties at the host level for one or more hosts • Important for handing “heterogeneous” clusters • AMBARI-1509 and AMBARI-1370 10 HEAPSIZE= 1024 HEAPSIZE= 2048
  11. 11. © Hortonworks Inc. 2013 Cluster Blueprints 11 • Perform “Headless Install” • Perform “Cluster Takeover” • Export blueprint from cluster • Boot & save wizard w/blueprint • AMBARI-1783 BLUEPRINT <stack> <host> <service> <component> <config> Ambari Server HOST MANIFEST <host> <meta> SERVICE CONFIGS <props> BLUEPRINT
  12. 12. © Hortonworks Inc. 2013 Hadoop 2.0 Support • Provision, manage and monitoring Hadoop 2.0 Stack • HDFS2, YARN, Tez • Rolling Cluster Upgrades –Enable cluster upgrade, one host at a time, in such a way that services and resources offered by the cluster are always available through out the upgrade process Page 12
  13. 13. © Hortonworks Inc. 2013 Ambari Architecture Page 13 DB Orchestrator SPI REST API Request Dispatcher Ambari Web Ambari Server Metrics AuthProvider /clusters /services /hosts /workflows/jobs /users, … User Store java RDBMS javascript RDBM S AD/ LDAP REST API for integration Auth Provider Cluster Configurations Web Client 100% REST Ambari Agents ganglia nagios Alerts Pluggable Service Providersfalcon Data Mgmt jmx python puppet
  14. 14. © Hortonworks Inc. 2013 REST API – Centralized & Consistent Page 14 Ambari REST API Alerts Job HistoryMetricsConfigurations Config DB Nagios Server Ganglia Server … HTTP GET, POST, PUT, DELETE :8080 HTTP Status Code / JSON core- site.xml core- site.xml Config files Config files Config files JMX Realtime Historical*-site.xml… Job History DB Hosts / ServicesCluster
  15. 15. © Hortonworks Inc. 2013 REST API Resource Tree • Resources • Clusters • Services (HDFS, MR, HIVE…) • Components (NAMENODE, DATANODE…) • Hosts • Host Components (DATANODE on host1…) • Configurations (core-site, mapred-site, …) • Workflows (Hive queries, Pig scripts, MR programs) • Jobs (spawned MR jobs…) • Task Attempts (Map, Shuffle, Reduce…) • Stacks (HDP, other distros) • https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md Page 15
  16. 16. © Hortonworks Inc. 2013 Ambari + Teradata Viewpoint Integration Page 16 • Ambari = Key enabler for integrating Hadoop monitoring capabilities to Viewpoint • Viewpoint uses Ambari REST API and Custom Service Providers to get Hadoop metrics from a non- Ambari deployed cluster
  17. 17. © Hortonworks Inc. 2013 Stack Definitions • Design Goals –Ambari should be able to support choice of Hadoop stacks –Ambari should enable adding new components to an existing stack • Define which Services are available (services) • Define where to get the packages (repos) 17 S S S SStack B repos services S S S SStack A repos services S S S S Stack C extends Stack B repos services S S+
  18. 18. © Hortonworks Inc. 2013 Ambari + Redhat GlusterFS Integration • Using Ambari to deploy / manage cluster with distributed file system other than HDFS –HCFS: GlusterFS as first implementation –Pluggability with other HCFS’s –See AMBARI-1817 Page 18 MapReduce Hive Distributed File System HDFS GlusterFS HBasePig Other HCFS …
  19. 19. © Hortonworks Inc. 2013 Ambari + Accumulo Integration • Using Ambari to deploy / manage cluster with Accumulo –Google Summer of Code project –See AMBARI-1930 MapReduce Hive Distributed File System HBasePig
  20. 20. © Hortonworks Inc. 2013 Ambari + Splunk Integration • Head over to Splunk’s Expo booth to learn about Ambari integrated into Splunk’s Management UI Page 20 +
  21. 21. © Hortonworks Inc. 2013 Get Involved! • Project Website – http://incubator.apache.org/ambari/ • Check out Ambari – Try installing your own cluster! (See project website for instructions) • Mailing Lists – ambari-user@incubator.apache.org – ambari-dev@incubator.apache.org • IRC Chanel – @apacheambari Page 6
  22. 22. © Hortonworks Inc. 2013 Thanks! • Questions? Page 22