SlideShare une entreprise Scribd logo
1  sur  13
Apache Ambari
Hadoop Cluster Manifest/Blueprint
Sumit Mohanty
Member of Technical Staff
Hortonworks
Agenda
• Cluster Manifest
• Scenarios
• Cluster Blueprint
• Using Cluster Manifest
• What’s next?
Hadoop Cluster Manifest
• Declarative representation of a Hadoop Cluster
– Stack Definition
– Configuration
– Host Details
– Component Mapping
• A common spec. across tools/services
• Targets
– Package Author, Hadoop Admins, and System Admins
Cluster Manifest:
Package Definition
• Package metadata
• Repository details
• Constituent services and their components
• Service specific metadata
• Configurable parameters
Cluster Manifest:
Package Definition
{
"schemaVersion:" : "1”,
"version" : "1.3.0”,
"author" : "Hortonworks”,
"created" : "03-31-2013”,
"manifestId" : "GUID",
"stackVersion" : "1.3.0”, "stackName" : "HDP",
"context" : […],
"packages" : {
"type" : "rpm",
"osSpecificPackages" : […]
},
"services" : [
{
"name" : "HDFS",
"components" : [
{
"name" : "NAMENODE",
"category" : "MASTER",
…
},
{
"name" : "DATANODE", …
],
"configurations" : [
{
"type":"core-site.xml",
"properties" : [
{
"propertyName" : "fs.trash.interval",
"defaultValue" : "360",
"propertyDescription" : "..."
},
…
],
"isManageable": "true",
"isRequired": "true",
"packages": […],
"serviceContext" : […]
}
}
Cluster Manifest:
Package Configuration
• Configurable parameters and values
– Non-default
– Organization, environment, instance specific
• Service or component specific values
{
"schemaVersion:" : "1", …
"context" : [
{ "name" : "targetStackVersion", "value" : "1.3.0" },
],
"deployedServices" : ["HDFS”, … ],
"configuration" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "300" },
...
]
},
…
"configOverrides" : [
/* delta changes on the top level changes */
{
"type" : "SERVICE”, "name" : "HDFS",
"configuration" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "480" },
...
},
{
"type" : "COMPONENT"
"name" : "JOBTRACKER",
...
}
Cluster Manifest:
Host List
• List of hosts
– Can be fully specified
– Or, can be a set of requirements
– Or, can even be non-existent
{
"schemaVersion:" : "1", …
"context" : […],
"hostGroups" : [
{
"name" : "masterHosts",
"members" : {
"count" : "1",
"hosts" : [
{ "FQDN" : "host1.domain1.com", "ip" : "" }
]
},
"properties" : […]
},
{
"name" : ”slaveHosts",
"members" : {…},
"properties" : […]
},
{
"name" : "clientHosts",
"members" : {…},
"properties" : [
{ "name" : "host_type", "value" : "High-CPU Medium" }
]
},
...
]
}
Cluster Manifest:
Host Component Mapping
• A mapping of components to hosts
– Simple component mapping to named hosts
– A set of constraints that can be used to find best
match (e.g. evaluate against host properties)
• System resources
– users, groups, ports, etc.
• Host specific configuration
– Non-homogeneous cluster
Cluster Manifest:
Host Component Mapping
{
"schemaVersion:" : "1”, …
"context" : […],
"hostResourceMapping" : [
{
"hosts" : [
{
"predicate" : "name=*"
}
],
"systemResources" : {
"hadoopGroup" : "hadoop",
"groups" : [
{
"name" : "hadoop",
...
],
"users" : [
{
"groups" : [
"hadoop"
],
"name" : "hdfs",
"type" : "LOCAL”
],
"ports" : […]
...
],
"hostComponentMapping" : [
{
"hosts" : [
{
"predicate" : "name=masterhosts1"
"configOverrides" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "480" },
...
],
"components" : [
"NAMENODE",
"JOBTRACKER",
...
]
},
...
}
Scenarios
• Define cluster templates
– and, host specific templates
• On demand cluster creation
– Cluster extension (e.g. add Datanodes)
• Export cluster manifest
• A uniform “language” across cluster managers
and environments
Cluster Blueprint
• Blueprint is manifest with “holes”
– Typically
• Hostnames
• Config parameters that use hostname
– But, any config params that a Hadoop admin
deems necessary to be parameterized
• Blueprint = Manifest + Parameter Values
Using Cluster Manifest
What’s Next?
• Apache Ambari JIRA 1783, is tracking this
project
– https://issues.apache.org/jira/browse/AMBARI-
1783
– Comments and suggestions, welcome
• In next releases, we will enhance Ambari to
add support for manifest and blueprints

Contenu connexe

Plus de Hortonworks

Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive DataHortonworks
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of DataHortonworks
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateHortonworks
 

Plus de Hortonworks (20)

Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 

Dernier

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Dernier (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Apache Ambari BOF - Blueprint - Hadoop Summit 2013

  • 1. Apache Ambari Hadoop Cluster Manifest/Blueprint Sumit Mohanty Member of Technical Staff Hortonworks
  • 2. Agenda • Cluster Manifest • Scenarios • Cluster Blueprint • Using Cluster Manifest • What’s next?
  • 3. Hadoop Cluster Manifest • Declarative representation of a Hadoop Cluster – Stack Definition – Configuration – Host Details – Component Mapping • A common spec. across tools/services • Targets – Package Author, Hadoop Admins, and System Admins
  • 4. Cluster Manifest: Package Definition • Package metadata • Repository details • Constituent services and their components • Service specific metadata • Configurable parameters
  • 5. Cluster Manifest: Package Definition { "schemaVersion:" : "1”, "version" : "1.3.0”, "author" : "Hortonworks”, "created" : "03-31-2013”, "manifestId" : "GUID", "stackVersion" : "1.3.0”, "stackName" : "HDP", "context" : […], "packages" : { "type" : "rpm", "osSpecificPackages" : […] }, "services" : [ { "name" : "HDFS", "components" : [ { "name" : "NAMENODE", "category" : "MASTER", … }, { "name" : "DATANODE", … ], "configurations" : [ { "type":"core-site.xml", "properties" : [ { "propertyName" : "fs.trash.interval", "defaultValue" : "360", "propertyDescription" : "..." }, … ], "isManageable": "true", "isRequired": "true", "packages": […], "serviceContext" : […] } }
  • 6. Cluster Manifest: Package Configuration • Configurable parameters and values – Non-default – Organization, environment, instance specific • Service or component specific values { "schemaVersion:" : "1", … "context" : [ { "name" : "targetStackVersion", "value" : "1.3.0" }, ], "deployedServices" : ["HDFS”, … ], "configuration" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "300" }, ... ] }, … "configOverrides" : [ /* delta changes on the top level changes */ { "type" : "SERVICE”, "name" : "HDFS", "configuration" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "480" }, ... }, { "type" : "COMPONENT" "name" : "JOBTRACKER", ... }
  • 7. Cluster Manifest: Host List • List of hosts – Can be fully specified – Or, can be a set of requirements – Or, can even be non-existent { "schemaVersion:" : "1", … "context" : […], "hostGroups" : [ { "name" : "masterHosts", "members" : { "count" : "1", "hosts" : [ { "FQDN" : "host1.domain1.com", "ip" : "" } ] }, "properties" : […] }, { "name" : ”slaveHosts", "members" : {…}, "properties" : […] }, { "name" : "clientHosts", "members" : {…}, "properties" : [ { "name" : "host_type", "value" : "High-CPU Medium" } ] }, ... ] }
  • 8. Cluster Manifest: Host Component Mapping • A mapping of components to hosts – Simple component mapping to named hosts – A set of constraints that can be used to find best match (e.g. evaluate against host properties) • System resources – users, groups, ports, etc. • Host specific configuration – Non-homogeneous cluster
  • 9. Cluster Manifest: Host Component Mapping { "schemaVersion:" : "1”, … "context" : […], "hostResourceMapping" : [ { "hosts" : [ { "predicate" : "name=*" } ], "systemResources" : { "hadoopGroup" : "hadoop", "groups" : [ { "name" : "hadoop", ... ], "users" : [ { "groups" : [ "hadoop" ], "name" : "hdfs", "type" : "LOCAL” ], "ports" : […] ... ], "hostComponentMapping" : [ { "hosts" : [ { "predicate" : "name=masterhosts1" "configOverrides" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "480" }, ... ], "components" : [ "NAMENODE", "JOBTRACKER", ... ] }, ... }
  • 10. Scenarios • Define cluster templates – and, host specific templates • On demand cluster creation – Cluster extension (e.g. add Datanodes) • Export cluster manifest • A uniform “language” across cluster managers and environments
  • 11. Cluster Blueprint • Blueprint is manifest with “holes” – Typically • Hostnames • Config parameters that use hostname – But, any config params that a Hadoop admin deems necessary to be parameterized • Blueprint = Manifest + Parameter Values
  • 13. What’s Next? • Apache Ambari JIRA 1783, is tracking this project – https://issues.apache.org/jira/browse/AMBARI- 1783 – Comments and suggestions, welcome • In next releases, we will enhance Ambari to add support for manifest and blueprints