SlideShare une entreprise Scribd logo
1  sur  29
Janos Matyas / CTO / SequenceIQ Inc.
GOAL / MOTIVATION
TECHNOLOGY STACK
PROBLEM RESOLUTION / HOW IT WORKS
RESULTS / ACHIEVEMENTS
OVERVIEW
GOAL / MOTIVATION
 Ease Hadoop provisioning – everywhere
 Automate and unify the process
 Arbitrary cluster size
 Same process through a cluster lifecycle (Dev, QA, UAT, Prod)
 (Auto) scaling Hadoop
 QoS
OUR APPROACH
 Use Docker
 Build cloud-specific ‘Dockerized’ images
 Provision the cluster
 Use Ambari
DOCKER
 Lightweight, portable
 Build once, run anywhere
 VM – without the overhead of a VM
 Isolated containers
 Automated and scripted
DOCKER – CONTAINERS vs. VMs
 Containers are isolated, but share OS and,
where appropriate, bins/libraries
APACHE AMBARI – ARCHITECTURE
 Easy Hadoop cluster provisioning
 Management and monitoring
 Key features – blueprints
 REST API
APACHE AMBARI – CREATE CLUSTER
 Define a blueprint (POST /api/v1/blueprints)
 Create cluster (POST /api/v1/clusters/mycluster)
HADOOP PROVISIONG ISSUES
 Each cloud provider has a proprietary API
 Create images for each provider
 Network configuration
 Service discovery
 Resize, failover, member join support
OUR APPROACH – DETAILS
 Build your Docker image
 Install or pre-install Hadoop services with Ambari
 Install Serf and dnsmasq
 Build your cloud image
 Use Ansible to create an image
 Provision the cluster
BUILD DOCKER IMAGES
 Create the Dockerfile
 Have Docker.io to build the image
 Optionally pre-install services
 Use Ambari
 Push image to Docker.io
 Licensing questions
BUILD CLOUD IMAGES
 Use a Docker ready base image
 Use Ansible to provision the image template
 Pull the Docker images
 Apply custom infrastructure
 Use cloud provider specific playbooks
 AWS EC2
 Azure
ANSIBLE
 Configuration as data
 Simplest way to automate IT
 Secure and agentless
 Goal oriented
 One playbook – multiple modules
 We use it to “burn” cloud images/templates
PROVISIONING – ISSUES
 FQDN
 /etc/hosts is read-only in Docker
 Everybody needs to know everybody
 DNS
 Single point of failure
 Dynamic cluster – nodes joining, leaving, failing
 Routing
 Cloud – ability to inter-host container routing
 Collision free private IP range for Docker bridge
 We need predefined host names/IP addresses
 /etc/hosts is read-only in Docker
 Use Ansible to provision the image template
 Pull the Docker images
 Start a DNS server
 Use it as a reference docker run -dns <IP_OF_DNS>
 Nodes need to know each other
PROVISIONING – SOLUTION
 FQDN
 Use –h and –dns Docker params
 DNS
 dnsmasq is running on each Docker container
 Serf member-xxx events trigger dnsmasq reconfiguration
 Routing
 Docker bridge configuration – follows a convention
SERF
 Gossip based membership
 Service discovery
 Decentralized
 Lightweight, fault tolerant
 Highly available
 DevOps friendly
 Keep an eye on Consul, Open vSwitch, pipework
SERF – DECENTRALIZED SERVICE DISCOVERY
 Gossip instead of heartbeat
 LAN, WAN profiles
 Provides membership information
 Event handlers: member_join, member_leave, member_failed, member-
update, member-reap, user
 Query
SERF – GOSSIPING
SERF – MEMBERSHIP, EVENT HANDLERS
DNSMASQ
 Network infrastructure for small networks
 Lightweight DNS, DHCP server
 Comes with most Linux distributions
AWS EC2 – HADOOP CLUSTER
 Use EC2 REST API to provision instances (from Dockerized image)
 Start Docker containers
 One Ambari server
 N-1 Ambari agents connecting to server
 Connect ambari-shell to
 Define blueprint
 Provision the cluster
AWS EC2 – NETWORK SECURITY
 Create a VPC
 Configure subnets
 Routing tables
 Security gateway
 Set ACL
 Configure VPN
AWS EC2 - CLOUDFORMATION
 Manually set up VPC is too complicated
 Use CloudFormation
 Manage the stack together
 Template-based
 Environments under version control
 Customizable at runtime
 No extra charge
"VpcId" : {
"Type" : "String",
"Description" : "VpcId of your existing Virtual Private Cloud (VPC)"
},
"SubnetId" : {
"Type" : "String",
"Description" : "SubnetId of an existing subnet (for the primary
network) in your Virtual Private Cloud (VPC)"
},
"SecondaryIPAddressCount" : {
"Type" : "Number",
"Default" : "1",
"MinValue" : "1",
"MaxValue" : "5",
"Description" : "Number of secondary IP addresses to assign to the
network interface (1-5)",
"ConstraintDescription": "must be a number from 1 to 5."
},
"SSHLocation" : {
"Description" : "The IP address range that can be used to SSH to the
EC2 instances",
"Type": "String",
"MinLength": "9",
"MaxLength": "18",
"Default": "0.0.0.0/0",
"AllowedPattern": "(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})/
(d{1,2})",
"ConstraintDescription": "must be a valid IP CIDR range of the form
x.x.x.x/x."
}
},
CLOUDBREAK
Cloudbreak is a powerful left surf that
breaks over a coral reef, a mile off
southwest the island of Tavarua, Fiji.
Cloudbreak is a cloud-agnostic
Hadoop as a Service API. Abstracts
the provisioning and ease
management and monitoring of on-
demand clusters.
Provisioning Hadoop has never been easier
CLOUDBREAK
 Benefits
 Elastic
 Scalable
 Blueprints
 Flexible
 Main REST resources
 /template – specify a cluster infrastructure
 /stack – creates a cloud infrastructure built from a template
 /blueprint – describes a Hadoop cluster
 /cluster – creates a Hadoop cluster
RESULTS AND ACHIEVEMENTS
 Hadoop as a Service API
 Available for EC2 and Azure cloud
 OpenStack, bare metal is coming soon
 Open source under Apache 2 licence
 Same goals as Apache Ambari Launchpad project
 What's next?
HADOOP SERVICES - AS A SERVICE
 Leverage YARN
 Slider (Hoya) providers
 HBase, Accumulo
 SequenceIQ providers - Flume, Tomcat
 YARN -1964
 QoS for YARN – heuristic scheduler
 Platform as a Service API
BANZAI PIPELINE
Banzai Pipeline is a surf reef break located
in Hawaii, off Ehukai Beach Park in
Pupukea on O'ahu's North Shore.
Banzai Pipeline is a RESTful
application development
platform for building on-
demand data and job pipelines
running on Hadoop YARN.
Banzai Pipeline is a big data API for the REST
THANK YOU
 Get the code: https://github.com/sequenceiq
 Read about: http://blog.sequenceiq.com
 Facebook: http://facebook.com/sequenceiq
 Twitter: http://twitter.com/sequenceiq
 LinkedIn: http://linkedin.com/sequenceiq
 Contact: janos.matyas@sequenceiq.com
FEEL FREE TO CONTRIBUTE

Contenu connexe

Tendances

Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalleybuildacloud
 
CloudStack-Developer-Day
CloudStack-Developer-DayCloudStack-Developer-Day
CloudStack-Developer-DayKimihiko Kitase
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on DockerRakesh Saha
 
Docker, Mesos, Spark
Docker, Mesos, Spark Docker, Mesos, Spark
Docker, Mesos, Spark Qiang Wang
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowskibuildacloud
 
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Cloud Native Day Tel Aviv
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture FutureKimihiko Kitase
 
Migrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSMigrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSJesus Guzman
 
Openstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkOpenstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkAndrew Shafer
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoopGergely Devenyi
 
EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC
 
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014Tran Nhan
 
OpenStack 101 Presentation
OpenStack 101 PresentationOpenStack 101 Presentation
OpenStack 101 PresentationEVault
 
Containers and workload security an overview
Containers and workload security an overview Containers and workload security an overview
Containers and workload security an overview Krishna-Kumar
 

Tendances (20)

Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
 
CloudStack-Developer-Day
CloudStack-Developer-DayCloudStack-Developer-Day
CloudStack-Developer-Day
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on Docker
 
Docker, Mesos, Spark
Docker, Mesos, Spark Docker, Mesos, Spark
Docker, Mesos, Spark
 
Cloud stack for_beginners
Cloud stack for_beginnersCloud stack for_beginners
Cloud stack for_beginners
 
AWS-compared-to-OpenStack
AWS-compared-to-OpenStackAWS-compared-to-OpenStack
AWS-compared-to-OpenStack
 
Apache CloudStack from API to UI
Apache CloudStack from API to UIApache CloudStack from API to UI
Apache CloudStack from API to UI
 
Guaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike TutkowskiGuaranteeing Storage Performance by Mike Tutkowski
Guaranteeing Storage Performance by Mike Tutkowski
 
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinde...
 
Introduction to CloudStack
Introduction to CloudStack Introduction to CloudStack
Introduction to CloudStack
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 
Migrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDSMigrate Oracle database to Amazon RDS
Migrate Oracle database to Amazon RDS
 
Openstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud FrameworkOpenstack: An Open Source Cloud Framework
Openstack: An Open Source Cloud Framework
 
OpenStack 101 update
OpenStack 101 updateOpenStack 101 update
OpenStack 101 update
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From Within
 
OpenStack Cinder
OpenStack CinderOpenStack Cinder
OpenStack Cinder
 
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
 
OpenStack 101 Presentation
OpenStack 101 PresentationOpenStack 101 Presentation
OpenStack 101 Presentation
 
Containers and workload security an overview
Containers and workload security an overview Containers and workload security an overview
Containers and workload security an overview
 

En vedette

Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Cosmin Lehene
 
Big data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartBig data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartIMC Institute
 
MLlib sparkmeetup_8_6_13_final_reduced
MLlib sparkmeetup_8_6_13_final_reducedMLlib sparkmeetup_8_6_13_final_reduced
MLlib sparkmeetup_8_6_13_final_reducedChao Chen
 
How could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemHow could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemJun Hong Kim
 
HBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with ClusterdockHBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with ClusterdockMichael Stack
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceLuciano Resende
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirLuciano Resende
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningLuciano Resende
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with DockerFabio Fumarola
 
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)Yahoo Developer Network
 
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Anna Yen
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysiszafarali1981
 
Shipping Applications to Production in Containers with Docker
Shipping Applications to Production in Containers with DockerShipping Applications to Production in Containers with Docker
Shipping Applications to Production in Containers with DockerJérôme Petazzoni
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector Yahoo Developer Network
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...Yahoo Developer Network
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieYahoo Developer Network
 
Budget Cuts And Their Effects
Budget Cuts And Their EffectsBudget Cuts And Their Effects
Budget Cuts And Their Effectsendre1mr
 
I want to visit Austrialia
I want to visit AustrialiaI want to visit Austrialia
I want to visit Austrialiamliadvisor
 

En vedette (20)

Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
 
Big data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartBig data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera Quickstart
 
MLlib sparkmeetup_8_6_13_final_reduced
MLlib sparkmeetup_8_6_13_final_reducedMLlib sparkmeetup_8_6_13_final_reduced
MLlib sparkmeetup_8_6_13_final_reduced
 
How could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemHow could I automate log gathering in the distributed system
How could I automate log gathering in the distributed system
 
HBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with ClusterdockHBaseConEast2016: HBase on Docker with Clusterdock
HBaseConEast2016: HBase on Docker with Clusterdock
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open source
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker
 
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)
Flickr: Computer vision at scale with Hadoop and Storm (Huy Nguyen)
 
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysis
 
Shipping Applications to Production in Containers with Docker
Shipping Applications to Production in Containers with DockerShipping Applications to Production in Containers with Docker
Shipping Applications to Production in Containers with Docker
 
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
 
Cafe Con Leche
Cafe Con LecheCafe Con Leche
Cafe Con Leche
 
Budget Cuts And Their Effects
Budget Cuts And Their EffectsBudget Cuts And Their Effects
Budget Cuts And Their Effects
 
I want to visit Austrialia
I want to visit AustrialiaI want to visit Austrialia
I want to visit Austrialia
 

Similaire à Docker Based Hadoop Provisioning

Automating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with PuppetAutomating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with Puppetbuildacloud
 
Automating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David NalleyAutomating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David NalleyPuppet
 
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...Alexey Bokov
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBlueData, Inc.
 
Docker in practice
Docker in practiceDocker in practice
Docker in practiceGeert Pante
 
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12dotCloud
 
Application Deployment on Openstack
Application Deployment on OpenstackApplication Deployment on Openstack
Application Deployment on OpenstackDocker, Inc.
 
Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deploymentjavaonfly
 
Dockerization of Azure Platform
Dockerization of Azure PlatformDockerization of Azure Platform
Dockerization of Azure Platformnirajrules
 
The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittalbuildacloud
 
Donabe-essex-conference-readout
Donabe-essex-conference-readoutDonabe-essex-conference-readout
Donabe-essex-conference-readoutDebojyoti Dutta
 
Directions for CloudStack Networking
Directions for CloudStack  NetworkingDirections for CloudStack  Networking
Directions for CloudStack NetworkingChiradeep Vittal
 
Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Nuxeo
 
Docker - Demo on PHP Application deployment
Docker - Demo on PHP Application deployment Docker - Demo on PHP Application deployment
Docker - Demo on PHP Application deployment Arun prasath
 
Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...Codemotion
 
Virtualizing Apache Spark and Machine Learning with Justin Murray
Virtualizing Apache Spark and Machine Learning with Justin MurrayVirtualizing Apache Spark and Machine Learning with Justin Murray
Virtualizing Apache Spark and Machine Learning with Justin MurrayDatabricks
 
.NET Developer Days - So many Docker platforms, so little time...
.NET Developer Days - So many Docker platforms, so little time....NET Developer Days - So many Docker platforms, so little time...
.NET Developer Days - So many Docker platforms, so little time...Michele Leroux Bustamante
 
Chef and Apache CloudStack (ChefConf 2014)
Chef and Apache CloudStack (ChefConf 2014)Chef and Apache CloudStack (ChefConf 2014)
Chef and Apache CloudStack (ChefConf 2014)Jeff Moody
 
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호Amazon Web Services Korea
 

Similaire à Docker Based Hadoop Provisioning (20)

Automating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with PuppetAutomating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with Puppet
 
Automating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David NalleyAutomating CloudStack with Puppet - David Nalley
Automating CloudStack with Puppet - David Nalley
 
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
Azure: Docker Container orchestration, PaaS ( Service Farbic ) and High avail...
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
Docker in practice
Docker in practiceDocker in practice
Docker in practice
 
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
Docker Presentation at the OpenStack Austin Meetup | 2013-09-12
 
Application Deployment on Openstack
Application Deployment on OpenstackApplication Deployment on Openstack
Application Deployment on Openstack
 
Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deployment
 
Dockerization of Azure Platform
Dockerization of Azure PlatformDockerization of Azure Platform
Dockerization of Azure Platform
 
The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittal
 
Donabe-essex-conference-readout
Donabe-essex-conference-readoutDonabe-essex-conference-readout
Donabe-essex-conference-readout
 
Directions for CloudStack Networking
Directions for CloudStack  NetworkingDirections for CloudStack  Networking
Directions for CloudStack Networking
 
Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014Cloud level scalability - Nuxeo Tour 2014
Cloud level scalability - Nuxeo Tour 2014
 
Docker - Demo on PHP Application deployment
Docker - Demo on PHP Application deployment Docker - Demo on PHP Application deployment
Docker - Demo on PHP Application deployment
 
Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...Higher order infrastructure: from Docker basics to cluster management - Nicol...
Higher order infrastructure: from Docker basics to cluster management - Nicol...
 
DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop
 
Virtualizing Apache Spark and Machine Learning with Justin Murray
Virtualizing Apache Spark and Machine Learning with Justin MurrayVirtualizing Apache Spark and Machine Learning with Justin Murray
Virtualizing Apache Spark and Machine Learning with Justin Murray
 
.NET Developer Days - So many Docker platforms, so little time...
.NET Developer Days - So many Docker platforms, so little time....NET Developer Days - So many Docker platforms, so little time...
.NET Developer Days - So many Docker platforms, so little time...
 
Chef and Apache CloudStack (ChefConf 2014)
Chef and Apache CloudStack (ChefConf 2014)Chef and Apache CloudStack (ChefConf 2014)
Chef and Apache CloudStack (ChefConf 2014)
 
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호
AWS re:Invent re:Cap - 배포를 더욱 손쉽고 빠르게: Amazon EC2 Container Service - 김일호
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Dernier (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Docker Based Hadoop Provisioning

  • 1. Janos Matyas / CTO / SequenceIQ Inc.
  • 2. GOAL / MOTIVATION TECHNOLOGY STACK PROBLEM RESOLUTION / HOW IT WORKS RESULTS / ACHIEVEMENTS OVERVIEW
  • 3. GOAL / MOTIVATION  Ease Hadoop provisioning – everywhere  Automate and unify the process  Arbitrary cluster size  Same process through a cluster lifecycle (Dev, QA, UAT, Prod)  (Auto) scaling Hadoop  QoS
  • 4. OUR APPROACH  Use Docker  Build cloud-specific ‘Dockerized’ images  Provision the cluster  Use Ambari
  • 5. DOCKER  Lightweight, portable  Build once, run anywhere  VM – without the overhead of a VM  Isolated containers  Automated and scripted
  • 6. DOCKER – CONTAINERS vs. VMs  Containers are isolated, but share OS and, where appropriate, bins/libraries
  • 7. APACHE AMBARI – ARCHITECTURE  Easy Hadoop cluster provisioning  Management and monitoring  Key features – blueprints  REST API
  • 8. APACHE AMBARI – CREATE CLUSTER  Define a blueprint (POST /api/v1/blueprints)  Create cluster (POST /api/v1/clusters/mycluster)
  • 9. HADOOP PROVISIONG ISSUES  Each cloud provider has a proprietary API  Create images for each provider  Network configuration  Service discovery  Resize, failover, member join support
  • 10. OUR APPROACH – DETAILS  Build your Docker image  Install or pre-install Hadoop services with Ambari  Install Serf and dnsmasq  Build your cloud image  Use Ansible to create an image  Provision the cluster
  • 11. BUILD DOCKER IMAGES  Create the Dockerfile  Have Docker.io to build the image  Optionally pre-install services  Use Ambari  Push image to Docker.io  Licensing questions
  • 12. BUILD CLOUD IMAGES  Use a Docker ready base image  Use Ansible to provision the image template  Pull the Docker images  Apply custom infrastructure  Use cloud provider specific playbooks  AWS EC2  Azure
  • 13. ANSIBLE  Configuration as data  Simplest way to automate IT  Secure and agentless  Goal oriented  One playbook – multiple modules  We use it to “burn” cloud images/templates
  • 14. PROVISIONING – ISSUES  FQDN  /etc/hosts is read-only in Docker  Everybody needs to know everybody  DNS  Single point of failure  Dynamic cluster – nodes joining, leaving, failing  Routing  Cloud – ability to inter-host container routing  Collision free private IP range for Docker bridge  We need predefined host names/IP addresses  /etc/hosts is read-only in Docker  Use Ansible to provision the image template  Pull the Docker images  Start a DNS server  Use it as a reference docker run -dns <IP_OF_DNS>  Nodes need to know each other
  • 15. PROVISIONING – SOLUTION  FQDN  Use –h and –dns Docker params  DNS  dnsmasq is running on each Docker container  Serf member-xxx events trigger dnsmasq reconfiguration  Routing  Docker bridge configuration – follows a convention
  • 16. SERF  Gossip based membership  Service discovery  Decentralized  Lightweight, fault tolerant  Highly available  DevOps friendly  Keep an eye on Consul, Open vSwitch, pipework
  • 17. SERF – DECENTRALIZED SERVICE DISCOVERY  Gossip instead of heartbeat  LAN, WAN profiles  Provides membership information  Event handlers: member_join, member_leave, member_failed, member- update, member-reap, user  Query
  • 19. SERF – MEMBERSHIP, EVENT HANDLERS
  • 20. DNSMASQ  Network infrastructure for small networks  Lightweight DNS, DHCP server  Comes with most Linux distributions
  • 21. AWS EC2 – HADOOP CLUSTER  Use EC2 REST API to provision instances (from Dockerized image)  Start Docker containers  One Ambari server  N-1 Ambari agents connecting to server  Connect ambari-shell to  Define blueprint  Provision the cluster
  • 22. AWS EC2 – NETWORK SECURITY  Create a VPC  Configure subnets  Routing tables  Security gateway  Set ACL  Configure VPN
  • 23. AWS EC2 - CLOUDFORMATION  Manually set up VPC is too complicated  Use CloudFormation  Manage the stack together  Template-based  Environments under version control  Customizable at runtime  No extra charge "VpcId" : { "Type" : "String", "Description" : "VpcId of your existing Virtual Private Cloud (VPC)" }, "SubnetId" : { "Type" : "String", "Description" : "SubnetId of an existing subnet (for the primary network) in your Virtual Private Cloud (VPC)" }, "SecondaryIPAddressCount" : { "Type" : "Number", "Default" : "1", "MinValue" : "1", "MaxValue" : "5", "Description" : "Number of secondary IP addresses to assign to the network interface (1-5)", "ConstraintDescription": "must be a number from 1 to 5." }, "SSHLocation" : { "Description" : "The IP address range that can be used to SSH to the EC2 instances", "Type": "String", "MinLength": "9", "MaxLength": "18", "Default": "0.0.0.0/0", "AllowedPattern": "(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})/ (d{1,2})", "ConstraintDescription": "must be a valid IP CIDR range of the form x.x.x.x/x." } },
  • 24. CLOUDBREAK Cloudbreak is a powerful left surf that breaks over a coral reef, a mile off southwest the island of Tavarua, Fiji. Cloudbreak is a cloud-agnostic Hadoop as a Service API. Abstracts the provisioning and ease management and monitoring of on- demand clusters. Provisioning Hadoop has never been easier
  • 25. CLOUDBREAK  Benefits  Elastic  Scalable  Blueprints  Flexible  Main REST resources  /template – specify a cluster infrastructure  /stack – creates a cloud infrastructure built from a template  /blueprint – describes a Hadoop cluster  /cluster – creates a Hadoop cluster
  • 26. RESULTS AND ACHIEVEMENTS  Hadoop as a Service API  Available for EC2 and Azure cloud  OpenStack, bare metal is coming soon  Open source under Apache 2 licence  Same goals as Apache Ambari Launchpad project  What's next?
  • 27. HADOOP SERVICES - AS A SERVICE  Leverage YARN  Slider (Hoya) providers  HBase, Accumulo  SequenceIQ providers - Flume, Tomcat  YARN -1964  QoS for YARN – heuristic scheduler  Platform as a Service API
  • 28. BANZAI PIPELINE Banzai Pipeline is a surf reef break located in Hawaii, off Ehukai Beach Park in Pupukea on O'ahu's North Shore. Banzai Pipeline is a RESTful application development platform for building on- demand data and job pipelines running on Hadoop YARN. Banzai Pipeline is a big data API for the REST
  • 29. THANK YOU  Get the code: https://github.com/sequenceiq  Read about: http://blog.sequenceiq.com  Facebook: http://facebook.com/sequenceiq  Twitter: http://twitter.com/sequenceiq  LinkedIn: http://linkedin.com/sequenceiq  Contact: janos.matyas@sequenceiq.com FEEL FREE TO CONTRIBUTE

Notes de l'éditeur

  1. YAML
  2. Dev – env : use default Docker bridge (easy)
  3. -h for hostname, --dns to specify the DNS service to use Convention: AMI launch index
  4. Fire and forget Waits for anwer – limited response collection