SlideShare une entreprise Scribd logo
1  sur  42
1
Big Data in Container
Heiko Loewe @loeweh
Meetup Big Data Hadoop & Spark NRW 08/24/2016
2
Why
• Fast Deployment
• Test/Dev Cluster
• Better Utilize Hardware
• Learn to manage Hadoop
• Test new Versions
• An appliance for continuous
integration/API testing
3
Design
Master Container
- Name Node
- Secondary Name Node
- Yarn
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
4
More than 1 Hosts needs Overlay Net
Interface Docker0 not routed
Overlay Network
1 Host Config
(almost ) no
Problem
For 2 Hosts
and more
we need an
Overlay Net-
work
5
Choice of the Overlay Network Impl.
Docker Multi-Host Network Weave Net
• Backend: VXLAN, AWS, GCE.
• Fallback: custom UDP-based
tunneling.
• Control plane: built-in, uses Etcd
for shared state.
CoreOS Flanneld
• Backend: VXLAN.
• Fallback: none.
• Control plane: built-in, uses
Zookeeper, Consul or Etcd for
shared state.
• Backend: VXLAN via OVS.
• Fallback: custom UDP-based
tunneling called “sleeve”.
• Control plane: built-in.
6
Normal mode of operations is called FDP – fast
data path – which works via OVS’s data path
kernel module (mainline since 3.12). It’s just
another VXLAN implementation.
Has a sleeve fallback mode, works in userspace
via pcap.
Sleeve supports full encryption.
Weaveworks also has Weave DNS, Weave
Scope and Weave Flux – providing
introspection, service discovery & routing
capabilities on top of Weave Net.
WEAVE NET
7
 /etc/sudoers
 # at the end:
 vuser ALL=(ALL) NOPASSWD: ALL
 # secure_path, append /usr/local/bin for weave
 Defaults secure_path =
/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin
 sudo groupadd docker
 sudo gpasswd -a ${USER} docker
 sudo chgrp docker /var/run/docker.sock
 alias docker="sudo /usr/bin/docker"
Docker Adaption (Fedora/Centos/RHEL)
8
 WARNING: existing iptables rule

 '-A FORWARD -j REJECT --reject-with icmp-host-prohibited'

 will block name resolution via weaveDNS - please reconfigure your firewall.
 sudo systemctl stop firewalld
 Sudo systemctl disable firewalld
 /sbin/iptables -D FORWARD -j REJECT --reject-with icmp-host-prohibited
 /sbin/iptables -D INPUT -j REJECT --reject-with icmp-host-prohibited
 iptables-save
 reboot
Weave Problems on Fedora/Centos/RHEL
9
[vuser@linux ~]$ ifconfig | grep -v "^ "
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
[vuser@linux ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
[vuser@linux ~]$ sudo weave launch
[vuser@linux ~]$ eval $(sudo weave env)
[vuser@linux ~]$ sudo weave -–local expose
10.32.0.6
[vuser@linux ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0fd6ab928d96 weaveworks/plugin:1.6.1 "/home/weave/plugin" 11 seconds ago Up 8 seconds weaveplugin
4b24e5802fcc weaveworks/weaveexec:1.6.1 "/home/weave/weavepro" 13 seconds ago Up 10 seconds weaveproxy
c4882326398a weaveworks/weave:1.6.1 "/home/weave/weaver -" 18 seconds ago Up 15 seconds weave
[vuser@linux ~]$ ifconfig | grep -v "^ "
datapath: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
vethwe-bridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410
vethwe-datapath: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410
vxlan-6784: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65485
weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410
WEAVE
Container
WEAVE Interfaces
Weave Run
10
https://github.com/kiwenlau/hadoop-cluster-docker/blob/master/Dockerfile
Hadoop Container Docker File
FROM ubuntu:14.04
# install openssh-server, openjdk and wget
# install hadoop 2.7.2
# set environment variable
# ssh without key
# set up Hadoop directorties
# copy config files from local
# make Hadoop start files executable
# format namenode
#standard run command
CMD [ "sh", "-c", "service ssh start; bash"]
$ docker build –t loewe/hadoop:latest
11
Start Hadoop Container
Host 1
• Master
$ sudo weave run –itd –p 8088:8088 –p 50070:50070 -–name hadoop-master
• Slaves 1,2
$ sudo weave run –itd -–name hadoop-slave1
$ sudo weave run –itd -–name hadoop-slave2
Host2
• Slave 3,4
$ sudo weave run –itd -–name hadoop-slave1
$ sudo weave run –itd -–name hadoop-slave2
root@boot2docker:~# weave status dns
hadoop-master 10.32.0.1 6a4db5f52340 92:64:f5:c5:57:a7
hadoop-slave1 10.32.0.2 34e0a7de1105 92:64:f5:c5:57:a7
hadoop-slave2 10.32.0.3 d879f077cf4e 92:64:f5:c5:57:a7
hadoop-slave3 10.44.0.0 6ca7ddb9daf8 92:56:f4:98:36:b0
hadoop-slave4 10.44.0.1 c1ed48630b1c 92:56:f4:98:36:b0
12
Hadoop Cluster / 2 Host / 5 Nodes
13
Persitent Volumes for HDFS
14
• Container (like Docker) are the Foundation for agile
Software Development
• The initial Container Design was stateless (12-factor
App)
• Use-cases are grown in the last few Month
(NoSQL, Stateful Apps)
• Persistence for Container is not easy
The Problem
15
• Enables Persistence of Docker Volumes
• Enables the Implementation of
– Fast Bytes (Performance)
– Data Services (Protection / Snapshots)
– Data Mobility
– Availability
• Operations:
– Create, Remove, Mount, Path, Unmount
– Additonal Option can be passed to the Volume Driver
DOCKER Volume Manager API
16
Persistente Volumes for CONTAINER
Container OS
Storage
/mnt/PersistentData
Container Container
-v /mnt/PersistenData:/mnt/ContainerData
Container Container
Docker Host
17
Docker Host
Persistente Volumes for CONTAINER
Container OS
Storage
/mnt/PersistentData
Container Container
-v /mnt/PersistenData:/mnt/ContainerData
Container Container
18
Persistente Volumes for CONTAINER
AWS EC2 (EBS)
OpenStack (Cinder)
EMC Isilon
EMC ScaleIO
EMC VMAX
EMC XtremIO
Google Compute Engine (GCE)
VirtualBox
Ubuntu
Debian
RedHat
CentOS
CoreOS
OSX
TinyLinux (boot2docker)
Docker Volume API
Mesos Isolator
...
19
Hadoop + persisten Volumes
Host A
Making the
Hadoop Container
ephemeral
20
Overlay Network
Strech Hadoop w/ persisten Volumes
Host A
Host B
Easiely strech
and shrink a
Cluster without
loosing the Data
21
Other similar Projects
• Big Top Provisioner / Apache Foundation
https://github.com/apache/bigtop/tree/master/provisioner/docker
• Building Hortonworks HDP on Docker
http://henning.kropponline.de/2015/07/19/building-hdp-on-docker/
https://hub.docker.com/r/hortonworks/ambari-server/
https://hub.docker.com/r/hortonworks/ambari-agent/
• Building Cloudera CHD on Docker
http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-
apache-hadoop-and-cloudera/
https://hub.docker.com/r/cloudera/quickstart/
Watch out Overlay Network topix
22
Apache Myriad
23
Myriad Overview
• Mesos Framework for Apache Yarn
• Mesos manages DC, Yarn Manages Hadoop
• Coarse and fine grained Resource Sharing
24
Situation without Integration
25
Yarn/Mesos Integration
26
How it works (simplyfied)
Myriad = Control Plane
27
Myriad Container
28
29
30
31
What about the Data
Myriad only cares for the Compute
Master Container
- Name Node
- Secondary Name Node
- Yarn
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
Slave Container
- Node Manager
- Data Node
Myriad/
Mesos
Cares about
Has to be provided
Outside from
Myriad/Mesos
Has to be provided
Outside from
Myriad/Mesos
32
What about the Data
• Myriad only cares for Compute / Map Reduce
• HDFS has to be provided on other Ways
Big Data New Realities
Big Data Traditional
Assumptions
Bare-metal
Data locality
Data on local disks
Big Data
New Realities
Containers and VMs
Compute and storage
separation
In-place access on
remote data stores
New Benefits
and Value
Big-Data-as-a-
Service
Agility and
cost savings
Faster time-to-
insights
33
Options for HDFS Data Layer
• Pure HDFS Cluster (only Data Node running)
– Bare Metal
– Containerized
– Mesos based
• Enterprise HDFS Array
– EMC Isilon
34
Myriad, Mesos, EMC Isilon for HDFS
35
• Multi Tenancy
• Multiple HDFS Environments
sharing the same storage
• Quota possible on HDFS
Environments
• Snapshots of HDFS Environemnt
possible
• Remote Replication
• Worm Option for HDFS
• High Avaiable HDFS
Infrastructure (distributed
Namen and Data Nodes)
• Storage efficient (usable/raw 0.8
compared to 0.33 with Hadoop)
• Shared Access HDFS / CIFS /
NFS/SFTP possible
• Maintenance equals Enterprise
Array Standard
• All major Distributions supported
EMC Isilon Advantages over classic
Hadoop HDFS
36
Spark on Mesos
37
48%
Standalone mode
40%
YARN
11%
Mesos
Most Common Spark Deployment Environments
(Cluster Managers)
Source: Spark Survey Report, 2015 (Databricks)
Common Deployment Patterns
38
Bare MetalBare MetalBare Metal
Bare MetalSpark Client
Virtual Machine
Virtual Machine Virtual Machine Virtual Machine
Spark
Slave
tasktask task
Spark
Slave
tasktask task
Spark
Slave
tasktask task
Spark Master
Spark Cluster – Standalone Mode
Data provided
outside
39
Node Manager Node Manager Node Manager
Spark
Executor
tasktask task
Spark
Executor
tasktask task
Spark
Executor
tasktask task
Spark Client
Spark Master
Resource
Manager
Spark Cluster – Hadoop YARN
Data provide
By Hadoop
Cluster
40
Mesos Slave Mesos Slave Mesos Slave
Spark
Executor
tasktask task
Spark
Executor
tasktask task
Spark
Executor
tasktask task
Mesos
Master
Spark
Scheduler
Spark Client
Spark Cluster – Mesos
Data provided
outside
41
Spark + Mesos + EMC Isilon
To solve the HDFS Data Layer
42Follow me on Twitter: @loeweh

Contenu connexe

Tendances

CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSDataWorks Summit
 
Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsBlueData, Inc.
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsDataWorks Summit
 
Docker Based Hadoop Provisioning
Docker Based Hadoop ProvisioningDocker Based Hadoop Provisioning
Docker Based Hadoop ProvisioningDataWorks Summit
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...Yahoo Developer Network
 
Lessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerLessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerSpark Summit
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_finalRamya Sunil
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureDataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperryanlecompte
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosDiscover Pinterest
 
Configuring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the EnterpriseConfiguring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the EnterpriseCloudera, Inc.
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management toolsRan Silberman
 
Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka SecurityDataWorks Summit
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and dockerFabio Fumarola
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 

Tendances (20)

CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
 
Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark Workloads
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
 
Docker Based Hadoop Provisioning
Docker Based Hadoop ProvisioningDocker Based Hadoop Provisioning
Docker Based Hadoop Provisioning
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
Lessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On DockerLessons Learned From Running Spark On Docker
Lessons Learned From Running Spark On Docker
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_final
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
 
YARN
YARNYARN
YARN
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Ansible + Hadoop
Ansible + HadoopAnsible + Hadoop
Ansible + Hadoop
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and Mesos
 
Configuring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the EnterpriseConfiguring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the Enterprise
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools
 
Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka Security
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and docker
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 

En vedette

Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Janos Matyas
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderHortonworks
 
Big data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartBig data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartIMC Institute
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.Timothy St. Clair
 
Lessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosLessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosRaj Shekhar
 
Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"Discover Pinterest
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in businessMapR Technologies
 
Creative photo effects
Creative photo effectsCreative photo effects
Creative photo effectsMarco Belzoni
 
Obtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadObtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadMaryBreenSmith
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsCognizant
 
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLR
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLRData Infrastructure on Hadoop - Hadoop Summit 2011 BLR
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLRSeetharam Venkatesh
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
How could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemHow could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemJun Hong Kim
 
Resource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadResource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadSantosh Marella
 
Mesos meetup @ shutterstock
Mesos meetup @ shutterstockMesos meetup @ shutterstock
Mesos meetup @ shutterstockBrenden Matthews
 

En vedette (19)

Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
 
Deploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via SliderDeploying Docker applications on YARN via Slider
Deploying Docker applications on YARN via Slider
 
Big data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera QuickstartBig data processing using Hadoop with Cloudera Quickstart
Big data processing using Hadoop with Cloudera Quickstart
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.Musings on Mesos: Docker, Kubernetes, and Beyond.
Musings on Mesos: Docker, Kubernetes, and Beyond.
 
Lessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesosLessons in moving from physical hosts to mesos
Lessons in moving from physical hosts to mesos
 
Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"Five Ways To Do Data Analytics "The Wrong Way"
Five Ways To Do Data Analytics "The Wrong Way"
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in business
 
Fair Fitness
Fair FitnessFair Fitness
Fair Fitness
 
Creative photo effects
Creative photo effectsCreative photo effects
Creative photo effects
 
Obtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and MyriadObtaining patentable claims after Prometheus and Myriad
Obtaining patentable claims after Prometheus and Myriad
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLR
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLRData Infrastructure on Hadoop - Hadoop Summit 2011 BLR
Data Infrastructure on Hadoop - Hadoop Summit 2011 BLR
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
How could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemHow could I automate log gathering in the distributed system
How could I automate log gathering in the distributed system
 
Resource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache MyriadResource Sharing Beyond Boundaries - Apache Myriad
Resource Sharing Beyond Boundaries - Apache Myriad
 
Mesos meetup @ shutterstock
Mesos meetup @ shutterstockMesos meetup @ shutterstock
Mesos meetup @ shutterstock
 

Similaire à Big Data in Container; Hadoop Spark in Docker and Mesos

Docker and coreos20141020b
Docker and coreos20141020bDocker and coreos20141020b
Docker and coreos20141020bRichard Kuo
 
Rootless Containers & Unresolved issues
Rootless Containers & Unresolved issuesRootless Containers & Unresolved issues
Rootless Containers & Unresolved issuesAkihiro Suda
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Praguetomasbart
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020Akihiro Suda
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopfann wu
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxjustinit
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Mandakini Kumari
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopLeons Petražickis
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0Heiko Loewe
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosEdureka!
 
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzureDevoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzurePatrick Chanezon
 
Oracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowOracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowSeth Miller
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Patrick Chanezon
 

Similaire à Big Data in Container; Hadoop Spark in Docker and Mesos (20)

Docker and coreos20141020b
Docker and coreos20141020bDocker and coreos20141020b
Docker and coreos20141020b
 
Rootless Containers & Unresolved issues
Rootless Containers & Unresolved issuesRootless Containers & Unresolved issues
Rootless Containers & Unresolved issues
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
tow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualboxtow nodes Oracle 12c RAC on virtualbox
tow nodes Oracle 12c RAC on virtualbox
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
 
Exp-3.pptx
Exp-3.pptxExp-3.pptx
Exp-3.pptx
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0
 
Docker
DockerDocker
Docker
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on AzureDevoxx France 2015 - The Docker Orchestration Ecosystem on Azure
Devoxx France 2015 - The Docker Orchestration Ecosystem on Azure
 
Oracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and HowOracle RAC and Docker: The Why and How
Oracle RAC and Docker: The Why and How
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
Docker San Francisco Meetup April 2015 - The Docker Orchestration Ecosystem o...
 

Dernier

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Dernier (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Big Data in Container; Hadoop Spark in Docker and Mesos

  • 1. 1 Big Data in Container Heiko Loewe @loeweh Meetup Big Data Hadoop & Spark NRW 08/24/2016
  • 2. 2 Why • Fast Deployment • Test/Dev Cluster • Better Utilize Hardware • Learn to manage Hadoop • Test new Versions • An appliance for continuous integration/API testing
  • 3. 3 Design Master Container - Name Node - Secondary Name Node - Yarn Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node
  • 4. 4 More than 1 Hosts needs Overlay Net Interface Docker0 not routed Overlay Network 1 Host Config (almost ) no Problem For 2 Hosts and more we need an Overlay Net- work
  • 5. 5 Choice of the Overlay Network Impl. Docker Multi-Host Network Weave Net • Backend: VXLAN, AWS, GCE. • Fallback: custom UDP-based tunneling. • Control plane: built-in, uses Etcd for shared state. CoreOS Flanneld • Backend: VXLAN. • Fallback: none. • Control plane: built-in, uses Zookeeper, Consul or Etcd for shared state. • Backend: VXLAN via OVS. • Fallback: custom UDP-based tunneling called “sleeve”. • Control plane: built-in.
  • 6. 6 Normal mode of operations is called FDP – fast data path – which works via OVS’s data path kernel module (mainline since 3.12). It’s just another VXLAN implementation. Has a sleeve fallback mode, works in userspace via pcap. Sleeve supports full encryption. Weaveworks also has Weave DNS, Weave Scope and Weave Flux – providing introspection, service discovery & routing capabilities on top of Weave Net. WEAVE NET
  • 7. 7  /etc/sudoers  # at the end:  vuser ALL=(ALL) NOPASSWD: ALL  # secure_path, append /usr/local/bin for weave  Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin  sudo groupadd docker  sudo gpasswd -a ${USER} docker  sudo chgrp docker /var/run/docker.sock  alias docker="sudo /usr/bin/docker" Docker Adaption (Fedora/Centos/RHEL)
  • 8. 8  WARNING: existing iptables rule   '-A FORWARD -j REJECT --reject-with icmp-host-prohibited'   will block name resolution via weaveDNS - please reconfigure your firewall.  sudo systemctl stop firewalld  Sudo systemctl disable firewalld  /sbin/iptables -D FORWARD -j REJECT --reject-with icmp-host-prohibited  /sbin/iptables -D INPUT -j REJECT --reject-with icmp-host-prohibited  iptables-save  reboot Weave Problems on Fedora/Centos/RHEL
  • 9. 9 [vuser@linux ~]$ ifconfig | grep -v "^ " docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 [vuser@linux ~]$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES [vuser@linux ~]$ sudo weave launch [vuser@linux ~]$ eval $(sudo weave env) [vuser@linux ~]$ sudo weave -–local expose 10.32.0.6 [vuser@linux ~]$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0fd6ab928d96 weaveworks/plugin:1.6.1 "/home/weave/plugin" 11 seconds ago Up 8 seconds weaveplugin 4b24e5802fcc weaveworks/weaveexec:1.6.1 "/home/weave/weavepro" 13 seconds ago Up 10 seconds weaveproxy c4882326398a weaveworks/weave:1.6.1 "/home/weave/weaver -" 18 seconds ago Up 15 seconds weave [vuser@linux ~]$ ifconfig | grep -v "^ " datapath: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410 docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 vethwe-bridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410 vethwe-datapath: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410 vxlan-6784: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65485 weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1410 WEAVE Container WEAVE Interfaces Weave Run
  • 10. 10 https://github.com/kiwenlau/hadoop-cluster-docker/blob/master/Dockerfile Hadoop Container Docker File FROM ubuntu:14.04 # install openssh-server, openjdk and wget # install hadoop 2.7.2 # set environment variable # ssh without key # set up Hadoop directorties # copy config files from local # make Hadoop start files executable # format namenode #standard run command CMD [ "sh", "-c", "service ssh start; bash"] $ docker build –t loewe/hadoop:latest
  • 11. 11 Start Hadoop Container Host 1 • Master $ sudo weave run –itd –p 8088:8088 –p 50070:50070 -–name hadoop-master • Slaves 1,2 $ sudo weave run –itd -–name hadoop-slave1 $ sudo weave run –itd -–name hadoop-slave2 Host2 • Slave 3,4 $ sudo weave run –itd -–name hadoop-slave1 $ sudo weave run –itd -–name hadoop-slave2 root@boot2docker:~# weave status dns hadoop-master 10.32.0.1 6a4db5f52340 92:64:f5:c5:57:a7 hadoop-slave1 10.32.0.2 34e0a7de1105 92:64:f5:c5:57:a7 hadoop-slave2 10.32.0.3 d879f077cf4e 92:64:f5:c5:57:a7 hadoop-slave3 10.44.0.0 6ca7ddb9daf8 92:56:f4:98:36:b0 hadoop-slave4 10.44.0.1 c1ed48630b1c 92:56:f4:98:36:b0
  • 12. 12 Hadoop Cluster / 2 Host / 5 Nodes
  • 14. 14 • Container (like Docker) are the Foundation for agile Software Development • The initial Container Design was stateless (12-factor App) • Use-cases are grown in the last few Month (NoSQL, Stateful Apps) • Persistence for Container is not easy The Problem
  • 15. 15 • Enables Persistence of Docker Volumes • Enables the Implementation of – Fast Bytes (Performance) – Data Services (Protection / Snapshots) – Data Mobility – Availability • Operations: – Create, Remove, Mount, Path, Unmount – Additonal Option can be passed to the Volume Driver DOCKER Volume Manager API
  • 16. 16 Persistente Volumes for CONTAINER Container OS Storage /mnt/PersistentData Container Container -v /mnt/PersistenData:/mnt/ContainerData Container Container Docker Host
  • 17. 17 Docker Host Persistente Volumes for CONTAINER Container OS Storage /mnt/PersistentData Container Container -v /mnt/PersistenData:/mnt/ContainerData Container Container
  • 18. 18 Persistente Volumes for CONTAINER AWS EC2 (EBS) OpenStack (Cinder) EMC Isilon EMC ScaleIO EMC VMAX EMC XtremIO Google Compute Engine (GCE) VirtualBox Ubuntu Debian RedHat CentOS CoreOS OSX TinyLinux (boot2docker) Docker Volume API Mesos Isolator ...
  • 19. 19 Hadoop + persisten Volumes Host A Making the Hadoop Container ephemeral
  • 20. 20 Overlay Network Strech Hadoop w/ persisten Volumes Host A Host B Easiely strech and shrink a Cluster without loosing the Data
  • 21. 21 Other similar Projects • Big Top Provisioner / Apache Foundation https://github.com/apache/bigtop/tree/master/provisioner/docker • Building Hortonworks HDP on Docker http://henning.kropponline.de/2015/07/19/building-hdp-on-docker/ https://hub.docker.com/r/hortonworks/ambari-server/ https://hub.docker.com/r/hortonworks/ambari-agent/ • Building Cloudera CHD on Docker http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for- apache-hadoop-and-cloudera/ https://hub.docker.com/r/cloudera/quickstart/ Watch out Overlay Network topix
  • 23. 23 Myriad Overview • Mesos Framework for Apache Yarn • Mesos manages DC, Yarn Manages Hadoop • Coarse and fine grained Resource Sharing
  • 26. 26 How it works (simplyfied) Myriad = Control Plane
  • 28. 28
  • 29. 29
  • 30. 30
  • 31. 31 What about the Data Myriad only cares for the Compute Master Container - Name Node - Secondary Name Node - Yarn Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node Slave Container - Node Manager - Data Node Myriad/ Mesos Cares about Has to be provided Outside from Myriad/Mesos Has to be provided Outside from Myriad/Mesos
  • 32. 32 What about the Data • Myriad only cares for Compute / Map Reduce • HDFS has to be provided on other Ways Big Data New Realities Big Data Traditional Assumptions Bare-metal Data locality Data on local disks Big Data New Realities Containers and VMs Compute and storage separation In-place access on remote data stores New Benefits and Value Big-Data-as-a- Service Agility and cost savings Faster time-to- insights
  • 33. 33 Options for HDFS Data Layer • Pure HDFS Cluster (only Data Node running) – Bare Metal – Containerized – Mesos based • Enterprise HDFS Array – EMC Isilon
  • 34. 34 Myriad, Mesos, EMC Isilon for HDFS
  • 35. 35 • Multi Tenancy • Multiple HDFS Environments sharing the same storage • Quota possible on HDFS Environments • Snapshots of HDFS Environemnt possible • Remote Replication • Worm Option for HDFS • High Avaiable HDFS Infrastructure (distributed Namen and Data Nodes) • Storage efficient (usable/raw 0.8 compared to 0.33 with Hadoop) • Shared Access HDFS / CIFS / NFS/SFTP possible • Maintenance equals Enterprise Array Standard • All major Distributions supported EMC Isilon Advantages over classic Hadoop HDFS
  • 37. 37 48% Standalone mode 40% YARN 11% Mesos Most Common Spark Deployment Environments (Cluster Managers) Source: Spark Survey Report, 2015 (Databricks) Common Deployment Patterns
  • 38. 38 Bare MetalBare MetalBare Metal Bare MetalSpark Client Virtual Machine Virtual Machine Virtual Machine Virtual Machine Spark Slave tasktask task Spark Slave tasktask task Spark Slave tasktask task Spark Master Spark Cluster – Standalone Mode Data provided outside
  • 39. 39 Node Manager Node Manager Node Manager Spark Executor tasktask task Spark Executor tasktask task Spark Executor tasktask task Spark Client Spark Master Resource Manager Spark Cluster – Hadoop YARN Data provide By Hadoop Cluster
  • 40. 40 Mesos Slave Mesos Slave Mesos Slave Spark Executor tasktask task Spark Executor tasktask task Spark Executor tasktask task Mesos Master Spark Scheduler Spark Client Spark Cluster – Mesos Data provided outside
  • 41. 41 Spark + Mesos + EMC Isilon To solve the HDFS Data Layer
  • 42. 42Follow me on Twitter: @loeweh

Notes de l'éditeur

  1. https://docs.docker.com/engine/extend/plugins_volume/ /VolumeDriver.Create /VolumeDriver.Remove /VolumeDriver.Mount /VolumeDriver.Path /VolumeDriver.Unmount
  2. Ok, so there really is a way to do this, but this means tons of work. These Dev guys want everything instant and i am just one person. How should I be able to deliver this?
  3. Ok, so there really is a way to do this, but this means tons of work. These Dev guys want everything instant and i am just one person. How should I be able to deliver this?
  4. Ok, so there really is a way to do this, but this means tons of work. These Dev guys want everything instant and i am just one person. How should I be able to deliver this?