SlideShare une entreprise Scribd logo
1  sur  64
Joe Stein
Building & Deploying Applications to Apache Mesos
Joe Stein
• Developer, Architect & Technologist
• Founder & Principal Consultant => Big Data Open Source Security LLC - http://stealth.ly
Big Data Open Source Security LLC provides professional services and product solutions for the collection,
storage, transfer, real-time analytics, batch processing and reporting for complex data streams, data sets and
distributed systems. BDOSS is all about the "glue" and helping companies to not only figure out what Big Data
Infrastructure Components to use but also how to change their existing (or build new) systems to work with
them.
• Apache Kafka Committer & PMC member
• Blog & Podcast - http://allthingshadoop.com
• Twitter @allthingshadoop
Overview
● Quick intro to Mesos
● Marathon Framework
● Aurora Framework
● Custom Framework
Origins
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
http://static.usenix.org/event/nsdi11/tech/full_papers/Hindman_new.pdf
Datacenter Management with Apache Mesos
https://www.youtube.com/watch?v=YB1VW0LKzJ4
Omega: flexible, scalable schedulers for large compute clusters
http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf
2011 GAFS Omega John Wilkes
https://www.youtube.com/watch?v=0ZFMlO98Jkc&feature=youtu.be
Life without Apache Mesos
Static Partitioning
Static Partitioning IS Bad
hard to utilize machines
(i.e., X GB RAM and Y CPUs)
Static Partitioning does NOT scale
hard to scale elastically
(to take advantage of statistical multiplexing)
Failures === Downtime
hard to deal with failures
Static Partitioning
Operating System === Datacenter
Mesos => data center “kernel”
Apache Mesos
● Scalability to 10,000s of nodes
● Fault-tolerant replicated master and slaves using ZooKeeper
● Support for Docker containers
● Native isolation between tasks with Linux Containers
● Multi-resource scheduling (memory, CPU, disk, and ports)
● Java, Python and C++ APIs for developing new parallel applications
● Web UI for viewing cluster state
Resources & Attributes
The Mesos system has two basic methods to describe the
slaves that comprise a cluster. One of these is managed by the
Mesos master, the other is simply passed onwards to the
frameworks using the cluster.
Attributes
The attributes are simply key value string pairs that Mesos passes along when it sends offers to frameworks.
attributes : attribute ( ";" attribute )*
attribute : labelString ":" ( labelString | "," )+
Resources
The Mesos system can manage 3 different types of resources: scalars, ranges, and sets. These are used to represent
the different resources that a Mesos slave has to offer. For example, a scalar resource type could be used to represent
the amount of memory on a slave. Each resource is identified by a key string.
resources : resource ( ";" resource )*
resource : key ":" ( scalar | range | set )
key : labelString ( "(" resourceRole ")" )?
scalar : floatValue
range : "[" rangeValue ( "," rangeValue )* "]"
rangeValue : scalar "-" scalar
set : "{" labelString ( "," labelString )* "}"
resourceRole : labelString | "*"
labelString : [a-zA-Z0-9_/.-]
floatValue : ( intValue ( "." intValue )? ) | ...
intValue : [0-9]+
Predefined Uses & Conventions
The Mesos master has a few resources that it pre-defines in how it handles them. At the current time, this list consist of:
● cpu
● mem
● disk
● ports
In particular, a slave without cpu and mem resources will never have its resources advertised to any frameworks. Also, the Master’s
user interface interprets the scalars inmem and disk in terms of MB. IE: the value 15000 is displayed as 14.65GB.
Here are some examples for configuring the Mesos slaves.
--resources='cpu:24;mem:24576;disk:409600;ports:[21000-24000];disks:{1,2,3,4,5,6,7,8}'
--attributes='dc:1;floor:2;aisle:6;rack:aa;server:15;os:trusty;jvm:7u68;docker:1.5'
In this case, we have three different types of resources, scalars, a range, and a set. They are called cpu, mem, disk, and the range
type is ports.
● scalar called cpu, with the value 24
● scalar called mem, with the value 24576
● scalar called disk, with the value 409600
● range called ports, with values 21000 through 24000 (inclusive)
● set called disks, with the values 1, 2, etc which we can concatenate later /mnt/disk/{diskNum} and each task can own a
disk
Examples
Roles
Total consumable resources per slave, in the form 'name(role):value;name(role):value...'. This value can be set to
limit resources per role, or to overstate the number of resources that are available to the slave.
--resources="cpus(*):8; mem(*):15360; disk(*):710534; ports(*):[31000-32000]"
--resources="cpus(prod):8; cpus(stage):2 mem(*):15360; disk(*):710534; ports(*):[31000-32000]"
All * roles will be detected, so you can specify only the resources that are not all roles (*). --
resources="cpus(prod):8; cpus(stage)"
Frameworks bind a specific roles or any. A default roll (instead of *) can also be configured.
Roles can be used to isolate and segregate frameworks.
Marathon
https://github.com/mesosphere/marathon
Cluster-wide init and control system for
services in cgroups or docker based on
Apache Mesos
Constraints
Constraints control where apps run to allow optimizing for fault tolerance or locality. Constraints are made up of three parts: a field
name, an operator, and an optional parameter. The field can be the slave hostname or any Mesos slave attribute.
Fields
Hostname field
hostname field matches the slave hostnames, see UNIQUE operator for usage example.
hostname field supports all operators of Marathon.
Attribute field
If the field name is none of the above, it will be treated as a Mesos slave attribute. Mesos slave attribute is a way to tag a slave
node, see mesos-slave --help to learn how to set the attributes.
Unique
UNIQUE tells Marathon to enforce uniqueness of the attribute across all of an app's tasks. For example the following constraint
ensures that there is only one app task running on each host:
via the Marathon gem:
$ marathon start -i sleep -C 'sleep 60' -n 3 --constraint hostname:UNIQUE
via curl:
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-unique",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["hostname", "UNIQUE"]]
}'
CLUSTER allows you to run all of your app's tasks on slaves that share a certain attribute.
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-cluster",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["rack_id", "CLUSTER", "rack-1"]]
}'
You can also use this attribute to tie an application to a specific node by using the hostname property:
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-cluster",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["hostname", "CLUSTER", "a.specific.node.com"]]
}'
Cluster
Group By
GROUP_BY can be used to distribute tasks evenly across racks or datacenters for high availability.
via the Marathon gem:
$ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:GROUP_BY
via curl:
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-group-by",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["rack_id", "GROUP_BY"]]
}'
Optionally, you can specify a minimum number of groups to try and achieve.
Like
LIKE accepts a regular expression as parameter, and allows you to run your tasks only on the slaves whose field values match the
regular expression.
via the Marathon gem:
$ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:LIKE:rack-[1-3]
via curl:
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-group-by",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["rack_id", "LIKE", "rack-[1-3]"]]
}'
Unlike
Just like LIKE operator, but only run tasks on slaves whose field values don't match the regular expression.
via the Marathon gem:
$ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:UNLIKE:rack-[7-9]
via curl:
$ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{
"id": "sleep-group-by",
"cmd": "sleep 60",
"instances": 3,
"constraints": [["rack_id", "UNLIKE", "rack-[7-9]"]]
}'
Running in Marathon
TAG=sample
APP=xyz
ID=$TAG-$APP
CMD=”./yourScript "'$HOST'" "'$PORT0'" "'$PORT1'
JSON=$(printf '{ "id": "%s", "cmd": "%s", "cpus": %s, "mem": %s, "instances":
%s, "uris": ["%s"], "ports": [0,0] , “env”:{ “JAVA_OPTS”,”%s”}' "$ID" “$CMD"
"0.1" "256" "1" "http://dns/path/yourScriptAndStuff.tgz" “-Xmx 128”)
curl -i -X POST -H "Content-Type: application/json" -d "$JSON"
http://localhost:8080/v2/apps
yourScript
#think of it as a distributed application launching in the cloud
HOST=$1
PORT0=$2
PORT1=$3
#talk to zookeeper or etcd or mesos dns
#call marathon rest api
#spawn another process, they are all in your cgroup =8^) woot woot
Persisting Data
● Write data outside the sandbox, the other kids will not mind.
○ /var/lib/data/<tag>/<app>
● Careful about starvation, use the constraints and roles to manage this.
● Future improvements https://issues.apache.org/jira/browse/MESOS-2018 This is a
feature to provide better support for running stateful services on Mesos such as
HDFS (Distributed Filesystem), Cassandra (Distributed Database), or MySQL (Local
Database). Current resource reservations (henceforth called "static" reservations)
are statically determined by the slave operator at slave start time, and individual
frameworks have no authority to reserve resources themselves. Dynamic
reservations allow a framework to dynamically/lazily reserve offered resources, such
that those resources will only be re-offered to the same framework (or other
frameworks with the same role). This is especially useful if the framework's task
stored some state on the slave, and needs a guaranteed set of resources reserved
so that it can re-launch a task on the same slave to recover that state.
Apache Aurora
http://aurora.incubator.apache.org/
Apache Aurora is a service scheduler that runs on top of
Mesos, enabling you to run long-running services that
take advantage of Mesos' scalability, fault-tolerance, and
resource isolation. Apache Aurora is currently part of the
Apache Incubator.
Thermos
Job Objects
Job Lifecycle
hello_world.aurora
import os
hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello world')
hello_world_task = Task(
resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB),
processes = [hello_world_process])
hello_world_job = Job(
cluster = 'cluster1',
role = os.getenv('USER'),
task = hello_world_task)
jobs = [hello_world_job]
Then issue the following commands to create and kill the job, using your own values for the job key.
aurora job create cluster1/$USER/test/hello_world hello_world.aurora
aurora job kill cluster1/$USER/test/hello_world
import os
import textwrap
class Profile(Struct):
archive = Default(String, 'https://archive.apache.org/dist')
gpg_grp = Default(String, 'https://people.apache.org/keys/group')
svc = Default(String, 'kafka')
svc_ver = Default(String, '0.8.1.1')
svc_ver_scala = Default(String, '2.10')
svc_prop_file = Default(String, 'server.properties')
jvm_heap = Default(String, '-Xmx1G -Xms1G')
common = Process(
name = 'fetch commons',
cmdline = textwrap.dedent("""
hdfs dfs -get /dist/scripts.tgz
tar xf scripts.tgz
""")
)
Kafka on Aurora
Kafka on Aurora
dist = Process(
name = 'fetch {{profile.svc}} v{{profile.svc_ver_scala}}-{{profile.svc_ver}} distribution',
cmdline = textwrap.dedent("""
command -v curl >/dev/null 2>&1 || { echo >&2 "error: 'curl' is not installed. Aborting."; exit 1; }
eval curl -sSfL '-O {{profile.archive}}/kafka/{{profile.svc_ver}}/kafka_{{profile.svc_ver_scala}}-
{{profile.svc_ver}}.tgz'{,.asc,.md5}
if command -v md5sum >/dev/null; then
md5sum -c kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz.md5
else
echo "warn: 'md5sum' is not installed. Check skipped." >&2
fi
if command -v gpg >/dev/null; then
curl -sSfL {{profile.gpg_grp}}/kafka.asc | gpg --import
gpg kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz.asc
else
echo "warn: 'gpg' is not installed. Signature verification skipped." >&2
fi
tar xf kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz
""")
)
register = Process(
name = 'register-service',
cmdline = textwrap.dedent("""
export IP=$(host `hostname` | tr ' ' 'n' | tail -1)
./scripts/common/registry.sh -r {{profile.svc}}-{{environment}} -i {{mesos.instance}} -p "$IP:{{thermos.ports[client]}}"
""")
)
unregister = Process(
name = 'unregister-service',
final = True,
cmdline = textwrap.dedent("""
./scripts/common/registry.sh -u {{profile.svc}}-{{environment}} -i {{mesos.instance}}
""")
)
Kafka on Aurora
config = Process(
name = 'create {{profile.svc_prop_file}}',
cmdline = textwrap.dedent("""
export ZK_CONNECT=$(echo -e `./scripts/common/registry.sh -q zookeeper-{{environment}}-client` | awk '{printf("%s,", $0)}')
export IP=$(host `hostname` | tr ' ' 'n' | tail -1)
echo -e "
broker.id={{mesos.instance}}
port={{thermos.ports[client]}}
host.name=$IP
advertised.host.name=$IP
num.network.threads=2
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=kafka-logs
num.partitions=1
log.retention.hours=168
log.segment.bytes=536870912
log.retention.check.interval.ms=60000
log.cleaner.enable=false
zookeeper.connect=$ZK_CONNECT
zookeeper.connection.timeout.ms=30000
" > {{profile.svc_prop_file}}
echo "Wrote '{{profile.svc_prop_file}}':"
cat {{profile.svc_prop_file}}
""")
)
Kafka on Aurora
run = Process(
name = 'run {{profile.svc}}',
cmdline = textwrap.dedent("""
export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:kafka_{{profile.svc_ver_scala}}-
{{profile.svc_ver}}/config/log4j.properties"
export KAFKA_HEAP_OPTS="{{profile.jvm_heap}}"
export EXTRA_ARGS="-name kafkaServer -loggc"
kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}/bin/kafka-run-class.sh $EXTRA_ARGS kafka.Kafka
{{profile.svc_prop_file}}
""")
)
Kafka on Aurora
Kafka on Aurora
base_task = Task(
processes = [register, unregister, common, dist, config, run],
constraints =
order(dist, run) +
order(common, register) +
order(common, config, run)
)
staging_task = base_task(
resources = Resources(cpu = 1.0, ram = 1280*MB, disk = 5*GB)
)
production_task = base_task(
resources = Resources(cpu = 24.0, ram = 23040*MB, disk = 10000*GB)
)
Kafka on Aurora
DEVELOPMENT = Profile()
PRODUCTION = Profile(
jvm_heap = '-Xmx4G -Xms4G'
)
base_job = Service(
name = 'kafka',
role = os.getenv('USER')
)
jobs = [
base_job(
cluster = dc1,
environment = 'devel',
instances = 4000,
contact = ‘root@localhost',
task = staging_task.bind(
profile = DEVELOPMENT
)
),
base_job(
cluster = ‘dc1’,
environment = 'prod',
instances = 150,
production = True,
contact = ‘root@localhost',
task = production_task.bind(
profile = PRODUCTION
)
)
]
The future of Kafka on Mesos
● A Mesos Kafka framework https://github.com/mesos/kafka
Sample Frameworks
C++ - https://github.com/apache/mesos/tree/master/src/examples
Java - https://github.com/apache/mesos/tree/master/src/examples/java
Python - https://github.com/apache/mesos/tree/master/src/examples/python
Scala - https://github.com/mesosphere/scala-sbt-mesos-framework.g8
Go - https://github.com/mesosphere/mesos-go
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto
message FrameworkInfo {
required string user = 1;
required string name = 2;
optional FrameworkID id = 3;
optional double failover_timeout = 4 [default = 0.0];
optional bool checkpoint = 5 [default = false];
optional string role = 6 [default = "*"];
optional string hostname = 7;
optional string principal = 8;
optional string webui_url = 9;
}
FrameworkInfo
TaskInfo
message TaskInfo {
required string name = 1;
required TaskID task_id = 2;
required SlaveID slave_id = 3;
repeated Resource resources = 4;
optional ExecutorInfo executor = 5;
optional CommandInfo command = 7;
optional ContainerInfo container = 9;
optional bytes data = 6;
optional HealthCheck health_check = 8;
optional Labels labels = 10;
optional DiscoveryInfo discovery = 11;
}
TaskState
/**
* Describes possible task states. IMPORTANT: Mesos assumes tasks that
* enter terminal states (see below) imply the task is no longer
* running and thus clean up any thing associated with the task
* (ultimately offering any resources being consumed by that task to
* another task).
*/
enum TaskState {
TASK_STAGING = 6; // Initial state. Framework status updates should not use.
TASK_STARTING = 0;
TASK_RUNNING = 1;
TASK_FINISHED = 2; // TERMINAL. The task finished successfully.
TASK_FAILED = 3; // TERMINAL. The task failed to finish successfully.
TASK_KILLED = 4; // TERMINAL. The task was killed by the executor.
TASK_LOST = 5; // TERMINAL. The task failed but can be rescheduled.
TASK_ERROR = 7; // TERMINAL. The task description contains an error.
}
Scheduler
/**
* Invoked when the scheduler successfully registers with a Mesos
* master. A unique ID (generated by the master) used for
* distinguishing this framework from others and MasterInfo
* with the ip and port of the current master are provided as arguments.
*/
def registered(
driver: SchedulerDriver,
frameworkId: FrameworkID,
masterInfo: MasterInfo): Unit = {
log.info("Scheduler.registered")
log.info("FrameworkID:n%s" format frameworkId)
log.info("MasterInfo:n%s" format masterInfo)
}
Scheduler
/**
* Invoked when the scheduler re-registers with a newly elected Mesos master.
* This is only called when the scheduler has previously been registered.
* MasterInfo containing the updated information about the elected master
* is provided as an argument.
*/
def reregistered(
driver: SchedulerDriver,
masterInfo: MasterInfo): Unit = {
log.info("Scheduler.reregistered")
log.info("MasterInfo:n%s" format masterInfo)
}
Scheduler
/**Invoked when resources have been offered to this framework. A single offer will only contain resources from a
single slave. Resources associated with an offer will not be re-offered to _this_ framework until either (a) this
framework has rejected those resources or (b) those resources have been rescinded.Note that resources may be
concurrently offered to more than one framework at a time (depending on the allocator being used). In that case,
the first framework to launch tasks using those resources will be able to use them while the other frameworks will
have those resources rescinded (or if a framework has already launched tasks with those resources then those
tasks will fail with a TASK_LOST status and a message saying as much).*/
def resourceOffers(
driver: SchedulerDriver, offers: JList[Offer]): Unit = {
log.info("Scheduler.resourceOffers")
// print and decline all received offers
offers foreach { offer =>
log.info(offer.toString)
driver declineOffer offer.getId
}
}
Scheduler
/**
* Invoked when an offer is no longer valid (e.g., the slave was
* lost or another framework used resources in the offer). If for
* whatever reason an offer is never rescinded (e.g., dropped
* message, failing over framework, etc.), a framwork that attempts
* to launch tasks using an invalid offer will receive TASK_LOST
* status updats for those tasks.
*/
def offerRescinded(
driver: SchedulerDriver,
offerId: OfferID): Unit = {
log.info("Scheduler.offerRescinded [%s]" format offerId.getValue)
}
Scheduler
/**
* Invoked when the status of a task has changed (e.g., a slave is
* lost and so the task is lost, a task finishes and an executor
* sends a status update saying so, etc). Note that returning from
* this callback _acknowledges_ receipt of this status update! If
* for whatever reason the scheduler aborts during this callback (or
* the process exits) another status update will be delivered (note,
* however, that this is currently not true if the slave sending the
* status update is lost/fails during that time).
*/
def statusUpdate(
driver: SchedulerDriver, status: TaskStatus): Unit = {
log.info("Scheduler.statusUpdate:n%s" format status)
}
Scheduler
/**
* Invoked when an executor sends a message. These messages are best
* effort; do not expect a framework message to be retransmitted in
* any reliable fashion.
*/
def frameworkMessage(
driver: SchedulerDriver,
executorId: ExecutorID,
slaveId: SlaveID,
data: Array[Byte]): Unit = {
log.info("Scheduler.frameworkMessage")
}
Scheduler
**
* Invoked when the scheduler becomes "disconnected" from the master
* (e.g., the master fails and another is taking over).
*/
def disconnected(driver: SchedulerDriver): Unit = {
log.info("Scheduler.disconnected")
}
/**
* Invoked when a slave has been determined unreachable (e.g.,
* machine failure, network partition). Most frameworks will need to
* reschedule any tasks launched on this slave on a new slave.
*/
def slaveLost(
driver: SchedulerDriver,
slaveId: SlaveID): Unit = {
log.info("Scheduler.slaveLost: [%s]" format slaveId.getValue)
}
Scheduler
/**
* Invoked when an executor has exited/terminated. Note that any
* tasks running will have TASK_LOST status updates automagically
* generated.
*/
def executorLost(
driver: SchedulerDriver,executorId: ExecutorID, slaveId: SlaveID,
status: Int): Unit = {
log.info("Scheduler.executorLost: [%s]" format executorId.getValue)
}
/**
* Invoked when there is an unrecoverable error in the scheduler or
* scheduler driver. The driver will be aborted BEFORE invoking this
* callback.
*/
def error(driver: SchedulerDriver, message: String): Unit = {
log.info("Scheduler.error: [%s]" format message)
}
Executor
/**
* Invoked once the executor driver has been able to successfully
* connect with Mesos. In particular, a scheduler can pass some
* data to it's executors through the ExecutorInfo.data
* field.
*/
def registered(
driver: ExecutorDriver,
executorInfo: ExecutorInfo,
frameworkInfo: FrameworkInfo,
slaveInfo: SlaveInfo): Unit = {
log.info("Executor.registered")
}
Executor
/**
* Invoked when the executor re-registers with a restarted slave.
*/
def reregistered(
driver: ExecutorDriver,
slaveInfo: SlaveInfo): Unit = {
log.info("Executor.reregistered")
}
/** Invoked when the executor becomes "disconnected" from the slave
* (e.g., the slave is being restarted due to an upgrade).
*/
def disconnected(driver: ExecutorDriver): Unit = {
log.info("Executor.disconnected")
}
Executor
/**
* Invoked when a task has been launched on this executor (initiated
* via Scheduler.launchTasks. Note that this task can be
* realized with a thread, a process, or some simple computation,
* however, no other callbacks will be invoked on this executor
* until this callback has returned.
*/
def launchTask(driver: ExecutorDriver, task: TaskInfo): Unit = {
log.info("Executor.launchTask")
}
Executor
/**
* Invoked when a task running within this executor has been killed
* (via SchedulerDriver.killTask). Note that no status
* update will be sent on behalf of the executor, the executor is
* responsible for creating a new TaskStatus (i.e., with
* TASK_KILLED) and invoking ExecutorDriver.sendStatusUpdate.
*/
def killTask(driver: ExecutorDriver, taskId: TaskID): Unit = {
log.info("Executor.killTask")
}
Executor
/**
* Invoked when a framework message has arrived for this
* executor. These messages are best effort; do not expect a
* framework message to be retransmitted in any reliable fashion.
*/
def frameworkMessage(driver: ExecutorDriver, data: Array[Byte]): Unit = {
log.info("Executor.frameworkMessage")
}
Executor
/**
* Invoked when the executor should terminate all of it's currently
* running tasks. Note that after a Mesos has determined that an
* executor has terminated any tasks that the executor did not send
* terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED,
* TASK_FAILED, etc) a TASK_LOST status update will be created.
*/
def shutdown(driver: ExecutorDriver): Unit = {
log.info("Executor.shutdown")
}
Executor
/**
* Invoked when a fatal error has occured with the executor and/or
* executor driver. The driver will be aborted BEFORE invoking this
* callback.
*/
def error(driver: ExecutorDriver, message: String): Unit = {
log.info("Executor.error")
}
Questions?
/*******************************************
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop
********************************************/

Contenu connexe

Tendances

Oracle forms and reports 11g installation on linux
Oracle forms and reports 11g installation on linuxOracle forms and reports 11g installation on linux
Oracle forms and reports 11g installation on linuxVenu Palakolanu
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
 
High availability virtualization with proxmox
High availability virtualization with proxmoxHigh availability virtualization with proxmox
High availability virtualization with proxmoxOriol Izquierdo Vibalda
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...ScaleGrid.io
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technologySZ Lin
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsStefano Salsano
 
Understanding docker networking
Understanding docker networkingUnderstanding docker networking
Understanding docker networkingLorenzo Fontana
 
Ifupdown2: Network Interface Manager
Ifupdown2: Network Interface ManagerIfupdown2: Network Interface Manager
Ifupdown2: Network Interface ManagerCumulus Networks
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals   part 2 - slidesUnderstanding oracle rac internals   part 2 - slides
Understanding oracle rac internals part 2 - slidesMohamed Farouk
 
Oracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxOracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxRavi Kumar Lanke
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
 
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...OpenStack Korea Community
 
Multi-signed Kernel Module
Multi-signed Kernel ModuleMulti-signed Kernel Module
Multi-signed Kernel ModuleSUSE Labs Taipei
 

Tendances (20)

Oracle forms and reports 11g installation on linux
Oracle forms and reports 11g installation on linuxOracle forms and reports 11g installation on linux
Oracle forms and reports 11g installation on linux
 
Ebs clone r12.2.4
Ebs clone r12.2.4Ebs clone r12.2.4
Ebs clone r12.2.4
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
 
Zenoh: The Genesis
Zenoh: The GenesisZenoh: The Genesis
Zenoh: The Genesis
 
High availability virtualization with proxmox
High availability virtualization with proxmoxHigh availability virtualization with proxmox
High availability virtualization with proxmox
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
 
Linux kernel tracing
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technology
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Freeswitch
FreeswitchFreeswitch
Freeswitch
 
Understanding docker networking
Understanding docker networkingUnderstanding docker networking
Understanding docker networking
 
Ifupdown2: Network Interface Manager
Ifupdown2: Network Interface ManagerIfupdown2: Network Interface Manager
Ifupdown2: Network Interface Manager
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals   part 2 - slidesUnderstanding oracle rac internals   part 2 - slides
Understanding oracle rac internals part 2 - slides
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Oracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linuxOracle applications r12.2.0 installation on linux
Oracle applications r12.2.0 installation on linux
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
 
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
[OpenInfra Days Korea 2018] (Track 1) Kubernetes 환경에서의 Volume 배포와 데이터 관리의 유연성...
 
Multi-signed Kernel Module
Multi-signed Kernel ModuleMulti-signed Kernel Module
Multi-signed Kernel Module
 

En vedette

Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCPaco Nathan
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache MesosJoe Stein
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesostomasbart
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosTimothy St. Clair
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsGladson Manuel
 
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksPaco Nathan
 
Scaling Like Twitter with Apache Mesos
Scaling Like Twitter with Apache MesosScaling Like Twitter with Apache Mesos
Scaling Like Twitter with Apache MesosMesosphere Inc.
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparisonKrishna-Kumar
 
범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)상욱 송
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...Nati Shalom
 
Mesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterMesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterDavid Greenberg
 
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the CloudAWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the CloudSharma Podila
 
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm Mode
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm ModeContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm Mode
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm ModeDocker-Hanoi
 
Continuous delivery with docker
Continuous delivery with dockerContinuous delivery with docker
Continuous delivery with dockerJohan Janssen
 
Xcode の一歩進んだ使い方 分散ビルド
Xcode の一歩進んだ使い方 分散ビルドXcode の一歩進んだ使い方 分散ビルド
Xcode の一歩進んだ使い方 分散ビルドnnkgw
 
BDD: Descubriendo qué requiere realmente tu cliente
BDD: Descubriendo qué requiere realmente tu clienteBDD: Descubriendo qué requiere realmente tu cliente
BDD: Descubriendo qué requiere realmente tu clienteJorge Gamba
 
Conferencia Rails: Integracion Continua Y Rails
Conferencia Rails: Integracion Continua Y RailsConferencia Rails: Integracion Continua Y Rails
Conferencia Rails: Integracion Continua Y RailsDavid Calavera
 

En vedette (20)

Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Apache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basicsApache Mesos: a simple explanation of basics
Apache Mesos: a simple explanation of basics
 
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
 
Scaling Like Twitter with Apache Mesos
Scaling Like Twitter with Apache MesosScaling Like Twitter with Apache Mesos
Scaling Like Twitter with Apache Mesos
 
Mesos vs kubernetes comparison
Mesos vs kubernetes comparisonMesos vs kubernetes comparison
Mesos vs kubernetes comparison
 
범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)범용 PaaS 플랫폼 mesos(mesosphere)
범용 PaaS 플랫폼 mesos(mesosphere)
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...Orchestration tool roundup   kubernetes vs. docker vs. heat vs. terra form vs...
Orchestration tool roundup kubernetes vs. docker vs. heat vs. terra form vs...
 
Mesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterMesos: The Operating System for your Datacenter
Mesos: The Operating System for your Datacenter
 
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the CloudAWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
 
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm Mode
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm ModeContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm Mode
ContainerDayVietnam2016: Lesson Leanred on Docker 1.12 and Swarm Mode
 
Continuous delivery with docker
Continuous delivery with dockerContinuous delivery with docker
Continuous delivery with docker
 
Xcode の一歩進んだ使い方 分散ビルド
Xcode の一歩進んだ使い方 分散ビルドXcode の一歩進んだ使い方 分散ビルド
Xcode の一歩進んだ使い方 分散ビルド
 
Mesos introduction
Mesos introductionMesos introduction
Mesos introduction
 
GoDocker presentation
GoDocker presentationGoDocker presentation
GoDocker presentation
 
BDD: Descubriendo qué requiere realmente tu cliente
BDD: Descubriendo qué requiere realmente tu clienteBDD: Descubriendo qué requiere realmente tu cliente
BDD: Descubriendo qué requiere realmente tu cliente
 
Conferencia Rails: Integracion Continua Y Rails
Conferencia Rails: Integracion Continua Y RailsConferencia Rails: Integracion Continua Y Rails
Conferencia Rails: Integracion Continua Y Rails
 

Similaire à Building and Deploying Application to Apache Mesos

Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosJoe Stein
 
DevOps Enabling Your Team
DevOps Enabling Your TeamDevOps Enabling Your Team
DevOps Enabling Your TeamGR8Conf
 
AWS CloudFormation Session
AWS CloudFormation SessionAWS CloudFormation Session
AWS CloudFormation SessionKamal Maiti
 
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...Amazon Web Services
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesLindsay Holmwood
 
Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Puppet
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Scalable and Fault-Tolerant Apps with AWS
Scalable and Fault-Tolerant Apps with AWSScalable and Fault-Tolerant Apps with AWS
Scalable and Fault-Tolerant Apps with AWSFernando Rodriguez
 
Programando sua infraestrutura com o AWS CloudFormation
Programando sua infraestrutura com o AWS CloudFormationProgramando sua infraestrutura com o AWS CloudFormation
Programando sua infraestrutura com o AWS CloudFormationAmazon Web Services LATAM
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
 
Infrastructure-as-code: bridging the gap between Devs and Ops
Infrastructure-as-code: bridging the gap between Devs and OpsInfrastructure-as-code: bridging the gap between Devs and Ops
Infrastructure-as-code: bridging the gap between Devs and OpsMykyta Protsenko
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariAlejandro Fernandez
 
Hands-on Lab: Comparing Redis with Relational
Hands-on Lab: Comparing Redis with RelationalHands-on Lab: Comparing Redis with Relational
Hands-on Lab: Comparing Redis with RelationalAmazon Web Services
 
Deployment and Management on AWS:
 A Deep Dive on Options and Tools
Deployment and Management on AWS:
 A Deep Dive on Options and ToolsDeployment and Management on AWS:
 A Deep Dive on Options and Tools
Deployment and Management on AWS:
 A Deep Dive on Options and ToolsDanilo Poccia
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loadingalex_araujo
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
 
Introduction to mesos bay
Introduction to mesos bayIntroduction to mesos bay
Introduction to mesos bayhongbin034
 

Similaire à Building and Deploying Application to Apache Mesos (20)

Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
DevOps Enabling Your Team
DevOps Enabling Your TeamDevOps Enabling Your Team
DevOps Enabling Your Team
 
AWS CloudFormation Session
AWS CloudFormation SessionAWS CloudFormation Session
AWS CloudFormation Session
 
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
(BDT401) Big Data Orchestra - Harmony within Data Analysis Tools | AWS re:Inv...
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
 
Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014Understanding OpenStack Deployments - PuppetConf 2014
Understanding OpenStack Deployments - PuppetConf 2014
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Scalable and Fault-Tolerant Apps with AWS
Scalable and Fault-Tolerant Apps with AWSScalable and Fault-Tolerant Apps with AWS
Scalable and Fault-Tolerant Apps with AWS
 
Programando sua infraestrutura com o AWS CloudFormation
Programando sua infraestrutura com o AWS CloudFormationProgramando sua infraestrutura com o AWS CloudFormation
Programando sua infraestrutura com o AWS CloudFormation
 
Amazon ECS Deep Dive
Amazon ECS Deep DiveAmazon ECS Deep Dive
Amazon ECS Deep Dive
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
Infrastructure-as-code: bridging the gap between Devs and Ops
Infrastructure-as-code: bridging the gap between Devs and OpsInfrastructure-as-code: bridging the gap between Devs and Ops
Infrastructure-as-code: bridging the gap between Devs and Ops
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Hands-on Lab: Comparing Redis with Relational
Hands-on Lab: Comparing Redis with RelationalHands-on Lab: Comparing Redis with Relational
Hands-on Lab: Comparing Redis with Relational
 
Deployment and Management on AWS:
 A Deep Dive on Options and Tools
Deployment and Management on AWS:
 A Deep Dive on Options and ToolsDeployment and Management on AWS:
 A Deep Dive on Options and Tools
Deployment and Management on AWS:
 A Deep Dive on Options and Tools
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlib
 
Introduction to mesos bay
Introduction to mesos bayIntroduction to mesos bay
Introduction to mesos bay
 

Plus de Joe Stein

Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1Joe Stein
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosJoe Stein
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache MesosJoe Stein
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Joe Stein
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on MesosJoe Stein
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosJoe Stein
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloJoe Stein
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaJoe Stein
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache KafkaJoe Stein
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsJoe Stein
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonJoe Stein
 

Plus de Joe Stein (20)

Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
Developing with the Go client for Apache Kafka
Developing with the Go client for Apache KafkaDeveloping with the Go client for Apache Kafka
Developing with the Go client for Apache Kafka
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Storing Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite ColumnsStoring Time Series Metrics With Cassandra and Composite Columns
Storing Time Series Metrics With Cassandra and Composite Columns
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Hadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With PythonHadoop Streaming Tutorial With Python
Hadoop Streaming Tutorial With Python
 

Dernier

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building and Deploying Application to Apache Mesos

  • 1. Joe Stein Building & Deploying Applications to Apache Mesos
  • 2. Joe Stein • Developer, Architect & Technologist • Founder & Principal Consultant => Big Data Open Source Security LLC - http://stealth.ly Big Data Open Source Security LLC provides professional services and product solutions for the collection, storage, transfer, real-time analytics, batch processing and reporting for complex data streams, data sets and distributed systems. BDOSS is all about the "glue" and helping companies to not only figure out what Big Data Infrastructure Components to use but also how to change their existing (or build new) systems to work with them. • Apache Kafka Committer & PMC member • Blog & Podcast - http://allthingshadoop.com • Twitter @allthingshadoop
  • 3. Overview ● Quick intro to Mesos ● Marathon Framework ● Aurora Framework ● Custom Framework
  • 4.
  • 5. Origins Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center http://static.usenix.org/event/nsdi11/tech/full_papers/Hindman_new.pdf Datacenter Management with Apache Mesos https://www.youtube.com/watch?v=YB1VW0LKzJ4 Omega: flexible, scalable schedulers for large compute clusters http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf 2011 GAFS Omega John Wilkes https://www.youtube.com/watch?v=0ZFMlO98Jkc&feature=youtu.be
  • 8. Static Partitioning IS Bad hard to utilize machines (i.e., X GB RAM and Y CPUs)
  • 9. Static Partitioning does NOT scale hard to scale elastically (to take advantage of statistical multiplexing)
  • 10. Failures === Downtime hard to deal with failures
  • 12. Operating System === Datacenter
  • 13. Mesos => data center “kernel”
  • 14. Apache Mesos ● Scalability to 10,000s of nodes ● Fault-tolerant replicated master and slaves using ZooKeeper ● Support for Docker containers ● Native isolation between tasks with Linux Containers ● Multi-resource scheduling (memory, CPU, disk, and ports) ● Java, Python and C++ APIs for developing new parallel applications ● Web UI for viewing cluster state
  • 15.
  • 16.
  • 17. Resources & Attributes The Mesos system has two basic methods to describe the slaves that comprise a cluster. One of these is managed by the Mesos master, the other is simply passed onwards to the frameworks using the cluster. Attributes The attributes are simply key value string pairs that Mesos passes along when it sends offers to frameworks. attributes : attribute ( ";" attribute )* attribute : labelString ":" ( labelString | "," )+
  • 18. Resources The Mesos system can manage 3 different types of resources: scalars, ranges, and sets. These are used to represent the different resources that a Mesos slave has to offer. For example, a scalar resource type could be used to represent the amount of memory on a slave. Each resource is identified by a key string. resources : resource ( ";" resource )* resource : key ":" ( scalar | range | set ) key : labelString ( "(" resourceRole ")" )? scalar : floatValue range : "[" rangeValue ( "," rangeValue )* "]" rangeValue : scalar "-" scalar set : "{" labelString ( "," labelString )* "}" resourceRole : labelString | "*" labelString : [a-zA-Z0-9_/.-] floatValue : ( intValue ( "." intValue )? ) | ... intValue : [0-9]+
  • 19. Predefined Uses & Conventions The Mesos master has a few resources that it pre-defines in how it handles them. At the current time, this list consist of: ● cpu ● mem ● disk ● ports In particular, a slave without cpu and mem resources will never have its resources advertised to any frameworks. Also, the Master’s user interface interprets the scalars inmem and disk in terms of MB. IE: the value 15000 is displayed as 14.65GB.
  • 20. Here are some examples for configuring the Mesos slaves. --resources='cpu:24;mem:24576;disk:409600;ports:[21000-24000];disks:{1,2,3,4,5,6,7,8}' --attributes='dc:1;floor:2;aisle:6;rack:aa;server:15;os:trusty;jvm:7u68;docker:1.5' In this case, we have three different types of resources, scalars, a range, and a set. They are called cpu, mem, disk, and the range type is ports. ● scalar called cpu, with the value 24 ● scalar called mem, with the value 24576 ● scalar called disk, with the value 409600 ● range called ports, with values 21000 through 24000 (inclusive) ● set called disks, with the values 1, 2, etc which we can concatenate later /mnt/disk/{diskNum} and each task can own a disk Examples
  • 21. Roles Total consumable resources per slave, in the form 'name(role):value;name(role):value...'. This value can be set to limit resources per role, or to overstate the number of resources that are available to the slave. --resources="cpus(*):8; mem(*):15360; disk(*):710534; ports(*):[31000-32000]" --resources="cpus(prod):8; cpus(stage):2 mem(*):15360; disk(*):710534; ports(*):[31000-32000]" All * roles will be detected, so you can specify only the resources that are not all roles (*). -- resources="cpus(prod):8; cpus(stage)" Frameworks bind a specific roles or any. A default roll (instead of *) can also be configured. Roles can be used to isolate and segregate frameworks.
  • 22. Marathon https://github.com/mesosphere/marathon Cluster-wide init and control system for services in cgroups or docker based on Apache Mesos
  • 23. Constraints Constraints control where apps run to allow optimizing for fault tolerance or locality. Constraints are made up of three parts: a field name, an operator, and an optional parameter. The field can be the slave hostname or any Mesos slave attribute. Fields Hostname field hostname field matches the slave hostnames, see UNIQUE operator for usage example. hostname field supports all operators of Marathon. Attribute field If the field name is none of the above, it will be treated as a Mesos slave attribute. Mesos slave attribute is a way to tag a slave node, see mesos-slave --help to learn how to set the attributes.
  • 24. Unique UNIQUE tells Marathon to enforce uniqueness of the attribute across all of an app's tasks. For example the following constraint ensures that there is only one app task running on each host: via the Marathon gem: $ marathon start -i sleep -C 'sleep 60' -n 3 --constraint hostname:UNIQUE via curl: $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-unique", "cmd": "sleep 60", "instances": 3, "constraints": [["hostname", "UNIQUE"]] }'
  • 25. CLUSTER allows you to run all of your app's tasks on slaves that share a certain attribute. $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-cluster", "cmd": "sleep 60", "instances": 3, "constraints": [["rack_id", "CLUSTER", "rack-1"]] }' You can also use this attribute to tie an application to a specific node by using the hostname property: $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-cluster", "cmd": "sleep 60", "instances": 3, "constraints": [["hostname", "CLUSTER", "a.specific.node.com"]] }' Cluster
  • 26. Group By GROUP_BY can be used to distribute tasks evenly across racks or datacenters for high availability. via the Marathon gem: $ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:GROUP_BY via curl: $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-group-by", "cmd": "sleep 60", "instances": 3, "constraints": [["rack_id", "GROUP_BY"]] }' Optionally, you can specify a minimum number of groups to try and achieve.
  • 27. Like LIKE accepts a regular expression as parameter, and allows you to run your tasks only on the slaves whose field values match the regular expression. via the Marathon gem: $ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:LIKE:rack-[1-3] via curl: $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-group-by", "cmd": "sleep 60", "instances": 3, "constraints": [["rack_id", "LIKE", "rack-[1-3]"]] }'
  • 28. Unlike Just like LIKE operator, but only run tasks on slaves whose field values don't match the regular expression. via the Marathon gem: $ marathon start -i sleep -C 'sleep 60' -n 3 --constraint rack_id:UNLIKE:rack-[7-9] via curl: $ curl -X POST -H "Content-type: application/json" localhost:8080/v1/apps/start -d '{ "id": "sleep-group-by", "cmd": "sleep 60", "instances": 3, "constraints": [["rack_id", "UNLIKE", "rack-[7-9]"]] }'
  • 29. Running in Marathon TAG=sample APP=xyz ID=$TAG-$APP CMD=”./yourScript "'$HOST'" "'$PORT0'" "'$PORT1' JSON=$(printf '{ "id": "%s", "cmd": "%s", "cpus": %s, "mem": %s, "instances": %s, "uris": ["%s"], "ports": [0,0] , “env”:{ “JAVA_OPTS”,”%s”}' "$ID" “$CMD" "0.1" "256" "1" "http://dns/path/yourScriptAndStuff.tgz" “-Xmx 128”) curl -i -X POST -H "Content-Type: application/json" -d "$JSON" http://localhost:8080/v2/apps
  • 30. yourScript #think of it as a distributed application launching in the cloud HOST=$1 PORT0=$2 PORT1=$3 #talk to zookeeper or etcd or mesos dns #call marathon rest api #spawn another process, they are all in your cgroup =8^) woot woot
  • 31. Persisting Data ● Write data outside the sandbox, the other kids will not mind. ○ /var/lib/data/<tag>/<app> ● Careful about starvation, use the constraints and roles to manage this. ● Future improvements https://issues.apache.org/jira/browse/MESOS-2018 This is a feature to provide better support for running stateful services on Mesos such as HDFS (Distributed Filesystem), Cassandra (Distributed Database), or MySQL (Local Database). Current resource reservations (henceforth called "static" reservations) are statically determined by the slave operator at slave start time, and individual frameworks have no authority to reserve resources themselves. Dynamic reservations allow a framework to dynamically/lazily reserve offered resources, such that those resources will only be re-offered to the same framework (or other frameworks with the same role). This is especially useful if the framework's task stored some state on the slave, and needs a guaranteed set of resources reserved so that it can re-launch a task on the same slave to recover that state.
  • 32. Apache Aurora http://aurora.incubator.apache.org/ Apache Aurora is a service scheduler that runs on top of Mesos, enabling you to run long-running services that take advantage of Mesos' scalability, fault-tolerance, and resource isolation. Apache Aurora is currently part of the Apache Incubator.
  • 36. hello_world.aurora import os hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello world') hello_world_task = Task( resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB), processes = [hello_world_process]) hello_world_job = Job( cluster = 'cluster1', role = os.getenv('USER'), task = hello_world_task) jobs = [hello_world_job] Then issue the following commands to create and kill the job, using your own values for the job key. aurora job create cluster1/$USER/test/hello_world hello_world.aurora aurora job kill cluster1/$USER/test/hello_world
  • 37. import os import textwrap class Profile(Struct): archive = Default(String, 'https://archive.apache.org/dist') gpg_grp = Default(String, 'https://people.apache.org/keys/group') svc = Default(String, 'kafka') svc_ver = Default(String, '0.8.1.1') svc_ver_scala = Default(String, '2.10') svc_prop_file = Default(String, 'server.properties') jvm_heap = Default(String, '-Xmx1G -Xms1G') common = Process( name = 'fetch commons', cmdline = textwrap.dedent(""" hdfs dfs -get /dist/scripts.tgz tar xf scripts.tgz """) ) Kafka on Aurora
  • 38. Kafka on Aurora dist = Process( name = 'fetch {{profile.svc}} v{{profile.svc_ver_scala}}-{{profile.svc_ver}} distribution', cmdline = textwrap.dedent(""" command -v curl >/dev/null 2>&1 || { echo >&2 "error: 'curl' is not installed. Aborting."; exit 1; } eval curl -sSfL '-O {{profile.archive}}/kafka/{{profile.svc_ver}}/kafka_{{profile.svc_ver_scala}}- {{profile.svc_ver}}.tgz'{,.asc,.md5} if command -v md5sum >/dev/null; then md5sum -c kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz.md5 else echo "warn: 'md5sum' is not installed. Check skipped." >&2 fi if command -v gpg >/dev/null; then curl -sSfL {{profile.gpg_grp}}/kafka.asc | gpg --import gpg kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz.asc else echo "warn: 'gpg' is not installed. Signature verification skipped." >&2 fi tar xf kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}.tgz """) )
  • 39. register = Process( name = 'register-service', cmdline = textwrap.dedent(""" export IP=$(host `hostname` | tr ' ' 'n' | tail -1) ./scripts/common/registry.sh -r {{profile.svc}}-{{environment}} -i {{mesos.instance}} -p "$IP:{{thermos.ports[client]}}" """) ) unregister = Process( name = 'unregister-service', final = True, cmdline = textwrap.dedent(""" ./scripts/common/registry.sh -u {{profile.svc}}-{{environment}} -i {{mesos.instance}} """) ) Kafka on Aurora
  • 40. config = Process( name = 'create {{profile.svc_prop_file}}', cmdline = textwrap.dedent(""" export ZK_CONNECT=$(echo -e `./scripts/common/registry.sh -q zookeeper-{{environment}}-client` | awk '{printf("%s,", $0)}') export IP=$(host `hostname` | tr ' ' 'n' | tail -1) echo -e " broker.id={{mesos.instance}} port={{thermos.ports[client]}} host.name=$IP advertised.host.name=$IP num.network.threads=2 num.io.threads=8 socket.send.buffer.bytes=1048576 socket.receive.buffer.bytes=1048576 socket.request.max.bytes=104857600 log.dirs=kafka-logs num.partitions=1 log.retention.hours=168 log.segment.bytes=536870912 log.retention.check.interval.ms=60000 log.cleaner.enable=false zookeeper.connect=$ZK_CONNECT zookeeper.connection.timeout.ms=30000 " > {{profile.svc_prop_file}} echo "Wrote '{{profile.svc_prop_file}}':" cat {{profile.svc_prop_file}} """) ) Kafka on Aurora
  • 41. run = Process( name = 'run {{profile.svc}}', cmdline = textwrap.dedent(""" export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:kafka_{{profile.svc_ver_scala}}- {{profile.svc_ver}}/config/log4j.properties" export KAFKA_HEAP_OPTS="{{profile.jvm_heap}}" export EXTRA_ARGS="-name kafkaServer -loggc" kafka_{{profile.svc_ver_scala}}-{{profile.svc_ver}}/bin/kafka-run-class.sh $EXTRA_ARGS kafka.Kafka {{profile.svc_prop_file}} """) ) Kafka on Aurora
  • 42. Kafka on Aurora base_task = Task( processes = [register, unregister, common, dist, config, run], constraints = order(dist, run) + order(common, register) + order(common, config, run) ) staging_task = base_task( resources = Resources(cpu = 1.0, ram = 1280*MB, disk = 5*GB) ) production_task = base_task( resources = Resources(cpu = 24.0, ram = 23040*MB, disk = 10000*GB) )
  • 43. Kafka on Aurora DEVELOPMENT = Profile() PRODUCTION = Profile( jvm_heap = '-Xmx4G -Xms4G' ) base_job = Service( name = 'kafka', role = os.getenv('USER') ) jobs = [ base_job( cluster = dc1, environment = 'devel', instances = 4000, contact = ‘root@localhost', task = staging_task.bind( profile = DEVELOPMENT ) ), base_job( cluster = ‘dc1’, environment = 'prod', instances = 150, production = True, contact = ‘root@localhost', task = production_task.bind( profile = PRODUCTION ) ) ]
  • 44. The future of Kafka on Mesos ● A Mesos Kafka framework https://github.com/mesos/kafka
  • 45. Sample Frameworks C++ - https://github.com/apache/mesos/tree/master/src/examples Java - https://github.com/apache/mesos/tree/master/src/examples/java Python - https://github.com/apache/mesos/tree/master/src/examples/python Scala - https://github.com/mesosphere/scala-sbt-mesos-framework.g8 Go - https://github.com/mesosphere/mesos-go
  • 46. https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto message FrameworkInfo { required string user = 1; required string name = 2; optional FrameworkID id = 3; optional double failover_timeout = 4 [default = 0.0]; optional bool checkpoint = 5 [default = false]; optional string role = 6 [default = "*"]; optional string hostname = 7; optional string principal = 8; optional string webui_url = 9; } FrameworkInfo
  • 47. TaskInfo message TaskInfo { required string name = 1; required TaskID task_id = 2; required SlaveID slave_id = 3; repeated Resource resources = 4; optional ExecutorInfo executor = 5; optional CommandInfo command = 7; optional ContainerInfo container = 9; optional bytes data = 6; optional HealthCheck health_check = 8; optional Labels labels = 10; optional DiscoveryInfo discovery = 11; }
  • 48. TaskState /** * Describes possible task states. IMPORTANT: Mesos assumes tasks that * enter terminal states (see below) imply the task is no longer * running and thus clean up any thing associated with the task * (ultimately offering any resources being consumed by that task to * another task). */ enum TaskState { TASK_STAGING = 6; // Initial state. Framework status updates should not use. TASK_STARTING = 0; TASK_RUNNING = 1; TASK_FINISHED = 2; // TERMINAL. The task finished successfully. TASK_FAILED = 3; // TERMINAL. The task failed to finish successfully. TASK_KILLED = 4; // TERMINAL. The task was killed by the executor. TASK_LOST = 5; // TERMINAL. The task failed but can be rescheduled. TASK_ERROR = 7; // TERMINAL. The task description contains an error. }
  • 49. Scheduler /** * Invoked when the scheduler successfully registers with a Mesos * master. A unique ID (generated by the master) used for * distinguishing this framework from others and MasterInfo * with the ip and port of the current master are provided as arguments. */ def registered( driver: SchedulerDriver, frameworkId: FrameworkID, masterInfo: MasterInfo): Unit = { log.info("Scheduler.registered") log.info("FrameworkID:n%s" format frameworkId) log.info("MasterInfo:n%s" format masterInfo) }
  • 50. Scheduler /** * Invoked when the scheduler re-registers with a newly elected Mesos master. * This is only called when the scheduler has previously been registered. * MasterInfo containing the updated information about the elected master * is provided as an argument. */ def reregistered( driver: SchedulerDriver, masterInfo: MasterInfo): Unit = { log.info("Scheduler.reregistered") log.info("MasterInfo:n%s" format masterInfo) }
  • 51. Scheduler /**Invoked when resources have been offered to this framework. A single offer will only contain resources from a single slave. Resources associated with an offer will not be re-offered to _this_ framework until either (a) this framework has rejected those resources or (b) those resources have been rescinded.Note that resources may be concurrently offered to more than one framework at a time (depending on the allocator being used). In that case, the first framework to launch tasks using those resources will be able to use them while the other frameworks will have those resources rescinded (or if a framework has already launched tasks with those resources then those tasks will fail with a TASK_LOST status and a message saying as much).*/ def resourceOffers( driver: SchedulerDriver, offers: JList[Offer]): Unit = { log.info("Scheduler.resourceOffers") // print and decline all received offers offers foreach { offer => log.info(offer.toString) driver declineOffer offer.getId } }
  • 52. Scheduler /** * Invoked when an offer is no longer valid (e.g., the slave was * lost or another framework used resources in the offer). If for * whatever reason an offer is never rescinded (e.g., dropped * message, failing over framework, etc.), a framwork that attempts * to launch tasks using an invalid offer will receive TASK_LOST * status updats for those tasks. */ def offerRescinded( driver: SchedulerDriver, offerId: OfferID): Unit = { log.info("Scheduler.offerRescinded [%s]" format offerId.getValue) }
  • 53. Scheduler /** * Invoked when the status of a task has changed (e.g., a slave is * lost and so the task is lost, a task finishes and an executor * sends a status update saying so, etc). Note that returning from * this callback _acknowledges_ receipt of this status update! If * for whatever reason the scheduler aborts during this callback (or * the process exits) another status update will be delivered (note, * however, that this is currently not true if the slave sending the * status update is lost/fails during that time). */ def statusUpdate( driver: SchedulerDriver, status: TaskStatus): Unit = { log.info("Scheduler.statusUpdate:n%s" format status) }
  • 54. Scheduler /** * Invoked when an executor sends a message. These messages are best * effort; do not expect a framework message to be retransmitted in * any reliable fashion. */ def frameworkMessage( driver: SchedulerDriver, executorId: ExecutorID, slaveId: SlaveID, data: Array[Byte]): Unit = { log.info("Scheduler.frameworkMessage") }
  • 55. Scheduler ** * Invoked when the scheduler becomes "disconnected" from the master * (e.g., the master fails and another is taking over). */ def disconnected(driver: SchedulerDriver): Unit = { log.info("Scheduler.disconnected") } /** * Invoked when a slave has been determined unreachable (e.g., * machine failure, network partition). Most frameworks will need to * reschedule any tasks launched on this slave on a new slave. */ def slaveLost( driver: SchedulerDriver, slaveId: SlaveID): Unit = { log.info("Scheduler.slaveLost: [%s]" format slaveId.getValue) }
  • 56. Scheduler /** * Invoked when an executor has exited/terminated. Note that any * tasks running will have TASK_LOST status updates automagically * generated. */ def executorLost( driver: SchedulerDriver,executorId: ExecutorID, slaveId: SlaveID, status: Int): Unit = { log.info("Scheduler.executorLost: [%s]" format executorId.getValue) } /** * Invoked when there is an unrecoverable error in the scheduler or * scheduler driver. The driver will be aborted BEFORE invoking this * callback. */ def error(driver: SchedulerDriver, message: String): Unit = { log.info("Scheduler.error: [%s]" format message) }
  • 57. Executor /** * Invoked once the executor driver has been able to successfully * connect with Mesos. In particular, a scheduler can pass some * data to it's executors through the ExecutorInfo.data * field. */ def registered( driver: ExecutorDriver, executorInfo: ExecutorInfo, frameworkInfo: FrameworkInfo, slaveInfo: SlaveInfo): Unit = { log.info("Executor.registered") }
  • 58. Executor /** * Invoked when the executor re-registers with a restarted slave. */ def reregistered( driver: ExecutorDriver, slaveInfo: SlaveInfo): Unit = { log.info("Executor.reregistered") } /** Invoked when the executor becomes "disconnected" from the slave * (e.g., the slave is being restarted due to an upgrade). */ def disconnected(driver: ExecutorDriver): Unit = { log.info("Executor.disconnected") }
  • 59. Executor /** * Invoked when a task has been launched on this executor (initiated * via Scheduler.launchTasks. Note that this task can be * realized with a thread, a process, or some simple computation, * however, no other callbacks will be invoked on this executor * until this callback has returned. */ def launchTask(driver: ExecutorDriver, task: TaskInfo): Unit = { log.info("Executor.launchTask") }
  • 60. Executor /** * Invoked when a task running within this executor has been killed * (via SchedulerDriver.killTask). Note that no status * update will be sent on behalf of the executor, the executor is * responsible for creating a new TaskStatus (i.e., with * TASK_KILLED) and invoking ExecutorDriver.sendStatusUpdate. */ def killTask(driver: ExecutorDriver, taskId: TaskID): Unit = { log.info("Executor.killTask") }
  • 61. Executor /** * Invoked when a framework message has arrived for this * executor. These messages are best effort; do not expect a * framework message to be retransmitted in any reliable fashion. */ def frameworkMessage(driver: ExecutorDriver, data: Array[Byte]): Unit = { log.info("Executor.frameworkMessage") }
  • 62. Executor /** * Invoked when the executor should terminate all of it's currently * running tasks. Note that after a Mesos has determined that an * executor has terminated any tasks that the executor did not send * terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED, * TASK_FAILED, etc) a TASK_LOST status update will be created. */ def shutdown(driver: ExecutorDriver): Unit = { log.info("Executor.shutdown") }
  • 63. Executor /** * Invoked when a fatal error has occured with the executor and/or * executor driver. The driver will be aborted BEFORE invoking this * callback. */ def error(driver: ExecutorDriver, message: String): Unit = { log.info("Executor.error") }
  • 64. Questions? /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop ********************************************/