SlideShare une entreprise Scribd logo
1  sur  48
Télécharger pour lire hors ligne
© 2019 Ververica
David Anderson | @alpinegizmo | Training Coordinator
Getting Started with
Apache Flink® on Kubernetes
2 © 2019 Ververica
About Ververica
Creators of
Apache Flink®
Real Time
Stream Processing
for the Enterprise
3 © 2019 Ververica
Outline
1. Introduction
2. Detailed Example
3. Debugging Tips
4. Future Plans
4 © 2019 Ververica
Why Containers?
• Containers provide isolation at low cost
– Require fewer resources than VMs
– Smaller, boot faster
• Simpler to manage
– Each container does one thing
– Consistent packaging
• Enables flexible and dynamic resource allocation
– Scalable
– Composable
5 © 2019 Ververica
Container Orchestration with Kubernetes
• Declarative configuration:
– You tell K8s the desired state, and a background process makes it happen
• 3 replicas of this container should be kept running
• A load balancer should exist, listening on port 443, backed by container with this label
• Core resource types:
– Pod: a group of one or more containers
– Job: keeps pod(s) running until finished
– Deployment: keeps n pods running indefinitely
– Service: a REST object backed by a set of pods
– Persistent Volume Claim: storage whose lifetime is not coupled to any of the pods
6 © 2019 Ververica
Vision: Flink as a Library
• Makes deployments simpler
– Focus is on deploying/running an application
– You build one, complete job-specific Docker image that includes:
• Your application code
• Flink libraries
• Other dependencies
• Configuration files
7 © 2019 Ververica
Flink’s Runtime Building Blocks
• Cluster framework-specific
• Manages available TaskManagers
• Acquires / releases resources
ResourceManager
TaskManagerJobManager
• Registers with ResourceManager
• Provides “task slots”
• Assigned tasks by one or more JobManagers
• One per job
• Schedules job in terms of "task slots"
• Monitors task execution
• Coordinates checkpointing
Dispatcher
• Touch-point for job submissions
• Spawns JobManagers
8 © 2019 Ververica
Flink’s Runtime Building Blocks
• Cluster framework-specific
• Manages available TaskManagers
• Acquires / releases resources
ResourceManager
TaskManagerJobManager
• Registers with ResourceManager
• Provides “task slots”
• Assigned tasks by JobManager(s)
• One per job
• Schedules job in terms of "task slots"
• Monitors task execution
• Coordinates checkpointing
Dispatcher
• Touch-point for job submissions
• Spawns JobManagers
9 © 2019 Ververica
Runtime Building Blocks (on Yarn)
ResourceManager
(3) Request slots
TaskManager
JobManager
(4) Start TaskManager
(5) Register
(7) Deploy Tasks
Dispatcher
App/Client
(1) Submit Job
(2) Start JobManager
(6) Offer slots
10 © 2019 Ververica
But we’re not quite there yet with K8s
11 © 2019 Ververica
Flink on K8s: current status
• Still using the legacy standalone resource manager
• Deployment establishes a static execution environment
• You will have a k8s manifest that effectively says
– there should be n taskmanagers that look like this
Flink is not aware of Kubernetes
12 © 2019 Ververica
Master Container
ResourceManager
JobManager
Mini Dispatcher
(2) Run & Start
Worker Container
TaskManager
Worker Container
TaskManager
Worker Container
TaskManager
(3) Register
(4) Deploy Tasks
(0) One image is built that can be either a Master or Worker
(1) Container framework starts Master & Worker Containers
Flink job cluster on K8s
13 © 2019 Ververica
2. EXAMPLE
https://github.com/alpinegizmo/flink-containers-example
14 © 2019 Ververica
Very Simple Streaming Job
https://github.com/alpinegizmo/flink-containers-example
data generator RichFlatMap print
# events per user
keyBy
15 © 2019 Ververica
16 © 2019 Ververica
Desired Runtime Landscape for K8s
17 © 2019 Ververica
Steps
1. Build the docker image
2. Set up job cluster (k8s job) &
task managers (k8s deployment)
3. Set up job cluster service
4. Add minio for checkpoints
18 © 2019 Ververica
1: Build a docker image
ADD $flink_dist $FLINK_INSTALL_PATH
ADD $job_jar $FLINK_INSTALL_PATH/job.jar
. . .
COPY docker/flink/flink-conf.yaml $FLINK_HOME/conf
COPY docker/flink/log4j-console.properties $FLINK_HOME/conf
COPY docker/flink/docker-entrypoint.sh /
. . .
ENTRYPOINT ["/docker-entrypoint.sh"]
Dockerfile
19 © 2019 Ververica
. . .
JOB_CLUSTER="job-cluster"
TASK_MANAGER="task-manager"
CMD="$1"
shift;
if [ "${CMD}" == "${JOB_CLUSTER}" -o "${CMD}" == "${TASK_MANAGER}" ]; then
if [ "${CMD}" == "${TASK_MANAGER}" ]; then
exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$@"
else
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@"
fi
fi
exec "$@"
docker-entrypoint.sh
20 © 2019 Ververica
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: flink-task-manager
spec:
replicas: ${FLINK_NUM_OF_TASKMANAGERS}
template:
metadata:
labels:
app: flink
component: task-manager
spec:
containers:
- name: flink-task-manager
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["task-manager",
"-Djobmanager.rpc.address=flink-job-cluster"]
task-manager-deployment.yaml.template
apiVersion: batch/v1
kind: Job
metadata:
name: flink-job-cluster
spec:
template:
metadata:
labels:
app: flink
component: job-cluster
spec:
restartPolicy: OnFailure
containers:
- name: flink-job-cluster
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["job-cluster",
"-Djobmanager.rpc.address=flink-job-cluster",
"-Dblob.server.port=6124",
"-Dqueryable-state.server.ports=6125"]
ports:
- containerPort: 6123
name: rpc
- containerPort: 6124
name: blob
- containerPort: 6125
name: query
- containerPort: 8081
name: ui
job-cluster-job.yaml.template
2: K8s manifests
21 © 2019 Ververica
task-manager-deployment.yaml.template
apiVersion: batch/v1
kind: Job
metadata:
name: flink-job-cluster
spec:
template:
metadata:
labels:
app: flink
component: job-cluster
spec:
restartPolicy: OnFailure
containers:
- name: flink-job-cluster
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["job-cluster",
"-Djobmanager.rpc.address=flink-job-cluster",
"-Dblob.server.port=6124",
"-Dqueryable-state.server.ports=6125"]
ports:
- containerPort: 6123
name: rpc
- containerPort: 6124
name: blob
- containerPort: 6125
name: query
- containerPort: 8081
name: ui
job-cluster-job.yaml.template
2: K8s manifests
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: flink-task-manager
spec:
replicas: ${FLINK_NUM_OF_TASKMANAGERS}
template:
metadata:
labels:
app: flink
component: task-manager
spec:
containers:
- name: flink-task-manager
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["task-manager",
"-Djobmanager.rpc.address=flink-job-cluster"]
22 © 2019 Ververica
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: flink-task-manager
spec:
replicas: ${FLINK_NUM_OF_TASKMANAGERS}
template:
metadata:
labels:
app: flink
component: task-manager
spec:
containers:
- name: flink-task-manager
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["task-manager",
"-Djobmanager.rpc.address=flink-job-cluster"]
task-manager-deployment.yaml.template
apiVersion: batch/v1
kind: Job
metadata:
name: flink-job-cluster
spec:
template:
metadata:
labels:
app: flink
component: job-cluster
spec:
restartPolicy: OnFailure
containers:
- name: flink-job-cluster
image: ${FLINK_IMAGE_NAME}
imagePullPolicy: Never
args: ["job-cluster",
"-Djobmanager.rpc.address=flink-job-cluster",
"-Dblob.server.port=6124",
"-Dqueryable-state.server.ports=6125"]
ports:
- containerPort: 6123
name: rpc
- containerPort: 6124
name: blob
- containerPort: 6125
name: query
- containerPort: 8081
name: ui
job-cluster-job.yaml.template
2: K8s manifests
23 © 2019 Ververica
24 © 2019 Ververica
apiVersion: v1
kind: Service
metadata:
name: flink-job-cluster
labels:
app: flink
component: job-cluster
spec:
ports:
- name: rpc
port: 6123
- name: blob
port: 6124
- name: query
port: 6125
nodePort: 30025
- name: ui
port: 8081
nodePort: 30081
type: NodePort
selector:
app: flink
component: job-cluster
3: Expose job cluster as a service
job-cluster-service.yaml
internal ports
external ports
25 © 2019 Ververica
26 © 2019 Ververica
4: Setup minio for checkpoints & savepoints
• S3-compatible storage service
• Apache License v2.0
• Lightweight, easy to setup
27 © 2019 Ververica
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minio-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
minio-standalone-pvc.yaml
28 © 2019 Ververica
minio-standalone-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: minio
spec:
strategy:
type: Recreate
template:
metadata:
labels:
app: minio
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: minio-pv-claim
containers:
- name: minio
volumeMounts:
- name: data
mountPath: "/data"
image: minio/minio:RELEASE.2019-03-13T21-59-47Z
args:
- server
- /data
env:
- name: MINIO_ACCESS_KEY
value: "minio"
- name: MINIO_SECRET_KEY
value: "minio123"
ports:
- containerPort: 9000
livenessProbe:
httpGet:
path: /minio/health/live
port: 9000
initialDelaySeconds: 120
periodSeconds: 20
29 © 2019 Ververica
apiVersion: v1
kind: Service
metadata:
name: minio-service
spec:
type: NodePort
ports:
- port: 9000
nodePort: 30090
selector:
app: minio
s3.path-style-access: true
s3.endpoint: http://minio-service:9000
minio-standalone-service.yaml
flink-conf.yaml
30 © 2019 Ververica
/bin/sh -c "
sleep 10;
/usr/bin/mc config host add myminio http://minio-service:9000 minio minio123;
/usr/bin/mc mb myminio/state;
exit 0;
"
minio setup job
state.checkpoints.dir: s3://state/checkpoints
state.savepoints.dir: s3://state/savepoints
s3.access-key: minio
s3.secret-key: minio123
flink-conf.yaml
31 © 2019 Ververica
A Note on Bucket Addresses
• Two ways to specify buckets:
– virtual-hosted style: state.minio-service:9000
– path-style: minio-service:9000/state
• It’s easier to get path-style addresses working, by either using
– s3.path-style-access: true (requires flink 1.8+)
or by
– specifying the endpoint with its IP address, rather than hostname
32 © 2019 Ververica
33 © 2019 Ververica
34 © 2019 Ververica
Rescaling
$ kubectl scale deployment -l component=task-manager --replicas=2
deployment.extensions "flink-task-manager" scaled
$ flink modify 00000000000000000000000000000000 -p 8 -m localhost:30081
Modify job 00000000000000000000000000000000.
Rescaled job 00000000000000000000000000000000. Its new parallelism is 8.
35 © 2019 Ververica
3. DEBUGGING
36 © 2019 Ververica
. . .
JOB_CLUSTER="job-cluster"
TASK_MANAGER="task-manager"
if [ "${CMD}" == "${JOB_CLUSTER}" -o "${CMD}" == "${TASK_MANAGER}" ]; then
echo "Starting the ${CMD}"
echo "config file: " && grep '^[^n#]' $FLINK_HOME/conf/flink-conf.yaml
if [ "${CMD}" == "${TASK_MANAGER}" ]; then
exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$@"
else
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@"
fi
fi
exec "$@"
docker-entrypoint.sh
37 © 2019 Ververica
Starting the job-cluster
config file:
jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 4
parallelism.default: 1
high-availability: zookeeper
high-availability.jobmanager.port: 6123
high-availability.storageDir: s3://highavailability/storage
high-availability.zookeeper.quorum: zoo1:2181
state.backend: filesystem
state.checkpoints.dir: s3://state/checkpoints
state.savepoints.dir: s3://state/savepoints
rest.port: 8081
zookeeper.sasl.disable: true
s3.access-key: minio
s3.secret-key: minio123
s3.path-style-access: true
s3.endpoint: http://minio-service:9000
logs
38 © 2019 Ververica
39 © 2019 Ververica
4. FUTURE PLANS
40 © 2019 Ververica
Tighter Integration with K8s
• Active mode
– Flink is aware of the cluster manager that it is running on,
and interacts with it
– Examples exist, e.g., FLIP-6 YARN
• Reactive mode
– Flink is oblivious to its environment
– Flink may react to resources changes by scaling job
41 © 2019 Ververica
Active k8s Integration
K8s deployment
controller
Client
TaskManager
JobManager
K8sResourceManager
ApplicationMaster
TaskManager
(3) Submit job
(1) Submit AM deployment
(2) Start AM
pod
(4) Start JM
(5) Request slots
(6) Submit TM
deployment
(7) Start TM pod
(8) Register(9) Request slots
(10) Offer slots
42 © 2019 Ververica
FLINK-9953: Active Kubernetes integration
The ResourceManager can talk to Kubernetes to launch new pods
43 © 2019 Ververica
Reactive Container Mode
• Relies on external system to start/release
TaskManagers, e.g.,
– Kubernetes Horizontal Pod Autoscaler
– GCP Autoscaling
– AWS Auto Scaling Group
• Re-scale job as resources are
added/removed (take savepoint and resume
job with new parallelism automatically)
• By definition works with all cluster managers
Flink cluster
JM TM TM
ASG
Start new TM if
CPU% > threshold
Monitor metrics, e.g, CPU%
Register
& offer
slots
Event rate over time
44 © 2019 Ververica
FLINK-10407: Reactive container mode
Re-scale job as resources are added/removed
45 © 2019 Ververica
Summary
• Flink currently supports job and session clusters on K8s
• Example
– https://github.com/alpinegizmo/flink-containers-example
• Active K8s integration is in progress
• Reactive container mode has been designed/planned
• Call to action:
– Umbrella tickets: FLINK-9953, FLINK-10407
– Join discussions on dev@flink.apache.org
46 © 2019 Ververica
Thank you!
47 © 2019 Ververica
Questions?
48 © 2019 Ververica
www.ververica.com @VervericaDatadavid@ververica.com

Contenu connexe

Tendances

Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 

Tendances (20)

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Why Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik RamasamyWhy Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik Ramasamy
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsMonitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for production
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
 
Flink on Kubernetes operator
Flink on Kubernetes operatorFlink on Kubernetes operator
Flink on Kubernetes operator
 
Battle of the Stream Processing Titans – Flink versus RisingWave
Battle of the Stream Processing Titans – Flink versus RisingWaveBattle of the Stream Processing Titans – Flink versus RisingWave
Battle of the Stream Processing Titans – Flink versus RisingWave
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 

Similaire à Deploying Flink on Kubernetes - David Anderson

Photon Controller: An Open Source Container Infrastructure Platform from VMware
Photon Controller: An Open Source Container Infrastructure Platform from VMwarePhoton Controller: An Open Source Container Infrastructure Platform from VMware
Photon Controller: An Open Source Container Infrastructure Platform from VMware
Docker, Inc.
 
DevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes IntegrationDevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes Integration
Hank Preston
 

Similaire à Deploying Flink on Kubernetes - David Anderson (20)

Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
 
Dockerizing OpenStack for High Availability
Dockerizing OpenStack for High AvailabilityDockerizing OpenStack for High Availability
Dockerizing OpenStack for High Availability
 
Scaling docker with kubernetes
Scaling docker with kubernetesScaling docker with kubernetes
Scaling docker with kubernetes
 
Kubernetes for the VI Admin
Kubernetes for the VI AdminKubernetes for the VI Admin
Kubernetes for the VI Admin
 
Photon Controller: An Open Source Container Infrastructure Platform from VMware
Photon Controller: An Open Source Container Infrastructure Platform from VMwarePhoton Controller: An Open Source Container Infrastructure Platform from VMware
Photon Controller: An Open Source Container Infrastructure Platform from VMware
 
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
 
DockerCon 2022 - From legacy to Kubernetes, securely & quickly
DockerCon 2022 - From legacy to Kubernetes, securely & quicklyDockerCon 2022 - From legacy to Kubernetes, securely & quickly
DockerCon 2022 - From legacy to Kubernetes, securely & quickly
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
 
The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)
 
Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307Docker kubernetes fundamental(pod_service)_190307
Docker kubernetes fundamental(pod_service)_190307
 
K8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals TrainingK8s in 3h - Kubernetes Fundamentals Training
K8s in 3h - Kubernetes Fundamentals Training
 
20191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 220191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 2
 
Kubernetes workshop -_the_basics
Kubernetes workshop -_the_basicsKubernetes workshop -_the_basics
Kubernetes workshop -_the_basics
 
VMware Tanzu Introduction- June 11, 2020
VMware Tanzu Introduction- June 11, 2020VMware Tanzu Introduction- June 11, 2020
VMware Tanzu Introduction- June 11, 2020
 
Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !Get you Java application ready for Kubernetes !
Get you Java application ready for Kubernetes !
 
DevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes IntegrationDevNetCreate - ACI and Kubernetes Integration
DevNetCreate - ACI and Kubernetes Integration
 
Part 4: Custom Buildpacks and Data Services (Pivotal Cloud Platform Roadshow)
Part 4: Custom Buildpacks and Data Services (Pivotal Cloud Platform Roadshow)Part 4: Custom Buildpacks and Data Services (Pivotal Cloud Platform Roadshow)
Part 4: Custom Buildpacks and Data Services (Pivotal Cloud Platform Roadshow)
 
Deploying Kubernetes in the Enterprise (IBM #Think2019 #7678 Tech Talk)
Deploying Kubernetes in the Enterprise (IBM #Think2019 #7678 Tech Talk)Deploying Kubernetes in the Enterprise (IBM #Think2019 #7678 Tech Talk)
Deploying Kubernetes in the Enterprise (IBM #Think2019 #7678 Tech Talk)
 
Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...
Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...
Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...
 

Plus de Ververica

Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 

Plus de Ververica (20)

2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
2020-05-06 Apache Flink Meetup London: The Easiest Way to Get Operational wit...
 
Webinar: How to contribute to Apache Flink - Robert Metzger
Webinar:  How to contribute to Apache Flink - Robert MetzgerWebinar:  How to contribute to Apache Flink - Robert Metzger
Webinar: How to contribute to Apache Flink - Robert Metzger
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin KnaufWebinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
Webinar: 99 Ways to Enrich Streaming Data with Apache Flink - Konstantin Knauf
 
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar:  Detecting row patterns with Flink SQL - Dawid WysakowiczWebinar:  Detecting row patterns with Flink SQL - Dawid Wysakowicz
Webinar: Detecting row patterns with Flink SQL - Dawid Wysakowicz
 
Webinar: Flink SQL in Action - Fabian Hueske
 Webinar: Flink SQL in Action - Fabian Hueske Webinar: Flink SQL in Action - Fabian Hueske
Webinar: Flink SQL in Action - Fabian Hueske
 
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
 
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 22018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
2018-01 Seattle Apache Flink Meetup at OfferUp, Opening Remarks and Talk 2
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
Kostas Kloudas - Complex Event Processing with Flink: the state of FlinkCEP
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
Aljoscha Krettek - Apache Flink® and IoT: How Stateful Event-Time Processing ...
 
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
 
Kostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIsKostas Kloudas - Extending Flink's Streaming APIs
Kostas Kloudas - Extending Flink's Streaming APIs
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
Stephan Ewen - Stream Processing as a Foundational Paradigm and Apache Flink'...
 
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Deploying Flink on Kubernetes - David Anderson

  • 1. © 2019 Ververica David Anderson | @alpinegizmo | Training Coordinator Getting Started with Apache Flink® on Kubernetes
  • 2. 2 © 2019 Ververica About Ververica Creators of Apache Flink® Real Time Stream Processing for the Enterprise
  • 3. 3 © 2019 Ververica Outline 1. Introduction 2. Detailed Example 3. Debugging Tips 4. Future Plans
  • 4. 4 © 2019 Ververica Why Containers? • Containers provide isolation at low cost – Require fewer resources than VMs – Smaller, boot faster • Simpler to manage – Each container does one thing – Consistent packaging • Enables flexible and dynamic resource allocation – Scalable – Composable
  • 5. 5 © 2019 Ververica Container Orchestration with Kubernetes • Declarative configuration: – You tell K8s the desired state, and a background process makes it happen • 3 replicas of this container should be kept running • A load balancer should exist, listening on port 443, backed by container with this label • Core resource types: – Pod: a group of one or more containers – Job: keeps pod(s) running until finished – Deployment: keeps n pods running indefinitely – Service: a REST object backed by a set of pods – Persistent Volume Claim: storage whose lifetime is not coupled to any of the pods
  • 6. 6 © 2019 Ververica Vision: Flink as a Library • Makes deployments simpler – Focus is on deploying/running an application – You build one, complete job-specific Docker image that includes: • Your application code • Flink libraries • Other dependencies • Configuration files
  • 7. 7 © 2019 Ververica Flink’s Runtime Building Blocks • Cluster framework-specific • Manages available TaskManagers • Acquires / releases resources ResourceManager TaskManagerJobManager • Registers with ResourceManager • Provides “task slots” • Assigned tasks by one or more JobManagers • One per job • Schedules job in terms of "task slots" • Monitors task execution • Coordinates checkpointing Dispatcher • Touch-point for job submissions • Spawns JobManagers
  • 8. 8 © 2019 Ververica Flink’s Runtime Building Blocks • Cluster framework-specific • Manages available TaskManagers • Acquires / releases resources ResourceManager TaskManagerJobManager • Registers with ResourceManager • Provides “task slots” • Assigned tasks by JobManager(s) • One per job • Schedules job in terms of "task slots" • Monitors task execution • Coordinates checkpointing Dispatcher • Touch-point for job submissions • Spawns JobManagers
  • 9. 9 © 2019 Ververica Runtime Building Blocks (on Yarn) ResourceManager (3) Request slots TaskManager JobManager (4) Start TaskManager (5) Register (7) Deploy Tasks Dispatcher App/Client (1) Submit Job (2) Start JobManager (6) Offer slots
  • 10. 10 © 2019 Ververica But we’re not quite there yet with K8s
  • 11. 11 © 2019 Ververica Flink on K8s: current status • Still using the legacy standalone resource manager • Deployment establishes a static execution environment • You will have a k8s manifest that effectively says – there should be n taskmanagers that look like this Flink is not aware of Kubernetes
  • 12. 12 © 2019 Ververica Master Container ResourceManager JobManager Mini Dispatcher (2) Run & Start Worker Container TaskManager Worker Container TaskManager Worker Container TaskManager (3) Register (4) Deploy Tasks (0) One image is built that can be either a Master or Worker (1) Container framework starts Master & Worker Containers Flink job cluster on K8s
  • 13. 13 © 2019 Ververica 2. EXAMPLE https://github.com/alpinegizmo/flink-containers-example
  • 14. 14 © 2019 Ververica Very Simple Streaming Job https://github.com/alpinegizmo/flink-containers-example data generator RichFlatMap print # events per user keyBy
  • 15. 15 © 2019 Ververica
  • 16. 16 © 2019 Ververica Desired Runtime Landscape for K8s
  • 17. 17 © 2019 Ververica Steps 1. Build the docker image 2. Set up job cluster (k8s job) & task managers (k8s deployment) 3. Set up job cluster service 4. Add minio for checkpoints
  • 18. 18 © 2019 Ververica 1: Build a docker image ADD $flink_dist $FLINK_INSTALL_PATH ADD $job_jar $FLINK_INSTALL_PATH/job.jar . . . COPY docker/flink/flink-conf.yaml $FLINK_HOME/conf COPY docker/flink/log4j-console.properties $FLINK_HOME/conf COPY docker/flink/docker-entrypoint.sh / . . . ENTRYPOINT ["/docker-entrypoint.sh"] Dockerfile
  • 19. 19 © 2019 Ververica . . . JOB_CLUSTER="job-cluster" TASK_MANAGER="task-manager" CMD="$1" shift; if [ "${CMD}" == "${JOB_CLUSTER}" -o "${CMD}" == "${TASK_MANAGER}" ]; then if [ "${CMD}" == "${TASK_MANAGER}" ]; then exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$@" else exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@" fi fi exec "$@" docker-entrypoint.sh
  • 20. 20 © 2019 Ververica apiVersion: extensions/v1beta1 kind: Deployment metadata: name: flink-task-manager spec: replicas: ${FLINK_NUM_OF_TASKMANAGERS} template: metadata: labels: app: flink component: task-manager spec: containers: - name: flink-task-manager image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["task-manager", "-Djobmanager.rpc.address=flink-job-cluster"] task-manager-deployment.yaml.template apiVersion: batch/v1 kind: Job metadata: name: flink-job-cluster spec: template: metadata: labels: app: flink component: job-cluster spec: restartPolicy: OnFailure containers: - name: flink-job-cluster image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["job-cluster", "-Djobmanager.rpc.address=flink-job-cluster", "-Dblob.server.port=6124", "-Dqueryable-state.server.ports=6125"] ports: - containerPort: 6123 name: rpc - containerPort: 6124 name: blob - containerPort: 6125 name: query - containerPort: 8081 name: ui job-cluster-job.yaml.template 2: K8s manifests
  • 21. 21 © 2019 Ververica task-manager-deployment.yaml.template apiVersion: batch/v1 kind: Job metadata: name: flink-job-cluster spec: template: metadata: labels: app: flink component: job-cluster spec: restartPolicy: OnFailure containers: - name: flink-job-cluster image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["job-cluster", "-Djobmanager.rpc.address=flink-job-cluster", "-Dblob.server.port=6124", "-Dqueryable-state.server.ports=6125"] ports: - containerPort: 6123 name: rpc - containerPort: 6124 name: blob - containerPort: 6125 name: query - containerPort: 8081 name: ui job-cluster-job.yaml.template 2: K8s manifests apiVersion: extensions/v1beta1 kind: Deployment metadata: name: flink-task-manager spec: replicas: ${FLINK_NUM_OF_TASKMANAGERS} template: metadata: labels: app: flink component: task-manager spec: containers: - name: flink-task-manager image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["task-manager", "-Djobmanager.rpc.address=flink-job-cluster"]
  • 22. 22 © 2019 Ververica apiVersion: extensions/v1beta1 kind: Deployment metadata: name: flink-task-manager spec: replicas: ${FLINK_NUM_OF_TASKMANAGERS} template: metadata: labels: app: flink component: task-manager spec: containers: - name: flink-task-manager image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["task-manager", "-Djobmanager.rpc.address=flink-job-cluster"] task-manager-deployment.yaml.template apiVersion: batch/v1 kind: Job metadata: name: flink-job-cluster spec: template: metadata: labels: app: flink component: job-cluster spec: restartPolicy: OnFailure containers: - name: flink-job-cluster image: ${FLINK_IMAGE_NAME} imagePullPolicy: Never args: ["job-cluster", "-Djobmanager.rpc.address=flink-job-cluster", "-Dblob.server.port=6124", "-Dqueryable-state.server.ports=6125"] ports: - containerPort: 6123 name: rpc - containerPort: 6124 name: blob - containerPort: 6125 name: query - containerPort: 8081 name: ui job-cluster-job.yaml.template 2: K8s manifests
  • 23. 23 © 2019 Ververica
  • 24. 24 © 2019 Ververica apiVersion: v1 kind: Service metadata: name: flink-job-cluster labels: app: flink component: job-cluster spec: ports: - name: rpc port: 6123 - name: blob port: 6124 - name: query port: 6125 nodePort: 30025 - name: ui port: 8081 nodePort: 30081 type: NodePort selector: app: flink component: job-cluster 3: Expose job cluster as a service job-cluster-service.yaml internal ports external ports
  • 25. 25 © 2019 Ververica
  • 26. 26 © 2019 Ververica 4: Setup minio for checkpoints & savepoints • S3-compatible storage service • Apache License v2.0 • Lightweight, easy to setup
  • 27. 27 © 2019 Ververica apiVersion: v1 kind: PersistentVolumeClaim metadata: name: minio-pv-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi minio-standalone-pvc.yaml
  • 28. 28 © 2019 Ververica minio-standalone-deployment.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: minio spec: strategy: type: Recreate template: metadata: labels: app: minio spec: volumes: - name: data persistentVolumeClaim: claimName: minio-pv-claim containers: - name: minio volumeMounts: - name: data mountPath: "/data" image: minio/minio:RELEASE.2019-03-13T21-59-47Z args: - server - /data env: - name: MINIO_ACCESS_KEY value: "minio" - name: MINIO_SECRET_KEY value: "minio123" ports: - containerPort: 9000 livenessProbe: httpGet: path: /minio/health/live port: 9000 initialDelaySeconds: 120 periodSeconds: 20
  • 29. 29 © 2019 Ververica apiVersion: v1 kind: Service metadata: name: minio-service spec: type: NodePort ports: - port: 9000 nodePort: 30090 selector: app: minio s3.path-style-access: true s3.endpoint: http://minio-service:9000 minio-standalone-service.yaml flink-conf.yaml
  • 30. 30 © 2019 Ververica /bin/sh -c " sleep 10; /usr/bin/mc config host add myminio http://minio-service:9000 minio minio123; /usr/bin/mc mb myminio/state; exit 0; " minio setup job state.checkpoints.dir: s3://state/checkpoints state.savepoints.dir: s3://state/savepoints s3.access-key: minio s3.secret-key: minio123 flink-conf.yaml
  • 31. 31 © 2019 Ververica A Note on Bucket Addresses • Two ways to specify buckets: – virtual-hosted style: state.minio-service:9000 – path-style: minio-service:9000/state • It’s easier to get path-style addresses working, by either using – s3.path-style-access: true (requires flink 1.8+) or by – specifying the endpoint with its IP address, rather than hostname
  • 32. 32 © 2019 Ververica
  • 33. 33 © 2019 Ververica
  • 34. 34 © 2019 Ververica Rescaling $ kubectl scale deployment -l component=task-manager --replicas=2 deployment.extensions "flink-task-manager" scaled $ flink modify 00000000000000000000000000000000 -p 8 -m localhost:30081 Modify job 00000000000000000000000000000000. Rescaled job 00000000000000000000000000000000. Its new parallelism is 8.
  • 35. 35 © 2019 Ververica 3. DEBUGGING
  • 36. 36 © 2019 Ververica . . . JOB_CLUSTER="job-cluster" TASK_MANAGER="task-manager" if [ "${CMD}" == "${JOB_CLUSTER}" -o "${CMD}" == "${TASK_MANAGER}" ]; then echo "Starting the ${CMD}" echo "config file: " && grep '^[^n#]' $FLINK_HOME/conf/flink-conf.yaml if [ "${CMD}" == "${TASK_MANAGER}" ]; then exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$@" else exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$@" fi fi exec "$@" docker-entrypoint.sh
  • 37. 37 © 2019 Ververica Starting the job-cluster config file: jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 jobmanager.heap.size: 1024m taskmanager.heap.size: 1024m taskmanager.numberOfTaskSlots: 4 parallelism.default: 1 high-availability: zookeeper high-availability.jobmanager.port: 6123 high-availability.storageDir: s3://highavailability/storage high-availability.zookeeper.quorum: zoo1:2181 state.backend: filesystem state.checkpoints.dir: s3://state/checkpoints state.savepoints.dir: s3://state/savepoints rest.port: 8081 zookeeper.sasl.disable: true s3.access-key: minio s3.secret-key: minio123 s3.path-style-access: true s3.endpoint: http://minio-service:9000 logs
  • 38. 38 © 2019 Ververica
  • 39. 39 © 2019 Ververica 4. FUTURE PLANS
  • 40. 40 © 2019 Ververica Tighter Integration with K8s • Active mode – Flink is aware of the cluster manager that it is running on, and interacts with it – Examples exist, e.g., FLIP-6 YARN • Reactive mode – Flink is oblivious to its environment – Flink may react to resources changes by scaling job
  • 41. 41 © 2019 Ververica Active k8s Integration K8s deployment controller Client TaskManager JobManager K8sResourceManager ApplicationMaster TaskManager (3) Submit job (1) Submit AM deployment (2) Start AM pod (4) Start JM (5) Request slots (6) Submit TM deployment (7) Start TM pod (8) Register(9) Request slots (10) Offer slots
  • 42. 42 © 2019 Ververica FLINK-9953: Active Kubernetes integration The ResourceManager can talk to Kubernetes to launch new pods
  • 43. 43 © 2019 Ververica Reactive Container Mode • Relies on external system to start/release TaskManagers, e.g., – Kubernetes Horizontal Pod Autoscaler – GCP Autoscaling – AWS Auto Scaling Group • Re-scale job as resources are added/removed (take savepoint and resume job with new parallelism automatically) • By definition works with all cluster managers Flink cluster JM TM TM ASG Start new TM if CPU% > threshold Monitor metrics, e.g, CPU% Register & offer slots Event rate over time
  • 44. 44 © 2019 Ververica FLINK-10407: Reactive container mode Re-scale job as resources are added/removed
  • 45. 45 © 2019 Ververica Summary • Flink currently supports job and session clusters on K8s • Example – https://github.com/alpinegizmo/flink-containers-example • Active K8s integration is in progress • Reactive container mode has been designed/planned • Call to action: – Umbrella tickets: FLINK-9953, FLINK-10407 – Join discussions on dev@flink.apache.org
  • 46. 46 © 2019 Ververica Thank you!
  • 47. 47 © 2019 Ververica Questions?
  • 48. 48 © 2019 Ververica www.ververica.com @VervericaDatadavid@ververica.com