Kafka on Kubernetes (Uma Mukkar - Mayadata)

Kafka on Kubernetes
@uma_mukkara
@ksatchit
22 December 2019
Cloud Native Data Management and Chaos Engineering for
Kafka
Apache Kafka and stream processing meetup @ Walmart

Agenda
● Cloud-Native environments - Quick Recap
● Kafka challenges
○ Data persistence
○ Resilience
● Introduce OpenEBS
● Introduce Litmus
● Demo
2

3
Uma Mukkara
Co-Founder & COO
@uma_mukkara
Open source projects
About your speakers
Karthik S
Core Maintainer, Litmus
@ksatchit

Cloud Native Environments
(Introduction)

Systemic changes in software industry
6
Developer
Manual
Tester
Automation
Tester
System
Engineer
Admins
DevOps
Operations
Architects

Systemic changes in software industry
7
Developer
Manual
Tester
Automation
Tester
System
Engineer
Admins
DevOps
Operations
Architects
SRE

Cloud Native Environments
● DevOps agility
○ Quick to Develop
○ Quick to Deploy
○ Faster feedback
○ Shift-Left
● DevOps
○ GitOps
○ DataOps
8
Goals Environment

Kafka Challenges
in Cloud Native Environments
● Data Challenges
● Resilience Challenges

Scaling
● Conﬂuent operator simpliﬁes installation and provisioning
● How to scale?
13
Operator.YAML
Operator
LocalPV
Kafka Broker
LocalPV
Kafka Broker
LocalPV
Kafka Broker
LocalPV
Kafka Broker
replicas : n++

Data Persistence challenges
14
● Ephemeral nodes
● Capacity based scheduling
● Reuse of disks
● Unavailability of disks or disk space
○ Auto provisioning of disks and then localpv
● Scaling down
Basic provisioning can be done statically or
some automation. But what about:

15
Ephemeral nodes
?

16
Capacity based scheduling
200G 500G
Operator
Scale = +1
LocalPV
Provisioner
? ?

17
Reuse of disks? Disk protection
New PVC request
Disk

18
Unavailability of disks Operator
Scale = +1
LocalPV
Provisioner
? ?

Solution
19
Use OpenEBS LocalPV

Kubernetes is great for
Stateless applications and app
components, so why not use it
to orchestrate stateful apps?

Applications have
changed and someone
forgot to tell storage

So MayaData put the storage
array inside K8s with OpenEBS,
and now devs can do their own
storage management.

OpenEBS
24
• Most active community in the Container Attached Storage space
• GitHub: https://github.com/openebs/openebs
• Website: https://openebs.io/
• Slack: https://slack.openebs.io
• Twitter: https://twitter.com/openebs
• Overall 350+ Code contributors (https://devstats.openebs.io/)
• 1400+ Slack Members, 600+ Forks, 6000+ stars
• 1.0 released in June
• Deployed in 100s of clusters every week.
• Kubernetes Native, 100% Userspace

OpenEBS NDM
25
● NDM runs as a
daemonset and
maintains the block
devices as CRs
● NDM operator links
north bound and
sound bound
interfaces
Application
Developer PVC
OpenEBS LocalPV
Provisioner
OpenEBS cStor/Jiva
volume Provisioner
cStor Pool
Provisioner
NDM Operator
NDM Sound bound
provisioner
CSI Drivers
OpenStack VSAN OpenSDS Legacy
(NetApp/Pure)
Block devices
in etcd Director
(DataOps)
Auto provisioning
of disks
VSANEBS/GPD/
Azure Disks
Any CSI driver
● Complete disk
management
including auto
provisioning to
smoothen the data
ops
● Data mobility
becomes easier with
auto provisioning on
remote clouds

OpenEBS Architecture
26
K8s api
Maya api
Browser
/
Terminal
---
kubectl
NDM
for underlying
block dev
discovery
Block
devices
OpenEBS
Provisioner
For disk setup
OpenEBS CRDs
} OpenEBS cStor Storage Pool
PVC
PV
POD
etcd
SnapClone
} OpenEBS LocalPV
PVC
PV
POD
Node 1
Node 2
Node 3
Node 4

Summary - OpenEBS LocalPV
27
● Ephemeral nodes
● Capacity based scheduling
● Reuse of disks
● Unavailability of disks or disk space
○ Auto provisioning of disks and then localpv
● Scaling down
Node Disk Manager and a bunch of operators
to solve the following problems:

Resilience challenges for Kafka
(Introduction)
Introducing Chaos Engineering

Reliability
● Reliability is too important. Outages of services costs $$$
29

Chaos Engineering
● Practice chaos engineering to increase resiliency
Resiliency Achieved by
CI Pipelines
Functional
Tests
Failure Tests
+
Achieved by
Staging / Production
Good CI
Random
Chaos+
30

Chaos Engineering Loop
* Images and content authored by: Mark McBride, Turbine Labs
31

Cloud-Native environment
● My code is 1%. Rest is not controlled by me.
● Linux is the least dynamic stack
● Rest is all microservices, based - highly dynamic CHAOS ENGINEERING
Then, how to achieve Resilience ?
32

Cloud-Native Chaos Engineering
Cloud Native
APIs
POD Deployment
PVC Statefulset
SVC CRDs
For
Development
For Chaos Testing
Cloud Native
APIs
?
Cloud-native
Application
33

Cloud Native
APIs
POD Deployment
PVC Statefulset
SVC CRDs
For Chaos Testing
Cloud Native
APIs
Chaos
Engine
Chaos
Experiment
Chaos Result
New CRDs
Cloud-native
Application
For
Development
34

apiVersion: v1
kind: Pod
metadata:
name: percona-pod
labels:
app: percona
spec:
containers:
- name: percona
image: percona:2.4
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
Name: demo-vol1-claim
spec:
storageClassName:
openebs-jiva-default
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5G
Create POD
Create PV
Inject Chaos
Cloud Native
Developer
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-percona
spec:
appinfo:
appns: default
applabel: "app=percona"
experiments:
- name: replica-kill
spec:
components:
- name: read-only
spec:
components:
35

● Leading open source project for Chaos Engineering on
Kubernetes
● Apache2 License
● https://github.com/litmuschaos
● 50+ contributors
● 450+ stars
● CNCF Landscape - https://landscape.cncf.io/selected=litmus
● Chaos Hub https://hub.litmuschaos.io/
● CNCF Blog
https://www.cncf.io/blog/2019/11/06/cloud-native-chaos-engineering
-enhancing-kubernetes-application-resiliency/
Introducing Litmus
36

LitmusChaos
www.litmuschaos.io
37

ChaosHub
hub.litmuschaos.io
38

How Litmus works
39
Chaos
Libraries
hub.litmuschaos.io
Chaos
Operator
helm install litmuschaos/litmus
Chaos
Charts
App
container
Chaos
container
Install Litmus
Install Charts
Inject Chaos

Resilience for Kafka
40
https://hub.litmuschaos.io

What can you do?
41
● Join OpenEBS and Litmus slack to keep in touch
● Try using OpenEBS for Kafka and provide feedback
○ Open Source Karma
● Try using Litmus for Kafka
● Create issues for
○ Your challenges related to data
○ More chaos scenarios

Open Source Karma
42
● Like OpenEBS ?
○ https://bit.ly/staropenebs
● Like Litmus?
○ https://bit.ly/starlitmus
Help yourself with some cool stickers

Quick Look at the setup
44
Broker Broker Broker Broker
Konvoy with OpenEBS LocalPV

A Quick Kafka Refresher
● Kafka is the current go-to solution for building maintainable, extendable and scalable data
pipelines.
● Applications (producers) send (publish) messages (records) to a Kafka node (broker) and
said messages are processed by other applications (consumers).
● Said messages get stored in a topic and consumers subscribe to the topic to receive new
messages.
● Topics are broken down into partitions, for better performance & scalability. These partitions
consist of ordered data units identified by their offsets.
● The partitions are often replicated with one broker owning it (handle writes), called Leader
Broker with data replicated written into followers.
● A Controller Broker manages the administrative tasks to topic creation, leader assignments,
failover to in-sync replicas etc.,
● Kafka Cluster State is typically maintained in ZooKeeper (a distributed key value store), with
brokers keep a ZooKeeper watch to know events.

Kafka Chaos Experiment: Kill the Kafka Broker Pod
● Objective: Test the deployment sanity of the Stateful Kafka Cluster
● Approach: Setup a liveness message stream with test producer/consumer & specify a
message timeout. Identify a partition leader & kill the pod. Verify if the liveness stream is
un-interrupted
● Health Checks: Verify broker list via ZooKeeper
● Tools: Litmus Chaos Operator & Kafka Chaos Experiment Chart
● Tested: CP-Kafka, Kudo-Kafka

Kafka Chaos Experiment: Kill the Kafka Broker Pod

Kafka on Kubernetes (Uma Mukkar - Mayadata)

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

Kafka on Kubernetes (Uma Mukkar - Mayadata)