Camunda BPM 7.2: Performance and Scalability (English)

Hands-on Webinar
Camunda BPM 7.2
Performance and Scalability

Daniel Meyer
 Process Engine Expert
 Technical Project Lead
@meyerdan | daniel.meyer@camunda.com
Bernd Rücker
 10+ years experience with workflow and Java
 Co-Founder of Camunda
 Evangelist & Head of Consulting
@berndruecker | bernd.ruecker@camunda.com
Your speakers today

Performance is a difficult topic

 It always depends
−On hardware
−On software environment (OS, Java, App Server, Database, …)
−On Service Tasks in the process
−On network topology (e.g. remote database, web services, …)
−On concurrent requests, database load, …
 There is no simple answer to performance
 But we always succeed – in each and every real-life situation
−Handling millions of process instances / day
−Handling more than 1.000 process instances / second
−Handling thousands of parallel users
Performance is a difficult topic

We are much faster than competition
see http://camunda.com/landing/whitepaper-camunda-jbpm/
In our tests, Camunda‘s
throughput was 10x –
30x higher than with
JBoss jBPM.

1. Understand basic engine architecture
2. Understand influence parameters on performance
3. Discuss performance improvement approaches
4. See example figures / measurements
5. Discuss future scenarios (e.g. sharding, NoSQL, …)
What we do today

Basic Engine Architecture
We use
Optimistic
Locking

Learning #1:
The architecture it damn simple –
and the bottleneck is not the
process engine!

Biggest influence on Performance
Database Delegation Code
Call Service

Clustering via shared database

Learning #2:
All state is in the database so
clustering gets really easy.
camunda scales!
More on this later…

„But what can I do if performance IS a problem?“

1. Tasklist
2. (History) Queries
3. Job Execution
Typical Areas of performance issues

 Process/Task Variables
−Show in list
−Use in Search/Filter
 Support for Pagination
 Big number of users accessing the tasklist very often
Implementation challenge
 Provide a generic database schema
 Complex data types are serialized – no SQL-JOIN possible
 Variables are stored in one row per variable – multiple SQL-JOINs
might be required
 Some customers use 10-30 variables
Tasklist Requirements

 Add Process Variables optimized (and only used) for Queries
−Extract attributes
−Combine variables to work with LIKE
 Use own queries
−Native – if you want to improve the WHERE
−Custom – if you want to SELECT multiple information at once
 Own TaskInfo or ProcessInstanceInfo entities
−Persisted as MyBatis or JPA entities
−Combine all attributes – allow to query tasks without (or with one)
JOIN only
−Synchronisation via Listener – or use ProcessInstanceInfo as single
source
Solution Approaches: Tasklist

Example
Customer
- customerId
- company
- …
Your DB camunda
PROCESS_VARIABLES
customerId
...
searchField
4711
...
4711#camunda#Berlin#...
1
2
Native
Query:
3
Custom
Query:
4
Java API –
results are
camunda „Task“
entities
Own MyBatis
mapping – result
can be anything.
Called via
custom code.

Example
TaskInfo
- taskId
- customerId
- companyName
- contractId
- productName
- …
Your DB
camunda
PROCESS_VARIABLES
customerId
contractId
productId
4711
0815
42
5
TaskInfo Entity
(or ProcessInstanceInfo)

 The challenge:
−Indexes cost space and performance in writing data
−We provide a generic database schema without knowing what you
exactly do with it
−We constantly work on the right balance of too many and too less
indexes
 What you can do:
−Check indexes and slow query log
−Add index where appropriate for your situation (perfectly OK with
us, you do not loose support!)
−As Enterprise Customer you can always discuss/validate changes
with support
 Example: create index PROC_DEF_ID_END_TIME ON
ACT_HI_PROCINST (PROC_DEF_ID_,END_TIME_)
(History) Queries

You can also customize history
Custom History
(e.g.
ElasticSearch)
Different History Levels:
- NONE
- ACTIVITY
- AUDIT
- FULL
- CUSTOM (own Filter written
in Java, e.g. „only variable
X“, „not process Y“, …)
Example for custom log level:
https://github.com/camunda/camunda-
bpm-examples/tree/master/process-
engine-plugin/custom-history-level

 Asynchronous Continuation involve Jobs
 Jobs are stored in the database
 Job Executor can be configured
−Number of Worker Threads
−Number of Jobs fetched with one database query
−Size of in-memory Queue
−Lock Time, Retry Behavior, …
 Job Execution can be distributed over a Cluster
 Optimizing is not a straight forward task, hard to give general advise
 If you need to improve: Measure and benchmark configurations in
your environment!
Job Execution

The good news: We did big performance improvements in Camunda
BPM 7.2!
 Improved First Level Cache (throughput increased by up to 90% if
async Service Tasks are executed in a row)
 Improved locking to have less Optimistic Lock Exceptions and more
Jobs acquired per Acquisition. Results in bigger Clusters getting
possible.
Job Execution in Camunda BPM 7.2

Recap:
 Added log level “CUSTOM” for History
 First Level Cache
 Job Executor Acquisition Locking
Plus:
 Added flush ordering (comparable to Hibernate) to minimize risk of
deadlocks
Summary: Performance Improvements in 7.2

Learning #3:
All performance challenges can be
solved.

Recommendation: Measure! No guessing.
camunda engine
Process
Application
External
Load
Generator
e.g. JMeter,
HP Load Runner,
CURL, …
REST
„close to production“
environment

- Measure
- JobExecutor Horizontal Scalability
- Impact of 1st level cache reuse
- Improvements Version 7.1.0 vs. Version 7.2.0
- Environment: Amazon AWS Cloud (EC2 & RDS)
Benchmark

Benchmark Setup
Client
Process Engine
Node 1
Process Engine
Node 2
Process Engine
Node 3
Process Engine
Node 4
Start Process
Instance (Rest API)
Database
(Postgres)
https://github.com/meyerdan/ec2-benchmark
EC2 m3.xlarge
(Intel Xeon E5-2670 v2,
4 core, 15 GiB Memory)
EC2 m3.xlarge
EC2 db.m3.xlarge
Provisioned using Docker

Benchmark Setup - The process
- All service tasks „Async“
- 1st service task creates 5 variables
- Variables are read by subsequent service tasks

 Throughput in terms of transactions / second
 No absolute Numbers 
Benchmark Results

Benchmarks Results
Cache Off Cache On
Amazon RDS Metrics

Benchmarks Results
Cache Off
Cache On
Amazon RDS Metrics

What about true
Horizontal Scalability?

What is Horizontal Scalability?
Scale up the number of transactions executed by adding more
processing nodes to the system. [*]
[*] http://en.wikipedia.org/wiki/Scalability#Horizontal_and_vertical_scaling (Adapted)
Horizontal Scalability
transactions /
sec
nodes

The current Situation
Scale number of Process Engine Nodes (JVMs)
Up to a certain point
Limited possibilities for scaling the shared
relational Database. In a sense this can
only be scaled “up”, not “out”.
Shared
Relational
Database
Process
Engine
Process
Engine
Process
Engine

Which way to go?
Distributed
Datastore
Process
Engine
Process
Engine
Process
Engine
Distributed Datastore.
Use a database which is itself a
distributed system and can be
scaled horizontally.
- Apache Cassandra,
- Apache HBase,
- Distributed Caches
(Hazelcast, …)
- ...
Sharding and partitioning.
Distribute the state over multiple
Datastores.
- Multiple instances of
PostgreSQL
- Each “DB” is a Mongo DB
shard
- No “DB” at all: use a
filesystem journal?
- ...
Key Difference: on the right hand side, the process engine itself is “distributed”
in the sense that it is aware of the distribution and sharding.

The problem with Distributed Datastores
(In the context of process engines)
1. Consistency guarantees offered by these databases (eventual consistency, ACID vs.
BASE, ...) often do not match the requirements of BPMN process execution. See:
conflicting concurrent transactions:
a. Racing incoming signals (E.g.: Two Messages targeting the same event instance arrive at the
same time)
b. Joins & Synchronization (E.g.: Gateways, Multi Instance, ...)
c. Cancel Activity instance (E.g.: Interrupting Message Boundary Event)
1. Data Representation and Network Latency / Overhead: Process instance state is
composite:
a. Token state / active activity instances
b. Variables
c. Task Information, …
Challenge is to find a data representation which does not lead to distribution of the state of a
single process instance across the cluster while still supporting the required access patterns.
2. Significant differences between individual technologies while there are no
industry standards in place yet. (Different with SQL).

Sharding => Distributed yet Local
Scale horizontally...
Each “shard / node” maitains its state
locally
Partitioning workflow instance state
- Each process instance lives inside a single shard /
partition
=> local data consistency easy to guarantee,
=> easy to access efficiently
=> Support range of different persistence engines
(Relational Database, Non-Relational Databases, …)

Proces
s
Engine
Flexible Architecture
...
Reality @
zalando 2014
Proces
s
Engine
Proces
s
Engine
The simplest case
A single process engine node
running on top of a
conventional database.
A medium Scenario
Horizontally scale on top of a
conventional database.
Massive Compute Cluster
500 Nodes ?
All of this should be possible with one
unified architecture!

No more Search!
The catch
“Find Process Instance
for order with ID 43543242”
??
???

Human Workflow (Build Task Lists)
History: Monitoring, Reporting, …
Message Correlation
When is „Search“ required?

Message Correlation
The Problem to solve
Workflow Instance State
for order with ID 435345
Incoming Message:
“customer cancelled Order
with ID 435345”

 Yes, but for non-workflow execution Use Cases
Use Search Index?
(A)sync
Updates
Search Index
(Near Realtime)
Tasklist
Queries,
Monitoring,...

Vision
HistoryTasksCore Process Execution
Signal / Cancel Activity Instance by Id
Correlate Message
Query for List of Tasks Monitoring,
Reports
Real Time, Strongly Consistent
Horizontally scalable through sharding
Multiple persistence technologies possible
Near Real Time, Eventually Consistent
Use best technology for the Job.
Async Event Stream

But still...
HistoryTasksCore Process Execution
Signal / Cancel Activity Instance by Id
Correlate Message
Query for List of Tasks Monitoring,
Reports
In the simplest case!

Learning #4:
You can do true horizontal
clustering with the engine which
exists today!
There is no need for No-SQL
persistence in the core engine.

Learning #5:
Camunda is really damn smart :-)

Camunda BPM Performance is already awesome
However: We are continuously improving
performance
There are strategies to solve specific performance
challenges
There is no limit in scalability
Summary

Start now!
Open Source Edition
• Download:
www.camunda.org
• Docs, Tutorials etc.
• Forum
• Meetings
Enterprise Edition
• Trial:
www.camunda.com
• Additional Features
• Support, Patches etc.
• Consulting, Training
http://camunda.com/bpm/consultation/
info@camunda.com | US +1.415.800.3908 | DE +49 30 664040 900

Camunda BPM 7.2: Performance and Scalability (English)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Camunda BPM 7.2: Performance and Scalability (English)

Similaire à Camunda BPM 7.2: Performance and Scalability (English) (20)

Plus de camunda services GmbH

Plus de camunda services GmbH (20)

Dernier

Dernier (20)

Camunda BPM 7.2: Performance and Scalability (English)