3. Motivation
Organizations need to make data available for
analysis as soon as it arrives
Machine learning results need to be stored where
other business/data analysts work with them
Time to insight and time to action are now
competitive differentiators for businesses
4. Bulk data adapters
Applications can use bulk data
adapters SDK to collect and
write data - on-demand data
loading
No need to copy CSV to UM
or PM - simpler
Bypass SQL interface,
parser and optimizer -
faster writes
C++
Python
Java
MariaDB Server
ColumnStore UM
Application
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Bulk Data Adapter
1. For each row
a. For each column
bulkInsert->setColumn
b. bulkInsert->writeRow
2. bulkInsert->commit
* Buffer 100,000 rows by default
5. Streaming data adapters
– MaxScale CDC
Stream all writes from
MariaDB TX to MariaDB AX
automatically and
continuously - ensure
analytical data is up to
date and not stale, no
need for batch jobs,
manual processes or
human intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore UM
MariaDB MaxScale
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Streaming Data Adapter
(MaxScale CDC Client)
Binlog-Avro CDC
Router
6. Inside MaxScale CDC Adapter
● Connects to MaxScale via MaxScale CDC Connector
● Connects to ColumnStore via ColumnStore API
● Set of CDC records → CS API mini-batch
● CDC Record
○ Timestamp
○ GTID
○ Type (write, delete, update)
○ Changed data
8. Streaming data adapters
– Apache Kafka
Stream all messages
published to Apache Kafka
topics to MariaDB AX
automatically and
continuously - enable data
from many sources to be
streamed and collected for
analysis without complex
code
MariaDB Server
ColumnStore UM
ColumnStore PM ColumnStore PMColumnStore PM
Write API Write API Write API
MariaDB Server
ColumnStore UM
Streaming Data Adapter
(Kafka Client)
Apache Kafka
Topic Topic Topic
9. Inside Apache Kafka Adapter
● Connects to Kafka
● Reads Avro formatted data
○ Confluent KafkaAvroSerializer: https://docs.confluent.io/current/streams/developer-guide/write-streams.html
● Each topic is a stream
● Streams map to tables
○ Stream to multiple tables
○ Multiple streams to single table