SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Data-Oriented Programming
with Clojure and Jackdaw
Charles Reese
Funding Circle
allows investors to
lend to established,
small businesses.
“We’ve made it our mission to help
small businesses across the world
go even further.”
Samir Desai
Co-founder and CEO
Small business
borrowers globally
$4.5 billion
Loans under
management globally
Individual investors and
financial institutions
Kafka as a streaming platform
API Front End Metrics Data Lake
Underwriting Servicing Trades Accounting
Double-entry bookkeeping in Kafka Streams
state stores
Why Clojure?
• Data-oriented
• Function composition
• The REPL
• JVM hosted
It's better to have 100 functions
operate on one data structure than
10 functions on 10 data structures.
Perlis, A. J. (1982). Epigrams on Programming.
SIGPLAN Notices 17(9)
Only a few data structures...
list, vector, map, set
Many functions
distinct filter remove for keep keep-indexed cons concat lazy-cat mapcat cycle
interleave interpose rest next fnext nnext drop drop-while nthnext for take take-nth
take-while butlast drop-last for flatten reverse sort sort-by shuffle split-at split-
with partition partition-all partition-by map pmap mapcat for replace reductions map-
indexed seque first ffirst nfirst second nth when-first last rand-nth zipmap into
reduce set vec into-array to-array-2d frequencies group-by apply not-empty some reduce
seq? every? not-every? not-any? empty? some filter doseq dorun doall realized? seq
vals keys rseq subseq rsubseq lazy-seq repeatedly iterate repeat range line-seq
resultset-seq re-seq tree-seq file-seq xml-seq iterator-seq enumeration-seq
((comp str +) 1 2 3 4)
;;=> "10"
(filter (comp not zero?) [0 1 0 2 0 3 0 4])
;;=> (1 2 3 4)
Composable parts
responding to input
occurring in small steps
repeated with frequency
JVM hosted
(def streams-builder
(def kstream
(.stream streams-builder
(Consumed/with (serde) (serde))))
(.to kstream
(Produced/with (serde) (serde)))
(defn topology-builder
[{:keys [input output] :as topics}]
(fn [builder]
(-> (j/kstream builder input)
(j/flat-map-values split-lines)
(j/group-by (fn [[_ v]] v))
(j/to output))
topic: input
topic: output
(defn topology-builder
[{:keys [input output] :as topics}]
(fn [builder]
(-> (j/kstream builder input)
(j/flat-map-values split-lines)
(j/group-by (fn [[_ v]] v))
(j/to output))
topic: input
topic: output
Jackdaw Features
• Create and manage topics
• Produce and consume
• Kafka Streams DSL
• EDN, JSON, and Avro serdes
• End-to-end tests
Testing with Jackdaw
(def commands
"all streams lead to kafka"
{:key-fn (constantly "")}]
"hello kafka streams"
{:key-fn (constantly "")}]
(fn [journal] (= 8 (count (get-in journal [:topics :output]))))
{:timeout 2000}]])
Testing with Jackdaw
(def commands
"all streams lead to kafka"
{:key-fn (constantly "")}]
"hello kafka streams"
{:key-fn (constantly "")}]
(fn [journal] (= 8 (count (get-in journal [:topics :output]))))
{:timeout 2000}]])
Testing with Jackdaw
(deftest test-xf-word-count
(jtf/with-fixtures [(jtf/integration-fixture topology-builder test-config)]
(jackdaw.test/with-test-machine (test-transport test-config)
(fn [machine]
(let [{:keys [journal]} (jackdaw.test/run-test machine commands)]
(is (= 1 (word-count journal "hello")))
(is (= 2 (word-count journal "kafka"))))))))
["inside every large program"
"is a small program"
"struggling to get out"]
Process all the lines, one step at a time.
1. Split each line into words.
2. Count the words for each line.
3. Reduce over all the lines.
Split each line into words.
(["inside" "every" "large" "program"]
["is" "a" "small" "program"]
["struggling" "to" "get" "out"])
Count the words for each line.
({"inside" 1 "every" 1 "large" 1 "program" 1}
{"is" 1 "a" 1 "small" 1 "program" 1}
{"struggling" 1 "to" 1 "get" 1 "out" 1})
Reduce over all the lines.
{"inside" 1 "every" 1 "large" 1 "program" 2
"is" 1 "a" 1 "small" 1
"struggling" 1 "to" 1 "get" 1 "out" 1}
(defn f
[acc x]
(merge-with + acc x))
(->> coll
(map (fn [x] (split x #" ")))
(map frequencies)
(reduce f))
Process all the lines, one step at a time.Process all the lines, one line
{"inside" 1
"every" 1
"large" 1
"program" 1}
{} {"inside" 1
"every" 1
"large" 1
"program" 2
"is" 1
"a" 1
"small" 1}
{"inside" 1
"every" 1
"large" 1
"program" 2
"is" 1
"a" 1
"small" 1
"struggling" 1
"to" 1
"get" 1
"out" 1}
(partial map (fn [x] (split x #" ")))
(partial map frequencies))
• Composable transformations
• Take a reducing fn and return another reducing fn
• Decoupled from inputs and outputs
• Reusable
• Fast
(map inc [1 2 3 4])
;;=> (2 3 4 5)
(map inc)
(def count-words
(map (fn [x] (split x #" ")))
(map frequencies)))
(def f
(fn [acc x]
(merge-with + acc x))))
(transduce count-words f {} coll)
{"inside" 1
"every" 1
"large" 1
"program" 2
"is" 1
"a" 1
"small" 1
"struggling" 1
"to" 1
"get" 1
"out" 1}
[["inside" 1]
["every" 1]
["large" 1]
["program" 1]
["is" 1]
["a" 1]
["small" 1]
["program" 2]
["struggling" 1]
["to" 1]
["get" 1]
["out" 1]]
We want...
[(["inside" 1] ["every" 1] ["large" 1] ["program" 1])
(["is" 1] ["a" 1] ["small" 1] ["program" 2])
(["struggling" 1] ["to" 1] ["get" 1] ["out" 1])]
[{"inside" 1, "every" 1, "large" 1, "program" 1}
{"is" 1, "a" 1, "small" 1, "program" 1}
{"struggling" 1, "to" 1, "get" 1, "out" 1}]
(defn xf-running-total
(fn [rf]
(let [state (volatile! {})]
([] (rf))
([result] (rf result))
([result input]
(let [next (as-> input %
(vswap! state #(merge-with (fnil + 0) %1 %2) %)
(select-keys % (keys input))
(map vec %))]
(rf result next)))))))
(defn xf-running-total
(fn [rf]
(let [state (volatile! {})]
([] (rf))
([result] (rf result))
([result input]
(let [next (as-> input %
(vswap! state #(merge-with (fnil + 0) %1 %2) %)
(select-keys % (keys input))
(map vec %))]
(rf result next)))))))
(defn xf-running-total
(fn [rf]
(let [state (volatile! {})]
([] (rf))
([result] (rf result))
([result input]
(let [next (as-> input %
(vswap! state #(merge-with (fnil + 0) %1 %2) %)
(select-keys % (keys input))
(map vec %))]
(rf result next)))))))
(into [] (xf-running-total) coll')
(def count-words
(map (fn [x] (split x #" ")))
(map frequencies)
(transduce count-words concat coll)
Putting it all together...
(defn transduce-kstream
[xf kstream]
(-> kstream
(j/transform (fn [] (transformer xf))
(j/flat-map (fn [[_ v]] v))))
(defn transduce-kstream
[xf kstream]
(-> kstream
(j/transform (fn [] (transformer xf))
(j/flat-map (fn [[_ v]] v))))
(defn transformer
(let [ctx (atom nil)]
(init [_ context]
(reset! ctx context))
(transform [_ k v]
(let [store (.getStateStore @ctx "transducer")
v (first (into [] (xf store) [[k v]]))]
(KeyValue/pair k v)))
(close [_]))))
(defn transformer
(let [ctx (atom nil)]
(init [_ context]
(reset! ctx context))
(transform [_ k v]
(let [store (.getStateStore @ctx "transducer")
v (first (into [] (xf store) [[k v]]))]
(KeyValue/pair k v)))
(close [_]))))
(defn xf-running-total
[state swap-fn]
(fn [rf]
([] (rf))
([result] (rf result))
([result input]
(let [[k v] input
next (as-> v %
(swap-fn state #(merge-with (fnil + 0) %1 %2) %)
(select-keys % (keys v))
(map vec %))]
(rf result next))))))
(defn xf-running-total
[state swap-fn]
(fn [rf]
([] (rf))
([result] (rf result))
([result input]
(let [[k v] input
next (as-> v %
(swap-fn state #(merge-with (fnil + 0) %1 %2) %)
(select-keys % (keys v))
(map vec %))]
(rf result next))))))
(defn count-words
[state swap-fn]
(map (fn [[k v]] [k (split v #" ")]))
(map (fn [[k v]] [k (frequencies v)]))
(xf-running-total state swap-fn)))
Putting it all together...
(transduce (count-words (atom {}) swap!) concat coll)
Example: Simple Ledger
[["1" {:debit-account "tech"
:credit-account "cash"
:amount 1000}]
["2" {:debit-account "cash"
:credit-account "sales"
:amount 2000}]]
We want...
{:account-name "tech"
:before-balance 0
:after-balance -1000}]
{:account-name "cash"
:before-balance 0
:after-balance 1000}]
{:account-name "cash"
:before-balance 1000
:after-balance -1000}]
{:account-name "sales"
:before-balance 0
:after-balance 2000}])
(->> coll
(transduce (xf-split-entries nil nil) concat)
(transduce (xf-running-balances (atom {}) swap!) concat))
topic: entry-pending
topic: transaction-added
topic: transaction-pending
"Transducers" by Rich Hickey
Strange Loop. (2014, September).
Kafka and the REPL: Stream Processing, the Functional Way
Reese, C. (2018, November).
Testing Event-Driven Systems
Chambers, A. (2018, April).
Funding Circle. (2017).
Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Circle) Kafka Summit SF 2019

Contenu connexe


ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnReactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnZalando Technology
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flinkRenato Guimaraes
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
Streaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaStreaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaAbrar Sheikh
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...confluent
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...confluent
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...confluent
Food Processing is Stream Processing (Stefan Freshe, Nordischer Maschinenbau...
Food Processing is Stream Processing (Stefan Freshe,  Nordischer Maschinenbau...Food Processing is Stream Processing (Stefan Freshe,  Nordischer Maschinenbau...
Food Processing is Stream Processing (Stefan Freshe, Nordischer Maschinenbau...confluent
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...confluent
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
Distributed systems vs compositionality
Distributed systems vs compositionalityDistributed systems vs compositionality
Distributed systems vs compositionalityRoland Kuhn
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Yaroslav Tkachenko
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...Flink Forward
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’ Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’ confluent
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)Apache Apex

Tendances (20)

ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland KuhnReactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
Streaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaStreaming Data from Cassandra into Kafka
Streaming Data from Cassandra into Kafka
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Food Processing is Stream Processing (Stefan Freshe, Nordischer Maschinenbau...
Food Processing is Stream Processing (Stefan Freshe,  Nordischer Maschinenbau...Food Processing is Stream Processing (Stefan Freshe,  Nordischer Maschinenbau...
Food Processing is Stream Processing (Stefan Freshe, Nordischer Maschinenbau...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
Distributed systems vs compositionality
Distributed systems vs compositionalityDistributed systems vs compositionality
Distributed systems vs compositionality
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch -  Dynami...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’ Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
Serverless and Streaming: Building ‘eBay’ by ‘Turning the Database Inside Out’
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)

Similaire à Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Circle) Kafka Summit SF 2019

ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015Michiel Borkent
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
Transducers in JavaScript
Transducers in JavaScriptTransducers in JavaScript
Transducers in JavaScriptPavel Forkert
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)jaxLondonConference
Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming JobsDatabricks
Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Samir Bessalah
Meet Up - Spark Stream Processing + Kafka
Meet Up - Spark Stream Processing + KafkaMeet Up - Spark Stream Processing + Kafka
Meet Up - Spark Stream Processing + KafkaKnoldus Inc.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LA
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Sciencehenrygarner
Scala meetup - Intro to spark
Scala meetup - Intro to sparkScala meetup - Intro to spark
Scala meetup - Intro to sparkJavier Arrieta
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and MonoidsHugo Gävert
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)Jacek Laskowski
Big Data for Mobile
Big Data for MobileBig Data for Mobile
Big Data for MobileBugSense

Similaire à Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Circle) Kafka Summit SF 2019 (20)

Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015
Cascalog internal dsl_preso
Cascalog internal dsl_presoCascalog internal dsl_preso
Cascalog internal dsl_preso
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
Full Stack Clojure
Full Stack ClojureFull Stack Clojure
Full Stack Clojure
Transducers in JavaScript
Transducers in JavaScriptTransducers in JavaScript
Transducers in JavaScript
Spark workshop
Spark workshopSpark workshop
Spark workshop
The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)The Curious Clojurist - Neal Ford (Thoughtworks)
The Curious Clojurist - Neal Ford (Thoughtworks)
Productionizing your Streaming Jobs
Productionizing your Streaming JobsProductionizing your Streaming Jobs
Productionizing your Streaming Jobs
Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013Big Data Analytics with Scala at SCALA.IO 2013
Big Data Analytics with Scala at SCALA.IO 2013
Meet Up - Spark Stream Processing + Kafka
Meet Up - Spark Stream Processing + KafkaMeet Up - Spark Stream Processing + Kafka
Meet Up - Spark Stream Processing + Kafka
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Science
Scala meetup - Intro to spark
Scala meetup - Intro to sparkScala meetup - Intro to spark
Scala meetup - Intro to spark
Hw09 Hadoop + Clojure
Hw09   Hadoop + ClojureHw09   Hadoop + Clojure
Hw09 Hadoop + Clojure
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)
Big Data for Mobile
Big Data for MobileBig Data for Mobile
Big Data for Mobile
Hadoop + Clojure
Hadoop + ClojureHadoop + Clojure
Hadoop + Clojure

Plus de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams


Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?

Data-Oriented Programming with Clojure and Jackdaw (Charles Reese, Funding Circle) Kafka Summit SF 2019

  • 1. Data-Oriented Programming with Clojure and Jackdaw Charles Reese
  • 2. Funding Circle allows investors to lend to established, creditworthy small businesses. “We’ve made it our mission to help small businesses across the world go even further.” Samir Desai Co-founder and CEO 2 72,000 Small business borrowers globally $4.5 billion Loans under management globally 90,000+ Individual investors and financial institutions
  • 3. Kafka as a streaming platform 3 Kafka API Front End Metrics Data Lake Underwriting Servicing Trades Accounting
  • 4. 2 8 2 40 Double-entry bookkeeping in Kafka Streams 4 sources sinks state stores nodes 7 2 1 1 3 18 1 5 2 branch foreach flatMap flatMapValues map mapValues transform transformValues merge
  • 5. 5 Why Clojure? • Data-oriented • Function composition • The REPL • JVM hosted
  • 6. “ 6 It's better to have 100 functions operate on one data structure than 10 functions on 10 data structures. Perlis, A. J. (1982). Epigrams on Programming. SIGPLAN Notices 17(9)
  • 7. Only a few data structures... 7 list, vector, map, set
  • 8. 8 Many functions distinct filter remove for keep keep-indexed cons concat lazy-cat mapcat cycle interleave interpose rest next fnext nnext drop drop-while nthnext for take take-nth take-while butlast drop-last for flatten reverse sort sort-by shuffle split-at split- with partition partition-all partition-by map pmap mapcat for replace reductions map- indexed seque first ffirst nfirst second nth when-first last rand-nth zipmap into reduce set vec into-array to-array-2d frequencies group-by apply not-empty some reduce seq? every? not-every? not-any? empty? some filter doseq dorun doall realized? seq vals keys rseq subseq rsubseq lazy-seq repeatedly iterate repeat range line-seq resultset-seq re-seq tree-seq file-seq xml-seq iterator-seq enumeration-seq
  • 9. ((comp str +) 1 2 3 4) ;;=> "10" (filter (comp not zero?) [0 1 0 2 0 3 0 4]) ;;=> (1 2 3 4) Composable parts 9
  • 10. The REPL 10 interactive responding to input incremental occurring in small steps iterative repeated with frequency
  • 11. 11
  • 12. 12 JVM hosted (def streams-builder (StreamsBuilder.)) (def kstream (.stream streams-builder "input" (Consumed/with (serde) (serde)))) (.to kstream "output" (Produced/with (serde) (serde)))
  • 13. Jackdaw 13 (defn topology-builder [{:keys [input output] :as topics}] (fn [builder] (-> (j/kstream builder input) (j/flat-map-values split-lines) (j/group-by (fn [[_ v]] v)) (j/count) (j/to-kstream) (j/to output)) builder)) topic: input topic: output flatMapValues groupBy count toStream
  • 14. Jackdaw 14 (defn topology-builder [{:keys [input output] :as topics}] (fn [builder] (-> (j/kstream builder input) (j/flat-map-values split-lines) (j/group-by (fn [[_ v]] v)) (j/count) (j/to-kstream) (j/to output)) builder)) topic: input topic: output flatMapValues groupBy count toStream
  • 15. 15 Jackdaw Features • Create and manage topics • Produce and consume • Kafka Streams DSL • EDN, JSON, and Avro serdes • End-to-end tests
  • 16. Testing with Jackdaw 16 (def commands [[:write! :input "all streams lead to kafka" {:key-fn (constantly "")}] [:write! :input "hello kafka streams" {:key-fn (constantly "")}] [:watch (fn [journal] (= 8 (count (get-in journal [:topics :output])))) {:timeout 2000}]])
  • 17. Testing with Jackdaw 17 (def commands [[:write! :input "all streams lead to kafka" {:key-fn (constantly "")}] [:write! :input "hello kafka streams" {:key-fn (constantly "")}] [:watch (fn [journal] (= 8 (count (get-in journal [:topics :output])))) {:timeout 2000}]])
  • 18. Testing with Jackdaw 18 (deftest test-xf-word-count (jtf/with-fixtures [(jtf/integration-fixture topology-builder test-config)] (jackdaw.test/with-test-machine (test-transport test-config) (fn [machine] (let [{:keys [journal]} (jackdaw.test/run-test machine commands)] (is (= 1 (word-count journal "hello"))) (is (= 2 (word-count journal "kafka"))))))))
  • 19. 19 ["inside every large program" "is a small program" "struggling to get out"]
  • 20. Process all the lines, one step at a time. 20
  • 21. 21 Steps 1. Split each line into words. 2. Count the words for each line. 3. Reduce over all the lines.
  • 22. Split each line into words. 22 (["inside" "every" "large" "program"] ["is" "a" "small" "program"] ["struggling" "to" "get" "out"])
  • 23. Count the words for each line. 23 ({"inside" 1 "every" 1 "large" 1 "program" 1} {"is" 1 "a" 1 "small" 1 "program" 1} {"struggling" 1 "to" 1 "get" 1 "out" 1})
  • 24. Reduce over all the lines. 24 {"inside" 1 "every" 1 "large" 1 "program" 2 "is" 1 "a" 1 "small" 1 "struggling" 1 "to" 1 "get" 1 "out" 1}
  • 25. 25 (defn f [acc x] (merge-with + acc x)) (->> coll (map (fn [x] (split x #" "))) (map frequencies) (reduce f))
  • 26. Process all the lines, one step at a time.Process all the lines, one line 26
  • 27. 27 {"inside" 1 "every" 1 "large" 1 "program" 1} {} {"inside" 1 "every" 1 "large" 1 "program" 2 "is" 1 "a" 1 "small" 1} {"inside" 1 "every" 1 "large" 1 "program" 2 "is" 1 "a" 1 "small" 1 "struggling" 1 "to" 1 "get" 1 "out" 1}
  • 28. 28 (comp (partial map (fn [x] (split x #" "))) (partial map frequencies))
  • 29. 29 Transducers • Composable transformations • Take a reducing fn and return another reducing fn • Decoupled from inputs and outputs • Reusable • Fast
  • 30. 30 (map inc [1 2 3 4]) ;;=> (2 3 4 5)
  • 32. 32 (def count-words (comp (map (fn [x] (split x #" "))) (map frequencies))) (def f (completing (fn [acc x] (merge-with + acc x))))
  • 34. 34 {"inside" 1 "every" 1 "large" 1 "program" 2 "is" 1 "a" 1 "small" 1 "struggling" 1 "to" 1 "get" 1 "out" 1} But... [["inside" 1] ["every" 1] ["large" 1] ["program" 1] ["is" 1] ["a" 1] ["small" 1] ["program" 2] ["struggling" 1] ["to" 1] ["get" 1] ["out" 1]] We want...
  • 35. 35 [(["inside" 1] ["every" 1] ["large" 1] ["program" 1]) (["is" 1] ["a" 1] ["small" 1] ["program" 2]) (["struggling" 1] ["to" 1] ["get" 1] ["out" 1])] [{"inside" 1, "every" 1, "large" 1, "program" 1} {"is" 1, "a" 1, "small" 1, "program" 1} {"struggling" 1, "to" 1, "get" 1, "out" 1}]
  • 36. 36 (defn xf-running-total [] (fn [rf] (let [state (volatile! {})] (fn ([] (rf)) ([result] (rf result)) ([result input] (let [next (as-> input % (vswap! state #(merge-with (fnil + 0) %1 %2) %) (select-keys % (keys input)) (map vec %))] (rf result next)))))))
  • 37. 37 (defn xf-running-total [] (fn [rf] (let [state (volatile! {})] (fn ([] (rf)) ([result] (rf result)) ([result input] (let [next (as-> input % (vswap! state #(merge-with (fnil + 0) %1 %2) %) (select-keys % (keys input)) (map vec %))] (rf result next)))))))
  • 38. 38 (defn xf-running-total [] (fn [rf] (let [state (volatile! {})] (fn ([] (rf)) ([result] (rf result)) ([result input] (let [next (as-> input % (vswap! state #(merge-with (fnil + 0) %1 %2) %) (select-keys % (keys input)) (map vec %))] (rf result next)))))))
  • 40. 40 (def count-words (comp (map (fn [x] (split x #" "))) (map frequencies) (xf-running-total))) (transduce count-words concat coll) Putting it all together...
  • 41. 41
  • 42. 42 ?
  • 44. 44 (defn transduce-kstream [xf kstream] (-> kstream (j/transform (fn [] (transformer xf)) ["transducer"]) (j/flat-map (fn [[_ v]] v))))
  • 45. 45 (defn transduce-kstream [xf kstream] (-> kstream (j/transform (fn [] (transformer xf)) ["transducer"]) (j/flat-map (fn [[_ v]] v))))
  • 46. 46 (defn transformer [xf] (let [ctx (atom nil)] (reify Transformer (init [_ context] (reset! ctx context)) (transform [_ k v] (let [store (.getStateStore @ctx "transducer") v (first (into [] (xf store) [[k v]]))] (KeyValue/pair k v))) (close [_]))))
  • 47. 47 (defn transformer [xf] (let [ctx (atom nil)] (reify Transformer (init [_ context] (reset! ctx context)) (transform [_ k v] (let [store (.getStateStore @ctx "transducer") v (first (into [] (xf store) [[k v]]))] (KeyValue/pair k v))) (close [_]))))
  • 48. 48 (defn xf-running-total [state swap-fn] (fn [rf] (fn ([] (rf)) ([result] (rf result)) ([result input] (let [[k v] input next (as-> v % (swap-fn state #(merge-with (fnil + 0) %1 %2) %) (select-keys % (keys v)) (map vec %))] (rf result next))))))
  • 49. 49 (defn xf-running-total [state swap-fn] (fn [rf] (fn ([] (rf)) ([result] (rf result)) ([result input] (let [[k v] input next (as-> v % (swap-fn state #(merge-with (fnil + 0) %1 %2) %) (select-keys % (keys v)) (map vec %))] (rf result next))))))
  • 50. 50 (defn count-words [state swap-fn] (comp (map (fn [[k v]] [k (split v #" ")])) (map (fn [[k v]] [k (frequencies v)])) (xf-running-total state swap-fn))) Putting it all together...
  • 51. 51 (transduce (count-words (atom {}) swap!) concat coll)
  • 52. 52
  • 53. 53 Example: Simple Ledger [["1" {:debit-account "tech" :credit-account "cash" :amount 1000}] ["2" {:debit-account "cash" :credit-account "sales" :amount 2000}]]
  • 54. 54 We want... (["tech" {:account-name "tech" :before-balance 0 :after-balance -1000}] ["cash" {:account-name "cash" :before-balance 0 :after-balance 1000}] ["cash" {:account-name "cash" :before-balance 1000 :after-balance -1000}] ["sales" {:account-name "sales" :before-balance 0 :after-balance 2000}])
  • 55. 55 (->> coll (transduce (xf-split-entries nil nil) concat) (transduce (xf-running-balances (atom {}) swap!) concat))
  • 57. 57
  • 58. 58
  • 59. 59 Resources "Transducers" by Rich Hickey Strange Loop. (2014, September). Kafka and the REPL: Stream Processing, the Functional Way Reese, C. (2018, November). Testing Event-Driven Systems Chambers, A. (2018, April). Jackdaw Funding Circle. (2017).