A Secure and Reliable Document Management System is Essential.docx
Data Intensive Applications with Apache Flink
1. Milan – July 13 2016
Data Intensive Applications with Apache Flink
Simone Robutti
Machine Learning Engineer at Radicalbit
@SimoneRobutti
2. Agenda
1. Brief Introduction to Apache Flink
○ Why
○ What
○ How
2. Machine Learning on Flink
○ Present landscape
○ Future of the Ecosystem
3. Closing notes on Radicalbit (shameless plug ahead)
14. “I have seen people insisting on using Hadoop for
datasets that could easily fit on a flash drive and could
easily be processed on a laptop.”
- Yann LeCun
-
ML on Flink
23. Apache Beam
Programming model for data processing pipelines
● Streaming first, batch as a bounded stream
● Layered API: What, Where, When, How
● Platform agnostic: same program, different
runners
30. Our vision
Flink can become the ideal choice to build real-time decision-
heavy applications with high data-throughput
To achieve this:
● Ambitious applications (aim for real-time services)
● Reliable distributed online learning (Proteus?)
● A Pipelining Framework (experiment fast, increase testability and
modularity)