Data is flowing everywhere around us, from phones, credit cards, sensor-equipped buildings, vending machines, thermostats, trains, buses, planes, posts to social media, digital pictures and video and so on....
http://www.datascicon.tech
42. @gamussa @confluentinc @DataSciCon
Time model
Different use cases time semantics
Majority of use cases require event-
time semantics
Other use cases may require
processing-time or special variants
like ingestion-time
46. @gamussa @confluentinc @DataSciCon
Windowing
Input data, where
colors represent
different users events
Rectangles denote
different event-time
windows
processing-time
event-time
windowing
alice
bob
dave
48. @gamussa @confluentinc @DataSciCon
Windowing
Windowing is an operation that groups
events
Most commonly needed: time windows,
session windows
Examples:
✗Real-time monitoring: 5-minute averages
✗Reader behavior on a website: user browsing sessions
53. @gamussa @confluentinc @DataSciCon
Out-of-order and late data
Users with mobile phones enter
airplane, lose Internet connectivity
Emails are being written
during the 10h flight
Internet connectivity is restored,
phones will send queued emails now
56. @gamussa @confluentinc @DataSciCon
Stream Processing: results
• Yes, it’s possible to get computation
results in real time
• Windows – finite view of infinite data
• Based on temporal characteristics of the evet
57. @gamussa @confluentinc @DataSciCon
Stream Processing: results
• Yes, it’s possible to get computation
results in real time
• Windows – finite view of infinite data
• Based on temporal characteristics of the evet
• Late event processing
• You choose how long to wait