SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
Simplifying Event
Streaming
Tools for Location Transparency & Data Evolution
Paul Osman - @paulosman - posman@underarmour.com - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
2
Introduction
• Paul Osman
• Staff Software Engineer - Under Armour Connected Fitness
• Formerly at PagerDuty, 500px, SoundCloud
• @paulosman
• posman@underarmour.com
Paul Osman - @paulosman - posman@underarmour.com - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
3
4
5
6
Under Armour Connected Fitness
• November 2013 - Under Armour acquires MapMyFitness Inc

• February 2015 - Under Armour acquires MyFitnessPal 

• February 2015 - Under Armour acquires Endomondo

• January 2016 - Announce HealthBox , Gemini 2 RE
7
Under Armour Connected Fitness
8
MyFitnessPal and Kafka
• MFP started as a Rails monolith

• Broken into microservices written in Scala and Ruby

• Data integration challenges

• Service dependencies difficult to manage
9
Solution
10
Pushing a data migration…
11
You broke my consumer!
12
My bad!
13
Fix it?
14
Other Challenges…
• Client libraries for non-JVM languages were of varying quality

• Developers needed to know about Kafka

• Wanted to federate Kafka clusters - no one team should have to maintain all
clusters
15
MyFitnessPal joined Under Armour
16
17
Project Golden Gate
18
19
Challenges Recap
• Engineers needed to know a lot about Kafka clusters

• Data migrations broke consumer contracts

• Client libraries for non-JVM languages

• Management of Kafka clusters

• Data retention policies
20
Location Transparency
• A publishing client needn’t be concerned with things like clusters, topics, etc

• Need some kind of source of truth for event locations
21
Topology Service
• Each event has a namespace and event type (globally unique)

• The topology service instructs clients where to publish or consume those
messages

• Introduces concept of “zones” which represent one or more clusters
22
Data Migrations
• Solved problem - use Schemas

• Confluent Schema Registry + Small Service to capture Metadata (event type
and namespace)

• Confluent Schema Registry uses Avro, so we do too
23
Data Migrations
pending available
{

event_type: "ActivityFeedStoryUpdate",

namespace: "mmf",

status: "pending",
confluent_subject: "mmf_activityfeedstoryupdate",
schema_id: "bb68e5381e88d52574b0f50a000fbe9b"

}
24
Registering a Schema
$> schema-registry register schema --event-type FoodEntryCreated --file-name FoodEntryCreated.avsc 
--namespace mfp -p
$> schema-registry activate schema 20afc5a8f9c017c1f4e82757a7a88f5b -p
25
26
Publishing
27
JVM Languages
• Java and Scala client libraries
28
Non-JVM Languages
projects/GoldenGate $ curl -D - -H'Content-Type: application/json' http://localhost:3005/golden-gate-proxy/produce -
d'@schemas/integ-message.json'

HTTP/1.1 202 Accepted
Server: spray-can/1.3.3
Date: Tue, 19 Apr 2016 17:34:21 GMT
Content-Type: application/json; charset=UTF-8
Content-Length: 419
{
"items": [{
"producer_id": "foo",
"schema_id": "5635ce15a15213105c091d5e0945b0c2",
"zone": "mfp",
"payload": {

"context" : null,

"email_address" : "vneo",

"email_source" : "fadipwxfmotvav",

"first_name" : null,

"last_name" : null,
"country" : {"string" : "US"},
}]
}
29
Consuming
30
JVM Languages
31
Non-JVM Languages
projects/GoldenGate $ ./gg-consumer-proxy —subscriptions=mmf/activity_feed_updated=http://localhost:3000/callback
32
Federation of Kafka Clusters
Publisher
Topology
Service
Kafka
Consumer
33
Data Retention
• Archiving is now just a job for a specialized consumer

• Archiving is done “per-zone”. Some data shouldn’t be archived, it only gets
published to zones that are not archived (as per event type)

• In our case, data is stored in S3 and then accessed through a variety of tools
for analysis, batch processing, etc.
Adoption Pains
35
Adoption Pains
Leaky Abstractions
36
Adoption Pains
Publishers Consumers
Schema Designers
37
Help Publishers - Avro Helper Library
• helpful-avro Scala library

• Adds a layer of robustness 

• Tries a few tricks to make a payload validate against a schema
38
{
"name": "Person",
"namespace": "org.example",
"type": "record",
"fields": [
{"name": "first_name", "type": "string"},
{"name": "last_name", "type": "string"},
{"name": "age", "type": ["null", "int"]},
{"name": "username", "type": ["null", "string"]}
]
}
Example Schema
39
{
"name": "Person",
"namespace": "org.example",
"type": "record",
"fields": [
{"name": "first_name", "type": "string"},
{"name": "last_name", "type": "string"},
{"name": "age", "type": ["null", "int"]},
{"name": "username", "type": ["null", "string"]}
]
}
Optional Fields
Nullable / Optional Fields
40
Nullable / Optional Fields
{
"first_name": "Paul",
"last_name": "Osman",
"username": "paulosman"
}
{
"first_name": "Paul",
"last_name": "Osman",
"age": {"null":null},
"username": {"string": "paulosman"}
}
age omitted
not type annotated
• Browser and CLI based tools that allow people to observe activity being
published to a specific zone

• Give people a way to see their event go through the system

• End to end monitoring, monitoring of consumer lag
41
Observability
42
Future Plans
• Make schema authoring and registration easier and more automated
• Extend helpful-avro to work with Case Classes and POJOs

• Further hide implementation details
43CONFIDENTIAL & BUSINESS PROPRIETARY INFORMATION OF UNDER ARMOUR, INC. COPYRIGHT (C)2015
Thank You
http://underarmour.jobs

Contenu connexe

Tendances

Real-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka EcosystemReal-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka Ecosystem
confluent
 
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
confluent
 
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
confluent
 

Tendances (20)

Real-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka EcosystemReal-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka Ecosystem
 
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache KafkaKafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...Maximize the Business Value of Machine Learning and Data Science with Kafka (...
Maximize the Business Value of Machine Learning and Data Science with Kafka (...
 
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
 
Developing custom transformation in the Kafka connect to minimize data redund...
Developing custom transformation in the Kafka connect to minimize data redund...Developing custom transformation in the Kafka connect to minimize data redund...
Developing custom transformation in the Kafka connect to minimize data redund...
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
 
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platform
 
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
Druid + Kafka: transform your data-in-motion to analytics-in-motion | Gian Me...
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
 
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
 
Kafka for Real-Time Event Processing in Serverless Environments
Kafka for Real-Time Event Processing in Serverless EnvironmentsKafka for Real-Time Event Processing in Serverless Environments
Kafka for Real-Time Event Processing in Serverless Environments
 
Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...
Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...
Streaming data in the cloud with Confluent and MongoDB Atlas | Robert Waters,...
 
Kafka in the Enterprise—A Two-Year Journey to Build a Data Streaming Platform...
Kafka in the Enterprise—A Two-Year Journey to Build a Data Streaming Platform...Kafka in the Enterprise—A Two-Year Journey to Build a Data Streaming Platform...
Kafka in the Enterprise—A Two-Year Journey to Build a Data Streaming Platform...
 
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
 
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
Digital Transformation in Healthcare with Kafka—Building a Low Latency Data P...
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 

En vedette

Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
confluent
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
confluent
 

En vedette (20)

Kafka, the "DialTone for Data": Building a self-service, scalable, streaming ...
Kafka, the "DialTone for Data": Building a self-service, scalable, streaming ...Kafka, the "DialTone for Data": Building a self-service, scalable, streaming ...
Kafka, the "DialTone for Data": Building a self-service, scalable, streaming ...
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData, An...
 
Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan
 
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
Introducing Kafka Streams: Large-scale Stream Processing with Kafka, Neha Nar...
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron SchildkroutKafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
 
Never at Rest - IoT and Data Streaming at British Gas Connected Homes, Paul M...
Never at Rest - IoT and Data Streaming at British Gas Connected Homes, Paul M...Never at Rest - IoT and Data Streaming at British Gas Connected Homes, Paul M...
Never at Rest - IoT and Data Streaming at British Gas Connected Homes, Paul M...
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Ingesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah WhitacreIngesting Healthcare Data, Micah Whitacre
Ingesting Healthcare Data, Micah Whitacre
 
Espresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom QuiggleEspresso Database Replication with Kafka, Tom Quiggle
Espresso Database Replication with Kafka, Tom Quiggle
 
Towards A Stream Centered Enterprise, Gabriel Commeau
Towards A Stream Centered Enterprise, Gabriel CommeauTowards A Stream Centered Enterprise, Gabriel Commeau
Towards A Stream Centered Enterprise, Gabriel Commeau
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
 
The Rise of Real Time
The Rise of Real TimeThe Rise of Real Time
The Rise of Real Time
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Demystifying Stream Processing with Apache Kafka
Demystifying Stream Processing with Apache KafkaDemystifying Stream Processing with Apache Kafka
Demystifying Stream Processing with Apache Kafka
 

Similaire à Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman

End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
DataWorks Summit
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Amazon Web Services
 

Similaire à Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman (20)

Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Streaming ETL for All
Streaming ETL for AllStreaming ETL for All
Streaming ETL for All
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
 
M7 and Apache Drill, Micheal Hausenblas
M7 and Apache Drill, Micheal HausenblasM7 and Apache Drill, Micheal Hausenblas
M7 and Apache Drill, Micheal Hausenblas
 
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
 
Data & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architectureData & analytics challenges in a microservice architecture
Data & analytics challenges in a microservice architecture
 
Using AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics ServiceUsing AWS To Build A Scalable Machine Data Analytics Service
Using AWS To Build A Scalable Machine Data Analytics Service
 
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
AWS re:Invent 2016: Deep Dive: Amazon EMR Best Practices & Design Patterns (B...
 
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future VisionMLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
MLOps journey at Swisscom: AI Use Cases, Architecture and Future Vision
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging
 
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in PythonThe Nitty Gritty of Advanced Analytics Using Apache Spark in Python
The Nitty Gritty of Advanced Analytics Using Apache Spark in Python
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
 
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
 
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
apidays New York 2022 - Leveraging Event Streaming to Super-Charge your Busin...
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
 
Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic QuickStart Webinar - Jan 2016
Sumo Logic QuickStart Webinar - Jan 2016
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
Public private hybrid - cmdb challenge
Public private hybrid - cmdb challengePublic private hybrid - cmdb challenge
Public private hybrid - cmdb challenge
 

Plus de confluent

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Dernier

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 

Dernier (20)

UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICSUNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 

Simplifying Event Streaming: Tools for Location Transparency and Data Evolution, Paul Osman

  • 1. Simplifying Event Streaming Tools for Location Transparency & Data Evolution Paul Osman - @paulosman - posman@underarmour.com - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
  • 2. 2 Introduction • Paul Osman • Staff Software Engineer - Under Armour Connected Fitness • Formerly at PagerDuty, 500px, SoundCloud • @paulosman • posman@underarmour.com Paul Osman - @paulosman - posman@underarmour.com - Staff Software Engineer, Under Armour Connected Fitness - Kafka Summit 2016
  • 3. 3
  • 4. 4
  • 5. 5
  • 6. 6 Under Armour Connected Fitness • November 2013 - Under Armour acquires MapMyFitness Inc
 • February 2015 - Under Armour acquires MyFitnessPal 
 • February 2015 - Under Armour acquires Endomondo
 • January 2016 - Announce HealthBox , Gemini 2 RE
  • 8. 8 MyFitnessPal and Kafka • MFP started as a Rails monolith
 • Broken into microservices written in Scala and Ruby
 • Data integration challenges
 • Service dependencies difficult to manage
  • 10. 10 Pushing a data migration…
  • 11. 11 You broke my consumer!
  • 14. 14 Other Challenges… • Client libraries for non-JVM languages were of varying quality
 • Developers needed to know about Kafka
 • Wanted to federate Kafka clusters - no one team should have to maintain all clusters
  • 16. 16
  • 18. 18
  • 19. 19 Challenges Recap • Engineers needed to know a lot about Kafka clusters
 • Data migrations broke consumer contracts
 • Client libraries for non-JVM languages
 • Management of Kafka clusters
 • Data retention policies
  • 20. 20 Location Transparency • A publishing client needn’t be concerned with things like clusters, topics, etc
 • Need some kind of source of truth for event locations
  • 21. 21 Topology Service • Each event has a namespace and event type (globally unique)
 • The topology service instructs clients where to publish or consume those messages
 • Introduces concept of “zones” which represent one or more clusters
  • 22. 22 Data Migrations • Solved problem - use Schemas
 • Confluent Schema Registry + Small Service to capture Metadata (event type and namespace)
 • Confluent Schema Registry uses Avro, so we do too
  • 23. 23 Data Migrations pending available {
 event_type: "ActivityFeedStoryUpdate",
 namespace: "mmf",
 status: "pending", confluent_subject: "mmf_activityfeedstoryupdate", schema_id: "bb68e5381e88d52574b0f50a000fbe9b"
 }
  • 24. 24 Registering a Schema $> schema-registry register schema --event-type FoodEntryCreated --file-name FoodEntryCreated.avsc --namespace mfp -p $> schema-registry activate schema 20afc5a8f9c017c1f4e82757a7a88f5b -p
  • 25. 25
  • 27. 27 JVM Languages • Java and Scala client libraries
  • 28. 28 Non-JVM Languages projects/GoldenGate $ curl -D - -H'Content-Type: application/json' http://localhost:3005/golden-gate-proxy/produce - d'@schemas/integ-message.json'
 HTTP/1.1 202 Accepted Server: spray-can/1.3.3 Date: Tue, 19 Apr 2016 17:34:21 GMT Content-Type: application/json; charset=UTF-8 Content-Length: 419 { "items": [{ "producer_id": "foo", "schema_id": "5635ce15a15213105c091d5e0945b0c2", "zone": "mfp", "payload": {
 "context" : null,
 "email_address" : "vneo",
 "email_source" : "fadipwxfmotvav",
 "first_name" : null,
 "last_name" : null, "country" : {"string" : "US"}, }] }
  • 31. 31 Non-JVM Languages projects/GoldenGate $ ./gg-consumer-proxy —subscriptions=mmf/activity_feed_updated=http://localhost:3000/callback
  • 32. 32 Federation of Kafka Clusters Publisher Topology Service Kafka Consumer
  • 33. 33 Data Retention • Archiving is now just a job for a specialized consumer
 • Archiving is done “per-zone”. Some data shouldn’t be archived, it only gets published to zones that are not archived (as per event type)
 • In our case, data is stored in S3 and then accessed through a variety of tools for analysis, batch processing, etc.
  • 37. 37 Help Publishers - Avro Helper Library • helpful-avro Scala library
 • Adds a layer of robustness 
 • Tries a few tricks to make a payload validate against a schema
  • 38. 38 { "name": "Person", "namespace": "org.example", "type": "record", "fields": [ {"name": "first_name", "type": "string"}, {"name": "last_name", "type": "string"}, {"name": "age", "type": ["null", "int"]}, {"name": "username", "type": ["null", "string"]} ] } Example Schema
  • 39. 39 { "name": "Person", "namespace": "org.example", "type": "record", "fields": [ {"name": "first_name", "type": "string"}, {"name": "last_name", "type": "string"}, {"name": "age", "type": ["null", "int"]}, {"name": "username", "type": ["null", "string"]} ] } Optional Fields Nullable / Optional Fields
  • 40. 40 Nullable / Optional Fields { "first_name": "Paul", "last_name": "Osman", "username": "paulosman" } { "first_name": "Paul", "last_name": "Osman", "age": {"null":null}, "username": {"string": "paulosman"} } age omitted not type annotated
  • 41. • Browser and CLI based tools that allow people to observe activity being published to a specific zone
 • Give people a way to see their event go through the system
 • End to end monitoring, monitoring of consumer lag 41 Observability
  • 42. 42 Future Plans • Make schema authoring and registration easier and more automated • Extend helpful-avro to work with Case Classes and POJOs
 • Further hide implementation details
  • 43. 43CONFIDENTIAL & BUSINESS PROPRIETARY INFORMATION OF UNDER ARMOUR, INC. COPYRIGHT (C)2015 Thank You http://underarmour.jobs