Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
LESSONS LEARNED
MONITORING THE DATA PIPELINE
AGENDA
• Who am I?
• What’s a Hulu?
• Beacons & the Data Pipeline
• Monitoring – Take One
• Monitoring – Take Two
TRISTAN REID
METRICS & REPORTING TOOLS TEAM LEAD
Help people find and enjoy
the world’s premium content
when, where and how they want
it.
HULU’S MISSION
PREMIUM CONTENT QUALITY AD EXPERIENCE
• Premium Content
• 485+ Content Partners
• 6 of 6 Broadcast Networks
USER CONTROL
•...
7
• Service Oriented
• Small teams, specialized scopes
• Build tools for other developers
• Right tool for the job
Beacons & The Data Pipeline
8
Fire & Forget
HTTP Format
High Availability
Process
Transform
Collect
External View of Beacons
Beacons
80 2013-04-01 00:00:00
/v3/playback/start?
bitrate=650
&cdn=Akamai
&channel=Anime
&client=Explorer
&computerguid=E...
The pipeline
Beacon collection
service
HDFS
Hive
RDBMS
Log Collector / Flume
MapReduce Jobs
Continuous Aggregation /
Selec...
Avg. 12,000
events per
second
Peak: ~35K
Data Collection
Data never stops
coming…
and we can’t lose
any data
HDFS
Files bucketed by beacon
type and partitioned by hour
Log Collection
machine #1
Log Collection
…
Load balancer
Device...
MapReduce - from beacons to basefacts
video_id 289696
content_partner_id 398
distribution_partner_id 602
distro_platform_i...
Hulu MapReduce Metrics Jobs
Definitions of
beacons and
base-facts
Beaconspec
compiler
MapReduce code,
including
metadata l...
UserJobs
 Mention the MVEL coolness
MVEL:
client contains 'Chrome' &&
fullscreen == true &&
(os contains 'Windows' || os ...
Aggregation & Publishing
Hourly Facts
Aggregations
Daily/Weekly/Monthly/Quarterly/Ann
ual
Popular Data
MySQL SQL
Publishing
Data API
Service
Reporting Flow
Reporting
Portal UI
(RP2)
Report
Controller
Scheduler
HiveRunner
Published
DB’s
RP2
DB
Ava...
RP2 UI
Monitoring
Some Issues…
BIG DATA PIPELINE?
I’LL BET THAT’S GOING
GREAT FOR YOU
EMAIL
EXPLOSIONS
GATEKEEPINGOverhead
Consumption
C
H
A...
Lots of Monitoring Tools Available
Ingest
Jobs
ClusterOpenTSDB & Graphite
WHAT’s GOING ON??!??
HOW IS OUR CLUSTER? WILL WE MEET OUR SLAs?
HOW FAST DID A JOB RUN?
HOW DID RUNTIME COMPARE TO
HISTORI...
The Design…
Access all your tools in one
place...
…but avoid multitasking
Service Oriented
Architecture
Comprehensive Web ...
Does this solve our problems?
32
• Single Point of Access?
• Maintain services separately?
TAKE THAT
DATA
PIPELINE
ISSUES!!
Our Users’ Perspective?
• We detect platform issues
• We quickly troubleshoot errors
• We track relative performance
• We ...
The User Perspective
User
Group
Report
User
Report
User
Report
UserReport
User
Report
UserReport
UserUser
Group
Report
Rep...
Contextual Troubleshooting Model
• Connect issues to business units
• Better impact assessment
• Tune performance per user...
Why a Graph?
 …instead of RDBMS
 Indeterminate # of Joins
 Query for graph connectedness is trivial and short
 Query f...
Let’s investigate… These failed before getting to a data store
Most of the hive failures were the same
table, but it’s a c...
Each service implements a log-fetching interface,
specific to the resources used for a particular report
SUCCESS!!!
In Summary…
 Find the Important Questions => Measure the Right Data
 Make troubleshooting easy
 Small distinct services...
Questions?
• Muthu…the Platform GrandMaster
• All of Metrics Platform, Tools, Reporting for making this stuff
• Mohamed, C...
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Lessons Learned - Monitoring the Data Pipeline at Hulu
Prochain SlideShare
Chargement dans…5
×

sur

Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 1 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 2 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 3 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 4 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 5 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 6 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 7 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 8 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 9 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 10 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 11 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 12 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 13 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 14 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 15 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 16 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 17 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 18 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 19 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 20 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 21 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 22 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 23 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 24 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 25 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 26 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 27 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 28 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 29 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 30 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 31 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 32 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 33 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 34 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 35 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 36 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 37 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 38 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 39 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 40 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 41 Lessons Learned - Monitoring the Data Pipeline at Hulu Slide 42
Prochain SlideShare
Inside Hulu's Data platform (BigDataCamp LA 2013)
Suivant

19 j’aime

Partager

Lessons Learned - Monitoring the Data Pipeline at Hulu

Lessons Learned - Monitoring the Data Pipeline at Hulu

  1. 1. LESSONS LEARNED MONITORING THE DATA PIPELINE
  2. 2. AGENDA • Who am I? • What’s a Hulu? • Beacons & the Data Pipeline • Monitoring – Take One • Monitoring – Take Two
  3. 3. TRISTAN REID METRICS & REPORTING TOOLS TEAM LEAD
  4. 4. Help people find and enjoy the world’s premium content when, where and how they want it. HULU’S MISSION
  5. 5. PREMIUM CONTENT QUALITY AD EXPERIENCE • Premium Content • 485+ Content Partners • 6 of 6 Broadcast Networks USER CONTROL • Ads can’t be skipped • Less ad load than TV • 100% video completion rate guarantee • On Demand • Across Devices • Choice Based Ad Formats WHY IS HULU EFFECTIVE?
  6. 6. 7 • Service Oriented • Small teams, specialized scopes • Build tools for other developers • Right tool for the job
  7. 7. Beacons & The Data Pipeline 8
  8. 8. Fire & Forget HTTP Format High Availability Process Transform Collect External View of Beacons
  9. 9. Beacons 80 2013-04-01 00:00:00 /v3/playback/start? bitrate=650 &cdn=Akamai &channel=Anime &client=Explorer &computerguid=EA8FA1000232B8F6986C3E0BE 55E9333 &contentid=5003673 …  Which show is the user watching?  Which pages did they visit?  How long did they stay?  Where did they come from?  Did they become Plus members?
  10. 10. The pipeline Beacon collection service HDFS Hive RDBMS Log Collector / Flume MapReduce Jobs Continuous Aggregation / Selective PublishingReporting Monitoring Developers Business Analysts
  11. 11. Avg. 12,000 events per second Peak: ~35K Data Collection
  12. 12. Data never stops coming… and we can’t lose any data
  13. 13. HDFS Files bucketed by beacon type and partitioned by hour Log Collection machine #1 Log Collection … Load balancer Devices Devices Devices Log Collection machine #11 CDN
  14. 14. MapReduce - from beacons to basefacts video_id 289696 content_partner_id 398 distribution_partner_id 602 distro_platform_id 14 is_on_hulu 0 … hourid 383149 watched 76426
  15. 15. Hulu MapReduce Metrics Jobs Definitions of beacons and base-facts Beaconspec compiler MapReduce code, including metadata lookups Job Scheduler BeaconSpec DSL Scala / Akka JFlex & CUP Java (Generated) Documentation Automated Validations for Beacon Generators In Progress…
  16. 16. UserJobs  Mention the MVEL coolness MVEL: client contains 'Chrome' && fullscreen == true && (os contains 'Windows' || os contains 'Mac')
  17. 17. Aggregation & Publishing Hourly Facts Aggregations Daily/Weekly/Monthly/Quarterly/Ann ual Popular Data MySQL SQL Publishing
  18. 18. Data API Service Reporting Flow Reporting Portal UI (RP2) Report Controller Scheduler HiveRunner Published DB’s RP2 DB Available columns Date range checks Submit Report Execute Report Check Status Queue Run Generate Query
  19. 19. RP2 UI
  20. 20. Monitoring
  21. 21. Some Issues… BIG DATA PIPELINE? I’LL BET THAT’S GOING GREAT FOR YOU EMAIL EXPLOSIONS GATEKEEPINGOverhead Consumption C H A N G E
  22. 22. Lots of Monitoring Tools Available Ingest Jobs ClusterOpenTSDB & Graphite
  23. 23. WHAT’s GOING ON??!?? HOW IS OUR CLUSTER? WILL WE MEET OUR SLAs? HOW FAST DID A JOB RUN? HOW DID RUNTIME COMPARE TO HISTORICAL? HOW IS THIS COMPONENT? HOW IS OUR SYSTEM?
  24. 24. The Design… Access all your tools in one place... …but avoid multitasking Service Oriented Architecture Comprehensive Web UI
  25. 25. Does this solve our problems? 32 • Single Point of Access? • Maintain services separately? TAKE THAT DATA PIPELINE ISSUES!!
  26. 26. Our Users’ Perspective? • We detect platform issues • We quickly troubleshoot errors • We track relative performance • We know where we are re: SLAs …but is detection of a problem enough? A PROBLEM DETECTION USERS We need to think of things from the report users’ perspectives
  27. 27. The User Perspective User Group Report User Report User Report UserReport User Report UserReport UserUser Group Report Report Report Report Report Report Run Report Run Report Run Report Run Report Run Data Pipeline Resources ETC! Schedule
  28. 28. Contextual Troubleshooting Model • Connect issues to business units • Better impact assessment • Tune performance per user needs We need a graph data structure, populated with the stuff we care about Something like this
  29. 29. Why a Graph?  …instead of RDBMS  Indeterminate # of Joins  Query for graph connectedness is trivial and short  Query for connectedness w/ SQL relies on knowing the intermediate resources  …instead of a tree?  Data is sometimes recombinant (e.g. a metric in multiple reports to same user)
  30. 30. Let’s investigate… These failed before getting to a data store Most of the hive failures were the same table, but it’s a common table As we filter, the matched reports show up on the bottom of the page. The log link shows us the details
  31. 31. Each service implements a log-fetching interface, specific to the resources used for a particular report
  32. 32. SUCCESS!!!
  33. 33. In Summary…  Find the Important Questions => Measure the Right Data  Make troubleshooting easy  Small distinct services are easy to create, maintain, and wire together
  34. 34. Questions? • Muthu…the Platform GrandMaster • All of Metrics Platform, Tools, Reporting for making this stuff • Mohamed, Chris, Charlie, Robert, Phong, AJ, Ratheesh, Adi, Matt, Shashank, Joanne, Siddhartha, Tamir, Jun, James, Dr. Kevin, Hang • All of the Hulu DEV team for general awesomeness • Prasan…thanks for the impetus to do this. I’ll look u up • Kevin…thanks for Hulu. I’ll send u a snap Thanks to…
  • JoyLeaman

    Sep. 9, 2019
  • paultongyoo

    Apr. 19, 2019
  • YvonneWangF

    Jan. 16, 2019
  • safibaig

    Aug. 30, 2018
  • arjunmantri

    Mar. 24, 2016
  • shikhaj1

    Jan. 28, 2016
  • amiryoussefi

    Jan. 26, 2016
  • jamesmcginnis701

    May. 5, 2015
  • marc.de.palol

    Apr. 28, 2015
  • justinleeschmidtmn

    Apr. 25, 2015
  • jiangyu1848

    Mar. 3, 2015
  • allenkk

    Jan. 4, 2015
  • icokecat

    Nov. 14, 2014
  • gkaul

    Aug. 6, 2014
  • rohanred

    Jul. 23, 2014
  • JaneFReed

    Jul. 14, 2014
  • stotman

    Jul. 13, 2014
  • NELLAIVIJAY1

    Jun. 21, 2014
  • zhutounong

    Jun. 18, 2014

Vues

Nombre de vues

4 894

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

5

Actions

Téléchargements

0

Partages

0

Commentaires

0

Mentions J'aime

19

×