1. (Big Data)2
How YARN Timeline Service v.2 Unlocks 360-Degree
Pla@orm Insights at Scale
Sangjin Lee @sjlee (Twi5er)
Li Lu (Hortonworks)
Vrushali Channapa5an @vrushalivc (Twi5er)
2. Outline
• Why v.2?
• Highlights
• Developing for Timeline Service v.2
• SeIng up Timeline Service v.2
• Milestones
• Demo
3. Why v.2?
• YARN Timeline Service v 1.x
• Gained good adopSon: Tez, HIVE, Pig, etc.
• Keeps improving with v 1.5 APIs and storage implementaSon
• SSll facing some fundamental challenges...
4. Why v.2?
• Scalability and reliability challenges
• Single instance of Timeline Server
• Storage (single local LevelDB instance)
• Usability
• Flow
• Metrics and configuraSon as first-class ciSzens
• Metrics aggregaSon up the enSty hierarchy
5. Highlights
v.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 enSty model New v.2 enSty model
No aggregaSon Metrics aggregaSon
REST API Richer query REST API
6. Architecture
• SeparaSon of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
8. Collector discovery
RM
AM
app id => address
! start AM container
NM
3meline
collector
" node heartbeat
# allocate response
worker node
3meline
client
9. New enSty model
• Flows and flow runs as parents of YARN applicaSon enSSes
• First-class configuraSon (key-value pairs)
• First-class metrics (single-value or Sme series)
• Designed to handle mulS-cluster environment out of the box
10. What is a flow?
• A flow is a group of YARN
applicaSons that are launched as
parts of a logical app
• Oozie, Scalding, Pig, etc.
• name:
“frequent_visitor_stat”
• run id: 1466097809000
• version: “b9b9068”
11. ConfiguraSon and metrics
• Now explicit top-level a5ributes of
enSSes
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 10
metric: B = 100
config: "Foo" = "bar"
12. ConfiguraSon and metrics
• Now explicit top-level a5ributes of
enSSes
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query enMMes where config A = B”
container 1_1
metric: A = 50
metric: B = 100
config: "Foo" = "bar"
13. HBase Storage
• Scalable backend
• Row Key structure
• efficient range scans
• KeyPrefixRegionSplitPolicy
• Filter pushdown
• Coprocessors for flow aggregaSon (“readless” aggregaSon)
• Cell tags for metadata (applicaSon id, aggregaSon operaSon)
• Cell Smestamps generated during put
• lei shiied with app id added to avoid overwrites
14. Tables in HBase
• flow run
• application
• entity
• flow activity
• app to flow
15. table: flow run
Row key:
clusterId!userName!
flowName!
inverted(flowRunId)
most recent flow run stored first
coprocessor enabled
18. table: flow acSvity
Row key:
clusterId!
inverted(TopOfTheDay)!
userName!flowName
shows the flows that ran on that day
stores informaSon per flow like number of
runs, the run ids, versions
23. Reader REST API: paths
• URLs under /ws/v2/Smeline
• Canonical REST style URLs: /ws/v2/Smeline/clusters/cluster_name/
users/user_name/flows/flow_name/runs/run_id
• Path elements may be omi5ed if they can be inferred
• flow context can be inferred by app id
• default cluster is assumed if cluster is omi5ed
24. Reader REST API: query params
• limit, createdTimeStart, createdTimeEnd: constrain the enSSes
• fields (ALL | EVENTS | INFO | CONFIGS | METRICS | RELATES_TO |
IS_RELATED_TO): limit the contents to return
• metricsToRetrieve, confsToRetrieve: further limit the contents to
return
• metricsLimit: limits the number of values in a Sme series
25. Reader REST API: query params
• relatesTo, isRelatedTo: filters by associaSon
• *Filters: filters by info, config, metric, event, …
• Supports complex filters including operators
• metricFilter=(((metric1 eq 50) AND (metric2 gt 40)) OR (metric1 lt
20))
26. Developing: TimelineClient
In your application master:
// create TimelineClient v.2 style
TimelineClient client = TimelineClient.createTimelineClient(appId);
client.init(conf);
client.start();
// bind it to AM/RM client to receive the collector address
amRMClient.registerTimelineClient(client);
// create and write timeline entities
TimelineEntity entity = new TimelineEntity();
client.putEntities(entity);
// when the app is complete, stop the timeline client
client.stop();
27. Developing: Flow context
In your app submitter:
ApplicationSubmissionContext appContext =
app.getApplicationSubmissionContext();
// set the flow context as YARN application tags
Set<String> tags = new HashSet<>();
tags.add(TimelineUtils.generateFlowNameTag("distributed grep"));
tags.add(TimelineUtils.generateFlowVersionTag(
"3df8b0d6100530080d2e0decf9e528e57c42a90a"));
tags.add(TimelineUtils.generateFlowRunIdTag(System.currentTimeMillis()));
appContext.setApplicationTags(tags);
28. SeIng up Timeline Service v.2
• Set up the HBase cluster (1.1.x)
• Add the Smeline service jar to HBase
• Install the flow run coprocessor
• Create tables via TimelineSchemaCreator uSlity
• Configure the YARN cluster
• Enable Timeline Service v.2
• Add hbase-site.xml for the Smeline collector and readers
• Start the Smeline reader daemon
29. Milestone 1 ("Alpha 1")
• Merge discussion (YARN-2928) in progress as we speak!
✓ Complete end-to-end read/write
flow
✓ Real Sme applicaSon and flow
aggregaSon
✓ New enSty model
✓ HBase Storage
✓ Rich REST API
✓ IntegraSon with Distributed Shell
and MapReduce
✓ YARN generic events and system
metrics
30. Milestones - Future
• Milestone 2 (“Alpha 2”)
• IntegraSon with new YARN
UI
• IntegraSon with more
frameworks
• Beta
• Freeze API and storage schema
• Security
• Collectors as containers
• Storage fault tolerance
• ProducSon-ready
• MigraSon-ready