The document proposes a "Punch Clock" concept to help debug Apache Storm transactional topologies. A Punch Clock would record when batches of tuples enter and exit spouts and bolts. Each spout/bolt would have a Punch Card ID to track the batch. Punching in would add the ID to a data structure, punching out would remove it. This would help identify batches stuck in specific spouts/bolts on hosts. It could be exposed via JMX to aggregate data across worker JVMs running the spouts/bolts. The goal is to determine batch flow through the topology and find any that are stuck.
4. Punch clock (a.ka. time clock)
● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
5. Punch clock (a.ka. time clock)
● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
● The person punches OUT with the card
when he/she leaves the office.
6. Punch clock (a.ka. time clock)
● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
● The person punches OUT with the card
when he/she leaves the office.
● The punch clock records the time of
entry/exit on the card
15. Debugging Transactional Topologies
1. Spout emits a batch of data(tuples) which forms a
transaction.
2. Every Bolt in the topology processes that batch of data
(tuples).
18. Motivation
To Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?
19. Motivation
To Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?
a. On which host are they stuck ?
b. In which Spout/Bolt are they stuck ?
22. Possible Solution(s):
Add a log statement before and after the critical section.
log.info(“Inserting data into database ….”); // ← entering
datasource.insert(table, tuples); // ←the real work
log.info(“Inserted data into database.”); //← exiting
23. Possible Solution(s):
Add a log statement before and after the critical section.
log.info(“Inserting data into database ….”); // ← entering
datasource.insert(table, tuples); // ←the real work
log.info(“Inserted data into database.”); //← exiting
------------------------------------------------------------------
Cons: Logs distributed over multiple hosts, need to aggregate logs. needs a bit of work,
Elastic Search Kibana ?
25. My Idea
Batch of Tuples Punch IN and Punch Out in a bolt / spout.
26. My Idea
Batch of Tuples Punch IN and Punch Out in a bolt / spout.
Punch In - Put into hashmap (or any other suitable data structure)
Punch Out - Remove from hashmap (or any other suitable data structure)
27. My Idea:
Batch of Tuples Punch In and Punch Out in a spout.
In the emitBatch of Transactional Spout:
PunchClock.getInstance().punchIn(punchCardId); // ←Punch In
collector.emit(tuples); // ←Emit tuple(s)
PunchClock.getInstance().punchOut(punchCardId); // ←Punch Out
28. Batch of Tuples Punch IN and Punch Out in a bolt .
In the prepare method of Transactional Bolt:
punchCardId ="Bolt__"+Thread.currentThread().getId()+"__"+System.currentTimeMillis(); // ←Create Punch
Card for txn
In the execute method of Transactional Bolt:
PunchClock.getInstance().punchIn(punchCardId); // ← Punch In
In the finishBatch method of Transactional Bolt:
PunchClock.getInstance().punchOut(punchCardId); // ← Punch Out
My Idea:
29. Yes,
but it’s a simple Put / Remove call to a hashmap.
When compared to logging it’s cheaper
Is it intrusive ?
32. Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
33. Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.
34. Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.
● Batches move across storm workers & we have multiple JVM,
○ We need to aggregate the data across Punch Clocks.
○ Expose Punch Clock via JMX.