Punch clock for debugging apache storm

•

0 j'aime•455 vues

The document proposes a "Punch Clock" concept to help debug Apache Storm transactional topologies. A Punch Clock would record when batches of tuples enter and exit spouts and bolts. Each spout/bolt would have a Punch Card ID to track the batch. Punching in would add the ID to a data structure, punching out would remove it. This would help identify batches stuck in specific spouts/bolts on hosts. It could be exposed via JMX to aggregate data across worker JVMs running the spouts/bolts. The goal is to determine batch flow through the topology and find any that are stuck.

Ingénierie

Punch clock for
Apache storm
<just an idea>

Punch clock (a.ka. time clock)
● You have a card per person.

Punch clock (a.ka. time clock)
● You have a card per person.
● The person punches IN with the card when
he/she enters the office.

Punch clock (a.ka. time clock)
● You have a card per person.
● The person punches IN with the card when
he/she enters the office.
● The person punches OUT with the card
when he/she leaves the office.

Motivation
To Find out …
1. When did the Person enter / exit the office ?

Motivation
To Find out …
1. When did the Person enter / exit the office ?
2. Who is still in office ?

“Apache Storm”
Tuples going In & Out
of Spouts/Bolts

Motivation
Debugging Apache Storm*
* Debugging Storm
Transactional Topologies

Debugging Transactional Topologies
1. Spout emits a batch of data(tuples) which forms a
transaction.

Debugging Transactional Topologies
1. Spout emits a batch of data(tuples) which forms a
transaction.
2. Every Bolt in the topology processes that batch of data
(tuples).

Motivation
To Find out …
1. When did the batch enter/exit the Spout/Bolt ?

Motivation
To Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?

Motivation
To Find out …
1. When did the batch enter/exit the Spout/Bolt ?
2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?
a. On which host are they stuck ?
b. In which Spout/Bolt are they stuck ?

Possible Solution(s):
Add a log statement before and after the critical section.

Possible Solution(s):
Add a log statement before and after the critical section.
log.info(“Inserting data into database ….”); // ← entering
datasource.insert(table, tuples); // ←the real work
log.info(“Inserted data into database.”); //← exiting

Possible Solution(s):
Use http://riemann.io/index.html
This was Suggested by my friend angad. I have not looked at this though.

My Idea
Batch of Tuples Punch IN and Punch Out in a bolt / spout.

My Idea
Batch of Tuples Punch IN and Punch Out in a bolt / spout.
Punch In - Put into hashmap (or any other suitable data structure)
Punch Out - Remove from hashmap (or any other suitable data structure)

My Idea:
Batch of Tuples Punch In and Punch Out in a spout.
In the emitBatch of Transactional Spout:
PunchClock.getInstance().punchIn(punchCardId); // ←Punch In
collector.emit(tuples); // ←Emit tuple(s)
PunchClock.getInstance().punchOut(punchCardId); // ←Punch Out

Batch of Tuples Punch IN and Punch Out in a bolt .
In the prepare method of Transactional Bolt:
punchCardId ="Bolt__"+Thread.currentThread().getId()+"__"+System.currentTimeMillis(); // ←Create Punch
Card for txn
In the execute method of Transactional Bolt:
PunchClock.getInstance().punchIn(punchCardId); // ← Punch In
In the finishBatch method of Transactional Bolt:
PunchClock.getInstance().punchOut(punchCardId); // ← Punch Out
My Idea:

Yes,
but it’s a simple Put / Remove call to a hashmap.
When compared to logging it’s cheaper
Is it intrusive ?

Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.

Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.

Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.

Punch Clocks
● Spouts / Bolts housed in a storm worker jvm.
● One Punch Clock per JVM.
● Since we have multiple JVM we have multiple Punch Clocks.
● Batches move across storm workers & we have multiple JVM,
○ We need to aggregate the data across Punch Clocks.
○ Expose Punch Clock via JMX.

thank you
jaihind213@gmail.com
https://github.com/jaihind213/storm-punch-clock
sweetweet213@twitter

Recommandé

Hypercritical C++ Code ReviewAndrey Karpov

a wild Supposition: can MySQL be Kafka ?vishnu rao

Build your own Real Time Analytics and Visualization, Enable Complex Event Pr...vishnu rao

Do you need microservices architecture?Manu Pk

Demystifying datastoresvishnu rao

Visualising Basic Concepts of Docker vishnu rao

Spring IO '15 - Developing microservices, Spring Boot or Grails?Fátima Casaú Pérez

Let's Go: Introduction to Google's Go Programming LanguageGanesh Samarthyam

Recommandé

Hypercritical C++ Code ReviewAndrey Karpov

a wild Supposition: can MySQL be Kafka ?vishnu rao

Build your own Real Time Analytics and Visualization, Enable Complex Event Pr...vishnu rao

Do you need microservices architecture?Manu Pk

Demystifying datastoresvishnu rao

Visualising Basic Concepts of Docker vishnu rao

Spring IO '15 - Developing microservices, Spring Boot or Grails?Fátima Casaú Pérez

Let's Go: Introduction to Google's Go Programming LanguageGanesh Samarthyam

Software Design in Practice (with Java examples)Ganesh Samarthyam

Microservices with Spring BootJoshua Long

Microservices with Java, Spring Boot and Spring CloudEberhard Wolff

Microservice With Spring Boot and Spring CloudEberhard Wolff

Bangalore Container Conference 2017 - PosterGanesh Samarthyam

Docker by Example - Basics Ganesh Samarthyam

Spring bootsdeeg

A talk on mysql & auroravishnu rao

Introduction to Apache Kafkavishnu rao

Mysql Relay log - the unsung herovishnu rao

simple introduction to hadoopvishnu rao

Druid beginner performance tipsvishnu rao

StormWars - when the data stream shrinksvishnu rao

SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome

UNIT-II FMM-Flow Through Circular Conduitsrknatarajan

UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Porous Ceramics seminar and technical writingrakeshbaidya232001

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat

Contenu connexe

En vedette

Software Design in Practice (with Java examples)Ganesh Samarthyam

Microservices with Spring BootJoshua Long

Microservices with Java, Spring Boot and Spring CloudEberhard Wolff

Microservice With Spring Boot and Spring CloudEberhard Wolff

Bangalore Container Conference 2017 - PosterGanesh Samarthyam

Docker by Example - Basics Ganesh Samarthyam

Spring bootsdeeg

En vedette (7)

Software Design in Practice (with Java examples)

Microservices with Spring Boot

Microservices with Java, Spring Boot and Spring Cloud

Microservice With Spring Boot and Spring Cloud

Bangalore Container Conference 2017 - Poster

Docker by Example - Basics

Spring boot

Plus de vishnu rao

A talk on mysql & auroravishnu rao

Introduction to Apache Kafkavishnu rao

Mysql Relay log - the unsung herovishnu rao

simple introduction to hadoopvishnu rao

Druid beginner performance tipsvishnu rao

StormWars - when the data stream shrinksvishnu rao

Plus de vishnu rao (6)

A talk on mysql & aurora

Introduction to Apache Kafka

Mysql Relay log - the unsung hero

simple introduction to hadoop

Druid beginner performance tips

StormWars - when the data stream shrinks

Dernier

SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome

UNIT-II FMM-Flow Through Circular Conduitsrknatarajan

UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Porous Ceramics seminar and technical writingrakeshbaidya232001

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

KubeKraft presentation @CloudNativeHooghlysanyuktamishra911

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat

Introduction to IEEE STANDARDS and its different types.pptxupamatechverse

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat

Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona

College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Extrusion Processes and Their Limitations120cr0395

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Dernier (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )

UNIT-II FMM-Flow Through Circular Conduits

UNIT-III FMM. DIMENSIONAL ANALYSIS

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Porous Ceramics seminar and technical writing

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts

KubeKraft presentation @CloudNativeHooghly

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts

Introduction to IEEE STANDARDS and its different types.pptx

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts

Processing & Properties of Floor and Wall Tiles.pptx

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

Extrusion Processes and Their Limitations

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts

Punch clock for debugging apache storm

1. Punch clock for Apache storm <just an idea>

2. Punch clock (a.ka. time clock)

3. Punch clock (a.ka. time clock) ● You have a card per person.

4. Punch clock (a.ka. time clock) ● You have a card per person. ● The person punches IN with the card when he/she enters the office.

5. Punch clock (a.ka. time clock) ● You have a card per person. ● The person punches IN with the card when he/she enters the office. ● The person punches OUT with the card when he/she leaves the office.

6. Punch clock (a.ka. time clock) ● You have a card per person. ● The person punches IN with the card when he/she enters the office. ● The person punches OUT with the card when he/she leaves the office. ● The punch clock records the time of entry/exit on the card

7. Motivation To Find out …

8. Motivation To Find out … 1. When did the Person enter / exit the office ?

9. Motivation To Find out … 1. When did the Person enter / exit the office ? 2. Who is still in office ?

10. Change of Context …

11. “Apache Storm” Tuples going In & Out of Spouts/Bolts

12. Motivation Debugging Apache Storm* * Debugging Storm Transactional Topologies

13. Debugging Transactional Topologies

14. Debugging Transactional Topologies 1. Spout emits a batch of data(tuples) which forms a transaction.

15. Debugging Transactional Topologies 1. Spout emits a batch of data(tuples) which forms a transaction. 2. Every Bolt in the topology processes that batch of data (tuples).

16. Motivation To Find out …

17. Motivation To Find out … 1. When did the batch enter/exit the Spout/Bolt ?

18. Motivation To Find out … 1. When did the batch enter/exit the Spout/Bolt ? 2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ?

19. Motivation To Find out … 1. When did the batch enter/exit the Spout/Bolt ? 2. Which batch is still in the Spout/Bolt? i.e. are any batches STUCK ? a. On which host are they stuck ? b. In which Spout/Bolt are they stuck ?

20. Possible Solution(s):

21. Possible Solution(s): Add a log statement before and after the critical section.

22. Possible Solution(s): Add a log statement before and after the critical section. log.info(“Inserting data into database ….”); // ← entering datasource.insert(table, tuples); // ←the real work log.info(“Inserted data into database.”); //← exiting

23. Possible Solution(s): Add a log statement before and after the critical section. log.info(“Inserting data into database ….”); // ← entering datasource.insert(table, tuples); // ←the real work log.info(“Inserted data into database.”); //← exiting ------------------------------------------------------------------ Cons: Logs distributed over multiple hosts, need to aggregate logs. needs a bit of work, Elastic Search Kibana ?

24. Possible Solution(s): Use http://riemann.io/index.html This was Suggested by my friend angad. I have not looked at this though.

25. My Idea Batch of Tuples Punch IN and Punch Out in a bolt / spout.

26. My Idea Batch of Tuples Punch IN and Punch Out in a bolt / spout. Punch In - Put into hashmap (or any other suitable data structure) Punch Out - Remove from hashmap (or any other suitable data structure)

27. My Idea: Batch of Tuples Punch In and Punch Out in a spout. In the emitBatch of Transactional Spout: PunchClock.getInstance().punchIn(punchCardId); // ←Punch In collector.emit(tuples); // ←Emit tuple(s) PunchClock.getInstance().punchOut(punchCardId); // ←Punch Out

28. Batch of Tuples Punch IN and Punch Out in a bolt . In the prepare method of Transactional Bolt: punchCardId ="Bolt__"+Thread.currentThread().getId()+"__"+System.currentTimeMillis(); // ←Create Punch Card for txn In the execute method of Transactional Bolt: PunchClock.getInstance().punchIn(punchCardId); // ← Punch In In the finishBatch method of Transactional Bolt: PunchClock.getInstance().punchOut(punchCardId); // ← Punch Out My Idea:

29. Yes, but it’s a simple Put / Remove call to a hashmap. When compared to logging it’s cheaper Is it intrusive ?

30. Punch Clocks

31. Punch Clocks ● Spouts / Bolts housed in a storm worker jvm.

32. Punch Clocks ● Spouts / Bolts housed in a storm worker jvm. ● One Punch Clock per JVM.

33. Punch Clocks ● Spouts / Bolts housed in a storm worker jvm. ● One Punch Clock per JVM. ● Since we have multiple JVM we have multiple Punch Clocks.

34. Punch Clocks ● Spouts / Bolts housed in a storm worker jvm. ● One Punch Clock per JVM. ● Since we have multiple JVM we have multiple Punch Clocks. ● Batches move across storm workers & we have multiple JVM, ○ We need to aggregate the data across Punch Clocks. ○ Expose Punch Clock via JMX.

35.

36. demo:

37.

38. thank you jaihind213@gmail.com https://github.com/jaihind213/storm-punch-clock sweetweet213@twitter