The document describes SherLog, a tool that helps debug systems by connecting clues from runtime logs. It infers possible failure-inducing execution paths and constraints along the paths by matching log sequences to control and data flow in source code. It also symbolically executes paths to infer the value flow of variables. The tool was evaluated on real-world bugs and found to infer useful reproduction information.
ICT role in 21st century education and its challenges
SherLog: Error Diagnosis Through Connecting Clues from Run-time Logs
1. SherLog: Error Diagnosis by Connecting Clues from
Run-time Logs
Ding Yuan1 Haohui Mai1 Weiwei Xiong1 Lin Tan1 Yuanyuan
Zhou2 Shankar Pasupathy3
1University of Illinois at Urbana-Champaign
2University of California, San Diego
3NetApp, Inc.
ASPLOS’10, March 13-17, 2010, Pittsburgh, Pennsylvania, USA.
July 18, 2013
Lisong Guo (LIP6/REGAL) July 18, 2013 1 / 13
2. Introduction
Scenario — Postmortem In-Production Debugging with Logs
postmortem VS. prediction (model checking, static analysis etc.)
in-production VS. in-house (runtime instrumentation etc.)
log VS. others (bug reports, deployment configuration etc.)
Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13
3. Introduction
Scenario — Postmortem In-Production Debugging with Logs
postmortem VS. prediction (model checking, static analysis etc.)
in-production VS. in-house (runtime instrumentation etc.)
log VS. others (bug reports, deployment configuration etc.)
Subtasks of Debugging
reproduce the bug (procedure non-related to source code)
infer the failure-inducing execution path
infer the conditions along the failure-inducing execution path
Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13
4. Introduction
Scenario — Postmortem In-Production Debugging with Logs
postmortem VS. prediction (model checking, static analysis etc.)
in-production VS. in-house (runtime instrumentation etc.)
log VS. others (bug reports, deployment configuration etc.)
Subtasks of Debugging
reproduce the bug (procedure non-related to source code)
infer the failure-inducing execution path
infer the conditions along the failure-inducing execution path
Research Question
How can we help a developer to debug in the scenario above ?
Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13
5. Approach
Idea
Manual Inspection −→ Automatic Inference
A tool that takes the run-time logs and source code as inputs, and then
produces some debugging hints for developers (i.e connecting the dots)
all possible and valid failure-inducing execution path
the evolution of value on certain variables along the inferred paths
Lisong Guo (LIP6/REGAL) July 18, 2013 3 / 13
6. Approach
Idea
Manual Inspection −→ Automatic Inference
A tool that takes the run-time logs and source code as inputs, and then
produces some debugging hints for developers (i.e connecting the dots)
all possible and valid failure-inducing execution path
the evolution of value on certain variables along the inferred paths
Usage Scenario
run the tool to get a list of interesting paths
examine the values of certain suspicious variables along some path
repeat the previous step until the root cause of the bug is found
Lisong Guo (LIP6/REGAL) July 18, 2013 3 / 13
7. Design
Components
Log Parsing: locate the logging statements in the source code
Path Inference: infer the failure execution paths and the constraints
Value Inference: infer the value evaluation along the given paths
Lisong Guo (LIP6/REGAL) July 18, 2013 4 / 13
9. Log Parsing
Objectives
Identifying the Logging Points and Variables in the source code.
Simple Logging Statements
Solution: regular-expression matching (i.e. grep)
e.g. error(0, 0, _("removing directory, %s"), path);
rule: {error(), 3, 4}
Lisong Guo (LIP6/REGAL) July 18, 2013 5 / 13
10. Log Parsing
Objectives
Identifying the Logging Points and Variables in the source code.
Simple Logging Statements
Solution: regular-expression matching (i.e. grep)
e.g. error(0, 0, _("removing directory, %s"), path);
rule: {error(), 3, 4}
Complicated Logging Facilities
Hierarchy wrappers of standard printing APIs. (alt: coccinelle)
e.g. error() -> strerrno()
rule 1: ’%s’: %{serrono}
rule 2: [{ "specifier": serrno; "regex": Regex; "val_func": ErrMsgToErrno}]
Lisong Guo (LIP6/REGAL) July 18, 2013 5 / 13
11. Path Inference
Objectives
infer the failure-inducing execution path
infer the constraints of variables along the path
Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13
12. Path Inference
Objectives
infer the failure-inducing execution path
infer the constraints of variables along the path
Constrained Sequence Matching Problem (NP-Complete?)
based on Saturn, a static analysis framework for C programs
match the control & data flow with the sequence of log messages
convert the path searching problem into a set of declarative rules
Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13
13. Path Inference
Objectives
infer the failure-inducing execution path
infer the constraints of variables along the path
Constrained Sequence Matching Problem (NP-Complete?)
based on Saturn, a static analysis framework for C programs
match the control & data flow with the sequence of log messages
convert the path searching problem into a set of declarative rules
Glance of Implementation
customized control-flow: main → log@4 → b1@10 → c@16 → log@25
conjunctive constraints: strchr = NULL ∧ verbose = 0 rmdir()@17 = 0
Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13
16. Path Inference
Technique Summary
SAT-based path searching
constraint programming
Limitations
skip the analysis on the functions of non-log-generating
therefore it might return incorrect results
no alias analysis for pointers
but the underlining framework Saturn support alias analysis
special treatments on some external routines/functions
abort, exit, setjmp, longjmp etc.
Lisong Guo (LIP6/REGAL) July 18, 2013 8 / 13
18. Value Inference
Objective
infer the value-flow of certain variables, given the execution paths
Algorithm
model the assignment relationship among memory locations as
guarded points-to graph
predicate(position, variable, value, constraint)
symbolically execute the inferred failure path forwards
refine the constraint according to the scope of variables
incrementally update the graph at each step
generate the sequence of value evolution (value-flow)
Lisong Guo (LIP6/REGAL) July 18, 2013 9 / 13
19. Evaluation
Methodology
manually reproduce and diagnose the real-world bugs
collect path summaries at runtime
compare the result of SherLog with the reproduction information
Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13
20. Evaluation
Methodology
manually reproduce and diagnose the real-world bugs
collect path summaries at runtime
compare the result of SherLog with the reproduction information
Metrics
useful: SherLog infers a subset of bug reproduction information
complete: SherLog infers all the bug reproduction information
Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13
21. Evaluation
Methodology
manually reproduce and diagnose the real-world bugs
collect path summaries at runtime
compare the result of SherLog with the reproduction information
Metrics
useful: SherLog infers a subset of bug reproduction information
complete: SherLog infers all the bug reproduction information
Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13
23. Assumptions/Limitations
Assumptions
sufficient logging messages
reasonable density distribution of logging statements
reasonable amount of logging statements being activated
well-match between the bug manifestation path and the log trace
sequential and single-threaded log messages
cannot handle multi-thread concurrent program
Technical Limitations
skip the functions that do not involve in log production
do not parse the complex constructs of C programming language, such
as pointer arithmetics
Lisong Guo (LIP6/REGAL) July 18, 2013 12 / 13
24. More Pointers...
Ding Yuan, Soyeon Park, Yuanyuan Zhou: Characterizing logging practices
in open-source software. ICSE 2012
Adam J. Oliner, Archana Ganapathi, Wei Xu: Advances and challenges in
log analysis. Commun. ACM 2012
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, Stefan Savage:
Improving Software Diagnosability via Log Enhancement. ASPLSO 2011
Wei Xu, Ling Huang, Armando Fox, David A. Patterson, Michael I. Jordan:
Detecting Large-Scale System Problems by Mining Console Logs. ICML 2010
Thomas Reidemeister, Mohammad Ahmad Munawar, Miao Jiang, Paul A. S.
Ward: Diagnosis of recurrent faults using log files. CASCON 2009
Trishul M. Chilimbi, Ben Liblit, Krishna Mehra, Aditya V. Nori, and Kapil
Vaswani: HOLMES: Effective statistical debugging via efficient path
profiling. ICSE 2009
Lisong Guo (LIP6/REGAL) July 18, 2013 13 / 13