Debug Information And Where They Come From

Debug INFormation

And Where They Come From
Min-Yih “Min” Hsu, COSCUP 2022

Debugging
2
Execution
Program
Pause

Debugging
2
Execution
Program
Pause
Debugger
Developers

Debugging
2
Execution
Program
Pause
Debugger
Developers
Source Code
• Source location (e.g. line number)

Debugging
2
Execution
Program
Pause
Debugger
Developers
Source Code
• Variable values

Debugging
2
Execution
Program
Pause
Debugger
Developers
Source Code
• Variable values
• Function call hierarchy

Languages & Debugging
4
Run & Debug on
Native Binaries

Debug Symbols / Info
5
Program
Binary

5
Program
Binary
Debug Info

5
Program
Binary
Debug Info
Source Code
Mapping

5
Program
Binary
Debug Info
Debugger
Consume
Source Code
Mapping

5
Program
Binary
Debug Info
Debugger
Consume
Source Code
Mapping
Pro
fi
ler

Today’s Topic
6
Program
Binary
Debug Info
Source Code
Compiler

Today’s Topic
6
Program
Binary
Debug Info
Source Code
Compiler
Debug Info…

Today’s Topic
6
Program
Binary
Debug Info
Source Code
Compiler
Debug Info…
… And Where They Come From

Example Input
8
struct Point { int x, y; };

int foo(int k, int c) {

struct Point point = {x: 0, y: 0};

if (c) {

point.x = k;

point.y = c;

}

return point.x + point.y;

}
*Host Architecture: x86_64

Example Input
8



if (c) {

point.x = k;

point.y = c;

}


}
$ cc -g -c demo.c -o demo.o

Example Input
8



if (c) {

point.x = k;

point.y = c;

}


}
$ cc -g -c demo.c -o demo.o
$ llvm-dva demo.o …

llvm-dva: Visualizing Debug Info
9
{CompileUnit} ‘demo.c'

....

3 {Function} extern not_inlined 'foo' -> 'int'

3 {Parameter} 'k' -> 'int'

{Location}

{Entry} fbreg -36

....

4 {Variable} 'point' -> 'Point'

{Location}

{Entry} fbreg -24

3 {Line}

{Code} 'endbr64'

{Code} 'pushq %rbp'

....

4 {Line}

{Code} 'movl $0x0, -0x8(%rbp)'


5 {Line}

{Code} 'cmpl $0x0, -0x18(%rbp)'

{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

10

....



{Location}

{Entry} fbreg -36

....


{Location}

{Entry} fbreg -24

3 {Line}

{Code} 'endbr64'

{Code} 'pushq %rbp'

....

4 {Line}



5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

11

....



{Location}

{Entry} fbreg -36

....


{Location}

{Entry} fbreg -24

3 {Line}

{Code} 'endbr64'

{Code} 'pushq %rbp'

....

4 {Line}



5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

12
5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Line numbers & instructions

12
5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Where ‘c’ is stored

12
5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Where ‘c’ is stored
PC Address Line Number
Assembly  
(NOT stored in debug info)
0x1C 5 cmpl $0x0,-0x18(%rbp)
0x20 5 je 0xc

13

....



{Location}

{Entry} fbreg -36

....


{Location}

{Entry} fbreg -24

3 {Line}

{Code} 'endbr64'

{Code} 'pushq %rbp'

....

4 {Line}



5 {Line}


{Code} 'je 0xc’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

14


{Location}

{Entry} fbreg -36

3 {Parameter} 'c' -> 'int'

{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Variable locations

15


{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Variable locations
Variable ‘point’
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
Current Stack Frame
Lo Address
Hi Address

15


{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Variable locations
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
Current Stack Frame
Lo Address
Hi Address
fbreg

15


{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Variable locations
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
-24
Current Stack Frame
Lo Address
Hi Address
fbreg

15


{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Variable locations
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
-24
-36
Current Stack Frame
Lo Address
Hi Address
fbreg

15


{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Variable locations
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
-24
-36
-40
Current Stack Frame
Lo Address
Hi Address
fbreg

16

1 {Struct} 'Point'

1 {Member} public 'x' -> 'int'

{Location}

{Entry} offset 0

1 {Member} public 'y' -> 'int'

{Location}

{Entry} offset 4



{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Type Layout
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
Current Stack Frame
Lo Address
Hi Address

16

1 {Struct} 'Point'

1 {Member} public 'x' -> 'int'

{Location}

{Entry} offset 0

1 {Member} public 'y' -> 'int'

{Location}

{Entry} offset 4



{Location}

{Entry} fbreg -36


{Location}

{Entry} fbreg -40


{Location}

{Entry} fbreg -24
Type Layout
Variable ‘k’
Variable ‘c’
Previous Frame Ptr
Return Address
Current Stack Frame
Lo Address
Hi Address
Field ‘x’
Field ‘y’
+0
+4

Other Common Debug Info Properties
17

• Scopes
17

• Scopes
• Advanced type information

• Type aliases

• Type hierarchy
17

Debug Info Standards
18
DWARF CodeView

18
DWARF CodeView
• Default format in Linux, Apple platforms, 
most of the Unix

18
DWARF CodeView
most of the Unix
• Supported by GNU & LLVM toolchain

18
DWARF CodeView
most of the Unix
• Container formats: DWO, DWP, dSYM

18
DWARF CodeView
most of the Unix
• Default format in Windows (MSVC)

18
DWARF CodeView
most of the Unix
• Supported by MSVC & LLVM toolchain

18
DWARF CodeView
most of the Unix
• Supported by MSVC & LLVM toolchain
• Container format: PDB

19
DWARF CodeView
Abstraction
llvm-dva
Logical View

Compilation Pipeline: A Crash Course
21
Source Code
Native Code

(e.g. *.o
fi
les)

21
Source Code
AST
Parse
Native Code

(e.g. *.o
fi
les)

21
Source Code
AST
Parse
Intermediate
Representation
(IR)
Native Code

(e.g. *.o
fi
les)

21
Source Code
AST
Parse
Intermediate
Representation
(IR)
Another
Intermediate
Representation
Native Code

(e.g. *.o
fi
les)

Debug Info in a Compilation Pipeline: Highlights
22
Source Code
AST
Parse
Intermediate
Representation
(IR)
Another
Intermediate
Representation
Native Code

(e.g. *.o
fi
les)

22
Source Code
AST
Parse
Intermediate
Representation
(IR)
Another
Intermediate
Representation
Native Code

(e.g. *.o
fi
les)
• How to “carry” debug info in IR ?

22
Source Code
AST
Parse
Intermediate
Representation
(IR)
Another
Intermediate
Representation
Native Code

(e.g. *.o
fi
les)
• Correctly translate from source to debug info in IR

22
Source Code
AST
Parse
Intermediate
Representation
(IR)
Another
Intermediate
Representation
Native Code

(e.g. *.o
fi
les)
• Correctly translate from source to debug info in IR
• Preserve debug info across transformations (e.g. optimizations)

Case Study: LLVM / Clang
23
void foo() {

int x = 9;

int y = 4;

}
C Source (foo.c)

23
define void @foo() {

%1 = alloca i32

%2 = alloca i32

store i32 9, i32* %1

store i32 4, i32* %2

ret void

}
void foo() {

int x = 9;

int y = 4;

}
C Source (foo.c)
LLVM IR (foo.ll)
$ clang -emit-llvm -S foo.c -o foo.ll

23
define void @foo() {

%1 = alloca i32

%2 = alloca i32

store i32 9, i32* %1

store i32 4, i32* %2

ret void

}
void foo() {

int x = 9;

int y = 4;

}
C Source (foo.c)
LLVM IR (foo.ll)
$ clang -emit-llvm -S foo.c -o foo.ll
Allocating stack space

24
$ clang -g -emit-llvm -S foo.c -o foo.ll

Debug Info in LLVM IR
24
define void @foo() !dbg !7 {

%1 = alloca i32

%2 = alloca i32

call void @llvm.dbg.declare(metadata i32* %1, metadata !11, …), !dbg !13

store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR (foo.ll)
$ clang -g -emit-llvm -S foo.c -o foo.ll

25

%1 = alloca i32

%2 = alloca i32

store i32 9, i32* %1, !dbg !13

store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
void foo() {

int x = 9;

int y = 4;

}
C Source
1

2

3

4
LLVM IR

25

%1 = alloca i32

%2 = alloca i32

store i32 9, i32* %1, !dbg !13

store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
void foo() {

int x = 9;

int y = 4;

}
C Source
1

2

3

4
LLVM IR
!13 = !DILocation(line: 2, column: 7, ...) At the bottom of IR
fi
le

25

%1 = alloca i32

%2 = alloca i32

store i32 9, i32* %1, !dbg !13

store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
void foo() {

int x = 9;

int y = 4;

}
C Source
1

2

3

4
LLVM IR
!13 = !DILocation(line: 2, column: 7, ...)
At the bottom of IR
fi
le

26

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR

27

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR

27

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR
!11 = !DILocalVariable(name: "x", ..., line: 2, ...)

27

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR
Associated

27

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR
Associated
representing variable’s location during runtime

27

%1 = alloca i32

%2 = alloca i32


store i32 9, i32* %1, !dbg !13


store i32 4, i32* %2, !dbg !15

ret void, !dbg !16

}
LLVM IR
!14 = !DILocalVariable(name: "y", ..., line: 3, ...)

Visualizing Debug Info
28
$ cc -g -c foo.c -o foo.o
$ llvm-dva foo.o …

29
{CompileUnit} ‘foo.c’

1 {Function} extern not_inlined ‘foo’…

2 {Variable} 'x' -> 'int'

{Location}

{Entry} fbreg -4

3 {Variable} 'y' -> 'int'

{Location}

{Entry} fbreg -8

1 {Line}

{Code} 'pushq %rbp'

{Code} 'movq %rsp, %rbp'

2 {Line}


3 {Line}


4 {Line}

{Code} 'popq %rbp'

{Code} 'retq'

4 {Line}

29



{Location}

{Entry} fbreg -4


{Location}

{Entry} fbreg -8

1 {Line}

{Code} 'pushq %rbp'


2 {Line}


3 {Line}


4 {Line}

{Code} 'popq %rbp'

{Code} 'retq'

4 {Line}
store i32 9, i32* %1, !dbg !13

29



{Location}

{Entry} fbreg -4


{Location}

{Entry} fbreg -8

1 {Line}

{Code} 'pushq %rbp'


2 {Line}


3 {Line}


4 {Line}

{Code} 'popq %rbp'

{Code} 'retq'

4 {Line}
%1 = alloca i32

call void @llvm.dbg.declare(metadata i32* %1, metadata !11, …)
store i32 9, i32* %1, !dbg !13

30
…And Developers Lives a Happy Debu
gg
ing Life Ever Since

Debug Info in Optimized Binaries

32
$ cc -g -O2 -c foo.c

32
$ cc -g -O2 -c foo.c
Why?

Debugging Optimized Programs
33

• Games

• Di
ffi
cult (if not nearly impossible) to debug low-FPS games
33

• Games

• Di
ffi
• Embedded systems

• Size optimization is usually a hard requirement
33

• Games

• Di
ffi
• Embedded systems

• Size optimization is usually a hard requirement
• Easier debugging on release binaries

• E.g. Using core
fi
les directly from customers
33

Recall: Early Example Code
34



if (c) {

point.x = k;

point.y = c;

}


}
$ clang -g foo.c -emit-llvm -S

35
define i32 @foo(i32 %0, i32 %1) !dbg !7 {

%3 = alloca i32

%4 = alloca i32

%5 = alloca %struct.Point

store i32 %0, i32* %3

call void @llvm.dbg.declare(metadata i32* %3, metadata !12, ...), !dbg !13

store i32 %1, i32* %4


call void @llvm.dbg.declare(metadata %struct.Point* %5, metadata !16, ...), !dbg !21

...

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

36

%3 = alloca i32

%4 = alloca i32


store i32 %0, i32* %3


store i32 %1, i32* %4



...

}
Allocating stack space for ‘k’, ‘c’, and ‘point’



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

37

%3 = alloca i32

%4 = alloca i32


store i32 %0, i32* %3


store i32 %1, i32* %4



...

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Associating stack spaces with the source variables

Optimized Example Code
38



if (c) {

point.x = k;

point.y = c;

}


}
$ clang -O2 foo.c -emit-llvm -S

39
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

39
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Values are not put on stack anymore!

40
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Prompt:  
What is the runtime location/value of source
variable ‘point’ ?

40
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Prompt:  
(gdb) print point

<your answer>

(gdb)

40
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Prompt:  
(gdb) print point

<your answer>

(gdb)
?

41
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Which instructions should I annotate
with source line 4, 6, and 7 ?

41
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
?

Other Common Challenges
42
Updating Scopes
Function Inlining

Preserving Debug Info in Optimized Code
• Most modern compilers preserve debug information as part of the code
transformations (e.g. optimizations)
43

• Challenges
43

• Challenges
• It’s easy for compiler developers to forget to handle debug info
43

• Challenges
• It’s not possible to (faithfully) map optimized code back to source
locations in every cases
43

• Challenges
• It’s not possible to (faithfully) map optimized code back to source
locations in every cases
• Debug info in optimized binaries is preserved on a best-e
ff
ort basis
43

Case Study:

How LLVM Mitigates These Issues

Recall: Optimized Code Example
45
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Prompt:  
(gdb) print point

<your answer>

(gdb)
?

Optimized Example Code w/ Debug Info
46



if (c) {

point.x = k;

point.y = c;

}


}
$ clang -O2 -g foo.c -emit-llvm -S

Preserving Variable Locations / Values in LLVM
47
define i32 @foo(i32 %0, i32 %1) {

call void @llvm.dbg.value(metadata i32 %0, metadata !13, metadata !DIExpression())


call void @llvm.dbg.value(metadata i32 0, metadata !15, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32))


%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

call void @llvm.dbg.value(metadata i32 %1, metadata !15, metadata !DIExpression(DW_OP_LLVM_fragment, 32,32))


%5 = add nsw i32 %4, %1

ret i32 %5

}

48
define i32 @foo(i32 %0, i32 %1) {



call void @llvm.dbg.value(metadata i32 0, metadata !15, metadata …)


%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

…

}

48
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

…

} !13 = !DILocalVariable(name: "k", arg: 1, ..., line: 3, type: !11)

!14 = !DILocalVariable(name: "c", arg: 2, ..., line: 3, type: !11)

49
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

…


LLVM Intrinsic Name Description
llvm.dbg.declare
Specify the (memory) location for a source variable.  
Only a single occurrence per source variable is allowed
llvm.dbg.value
Designate a (runtime) value to a source variable

Can have multiple occurrences for a source variable 
(akin to updating di
ff
erent values on a source variable)

49
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

…



50
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

50
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
!15 = !DILocalVariable(name: "point", ..., line: 4, type: !16)

50
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
Why this is not “point.x” or “point.y” ?

51
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}
A fragment of source variable “point” has value 0.

52
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}
A 32-bit, o
ff
set 0 fragment of source variable “point” has value 0.

52
define i32 @foo(i32 %0, i32 %1) {

...



%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

...

}
A 32-bit, o
ff
set 0 fragment of source variable “point” has value 0.
(i.e. the “point.x”
fi
eld)

53
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0



%5 = add nsw i32 %4, %1

ret i32 %5

}
First, “point.x” & “point.y” were initialized to zeros…

54
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0



%5 = add nsw i32 %4, %1

ret i32 %5

}
After these two instructions…

55
define i32 @foo(i32 %0, i32 %1) {





%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0



%5 = add nsw i32 %4, %1

ret i32 %5

}
“point.x” & “point.y” now have values %4 and %1, respectively

Recall: Optimized Code Example
56
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

Preserving Debug Locations in LLVM
57
Principles
Actions

57
Principles
Actions
Keep

57
Principles
Actions
Keep Merge

57
Principles
Actions
Keep Merge Delete

57
Principles
• Don’t create misleading debug locations that are only correct in some cases
Actions
Keep Merge Delete

57
Principles
• If you’re not sure, just drop the debug locations
Actions
Keep Merge Delete

57
Principles
• If you’re not sure, just drop the debug locations
• Otherwise, preserve as much debug locations as possible
Actions
Keep Merge Delete

Debug Locations in our Example Code
58
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

59
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10

59
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
line 6 will only hit conditionally

59
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
“Don’t create misleading debug locations  
that are only correct in some cases”

59
define i32 @foo(i32 %0, i32 %1) {

%3 = icmp eq i32 %1, 0

%4 = select i1 %3, i32 0, i32 %0

%5 = add nsw i32 %4, %1

ret i32 %5

}



if (c) {

point.x = k;

point.y = c;

}


}
1

2

3

4

5

6

7

8

9

10
“Don’t create misleading debug locations  
that are only correct in some cases”
Delete

60
Prantl and Kumar, US LLVM Dev Meeting 2020
Guidelines for updating debug locations in code transformations
https://tinyurl.com/llvmdebuginfo
Full write-up:

Preserving Debug Info in LLVM: Automatically
61

61
• Common transformation APIs will help you to keep / merge debug
location underlying

61
location underlying
• e.g. Replace All Uses With (RAUW)

61
location underlying
• Handy debug info utilities to make debug info manipulations easier

61
location underlying
• salvageDebugInfo helps you to generate llvm.dbg.value intrinsics

61
location underlying
• salvageDebugInfo helps you to generate llvm.dbg.value intrinsics
• Instruction::applyMergedLocation helps you to merge debug
locations

Summary
• We learned how line source locations (e.g. line number) and variable
locations are represented in debug info
62

Summary
• We learned how debug info is stored in LLVM IR
62

Summary
• We learned how debug info is stored in LLVM IR
• The challenges of debug info in optimized binaries, and how LLVM
mitigates those issues
62

Contact
63
Email: minyihh@uci.edu

GitHub: mshockwave

LinkedIn: https://www.linkedin.com/in/bekketmcclane/

llvm-dva
• LLVM DVA is still in the process of upstreaming to LLVM

• RFC: https://discourse.llvm.org/t/llvm-dev-rfc-llvm-dva-debug-
information-visual-analyzer/62570

• You can, however, build it with this patch: https://reviews.llvm.org/D88661

• The llvm-dva command I used in this slides:

• llvm-dva --attribute=location,format --output-sort=offset —
print=symbols,lines,instructions,scopes <object file>
66

Debug Info & Object Files
67
#include <greet.h>

void hello() {…}
#include <greet.h>

void bye() {…}
foo.c bar.c

67
foo.o
Debug info for:

foo.c + greet.h
#include <greet.h>

void hello() {…}
#include <greet.h>

void bye() {…}
foo.c bar.c

67
foo.o
Debug info for:

foo.c + greet.h
#include <greet.h>

void hello() {…}
#include <greet.h>

void bye() {…}
foo.c bar.c
bar.o
Debug info for:

bar.c + greet.h

Splitting (DWARF) Debug Info
68
#include <greet.h>

void hello() {…}
#include <greet.h>

void bye() {…}
foo.c bar.c
foo.o
foo.c.dwo
bar.o
bar.c.dwo
greet.h.dwo

Splitting (DWARF) Debug Info
68
#include <greet.h>

void hello() {…}
#include <greet.h>

void bye() {…}
foo.c bar.c
foo.o
foo.c.dwo
bar.o
bar.c.dwo
greet.h.dwo
Debug info
container

Splitting Debug Info
• Saving disk spaces

• E.g. Able to de-duplicate type information
69


• Attaching debug info
fi
les (e.g. *.dwo) on release binaries

• E.g. When debugging crashes reported by users
69


• Attaching debug info
fi
les (e.g. *.dwo) on release binaries

• E.g. When debugging crashes reported by users
• Apple platforms are doing this by default (i.e. *.dSYM folders)
69

70
{Struct} 'Point'

{Member} public 'x' -> 'int'

{Location}

{Entry} offset 0

{Member} public 'y' -> 'int'

{Location}

{Entry} offset 4

{Function} extern not_inlined 'foo' -> 'int'

{Variable} 'point' -> 'Point'

{Location}

{Entry} fbreg -24
llvm-dva output DWARF dump

70
0x0000002d: DW_TAG_structure_type

DW_AT_name

("Point")

…

0x0000003a: DW_TAG_member

DW_AT_name

("x")

DW_AT_data_member_location (0x00)

…

0x00000045: DW_TAG_member

DW_AT_name

("y")

DW_AT_data_member_location (0x04)

…

0x00000058: DW_TAG_subprogram

DW_AT_name

("foo")

DW_AT_decl_line

(3)

DW_AT_decl_column

(0x05)

…

0x00000090: DW_TAG_variable

DW_AT_name

("point")

DW_AT_decl_line

(4)

…

DW_AT_location

(DW_OP_fbreg -24)
{Struct} 'Point'

{Member} public 'x' -> 'int'

{Location}

{Entry} offset 0

{Member} public 'y' -> 'int'

{Location}

{Entry} offset 4

{Function} extern not_inlined 'foo' -> 'int'

{Variable} 'point' -> 'Point'

{Location}

{Entry} fbreg -24
llvm-dva output DWARF dump

Validating Debug Info in
Optimized Code

LLVM Debugify
72
define i32 @add(i32 %0, i32 %1) {

%x = add i32 %0, %1

ret i32 %x

}

LLVM Debugify
72
define i32 @add(i32 %0, i32 %1) {

%x = add i32 %0, %1

ret i32 %x

}
define i32 @add(i32 %0, i32 %1) !dbg !7 {

%x = add i32 %0, %1, !dbg, !8

ret i32 %x, !dbg, !9

}
!7 = !DISubprogram(name: "add", line: 1,...)

!8 = !DILocation(line: 2, ...)


LLVM Debugify
• Adding arti
fi
cially-created (fake) debug info metadata to every instructions
72
define i32 @add(i32 %0, i32 %1) {

%x = add i32 %0, %1

ret i32 %x

}

%x = add i32 %0, %1, !dbg, !8


}



LLVM Debugify
• Adding arti
fi
• Useful to test if a compiler transformation preserves debug info as expected
72
define i32 @add(i32 %0, i32 %1) {

%x = add i32 %0, %1

ret i32 %x

}

%x = add i32 %0, %1, !dbg, !8


}



LLVM Debugify
• Adding arti
fi
• Useful to test if a compiler transformation preserves debug info as expected
• Don’t need a compiler frontend to generate debug info
72
define i32 @add(i32 %0, i32 %1) {

%x = add i32 %0, %1

ret i32 %x

}

%x = add i32 %0, %1, !dbg, !8


}



DExTer: End-to-End Debug Info Tests
73
void bar(int *test) {}

int main() {

int test;

test = 23;

bar(&test); // DexLabel('before_bar')

return test; // DexLabel('after_bar')

}

// DexExpectWatchValue('test', '23', on_line=ref('before_bar'))

// DexExpectWatchValue('test', '23', on_line=ref('after_bar'))
foo.c

73

int main() {

int test;

test = 23;



}


Expectations / Assertions
foo.c

73

int main() {

int test;

test = 23;



}


foo.o
foo.c
Debugger

73

int main() {

int test;

test = 23;



}


foo.o
foo.c
Debugger
DExTer

73

int main() {

int test;

test = 23;



}


foo.o
foo.c
Debugger
DExTer
Validation

Example: Merging Debug Locations
74
store i32 9, i32* %1

...
store i32 9, i32* %1

...
store i32 9, i32* %1

...

Debug Information And Where They Come From

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Debug Information And Where They Come From

Similaire à Debug Information And Where They Come From (20)

Plus de Min-Yih Hsu

Plus de Min-Yih Hsu (14)

Dernier

Dernier (20)

Debug Information And Where They Come From