2. About Me
Writing code for the past ~15 years.
1. Dev at IDF intl
2. Team lead, architect at IAI Space Industries
3. CEO at VisualTao
4. Director - AutoCAD Web & Mobile, GM Autodesk IL
5. CEO at Takipi
3. Overview
1. What is bytecode
2. The 3 biggest differences between source code and bytecode
3. 5 things you should know about bytecode
4. Practical uses
4. What is bytecode?
A set of low-level instructions to be executed by the JVM.
~200 instruction types, each ~1-2 bytes in size.
Some instructions are very similar to Java, some completely
different.
5. Bytecode is very similar to Assembly
(That’s why we avoid it…)
assembly - > exec file -> OS
bytecode -> .class file -> JVM
.cpp -> g++ -> exec file -> OS
.java -> JavaC -> .class file -> JVM
.scala-> ScalaC -> .class file -> JVM
6. The 3 biggest differences between
source and byte code
7. 1. No variables
Bytecode employs an Assembly-like register stack
known as the locals stack to hold variables.
Values of fields, functions and of binary operations (+, -, * ..) are held in a
stack known as the operand stack.
8. public getField() : double
L0
public class LocalVars ALOAD 0: this
{ GETFIELD NoLocalVars.state : boolean
private int intField; IFEQ L1
private double doubleField; L2
ALOAD 0: this
private boolean state; GETFIELD NoLocalVars.intField : int
ICONST_1
public double getField() IADD
{ ISTORE 1
if (state) L3
{ GETSTATIC System.out : PrintStream
ILOAD 1: a
int a = intField + 1;
INVOKEVIRTUAL PrintStream.println(int) : void
return a;
L4
} ILOAD 1: a
else I2D
{ DRETURN
double b = doubleField + 1; L1
return b; ALOAD 0: this
} GETFIELD NoLocalVars.doubleField : double
} DCONST_1
} DADD
DSTORE 1
L5
GETSTATIC System.out : PrintStream
DLOAD 1: b
Notes
INVOKEVIRTUAL PrintStream.println(double) : void
L6
Notice how the same register slot (1) is DLOAD 1: b
re-used between blocks for different variables. DRETURN
L7
The variable meta-data table describes the LOCALVARIABLE this NoLocalVars L0 L7 0
mappings between registers and source code LOCALVARIABLE a int L3 L1 1
variables LOCALVARIABLE b double L5 L7 1
MAXSTACK = 4
MAXLOCALS = 3
9. 2. No binary logical operators
No built-in support for &&, ||, ^
Compilers implement these using jump instructions.
10. public and(boolean, boolean) : void
public class NoLogicalOperators
L0
{ ILOAD 1: a
public void and(boolean a, boolean b) { IFEQ L1
if (a && b) { ILOAD 2: b
System.out.println("its true"); IFEQ L1
} L2
} GETSTATIC System.out : PrintStream
LDC "its true"
public void or(boolean a, boolean b) { INVOKEVIRTUAL PrintStream.println(String) : void
if (a || b) { L1
System.out.println("its true"); RETURN
L3
}
}
}
public or(boolean, boolean) : void
L0
ILOAD 1: a
IFNE L1
ILOAD 2: b
IFEQ L2
L1
GETSTATIC System.out : PrintStream
Notes LDC "its true"
INVOKEVIRTUAL PrintStream.println(String) : void
Notice how both && and || operators are L2
implemented using jump instructions who evaluate RETURN
the last value if the operand stack L3
11. public class NoLogicalOperators public orX2(boolean, boolean, boolean, boolean) : void
{ L0
public void orX2(boolean a, boolean b, ILOAD 1: a
boolean c, boolean d) { IFNE L1
if ((a || b) && (c || d)) ILOAD 2: b
{ IFEQ L2
System.out.println("its true"); L1
} ILOAD 3: c
} IFNE L3
} ILOAD 4: d
IFEQ L2
L3
GETSTATIC System.out : PrintStream
LDC "its true"
INVOKEVIRTUAL PrintStream.println(String) : void
L2
RETURN
L4
Notes
For composite ||, && conditions compilers will
generate multiple jump combinations
12. 3. No loop constructs
There’s no built-in support for while, for, for-each loops.
Compilers implement these using jump instructions.
13. public class Loops public forLoop(int) : void
{ L0
public void forLoop(int n) { ICONST_0
for (int i = 0; i < n; i++) { ISTORE 2
System.out.println(i); L1
} GOTO L2
} L3
} GETSTATIC System.out : PrintStream
ILOAD 2: i
INVOKEVIRTUAL PrintStream.println(int) : void
L4
IINC 2: i 1
L2
ILOAD 2: i
ILOAD 1: n
IF_ICMPLT L3
L5
RETURN
L6
LOCALVARIABLE this Loops L0 L6 0
LOCALVARIABLE n int L0 L6 1
LOCALVARIABLE i int L1 L5 2
Notes
A for loop is implemented using a conditional jump
instruction comparing i and n
14. public class Loops public whileLoop(int) : void
{ L0
public void whileLoop(int n) ICONST_0
{ ISTORE 2
int i = 0; L1
GOTO L2
while (i < n) L3
{ GETSTATIC System.out : PrintStream
System.out.println(i); ILOAD 2: i
i++; INVOKEVIRTUAL PrintStream.println(int) : void
} L4
} IINC 2: i 1
} L2
ILOAD 2: i
ILOAD 1: n
IF_ICMPLT L3
L5
RETURN
L6
LOCALVARIABLE this Loops L0 L6 0
LOCALVARIABLE n int L0 L6 1
LOCALVARIABLE i int L1 L6 2
Notes
This while loop is also implemented using a
conditional jump instruction comparing i and n. The
bytecode is nearly identical to the previous loop.
15. public class Loops
{ public forEachLoop(List) : void
public void forEachLoop(List<String> strings) { L0
for (String s : strings) { ALOAD 1: strings
System.out.println(s); INVOKEINTERFACE List.iterator() : Iterator
} ASTORE 3
} GOTO L1
} L2
ALOAD 3
INVOKEINTERFACE Iterator.next() : Object
CHECKCAST String
ASTORE 2
L3
GETSTATIC System.out : PrintStream
ALOAD 2: s
INVOKEVIRTUAL PrintStream.println(String) : void
L1
ALOAD 3
INVOKEINTERFACE Iterator.hasNext() : boolean
IFNE L2
L4
RETURN
L5
LOCALVARIABLE this Loops L0 L5 0
LOCALVARIABLE strings List L0 L5 1
Notes // declaration: java.util.List<java.lang.String>
LOCALVARIABLE s String L3 L1 2
A for-each loop is generated by the javaC compiler }
by jumping against the hasNext() method. The result
bytecode is unaware of the for-each construct.
Also notice how register 3 is used to hold the
iterator
16. 5 Things you should know about bytecode
that affect everyday programming
17. 1. No String support
Like in C, there’s no built-in support for strings, only char arrays.
Compilers usually use StringBuilder to compensate.
No penalty for concatenating different data types
18. public class ImplicitStrings // access flags 0x1
{ public toString(int, int) : String
public String toString(int a, int b) L0
{ NEW StringBuilder
DUP
String c = "Hello " + a + "World" + b;
LDC "Hello "
return c;
INVOKESPECIAL StringBuilder.<init>(String) : void
} ILOAD 1: a
} INVOKEVIRTUAL StringBuilder.append(int) : StringBuilder
LDC "World"
INVOKEVIRTUAL StringBuilder.append(String) : StringBuilder
ILOAD 2: b
INVOKEVIRTUAL StringBuilder.append(int) : StringBuilder
INVOKEVIRTUAL StringBuilder.toString() : String
ASTORE 3
L1
ALOAD 3: c
ARETURN
L2
LOCALVARIABLE this ImplicitStrings L0 L2 0
LOCALVARIABLE a int L0 L2 1
LOCALVARIABLE b int L0 L2 2
LOCALVARIABLE c String L1 L2 3
Notes
JavaC uses java.lang.StringBuilder to combine
(+)strings. Different overloads of the .append()
method are used to concat different data types.
19. public toString1(int, int) : String
public class ImplicitStrings L0
{ NEW StringBuilder
public String toString1(int a, int b) DUP
{ LDC "Hello"
String c; INVOKESPECIAL StringBuilder.<init>(String) : void
ILOAD 1: a
INVOKEVIRTUAL StringBuilder.append(int) : StringBuilder
c = "Hello" + a;
INVOKEVIRTUAL StringBuilder.toString() : String
c += "World" + b; ASTORE 3
L1
return c; NEW StringBuilder
} DUP
} ALOAD 3: c
INVOKESTATIC String.valueOf(Object) : String
INVOKESPECIAL StringBuilder.<init>(String) : void
LDC "World"
INVOKEVIRTUAL StringBuilder.append(String) : StringBuilder
ILOAD 2: b
INVOKEVIRTUAL StringBuilder.append(int) : StringBuilder
INVOKEVIRTUAL StringBuilder.toString() : String
ASTORE 3: c
L2
ALOAD 3: c
ARETURN
L3
Notes
While this code is identical to the previous example
in terms of functionality, there’s a performance
penalty to note as 2 StringBuilders are constructed
20. 2. Only 4 primitive types
Bytecode only operates on 4 primitives types ( int, float, double, long)
vs. the 8 Java primitives.
Doesn’t operate on char, bool, byte, short (treated as ints)
21. public mulByeShort(byte, short) : void
public class BytecodePrimitives L0
{ GETSTATIC System.out : PrintStream
public void mulByeShort(byte b, short c) ILOAD 1: b
{ ILOAD 2: c
System.out.println(b * c); IMUL
} INVOKEVIRTUAL PrintStream.println(int) : void
L1
public void mulInts(int b, int c) RETURN
{ L2
LOCALVARIABLE this B_BytecodePrimitives L0 L2 0
System.out.println(b * c);
LOCALVARIABLE b byte L0 L2 1
}
LOCALVARIABLE c short L0 L2 2
}
public mulInts(int, int) : void
L0
GETSTATIC System.out : PrintStream
ILOAD 1: b
ILOAD 2: c
IMUL
INVOKEVIRTUAL PrintStream.println(int) : void
L1
RETURN
L2
LOCALVARIABLE this B_BytecodePrimitives L0 L2 0
LOCALVARIABLE b int L0 L2 1
LOCALVARIABLE c int L0 L2 2
Notes
Notice how the bytecode for these 2 methods is
identical, regardless of the difference in var types
22. public class BytecodePrimitives public printIfTrue(boolean) : void
{ L0
public void printIfTrue(boolean b) ILOAD 1: b
{ IFEQ L1
L2
if (b)
GETSTATIC System.out : PrintStream
{
LDC "Hi"
System.out.println("Hi"); INVOKEVIRTUAL PrintStream.println(String) : void
} L1
} RETURN
L3
public void printIfN0(int i) LOCALVARIABLE this B_BytecodePrimitives L0 L3 0
{ LOCALVARIABLE b boolean L0 L3 1
if (i != 0)
{ public printIfN0(int) : void
System.out.println("Hi"); L0
ILOAD 1: i
}
IFEQ L1
}
L2
} GETSTATIC System.out : PrintStream
LDC "Hi"
INVOKEVIRTUAL PrintStream.println(String) : void
L1
RETURN
L3
LOCALVARIABLE this B_BytecodePrimitives L0 L3 0
Notes
LOCALVARIABLE i int L0 L3 1
The same observation is also true when evaluating
conditions. See how both boolean and int
operations are treated the same.
23. 3. Using nested classes?
Compilers will add synthetic $this fields.
If you’re not making calls to your outer-class - don’t forget to add a static
modifier.
24. public class C_NestedClasses
public class NestedClasses
{
{ public class C_NestedClasses$NestedClass
public class NestedClass {
{ final C_NestedClasses this$0
}
public <init>(C_NestedClasses) : void
public static class StaticNestedClass L0
{ ALOAD 0: this
ALOAD 1
} PUTFIELD C_NestedClasses$NestedClass.this$0 : C_NestedClasses
ALOAD 0: this
}
INVOKESPECIAL Object.<init>() : void
RETURN
L1
LOCALVARIABLE this C_NestedClasses$NestedClass L0 L1 0
}
public class C_NestedClasses$StaticNestedClass
{
public <init>() : void
L0
ALOAD 0: this
INVOKESPECIAL Object.<init>() : void
Notes RETURN
L1
The NestedClass inner-class has an implicit $this0
created for him, and assigned in the constructor. }
25. Using nested classes (2)?
Try and avoid implicitly creating bridge methods by invoking
private members (use protected)
26. static access$0(D_BridgeMethods) : int
public class BridgeMethods
L0
{
ALOAD 0
private int member; GETFIELD D_BridgeMethods.member : int
IRETURN
public class BridgeMethodClass MAXSTACK = 1
{ MAXLOCALS = 1
public void printBridge()
{ public printBridge() : void
System.out.println(member); L0
} GETSTATIC System.out : PrintStream
} ALOAD 0: this
GETFIELD D_BridgeMethods$BridgeMethodClass.this$0 : D_BridgeMethods
}
INVOKESTATIC D_BridgeMethods.access$0(D_BridgeMethods) : int
INVOKEVIRTUAL PrintStream.println(int) : void
L1
RETURN
L2
LOCALVARIABLE this D_BridgeMethods$BridgeMethodClass L0 L2 0
Notes
When a private field or method is invoked javaC will
add synthetic bridge methods to allow the internal
class to access private members of its outer-class
27. 4. Boxing and unboxing
Boxing is added by the Java/Scala compiler.
There’s no such concept in bytecode or in the JVM.
Watch out for NullPointerExceptions
28. public printSqr(int) : void
L0
ILOAD 1: a
ISTORE 2
public class Boxing
L1
{ ILOAD 1: a
public void printSqr(int a) INVOKESTATIC Integer.valueOf(int) : Integer
{ ASTORE 3
int a1 = a; L2
GETSTATIC System.out : PrintStream
Integer a2 = a; ILOAD 2: a1
ALOAD 3: a2
System.out.println(a1 * a2); INVOKEVIRTUAL Integer.intValue() : int
} IMUL
INVOKEVIRTUAL PrintStream.println(int) : void
L3
public void check(Integer i)
RETURN
{ L4
if (i == 0) LOCALVARIABLE this E_Boxing L0 L4 0
{ LOCALVARIABLE a int L0 L4 1
System.out.println("zero"); LOCALVARIABLE a1 int L1 L4 2
} LOCALVARIABLE a2 Integer L2 L4 3
}
} public check(Integer) : void
L0
ALOAD 1: i
INVOKEVIRTUAL Integer.intValue() : int
IFNE L1
L2
Notes GETSTATIC System.out : PrintStream
LDC "zero"
Notice how javaC implicitly invokes the various INVOKEVIRTUAL PrintStream.println(String) : void
java.lang.Integer methods. L1
RETURN
Other compilers, such as scalaC, use their own L3
LOCALVARIABLE this E_Boxing L0 L3 0
boxed types
LOCALVARIABLE i Integer L0 L3 1
}
29. 5. 3 bytecode myths
1. Bytecode supports multiple inheritance
2. Illegal bytecode can crash the JVM (it’s blocked by the JVM verifier)
3. Low-level bytecode can operate outside the JVM sandbox
30. 3 Main bytecode uses
1. Building a compiler
2. Static analysis
3. JVM bytecode instrumentation
31. Building a “better Java” through Scala and ScalaC.
A new OO/functional language which compiles into
standard JVM bytecode.
Transparent from the JVM’s perspective.
32. Takipi - Overview
Explain the cause of server exceptions, latency and unexpected code behavior
at scale.
Help R&D teams solve errors and downtime in production systems, without
having to re-deploy code or sift through log files.
33. Takipi & bytecode
1. Index bytecode in the cloud.
2. When an exception occurs, query the DB to understand which variables,
fields and conditions are causing it.
3. Instrument new bytecode to log the values causing the exception.
4. Present a “story” of the exception to the developer.
34. Thanks!
Join our private beta - takipi.com/signup
tal.weiss@takipi.com
@takipid (tweeting about Java, Scala, DevOps and Cloud)