There is an increasing interest in understanding and analyzing the use of resources in software and hardware systems. Certifying memory consumption is vital to ensure safety in embedded systems as well as proper administration of their power consumption; understanding the number of messages sent through a network is useful to detect performance bottlenecks or reduce communication costs, etc. Assessing resource usage is indeed a cornerstone in a wide variety of software-intensive system ranging from embedded to Cloud computing. It is well known that inferring, and even checking, quantitative bounds is difficult (actually undecidable). Memory consumption is a particularly challenging case of resource-usage analysis due to its non-accumulative nature. Inferring memory consumption requires not only computing bounds for allocations but also taking into account the memory recovered by a GC. In this talk I will present some of the work our group have been performing in order to automatically analyze heap memory requirements. In particular, I will show some basic ideas which are core to our techniques and how they were applied to different problems, ranging from inferring sizes of memory regions in real-time Java to analyzing heap memory requirements in Java/.Net. Then, I will introduce our new compositional approach which is used to analyze (infer/verify) Java and .Net programs. Finally, I will explain some limitations of our approach and discuss some key challenges and directions for future research.
Presentation on how to chat with PDF using ChatGPT code interpreter
ByteCode 2012 Talk: Quantitative analysis of Java/.Net like programs to understand heap memory requirements
1. Quantitative analysis of Java/.Net like
programs to understand heap memory
requirements
Diego Garbervetsky
Departamento de Computación
Facultad de Ciencias Exactas y Naturales
Universidad de Buenos Aires (UBA) ByteCode 2012
3. Static analysis of heap memory in Java like programs
• Analysis of memory allocations is very hard
– Problem undecidable in general
– Impossible to find an exact expression for the number on
allocated objects
• Predicting actual heap memory requirements is
harder
– Garbage Collection memory required <= memory
requested/allocated (live object <= object allocated)
– Requires analysis of object lifetime
4. #Live Objects != Memory required
• Analysis actual memory consumption also requires understanding
internals of the underlying VM, the memory manager(and
potentially the operating system).
• We still believe analyzing number of allocations and live objects is
cornerstone
5. Stack allocation vs Heap allocation
• Heap allocation: Associated with data structures
created by the app, controlled by GC
– requires analysis of object lifetime
• Stack allocation: Frame bounding + Stack Depth
– requires analysis of recursive method calls
• Popeea [CNPQ08]: ISMM08, Albert [ISMM07], and approaches for
functional languages
6. Example
public static D[][] init(int n, int m) {
D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i);
return matrix;
}
• n*m objects of type D and an array of n*m references (D[][])
• Ignoring types, we can consider the total allocated as 2(n.m)
or n.m+1 objects (depending how we count arrays)
7. Verification
public static D[][] init(int n, int m)
ensures memoryAlloc <= 2*n*m;
{
D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
invariant memoryAlloc <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryAlloc <= n*m + n*i+j;
matrix[i][j] = new D(i);
return matrix;
}
• Non linear SMT solver + invariants
8. Verification
public static D[][] init(int n, int m)
ensures memoryReq <= 2*n*m + 1;
{
escapes D[][] matrix = new Object[n][m];
for(int i = 0; i < n; i++)
invariant memoryReq <= n*m + i;
for(int j = 0; j < m; j++)
invariant memoryReq <= n*m + n*i+j + 1;
escapes matrix[i][j] = new D(i);
collectable A a = new A();
return matrix;
}
• Include lifetime/sharing/shape information
9. Inference of bounds in imperative languages
Some approaches for imperative languages:
• Abstract Interpretation [e.g., BCJP09]
• Recurrence equations [e.g., AGG07/09]
• Iteration patterns / Ranking functions [e.g., GZ10]
• Counting / Iteration spaces [BGY05/06,BFGY08,…]
10. [e.g., BCJP09]
Abstract interpretation
• Use counters to represent memory allocation (and deallocation)
• Compute program invariants (using a lattice and fixpoint)
public static D[][] init(int n, int m)
{
D[][] matrix = new D[n][m];
objects += n*m
for(int i = 0; i < n; i++)
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i);
objects +=1
return matrix;
}
• Requires inferring non-linear invariants
– Rodriguez-Carbonell, Müller-Olm, Cachera, etc
11. [e.g., AGG07/09]
Recurrence Equations
• Computes a set of recurrence equations
– Then tries to find a close-form solution to the recurrence
equation
public static D[][] init(int n, int m) init(n,m) = loop1(n,m,0)
{
{i<n,i'=i+1}
D[][] matrix = new D[n][m];
loop1(n,m,i) = loop2(m,0) + loop1(n,m,i')
for(int i = 0; i < n; i++) {i>=n}
loop1(n,m,i) = 0
for(int j = 0; j < m; j++)
matrix[i][j] = new D(i); {j<m, j'=j+1}
loop2(m,j) = 4 + c(D) + loop2(m,j')
return matrix; {j>=m}
}
loop2(m,j) = 0
Solution init(n, m) = 2.n.m
- Albert and collaborators. / Costa Analyzer
12. BGY05/06
for(i=0;i<n;i++) {0≤ i < n, 0≤j<i}: a set of
for(j=0;j<i;j++) constraints describing a iteration
• new C() space
Dynamic Memory request number of visits to new statements
number of possible variable assignments at its control location
number of integer solutions of a predicate constraining variable
assignments at its control location (i.e. an invariant) j
For linear invariants, # of integer solutions = # of integer
n
points = Ehrhart polynomial
size(C) * ( ½n2+½n) i
n+1 14
13. Outline
• Inferring parametric upper-bounds of heap
memory usage (or live objects)
• Our new compositional approach
• Verification of .NET programs
• Conclusions
15. Example
• How much memory is required to run m0?
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
16. Example
• How much memory is required to run m0?
“Ideal” consumption m0(2)
void m0(int mc) { 14
1: m1(mc); 12
2: B[] m2Arr=m2(2 * mc); 10
} 8
void m1(int k) { 6
3: for (int i = 1; i <= k; i++){ 4
4: A a = new A(); 2
5: B[] dummyArr= m2(i); 0
} ret m1
}
B[] m2(int n) { Ideal consumption m0(7)
80
6: B[] arrB = new B[n];
70
7: for (int j = 1; j <= n; j++) { 60
8: arrB[j-1] = new B(); 50
9: C c = new C(); 40
10: c.value = arrB[j-1]; 30
20
}
10
11: return arrB; 0
} 18
ret m1
17. Our goal
An expression over-approximating the peak
amount of memory consumed by a method
Parametric
Easy to evaluate at run time
E. g. :Required(m)(p1,p2) = 2p2 + p1
Evaluation cost known “a priori”
Given a method m(p1,..,pn)
peak(m): an expression in terms of p1,…,pn
for the max amount of memory consumed by m
18. In a nutshell
How: Performing a good approximation of memory
requirements using a region-based memory manager
(RTSJ)
1. Infer total allocations by counting visits to new
statements
2. Compute memory regions using escape analysis
3. Compute peak consumption using a region based
memory manager
19. BGY05/06
Inferring parametric upper-
bounds of heap memory
usage (or live objects)
I) Computing dymamic memory
allocations
Víctor A. Braberman, Diego Garbervetsky, Sergio Yovine: A Static Analysis for Synthesizing Parametric Specifications of
Dynamic Memory Consumption. Journal of Object Technology 5(5): 31-58 (2006)
20. Memory requested by a method
void m0(int mc) {
1: Identify allocation sites
1: m1(mc);
Key of the technique: Manipulate
2: B[] m2Arr=m2(2 * mc); Ex: m0.1.m1.5.m2.8, cs for new B with stack
} linear expressions, easier to handle (m0.1.m1.5). m0.2.m2.8 = stack (m0.2)
void and less expensive in order to
m1(int k) {
3: for (int non-linear expressions
generate i = 1; i <= k; i++){ 2: Find invariants for creation sites
4: A a = new A();
5: B[] dummyArr= m2(i); Im0(m0.1.m1.5.m2.8) {k=mc 1≤i≤k n=i
} 1≤j≤n}
}
B[] m2(int n) { 3: Count the number of solutions (in terms of
6: B[] arrB = new B[n]; MUA parameters)
7: for (int j = 1; j <= n; j++) {
{(k,i,j,n)| (k=mc 1≤i≤k n=i 1≤j≤n) }
8: arrB[j-1] = new B();
9: C c = new C(); = ½ mc2 + ½ mc
10: c.value = arrB[j-1]; 4: Transform number of visits into memory
} consumption
11: return arrB;
} size(B)*½ mc2 + ½ mc
Creation sites reachable from m0: 5: Sum up the resulting expressions
CSm0
={m0.1.m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8, (size(B[]) + size(B) + size(C))(1/2 mc2 +5/2 mc)
m0.1.m1.5.m2.9 +size(A)mc
21. Memory requested by a method
How much memory (in terms of m0
parameters) is requested/allocated by m0
totAlloc(m0)(mc) = S(m0, cs)
cs CS_m0
= (size(B[])+size(B)+ size(C))(1/2 mc2 +5/2 mc)
+size(A)mc
= 3/2 mc2 +17/2 mc [considering (size(T) = 1]
28
22. Problem
• Memory is released by a garbage collector
– Very difficult to predict when, where, and how many object are collected
Ideal consumption Ideal consumption
15 80
60
10
40
5
20
0 0
m0(2) m0(7)
• Our approach: Approximate GC using a scope-based region memory
manager
29
24. RTSJ (Real time specification for Java)
• Scoped memory management
– Dynamic memory organized in regions associated to
particular scopes
• Methods, threads, etc.
• Advantages
– Better time predictability
• (compared with non RTGCs)
– More controlled object allocation
and deallocation
• Useful for memory consumption predictability
• But… you must respect scoping restriction
– Potential dangling references
31
25. Region-based memory management
void m1(int k) {
SM.enter(Regions.rm1);
for (int i = 1; i <= k; i++) {
A a =SM.newInstance(CSs.m1_4, A.class);
B[] dummyArr = m2(i); Rm1(m1.4,m2.6,m2.8)
}
SM.exit(); m1.4 m2.6 m2.8
} m2.6 m2.8
B[] m2(int n) { m2.6 m2.8
SM.enter(Regions.rm2);
B[] arrB = (B[])SM.newAInstance(CSs.m2_6, B.class,n);
• Region-based program
for (int j = 1; j <= n; j++) {
arrB[j - 1] = (B)SM.newInstance(CSs.m2_8, B.class); Rm2(m2.9)
C c = (C)SM.newInstance(CSs.m2_9, C.class);
c.value = arrB[j - 1];
}
m2.9
SM.exit(); m2.9
return arrB; m2.9
}
Regions.rm1 = <“rm1”,{CSs.m1_4, CSs.m2_6, CSs.m2_8}>
Regions.rm1 = <{“rm2”,{CSs.m2_9}>
Diego Garbervetsky, Chaker Nakhli, Sergio Yovine, Hichem Zorgati: Program Instrumentation and Run-Time Analysis of
Scoped Memory in Java. Electr. Notes Theor. Comput. Sci. 113: 105-121 (2005)
26. Region-based memory management
• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
33
27. Region-based memory management
• Escape Analysis to infer regions
• Escape(m): objects that live beyond m
void m0(int mc) {
1: m1(mc); – Escape(mo) = {}
2: B[] m2Arr=m2(2 * mc); – Escape(m1) = {}
}
void m1(int k) { – Escape(m2) = {m2.6, m2.8}
3: for (int i = 1; i <= k; i++){
4: A a = new A();
• Capture(m): objects that do not live
5: B[] dummyArr= m2(i);
} more that m
}
– Capture(mo) = {m0.2.m2.6, m0.2.m2.8},
B[] m2(int n) {
6: B[] arrB = new B[n]; – Capture(m1) =
7: for (int j = 1; j <= n; j++) { {m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8},
8: arrB[j-1] = new B();
– Capture(m2) = {m2.9}
9: C c = new C();
10: c.value = arrB[j-1]; • Region(m) Capture(m)
}
11: return arrB;
}
34
28. Region-based memory management
• Memory organized using m-regions
void m0(int mc) {
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for (int i = 1; i <= k; i++){
4: A a = new A();
5: B[] dummyArr= m2(i);
}
}
B[] m2(int n) {
6: B[] arrB = new B[n];
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B();
9: C c = new C();
10: c.value = arrB[j-1];
}
11: return arrB;
}
35
29. Obtaining region sizes
• Region(m) Capture(m)
• memCap(m): an expression in terms of p1,…,pn for
the amount of memory required for the region associated
with m
• memCap(m)is totAlloc(m)applied only to captured
allocations
• memCap(m0) = (size(B[]) + size(B)).2mc
• memCap(m1) = (size(B[]) + size(B)).(1/2 k2 +1/2k) +size(A).k
• memCap(m2)= size(C).n 37
30. Inferring parametric upper-
bounds of heap memory
usage (or live objects)
III) Approximating peak consumption
-Víctor A. Braberman, Federico Javier Fernández, Diego Garbervetsky, Sergio Yovine: Parametric prediction of heap
memory requirements. ISMM 2008: 141-150
-Philippe Clauss, Federico Javier Fernández, Diego Garbervetsky, Sven Verdoolaege: Symbolic Polynomial Maximization
Over Convex Sets and Its Application to Memory Requirement Estimation. IEEE Trans. VLSI Syst. 17(8): 983-996 (2009)
31. Approximating peak consumption
We over approximate an ideal memory manager using a
scoped-based memory regions
– m-regions: one region per method
• When & Where:
• created at the beginning of method
• destroyed at the end
• How much memory is allocated/deallocated in each
region:
• memCap(m) >= actual region size of m for any call context
• How much memory is allocated in outer regions :
• memEsc(m) >= actual memory that is allocated in callers regions
39
32. Approximating Peak (m)
Region’s stack evolution
Some region configurations can not happen at the same time
e.g m0.1.m1.m2 and m0.2.m2
?
rm2 rm2
?
rm1 rm1 rm1 rm1 rm1 rm2
?
rm0 rm0 rm0 rm0 rm0 … rm0 rm0 rm0
peak ( 0, m0) = max size(rk( )) 41
33. Approximating Peak (m)
Region sizes may vary according to method calling context
rsize(m2) = n (assume size(C)=1)
m0.1.m1.5.m2
{ k= mc, 1 i k, n = i}
{ k= mc = n} maximizes
maxrsize(m0.1.m1.5.m2,m0) = mc
rm2 rm2
rm2
In terms of m0
rm1 rm1 rm1 parameters!
…
rm0 rm0 rm0
42
peak (m0)
34. Approximating Peak (m)
3. Maximizing instantiated regions
maxrsize( .m,m0)(Pm0)
= Maximize rsize(m) subject to I (Pm0 ,Pm, W)
• m-region expressed in terms of m parameters
– rsize(m2)(m0) = n
• A complex non-linear maximization problem
• Maximum according to calling context and in terms
even when parameters are instantiated (in
of MUA parameters
runtime)
– maxrsize(m0.1.m1.5.m2,m0) (mc) = mc
• Too expensive
– maxrsize(m0.2.m2,m0)(mc) = 2mc
• Execution time difficult to predict
43
35. Solving maxrsize
• Solution: an approach based on Bernstein
basis over polyhedral domains (Clauss et al. 2004)
– Enables bounding a polynomial over a
parametric domain given as a set of linear restraints
– Obtains a parametric solution
• Bernstein(pol, I):
– Input: a polynomial pol and a set of linear (parametric) constrains I
– Return a set of polynomials (candidates)
• Bound the maximum value of pol in the domain given by I
44
36. maxrsize
max { q(Pmo) C1} if D1(Pmo)
Maxrsize(m0, .mk)=
max { q(Pmo) Ck} if Dk(Pmo)
where {Ci, Di} = Bernstein(rsize(mk), I .mk,Pm0)
• Maxrsize(m0,m0)(mc) = (size(B[]) + size(B)).2mc
• Maxrsize(m0.1.m1,m0)(mc) =
(size(B[]) + size(B)).(1/2 mc2 +1/2mc) +size(A).mc
• Maxrsize(m0.1.m1.5.m2,m0)(mc) = size(C).mc
• Maxrsize(m0.21m2,m0)(mc) = size(C).2mc
45
37. Approximating Peak (m)
We consider the largest region for the same calling context
[1..k]
peak (m0) max maxrsize
1 k | |
m0
(m c) = mem (m0)
mo
m0.1.m1.5.m2
m0.1.m1.5 maxrm2
m0.2
maxrm1 maxrm1 maxrm2
m0
maxrm0 maxrm0 maxrm0 maxrm0
47
mem (m0)
max
38. Dynamic memory required to run a method
Memreqm0(mc) = mc2 +7mc
M2
M1
M0
ideal
memRq(4)
Init start call call ret call ret call ret call ret ret call ret ret end
m0 m1 m2 m2 m2 m2 m2 m2 m2 m2 m1 m2 m2 m0 51
39. Inferring parametric upper-
bounds of heap memory
usage (or live objects)
Tool support
Diego Garbervetsky, Sergio Yovine, Víctor A. Braberman, Martín Rouaux, Alejandro Taboada: Quantitative dynamic-
memory analysis for Java. Concurrency and Computation: Practice and Experience 23(14): 1665-1678 (2011)
43. Inferring total allocations
Counting
Ensures that variables
concerning visits to
statements are
considered in the
counting (to ensure
soundness)
Try to compute a
minimal set of variables
(to filter out irrelevant
variables, better
precision)
56
44. Inferring regions
Our tool
1. Automatic region inference of
m-regions
– Using escape analysis
2. Translation to region based
bytecode
– RC (regions library)
– RTSJ
– JikesVM
57
45. Refining memory regions
• Escape analysis over approximates object lifetime (to be safe)
– It may impact of memory regions
• JScoper: A tool for Region edition and visualization
– Call graph visualization
– Region edition
– Interfacing with Escape Analysis
– Region-based code
generation
– Region-based memory
manager simulator
Andrés Ferrari, Diego Garbervetsky, Víctor A. Braberman, Pablo Listingart, Sergio Yovine: JScoper: Eclipse support for
58
research on scoping and instrumentation for real time Java applications. ETX 2005: 50-54
46. Peak memory computation component
• Non linear maximization problem
solved using an approach based on
Bernstein basis over polyhedral domains
(Clauss et al. 2004-2009)
• Enables bounding a polynomial over
a parametric domain given as a set of
linear restraints
• Yields a set of candidate
polynomials
59
48. Limitations
• For loop intensive programs the tool performs very
well
– Not well suited for memory allocating recursions
• Imprecision comes mainly from:
– Escape analysis
– Program invariants
– Inductive variables analysis
– Approximations of maximum regions sizes
61
49. More limitations
• Global approach make the analysis and tuning of results a
hard task
– Too many variables, parameters and bindings
• Affects scalability and usability
• Sometimes it would be necessary to provide bounds
manually
– Recursion, non analyzable methods, easy to understand but with a
non-linear invariants
51. Why?
• Scalability
– Symbolic manipulation algorithms complexity heavily
depend on the number of the involved variables
• Usability:
– Manual inspection and tuning of program invariants are
much easier when dealing with local invariants
64
52. More reasons
• Dealing with non analyzable methods
– User provided annotations (applies also for mutually
recursive components)
• Enables the use of other counting
mechanisms
• Ability to analyze programs fragments
• Better support for Polymorphism
53. Compositional analysis = method summaries
B[] m2(int n) { MR_m2 = 3n
6: B[] arrB = new B[n]; 6: n (new B[n])
7: for (int j = 1; j <= n; j++) {
8: n (new B(), count {j = 1..n} =n)
8: arrB[j-1] = new B();
9: C c = new C();
9:= n (new C(), count {j = 1..n} =n
10: c.value = arrB[j-1];
}
11: return arrB;
}
4: k (new A(), count {i = 1..k} =k)
void m1(int k) MR_m1 = 3/2k^2 + 5/2k
{
3: for (int i = 1; i <= k; i++){ 5:= call to m2(i) {i = 1..k}
4: A a = new A();
5: B[] dummyArr= m2(i); sum{i = 1..k} 3i = 3*(k(k+1))/2=3/2(k2+k)
}
} symbolic
operation on
polynomials
void m0(int mc) {= 3/2mc^2+17/2mc
MR_m0 1: call to m1(mc) = 3/2mc2+5/2mc
1: m1(mc);
2: B[] m2Arr=m2(2 * mc);
2: call to m2(2*mc) = 3*(2mc) = 6mc
}
54. Challenge
Do not loose too much precision!!
Compositional: 3/2*mc2+17/2mc vs. Global: mc2 +7mc
• Specification of memory reclaiming in a
compositional fashion
• Symbolic manipulation of summaries
– Maximize polynomials over iterations spaces
– Sum polynomials over iterations spaces
56. Modeling object reclaming
B[] m2(int n) {
6: B[] arrB = new B[n]; 6 and 8 live longer than m2
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B(); A responsibility of the caller
9: C c = new C();
10: c.value = arrB[j-1]; 9 can be safely collected
}
11: return arrB;
}
when m2 finishes
Idea: Enrich summaries in order to distinguish
escaping objects from captured objects
6 y 8 are escaping or residual and 9 is auxiliary
57. Compositional analysis (simplified)
B[] m2(int n) { MR_m2 = 3n 6,8:= n+n = 2n (Escapes)
6: B[] arrB = new B[n]; ME_m2 = 2n
9:= n (do not escape)
7: for (int j = 1; j <= n; j++) {
8: arrB[j-1] = new B(); MR_m2(n) = 3n ( ME, 2n, MT = 1)
9: C c = new C();
10: c.value = arrB[j-1];
} Escaping memory
11: return arrB;
is accumulative!
}
4: k (do not escape)
void m1(int k) { MR_m1 = k^2 + 3k 5:= call to m2(i) {i = 1..k}
3: for (int i = 1; ME_m1k;0 i++){
i <= = sum{i=1..k} ME_m2(i) = 2*(k(k+1))/2= (k2+k)
4: A a = new A(); max{i=1..k} MR-ME_m2(i) = k
5: B[] dummyArr= m2(i);
} MR = k + (k2+k) + k
symbolic
}
2+3mc
operation on
1: call to m1(mc) = mc polynomials
void m0(int mc) {
2: call to m2(2*mc) =
MR_m0 = mc^2+7mc
1: m1(mc); ME_m0 = 0 • From ME_m2 2*2mc = 4mc
2: B[] m2Arr=m2(2 * mc); • From MR-ME_m2 2mc = 2mc
}
MR = max(mc2+3mc, 2mc)+4mc
58. Precision vs. Granularity
• How we call help the caller to identify
if escaping objects can be collected?
• Subheap: A set of objects that have similar lifetime
– Subheap descriptor: An identifier for a subheap
• Esc can be organized in terms of subheaps
public B two() {
1:this.g = new A();
• Esc(all) = 3
2:b = new B() • Esc({This}) =1
3:b.f = new C();
4:return b;
• Esc({Ret}) = 2
} • Esc({Ret}) = 1 Esc({Ret_f}) = 1
59. Subheap descriptors and Esc Analysis
public B two() {
1:this.g = new A();
2:b = new B()
3:b.f = new C();
4:return b;
}
Salcianu’s Points –to Graph Steengard’s like equivalences
this
P {b~2~ 3~ret} sh descriptor = eq class
N b sh descriptor = {this~1}
fg • Esc({1}) = 1 • Esc(ret) = 2
• Esc({2}) = 1 • Esc(this) = 1
1 2
f
• Esc({3}) = 1
3
60. From Code Contracts to Memory Contracts
static public int GCD(int x, int y) {
Contract.Requires(x > 0 && y > 0);
Contract.Ensures(Contract.Result<int>()>0);
while (true) {
if (x < y){
y %= x;
if (y == 0) return x;
} else {
x %= y;
if (x == 0) return y;
}
}
}
• Code Contracts: Pre/posconditions loop and object invariants
– Runtime checking
– Static checking using abstract interpreter
• Automatic Inference of loop invariants!!
61. From Code Contracts to Memory Contracts
• Memory contracts
– Contract.Memory.X / X= {MemReq, RSD, TMP}
• Annotations for specifying object lifetime
• DestRsd: whether an allocations escapes through a subheap
• AddRsd: Determine the destination of a callee subheap in the
caller
• BindRsd: Associate a subheap name with a expr in code.
62. DestTmp() indicates the
next object is temporary
logger is a temporary
object since it can be
collected when method
finishes its ex ecution
Rsd(sh,n) or Esc speficies
the number of objects
escaping by sh is at most n
node is a residual object
because its lifetime
exceeds that of the DestRsd(sh) indicates the
method that creates it next object is escaping and
tagged with sh
Squiggles means errors
63. How annotations are checked?
• Code is instrumented to include counters
– Each subheap has its counter
– Annotations determine which counter should be added
• Lifetime annotations are checked using a Points-to and Escape
Analysis
Jonathan Tapicer, Diego Garbervetsky, Martín Roaux , "Resource Usage Contracts for .NET" , TOPI 2011: 1st Workshop on
Developing Tools as Plug-ins - 2011
64. How annotations are checked?
(simplified)
B[] m2(int n) {
B[] m2(int n) {
Contract.ensures(ret_m2<=2n)
Contract.Memory.Rsd(ret,2n) Contract.ensures (mr<= 3n)
Contract.Memory.MR(3n) ret_m2+=n; mr+=n;
Contract.Memory.DestRsd(); 6: B[] arrB = new B[n];
6: B[] arrB = new B[n]; 7: for (int j = 1; j <= n;j++){
7: for (int j = 1; j <= n; j++) { ret_m2++;mr++;
Contract.Memory.DestRsd(); 8: arrB[j-1] = new B();
8: arrB[j-1] = new B(); mr++;
9: C c = new C(); 9: C c = new C();
10: c.value = arrB[j-1]; 10: c.value = arrB[j-1];
} }
11: return arrB; 11: return arrB;
} }
• We leverage of automatic invariant inference from
Cousot (Code Contracts checker) to prove this
automatically
65. How annotations are checked? (problem)
B[] m2(int n) {
Contract.Memory.Rsd(Ret,2n)
Contract.Memory.MR(3n)
void m1(int k) {
void m1(int k) { Contract.ensures(this_m1<=k*k+k)
Contract.Memory.Rsd(this,k*K+K) Contract.ensures (mr<= k*k+3k)
Contract.Memory.MR(k*K+3K) 3: for (int i = 1; i <= k; i++){
3: for (int i = 1; i <= k; i++){ mr++;
4: A a = new A(); 4: A a = new A();
Contract.Memory.AddRsd(This,Ret); this_m1+=2i; mr+=2i
5: this.f[i] = m2(i); mr+=max{1,i};
} 5: this.f[i] = m2(i);
} }
}
• To prove this_m1<=k*k+k we need a non-linear invariant!
• Beyond the capabilities of Clousot
66. How annotations are checked?
Good news!
• We know how to count/sum solutions to iteration spaces
• We just need linear invariants
void m1(int k) { Inferred by
Contract.ensures(this_m1<=k*k+k) Clousot
Contract.ensures (mr<= k*k+3k)
3: for (int i = 1; i <= k; i++){
Contract.itSpace(1<=i<=k)
4: mr+;
A a = new A(); Computed using the
loop=sum(itSpace2,2i);
mr+=max{1,i};
symbolic calculator
5: this.f[i] = m2(i);
}
Contract.assume(loop<=k*k+k) We force the checker to
this_m1+=loop; mr+=loop; accept the bound
}
68. Polymorphism
public void test(List l) { ME_A1do = 2
foreach(e:l){ Class A1 extends A {
if(cond) a = new A1(); public int do(int n) { .. }
else a= new A2(); }
a.do(e);
} ME_A2do = n
Class A2 extends A {
}
public int do(int n) { .. }
}
sum{i=1..size(l)} (max{ ME_A1.do(l[i]), ME_A2.do(l[i])})
• Hard to solve simbolicaly….
– Need to over approximate the max operation
69. Polymorphism
public void test(List l) { ME_A1do = 2
if(cond) a = new A1(); Class A1 extends A {
else a= new A2(); public int do(int n) { .. }
}
foreach(e:l){
a.do(e); ME_A2do = n
Class A2 extends A {
} public int do(int n) { .. }
} }
• The object remains the same in all iterations
loop_A1 = sum{i=1..size(l)} ME_A1.do(l[i])
loop_A2 = sum{i=1..size(l)} ME_A2.do(l[i])
Max(loop_A1, loop_A2)
70. About Flow sensitiveness
public void m1_m2() { public void m1_m2() {
m1(); m2();
m2(); m1();
} }
M1 M2
M2 M1
M1_M2 M1_M2
Is more complicated with loops and conditionals….
public void for_m1_m2(int n) { public void for_m1_m2(int n) {
for(int i = 0; i < n; i++) { M1,M2 for(int i = 0; i < n; i++) { M1,M2
m1(i); M1,M2 if(i%2) m1(i); M2,M1
m2(i); … m2(i); M2,M2,
} } …
} }
71. From objects to actual memory consumption
• Consider all objects
– Model VM behavior
– VM Memory vs System memory
• Improve lifetime inference
– Reachable objects, live objects
– (parametric in GC)
• Consumption patterns
– Lazy initialization, global fields that are constantly overrriten
• Specification language (independent of the EA)
– For non-analyzable, interfaces
72. Current + Future work
• Improving the interprocedural inference analysis
– “Plugleable” object lifefime analysis
– Support for recursive methods invocations
• Better invariant inference
– Tool suppot for annotations / checking
• A new specification language
88
73. Current + Future work
• Experimenting with static verification of memory
consumption
– Code Contracts, SMT solver + barvinok library
• A intermediate language (a sort of Boogie)
– With translations to Java, C#, C, etc.
• Inference for new memory models
– ISMM 2011: Short-term memory for self-collecting
mutators.
89
74. Conclusions
• The analysis of memory requirements is feasible (but
very difficult…).
• We (and other groups) had made good progress
• We need to improve in order to analyze real programs
– Compute actual consumption
– Seriously improve scalability
• We believe compositional approaches is a promising
direction
75. Credits
• Victor Braberman • Javier Tapicer
• Sergio Yovine • Martin Rouaux
• Philippe Clauss • Andres Ferrari
• Samuel Hym • Alejandro Taboada
• Daniel Gorin • Guillaume Salagnac
• Federico Fernandez
• Matias Grumberg
Notes de l'éditeur
Nowadays we can see more and more embedded systems and mobile computers surrounding us.Even they are becoming powerful devices they have limited resources. They have limited battery life, limited memory and they usually communicate with other computers, incurring in communications cost. So, it is becoming more and more necessary to understand and control the consumption of these resources. Even in other settings like computer farms, or clusters it is also vital for the business to make proper use of them.
In this talk the focus is in understanding memory consumption with Java likes language which provides automatic memory management which takes cares of the deallocation of unused objects. This kind of features make program development easier and less error-prone. However, in this setting where objects are allocated dynamically predicting memory utilization is very dificult. In fact, just predicting memory allocations is hard, indeed is undecidable. In fact, it is a problem similar to the halting problem assuming that every program statement consumes some amount of memory.Even harder is predicting actual memory requirements, taking into account that memory can be reused once unused objects are collected by the GC.
Even we know is not the complete solution, we believe that being able to understand things like the number of object allocations and the maximum number of live objects is one of the foundations for understanding real memory consumption.In some systems (like some region based memory manager) this information is closer to what we need to produce real bounds.
In this work we do not recursive program and do not generate bound of the size of the stack. There are nice works like the…. And for functional languages.Stack allocation: Chin Wei Ngan, CorneliuPopeea and colleagues from Singapur
Consider this example that return a matrix filled with objects of type D.Its allocates an arrays of n times m references and n times m objects of type D I will consider arrays not as one object but as collection of many references to objects.Taking this into account this method allocated 2 time n times objects .
Suppose we want to verify this program that simply specifies number of objects allocated.We will need to generate the verification condition in order to prove the program satisfies the spec.Since the program has loops we will need at least to provide loop invariant describing how memory evolves in order to prove the spec.
----- Meeting Notes (29/03/12 10:18) -----In general if we want to compute actual memory requirements we will need to include several annotations to explain object lifetime and the shape of the data structures.
Related with symbolic complexity computation
Requires computing non-linear invariant, like polynomial invariants, which requires the definition o a complex lattice, which tends to be expensive in terms of computational cost, or may need widenning operators. More recent works (2011) show improvements in this topic.
Thus, look at the following code. It basically allocates several objects in the body of two loops. For instance, when method m0 calls m1 allocates k objects of type A and then makes several calls method m2 which allocates n objects of type B and C and an array of type B. Later, m0 calls again m2. The pictures on the right show the memory consumption of two different executions of the same program using a sort of “ideal” GC which releases the objects when they are not longer reacheable. The amount of memory required for this program depends on the parameter “mc” which is m0’s parameter. Notice that that not only the amount of memory of the peak consumption of the program changes when mc changes, also the place where this peak is reached also changes according to the calling context.
Thus, look at the following code. It basically allocates several objects in the body of two loops. For instance, when method m0 calls m1 allocates k objects of type A and then makes several calls method m2 which allocates n objects of type B and C and an array of type B. Later, m0 calls again m2. The pictures on the right show the memory consumption of two different executions of the same program using a sort of “ideal” GC which releases the objects when they are not longer reacheable. The amount of memory required for this program depends on the parameter “mc” which is m0’s parameter. Notice that that not only the amount of memory of the peak consumption of the program changes when mc changes, also the place where this peak is reached also changes according to the calling context.
Our goal is to obtain an expression that overapproximates peak memory consumption. We want the expression to be in terms of the method under analysis’ parameters.We also want the expression to be easy to evaluate. That means, the evaluation cost has to be low or are least known before. In this work we propose a technique to obtain such expression. That is, given a method we obtain a parametric expression overapproximating the maximum amount of memory consumed by any run starting at m.
Luego, para obtener los certificados que aproximan la memoria solicitada aplicamos el siguiente algoritmo…
Remember that creation sites are paths that may involve several methods. So the invariants should predicate about the variables in those methods. Let’s take a look to our running example. The first case is for the creation site which represents that m0 that calls m1 and creates an object of type A inside the loop. There, we have the following invariant representing the binding in the call from m0 to m1 and the iteration space of the loop where objects are created.The second case is when m0 calls m1 and m1 calls m2.
Recall that to count visits to statements we will basically use invariants. However, as invariants only constraint variables, they loose information about the call stack configuration. And the call stack is relevant to count visits. For example:Examining the example’s call graph, we immediately observe that, from a static view, method m2 is called at least twice. So their allocation sites will be executed twice.To cope with issue, we decided to distinguish program locations not only by their localcontrol location, but also by their history. That is, the call chain that leads to that allocation site. We introduce the notion of creation site. That is, a path from the method under analysis to a new statement reachable from that method. The creation site denotes not only the new statement but also a control projection of the call stackwhen reaching this statement. For example, this is the set of creation sites for the program assuming that we are analyzing method m0Demos una pequeña recorrida a las partes del algoritmos . Lo primero que hacemos es identificar los puntos donde se crean objetos. Un aspecto importante es que nuestra técnica distingue a los puntos de solicitud no solo por el lugar en el método donde se ejecuta el new, sino por todo el camino desde el metodo bajo analisis hasta ese punto. En el ejemplo, podemos distinguir los objetos creados desde la rama m0,m1,m2 de los de la rama m0,m2. A los puntos de solicitud extendidos con el contexto de llamada los llamamos creationsites.
Remember that creation sites are paths that may involve several methods. So the invariants should predicate about the variables in those methods. Let’s take a look to our running example. The first case is for the creation site which represents that m0 that calls m1 and creates an object of type A inside the loop. There, we have the following invariant representing the binding in the call from m0 to m1 and the iteration space of the loop where objects are created.The second case is when m0 calls m1 and m1 calls m2.
Now we are ready to count the number of visits of a creation site, For example, we want an expression in terms of method m0 parameters (that is the variable mc) that over-approximates the number of visits to this statement for the following stack configuration.To solve that, we take the invariant for that creation site, and count the number of possible variable assignments satisfying that invariant, assuming that the value of mc is fixed (the m0 parameter).Assuming the invariant can be expressed as a polyhedron, using the technique that I mentioned before, we can get a polynomial expressing the number of solutions of this formula.----------------------------In this case k and n has only one possible value “mc” and “n” respectively. But i range between 1 and k, and j between 1 and i. So, the number of possible solutions is the following expression.-Note that, counting the number of solutions for an invariant for a creation site cs=π.l over approximates the number of visits of the new statement at l when program stack is π
We know how to approximate the number of visits of a statement that allocates memory. Now, we have to take into account the size of the allocated objects. Following the same example, we already know the number of visits. To get the desired expression, we just multiply this expression by a symbolic expression that denotes the size of allocated object in runtime. (opcional) For the case of array instantiations, we treated them as a set of nested loops that create single instance objects (of the size of a reference to the type of the objects contained in the array). We do that by adding new constrains to the invariant describing these loops iteration spaces.
For instance we can synthesize the following expression approximating the total amount of dynamic memory allocated by m0.However, this expression is too conservative if we want to predict actual memory requirements since object deallocation is not taking into account.
The problem is that is it diffcult to predict when objects are actually realeased by the GC and it is even more complicate to know how many objects. Thus, our approach is to approximate an ideal GC using a more coarse grained one based on memory regions.
The realtime specification for Java propose a memory management mechanism based on Scoped memories, where objects are organized in regions which are associated to a particular scope.This appraoch allows better time predictability, compared with GCs and more control in the way object are allocated and deallocated.
Veamos rapidamente como funciona un programa utilizando la API.
Volvamos a nuestro ejemplo. El grafico muestra un grafo que aproxima las referencias entre objetos que crea este programa. Cada nodo en este grafo representa todos los objetos que pueden crearse en ese punto. Hay tantos nodos como creationsites.
Thus, to sinthesize memory regions we use escape analysis as a way to approximate object lifetimes.Basically, the region of each method is composed by the creation sites that do not escape the method but escape from some of its callees.
Volvamos a nuestro ejemplo. El grafico muestra un grafo que aproxima las referencias entre objetos que crea este programa. Cada nodo en este grafo representa todos los objetos que pueden crearse en ese punto. Hay tantos nodos como creationsites.
In this case we define a memory manager where there is one region associated within each method. So, the lifetime of objects in a region is directly associated with the lifetime of the associated method. For instance, for our example we can make this region organization. Boxes represents creation sites that are an abstraction of all objects creating in a program location. Ligth green boxed corresponds to objects created in method m2 but following the call chain that passes from m1 and the dark green corresponds to objects from m2 when it is called directly from m0. To respect scoping rules, object has to be allocated in the region that has a lifetime equal or longer to the object lifetime. In this case, objects refer to objects in the same region or to objects with a longer lifetime.
Method’s associated regions can be sinthesize using escape analysis. Thus, we can adapt our technique that compute total allocations to consider only the creation sites caputed by the method under analysis.Using this approach we are able to obtain parametic expressions that overapproximate region sizes.
Ok… we can model a memory manager using a scoped-based region manager, sinthesize memory regions and compute its sizes.In fact, using a similar reasoning, we can also compute the amount of memory escaping a method.We will use all this information to model peak memory consumption under this setting….
Suposse, we want to analyze peak consumption of a method m. When “m” is executed a region for “m” will be created and also a region for each method that m calls. During the execution of those methods, some objects will be allocated in one of the newly created regions or if they lifetime exceed method “m” lifetime they will be allocated in a preexistent region. So, when computing the peak consumption for m we will distinguish between the maximum consumption produced in newly created regions from the consumption produced in already existent regions. Using that approach, we focus on a technique to produce an overapproximation of the peak amount of memory consumed by newly created regions. The approximation of the consumption for the pre-existent regions can be done using the estimation of the amount of memory escaping the method under analysis.mem (m) >= Peak (m) Approx of peak memory allocated in newly created regionsmem(m) >= Peak(m) Apprpx of memory allocated in preexistent regions (memEsc(m))
So what happens with the regions created by the method under analysis?This is how memory regions evolves following the execution of the methods. Observe that the maximum number of active regions is bounded by the size of the maximum path is the call graph. Thus, the maximum consumption of newly created region can be computed by taking the region configuration whose sum of region sizes is the largest one. The problem is that number of configurations can be infinite… and of course every region size depends on the state of the program when the associated method is called and
Notice that we have bounds of regions in terms of their associated method parameters. And regions sizes evolve according to their calling context. For instance, we can analyze the evolution of regions for m2 when it is called from m1. The size for the region of m2 is actually bounded by the maximum value the loop at m1 can reach which depends on the parameter used when m1 is invocated. All this is expressed in the invariant that models the calling context.We can approximate a set of region configurations within calling context by considering the one with the largest region size. We denote this maximum as “maxrsize”. In this case the largest region for m2 is produced when n is equal to mc. We want to express this expressions in terms of the root method parameters in order to able to compare it with other regions sizes later.
Thus, the most important problem now is to compute the largest region size for a given calling context. In particular we need an expression representing the largest size a region for a method m can reach when called from a method under analysis (in this case m0) using a particular calling context given by the call chain leading to m.The calling context will be given by a binding invariant and the region size is a function of its associated method parameters.Remenber that we already have a method to approximate region sizes. We can use that methods or provide this specifications by other means. Since our technique is able to provide polynomials in terms of the associated method parameters, we are dealing with a non-linear optimization problem. Non-linear optimization problems are hard to solve and its execution time is difficult to predict. Thus, we cannot solve this problem in runtime. We need to solve this in compile time. This makes the problem even more complicated because we need a parametric solution that can be instantiated when parameter values are known in runtime.
Lukelly, there is a solution to this problem. This solution is due to Clauss wich extend an original technique of Bernstein to bound polynomials. The technique of Clauss receives a polynomial and a domain expressed as a polytype produces a set of polynomials wich are candidates of the maximun and minimum bound of the input polynomial within that given domain.
Thus, if we can describebinding invariants using linear constrains and region sizes using polynomails, we can use this technique to compute maxrsize. For our example we can produce the following expressions which are in terms of method m0 parameters although region sizes are expressed in terms of their method parameters. Now, the whole problem is now reduced to max comparison between polynomial in terms of the same parameters.This variable renaming can be performed thanks to the Bernstein transformation and because the invariant binds the parameters of method under analysis with the parameters of the method from which we want to perform the maximization.
For instance, consider the following polynomial and domain and a set of parameters. Berstein applied to this polinomial returns the following two set of candidates determined for a parametric domain. Notice that the result is in terms of the set of selected parameters.This is a remarkable feature of this symbolic technique: the result is a parametric expression which doesn’t have to be in terms of the parameters of the input polynomial. However, notice that there are still some problems: the output is not a simple expression, it is a set of candidates polynomails that need to evaluated to determine which is the largest one. This set can be reduced by applying some symbolic techniques or it can be done directly in run-time. In any case the evaluation cost is known “a priori”
Once we approximate the regions sizes using max r-size, for non-recursive programs we obtain a finite set of regions stacks.Thus, we just need to compute the maximum among this configurations.
So far approximate the peak consumption, considering the sum of largest region of every region configuration among the possible call chains leading from the methdo under analysisNotice that is not n
Instead of comparing the region configurations for all paths in the program, we can compute recursively the same maximum by traversing the call graph en genereting an evaluation tree whose leaves are non-linear maximization problems and the nodes are summing or max operations. Max is used for branches in the call graph and sums when we go deeper to generate a region configuration.
We can try to simplify offline the tree until we get an expression that we cannot or we don’t want to simplify. This evaluation tree can be then translated into code that can be executed when the parameters becomes available.
For instance, for our example we can compute the following expression for the peak consumption of method m0.If we analyze the peak consumption using an sort of ideal GC we see that we are over-approximating. Nevertheless, once we assume our region based memory manager the obtained expression is quite accurate.
We integrated this technique into a tool that is able to compute total allocations and region sizes and also generate region base code out-of conventional java code by sinthesizing the memory regions using escape analysis.
Thisisthecomponents of thetoolperformingthistask.
Thisisthecomponents of thetoolperformingthistask.
In particular an important part is the one taking care of computing the variables which are actually relevant in terms of number of visits.
Las regiones son inferidas automaticamente usando escape analisis y utilizando una herramienta que permite la edicion manual. Una vez que tenemos las regiones generamos automaticamente un nuevo codigo que tiene la misma funcionalidad pero que utiliza las regiones. En nuestro caso el codigo utiliza un admnistrador de memoria adhoc basado en regiones, pero por ejemplo otro tesista genera codigo para RTSJ usando la misma idea
So, we were able to compute the peak consumption for the application, but maybe we can do something better. In particular, the computed expression strongly depends on the strategy used to allocate objects in regions.Since escape analysis sometimes is to coarse when approximating objects lifetimes, they can lead in coarser regions.Thus, we provide a mechanism to manually refine the regions.We call this tool Jscoper and is a tool that allow regions refinement and also the generation of region based code out of standard java code.Tambien desarrollamos una herramienta que asiste en el proceso de generación de programas basados en regiones. Esta herramienta permite convertir automaticamente un programa java convencional en uno basado en regiones. Para ello se nutre de información sobre regiones que puede obtener conectandose con una herramienta de escape analysis o permitiendo la edicion manual de regiones.La herramienta esta integrada al entorno de programación Eclipse permitiendo tambien el debugging de aplicaciones utilizando para este caso un simulador de regiones.
Let me show you a detailed view of the componentes that implements the technique.The most important components are the one responsible of solving non-linear maximization problem. For that we use a technique based on the bernstein transformation which given a polinomial and a restriction over its parameters it yields a set of candidates representing the maximum polynomials respecting the restriction. --The technique relies on having a call graph of the application to generate the potential memory regions configurations and a component that provides invariants which are used as binding invariants when we model the non-linear maximization problem of obtaining largest region sizes.
Too conservative for recursive methods or recursive data structures whose values affects future memory allocations
Symbolic tools complexity depends on the number of variable and debugging of invariants is hard looking at global states.
To overcome the aforementioned problems we propose the use of a compositional analysis
Moreover, it enables the analysis of non-analyzable method by providing summaries and even some fragments by using summaries as stubs.
We start by analyzing the methods in a bottom up fashion. We analyze each allocation using only local invariants or some counting mechanism and generate a sumary.When a method invoke a another method we use the summary. Notice we need a symbolic calculator capable to support arithmetic operations on polynomials within an iteration space.
The challenge here is to do the approach in a compositional fashion but without to much loosing precision. This implies we need some means to include in the summaries certain lifetime information. We also need some symbolic calculator to deal with non-linear constraints such as polynomials.
So now we take the example and we include not only the amount of live objects required but also how many of them are actually escaping the method.Then we analyze the invocation of method m2 in the loop of m1, we sum for each iteration only the objects from m1 that escape. For the remainding objects we know they can be collected, so we only consider the iteration that consumes the most,A similar idea is applied to m0, the summary of m1 says that no object escapes, so we can assume that after the call to m1 the memory can be recovered and then use them for the requirements of m2. So the consumption of m0 is solved by computing a max between summaries of m1 and m2. No decir? Actually we add the escaping objects from m2 because we are flow insensitive
OK, I lie a little bit In order to be more precise we need to help the caller to understand whether some objects escaping from the callee are escaping their own scope or not. One way we propose is to deal with this is by grouping set of objects according to their expected lifetime. For instance, we can all that all objects escape, or say 1 one object escapes through this, and 2 from the return or even split that subheap to give more detailed info.
In our analysis we think the escape analysis as an oracle and actually a subheap descriptor can be some element that can be mapped to the EA.Example: using Salcianu’s PTG a subheap descriptor can be want they call inside node which represent all objects allocated in a program point.Other analysis can use completely differentydespcriptor like one element of a equivelance classes representing possibly connected objects and references.
Squiggles means errors: for instance the method consumes more memory than the specified or an object is declared as non-escaping but we cannot prove it because the analysis says it escapes.----- Meeting Notes (30/03/12 23:12) -----Sacar lo de temporary
So, how we check the annotations: Essentially transforming them into code contracts assertions. And by checking lifetime annotations using a PT and EA.
The idea is include a counter for each subheap descriptor that is updated in allocations
For method invocations we use the information from the summary add the amount to the subheap indicated in the annotation.The problem is that in order to prove the ensure clause we need non-linear arithmetic which is beyond that Clousot.
What we do to overcome this id using our inference technique. We only request linear invariant that can be inferred or check by Clousot.
If we think as Tmp as the complement of escaping we should say 0 although only one object is escaping.This semantic is meaning that Tmp is a lower bound which is complex to compute using counting If we think as an overapproximation the are saying that it is an upperbound
Regarding polymorphism, in the case of virtual call we should compute a summary representing all callees. It can be tricky when you have different subheapsbut it is essentially computing a symbolic maximum between polynomials and them apply the symbolic sum for the result. The problem is that max operation sometimes cannot be resolved completely and cannot be expressed as a polinomial. In these cases it doesn’t combine well with the sum operator.
What we do is trying to discover is the reciever actually changes during the loop. If it doesn’t change we can inverse the operation and apply the sum for each receiver and then apply the max.
In one moment I might say the analysis in flow insensitive. Actually the analysis would be much precise if we consider the flow of operations. For instance having….The problem is that in some situations is not very easy to determine the order of execution statically.