SlideShare une entreprise Scribd logo
1  sur  76
Télécharger pour lire hors ligne
Abstract Interpretation for Dummies

      Program Analysis without Tears



               Andrew P. Black
      OGI School of Science & Engineering at OHSU
                Portland, Oregon, USA
What is Abstract Interpretation?


•   An interpretation:
        a way of understanding something
    °
•   Abstract:
        omitting some of the details
    °


                                           2
An Example: Portland




                       3
An Example: Portland




                       3
An Example: Portland




                       3
An Example: Portland




                       3
An Example: Portland
•   Each interpretations makes available
    different information
        all have less information than the real city!
    °
        in general, information content is disjoint
    °
        some interpretations dominate others, but
    °
•   Different interpretations have different
    uses
        the map is more useful for driving than the
    °   photograph
        it would be more useful still if it showed
    °   one-way streets!

                                                        4
Examples from Computing


•   “Concrete” artifact is a program with the
    “standard” semantics
        e.g., + is an operation that performs arithmetic
    °
•   “Abstract” artifact is a program with a non-
    standard semantics
        e.g., + is an operation that builds an expression tree, or
    °   computes the type of the answer



                                                                     5
Fraction >> reduced

reduced
   | gcd numer denom |
   numerator = 0 ifTrue: [↑0].
   gcd ← numerator gcd: denominator.
   numer ← numerator // gcd.
   denom ← denominator // gcd.
   denom = 1 ifTrue: [↑numer].
   ↑Fraction numerator: numer denominator: denom


                                              6
Fraction >> reduced
                    1. Interpret inst vars as their types
reduced
   | gcd numer denom |
   numerator = 0 ifTrue: [↑0].
   gcd ← numerator gcd: denominator.
   numer ← numerator // gcd.
   denom ← denominator // gcd.
   denom = 1 ifTrue: [↑numer].
   ↑Fraction numerator: numer denominator: denom


                                                     6
Fraction >> reduced
                    1. Interpret inst vars as their types
reduced
   | gcd numer denom |
   numerator = 0 ifTrue: [↑0].
   <integer>
   gcd ← numerator gcd: denominator.
          <integer>       <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer].
   ↑Fraction numerator: numer denominator: denom


                                                     6
Fraction >> reduced

reduced
   | gcd numer denom |
   numerator = 0 ifTrue: [↑0].
   <integer>
   gcd ← numerator gcd: denominator.
          <integer>       <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer].
   ↑Fraction numerator: numer denominator: denom


                                              6
Fraction >> reduced
                   2. Interpret constants as their types
reduced
   | gcd numer denom |
   numerator = 0 ifTrue: [↑0].
   <integer>
   gcd ← numerator gcd: denominator.
          <integer>       <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer].
   ↑Fraction numerator: numer denominator: denom


                                                   6
Fraction >> reduced
                   2. Interpret constants as their types
reduced
   | gcd numer denom |
   numerator = <integer> [↑0].
   <integer> 0 ifTrue: ifTrue: [↑<integer>]
   gcd ← numerator gcd: denominator.
          <integer>        <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer]. [↑numer]
              <integer> ifTrue:
   ↑Fraction numerator: numer denominator: denom


                                                   6
Fraction >> reduced

reduced
   | gcd numer denom |
   numerator = <integer> [↑0].
   <integer> 0 ifTrue: ifTrue: [↑<integer>]
   gcd ← numerator gcd: denominator.
          <integer>        <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer]. [↑numer]
              <integer> ifTrue:
   ↑Fraction numerator: numer denominator: denom


                                              6
Fraction >> reduced
                       3. Interpret operations …
reduced
   | gcd numer denom |
   numerator = <integer> [↑0].
   <integer> 0 ifTrue: ifTrue: [↑<integer>]
   gcd ← numerator gcd: denominator.
          <integer>        <integer>
   numer ← numerator // gcd.
             <integer>
   denom ← denominator // gcd.
               <integer>
   denom = 1 ifTrue: [↑numer]. [↑numer]
              <integer> ifTrue:
   ↑Fraction numerator: numer denominator: denom


                                                   6
Fraction >> reduced
                          3. Interpret operations …
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd    <integer>        <integer>
      numer ← numerator // <integer>
                <integer> gcd.
      denom ← denominator // <integer>
                  <integer>     gcd.
      denom = 1 ifTrue: [↑numer]. [↑numer]
                 <integer> ifTrue:
      ↑Fraction numerator: numer denominator: denom


                                                      6
Fraction >> reduced
                            3. Interpret operations …
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd    <integer>        <integer>
  <integer> ← numerator // <integer>
      numer     <integer> gcd.
      denom ← denominator // <integer>
                  <integer>     gcd.
      denom = 1 ifTrue: [↑numer]. [↑numer] ]
                 <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
                             numer


                                                        6
Fraction >> reduced
                             3. Interpret operations …
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd    <integer>         <integer>
  <integer> ← numerator // <integer>
      numer      <integer> gcd.
   <integer> ← denominator // <integer>
      denom        <integer>     gcd.
   <integer> = 1 ifTrue: [↑numer]. [↑numer] ]
      denom       <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
                              numer               <integer>


                                                         6
Fraction >> reduced
                             3. Interpret operations …
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd     <integer>        <integer>
  <integer> ← numerator // <integer>
      numer      <integer> gcd.
   <integer> ← denominator // <integer>
      denom        <integer>     gcd.
   <integer> = 1 ifTrue: [↑numer]. [↑numer] ]
      denom       <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
        <fraction>            numer               <integer>


                                                         6
Fraction >> reduced

   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd     <integer>        <integer>
  <integer> ← numerator // <integer>
      numer      <integer> gcd.
   <integer> ← denominator // <integer>
      denom        <integer>     gcd.
   <integer> = 1 ifTrue: [↑numer]. [↑numer] ]
      denom       <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
        <fraction>            numer               <integer>


                                                      6
Fraction >> reduced
                                4. Collect answers
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd     <integer>        <integer>
  <integer> ← numerator // <integer>
      numer      <integer> gcd.
   <integer> ← denominator // <integer>
      denom        <integer>     gcd.
   <integer> = 1 ifTrue: [↑numer]. [↑numer] ]
      denom       <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
        <fraction>            numer               <integer>


                                                      6
Fraction >> reduced
                                4. Collect answers
   reduced
      | gcd numer denom |
      numerator = <integer> [↑0].
      <integer> 0 ifTrue: ifTrue: [↑<integer>]
<integer> ← numerator gcd: denominator.
      gcd     <integer>        <integer>
  <integer> ← numerator // <integer>
      numer      <integer> gcd.
   <integer> ← denominator // <integer>
      denom        <integer>     gcd.
   <integer> = 1 ifTrue: [↑numer]. [↑numer] ]
      denom       <integer> ifTrue: <integer>
      ↑Fraction numerator: <integer> denominator: denom
        <fraction>            numer               <integer>


                                                      6
My application:
Inferring method requirements

•   When a method sends a message m to self, infer
    that its class should have a method on m

•   When a method sends a message n to super, infer
    that its superclass should have a method on n

•   When a method sends a message c to self class,
    infer that its metaclass should have a method on c


                                                      7
Back to ESUG 2003 …




                      8
Back to ESUG 2003 …
• Virtual Categories




                                8
Back to ESUG 2003 …
• Virtual Categories




                                8
Back to ESUG 2003 …
                       categorization of methods by the browser, based
• Virtual Categories   on their characteristics; always up-to-date




                                                             8
Andrew Black

Collecting
information about
self-sends requires
that we parse the
source code of all of
the methods ...


                         9
Andrew Black

Collecting
information about
self-sends requires
that we parse the
source code of all of
the methods ...


                         9
John Brant

• I think you can do
 this by looking at the
 byte code

• We did something
 similar once in the
 refactoring browser


                          10
Squeak Byte Code

•   instructions for a simple stack-based VM

17   <00>   pushRcvr: 0
18   <75>   pushConstant: 0
19   <B6>   send: =
20   <99>   jumpFalse: 23
21   <75>   pushConstant: 0
22   <7C>   returnTop         reduced
23   <00>   pushRcvr: 0
                                 | gcd numer denom |
24   <01>   pushRcvr: 1
25   <E0>   send: gcd:           numerator = 0 ifTrue: [↑0].
26   <68>   popIntoTemp: 0       gcd ← numerator gcd:
…
                                                    denominator.
                                 …
CompiledMethod
•   In Squeak, the class CompiledMethod is a subclass
    of ByteArray.
        It contains both the instructions (bytes) and the literal
    °   table (objects)

        While the on-disk format is standardized, the in-
    °   memory format of the literals depends on the byte
        order of the processor

•   Use CompiledMethod >> literalAt: and
    InstructionStream to look at CompiledMethods

                                                                    12
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                  13
InstructionStream Hierarchy
                         interpret the byte-encoded
                         Smalltalk instruction set.
InstructionStream
                        maintain a program counter
   ContextPart
                        (pc) for streaming through
       BlockContext          CompiledMethods.
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                               13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                  13
InstructionStream Hierarchy

InstructionStream
   ContextPart
                        adds the semantics
       BlockContext
                        for execution, and
       MethodContext      keeps track of
   Decompiler            sending context
   InstructionPrinter
   <Your intepreter>




                                             13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                  13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext      decompile a method into
   Decompiler            Smalltalk text by a three-
                         phase process: postfix byte
   InstructionPrinter
                        code    prefix symbolic code
   <Your intepreter>
                               node tree     text




                                                       13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                  13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>       Print-out a
                            symbolic
                        representation of
                          the byte-codes
                                            13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   <Your intepreter>




                                  13
InstructionStream Hierarchy

InstructionStream
   ContextPart
       BlockContext
       MethodContext
   Decompiler
   InstructionPrinter
   SendsInfo
   <Your intepreter>




                                  13
InstructionStream Hierarchy

InstructionStream
   ContextPart
                         My concrete subclass of
       BlockContext
                        InstructionStream which
       MethodContext
                           collects information
   Decompiler
                          about self-, super- and
   InstructionPrinter
                                class-sends
   SendsInfo
   <Your intepreter>




                                              13
InstructionStream subclass: #SendsInfo

•   The core class of my abstract interpreter
        or for any similar single method interpreter
    °
•   Example usage:
     si ← SendsInfo on: ArrayedCollection>>#writeOn: .
     si collectSends
         answers a SendsInfo that prints as
        [#(#basicSize)
         #(#writeOn:)
         #(#isWords #isPointers)]

                                                       14
InstructionStream subclass: #SendsInfo

•   The core class of my abstract interpreter
        or for any similar single method interpreter
    °
•   Example usage:
     si ← SendsInfo on: ArrayedCollection>>#writeOn: .
     si collectSends
         answers a SendsInfo that prints as
        [#(#basicSize)
         #(#writeOn:)
         #(#isWords #isPointers)]

                                                       14
SendsInfo >> #collectSends

•   “main loop” of the interpreter

collectSends
    | end |
    end     self method endPC.
    [pc <= end]
         whileTrue:
               [self interpretNextInstructionFor: self]




                                                      15
SendsInfo >> #collectSends
                  instruction interpreter inherited
•   “main loop” of the interpreter
                      from InstructionStream

collectSends
    | end |
    end     self method endPC.
    [pc <= end]
         whileTrue:
               [self interpretNextInstructionFor: self]




                                                      15
SendsInfo >> #collectSends

•   “main loop” of the interpreter

collectSends
    | end |
    end     self method endPC.
    [pc <= end]
         whileTrue:
               [self interpretNextInstructionFor: self]




                                                      15
SendsInfo >> #collectSends

•   “main loop” of the interpreter

collectSends
    | end |
    end     self method endPC.
    [pc <= end]
         whileTrue:
               [self interpretNextInstructionFor: self]


                   this SendsInfo is the argument
                         as well as the target        15
SendsInfo >> #collectSends

•   “main loop” of the interpreter

collectSends
    | end |
    end     self method endPC.
    [pc <= end]
         whileTrue:
               [self interpretNextInstructionFor: self]




                                                      15
InstructionStream >>interpretNextInstructionFor:
interpretNextInstructionFor: client
   "Send to client, a message that specifies the type of the next instruction."

  | byte type offset method |
  method      self method.
  byte    method at: pc.
  type    byte // 16.
  offset     byte  16.
  pc    pc + 1.
  type=0 ifTrue: [↑client pushReceiverVariable: offset].
  type=1 ifTrue: [↑client pushTemporaryVariable: offset].
  type=2 ifTrue: [↑client pushConstant: (method literalAt: offset+1)].
  type=3 ifTrue: [↑client pushConstant: (method literalAt: offset+17)].
  type=4 ifTrue: [↑client pushLiteralVariable: (method literalAt: offset+1)].
  type=5 ifTrue: [↑client pushLiteralVariable: (method literalAt: offset+17)].
  …

                                                                            16
Instruction set Methods

•   To detect self- and class-sends, we have to simulate the
    stack
        But we can abstract over the set of values pushed on stack
    °
        We need distinguish only between self, self class, and other
    °
        stuff

        •   actually, there are also blocks and small integers representing
            the number of arguments to the block


                                                                     17
Sample Instruction Set Methods


pushTemporaryVariable: offset
   self push: #stuff

pushConstant: value
   self push: value

pushReceiver
   self push: #self



                                18
The critical instruction set method
send: selector super: superFlag numArgs: numArgs
   "Simulate the action of bytecodes that send a message with selector…
    The arguments of the message are found in the top numArgs
     locations on the stack and the receiver just below them."
   | stackTop |
   …
   self pop: numArgs.
   stackTop      self pop.
   superFlag
       ifTrue: [superSentSelectors add: selector]
       ifFalse: [stackTop == #self
              ifTrue: [self tallySelfSendsFor: selector].
          stackTop == #class
             ifTrue: [classSentSelectors add: selector]].

                                                                 19
The critical instruction set method
send: selector super: superFlag numArgs: numArgs
   "Simulate the action of bytecodes that send a message with selector…
    The arguments of the message are found in the top numArgs
     locations on the stack and the receiver just below them."
   | stackTop |
   …
   self pop: numArgs.
   stackTop      self pop.
   superFlag
       ifTrue: [superSentSelectors add: selector]
       ifFalse: [stackTop == #self
              ifTrue: [self tallySelfSendsFor: selector].
          stackTop == #class
             ifTrue: [classSentSelectors add: selector]].

                                                                 19
The critical instruction set method
send: selector super: superFlag numArgs: numArgs
   "Simulate the action of bytecodes that send a message with selector…
    The arguments of the message are found in the top numArgs
     locations on the stack and the receiver just below them."
   | stackTop |
   …
   self pop: numArgs.
   stackTop      self pop.
   superFlag
       ifTrue: [superSentSelectors add: selector]
       ifFalse: [stackTop == #self
              ifTrue: [self tallySelfSendsFor: selector].
          stackTop == #class
             ifTrue: [classSentSelectors add: selector]].

                                                                 19
The critical instruction set method
send: selector super: superFlag numArgs: numArgs
   "Simulate the action of bytecodes that send a message with selector…
    The arguments of the message are found in the top numArgs
     locations on the stack and the receiver just below them."
   | stackTop |
   …
   self pop: numArgs.
   stackTop      self pop.
   superFlag
       ifTrue: [superSentSelectors add: selector]
       ifFalse: [stackTop == #self
              ifTrue: [self tallySelfSendsFor: selector].
          stackTop == #class
             ifTrue: [classSentSelectors add: selector]].

                                                                 19
What about loops?

•   Simplifying assumption:
        any send in a loop does not change from a self-send to a
    °   object-send

        •   this is true because we consider temp message to always
            be an object-send, even if temp == self.

        so we never need to go around a loop more than once
    °
        •   we can just ignore backward jumps


                                                                  20
SendsInfo >>jump:

jump: distance
  "Simulate the action of a 'unconditional jump' bytecode
   whose offset is distance."
  distance < 0
     ifTrue: [↑ self].
  distance = 0
     ifTrue: [self error: 'bad compiler!'].
  …



                                                      21
What about forward jumps?
classPool
    "Answer the dictionary of class variables."
    classPool == nil
         ifTrue: [↑Dictionary new]
         ifFalse: [↑classPool]
                                       • instructions like 16 are
                                         “merge points” in the
                                         instruction sequence:

                                          ° execution can reachways.
                                            merge point in two
                                                               a


                                          ° which stack should we use?
                                                                  22
What about forward jumps?
classPool
    "Answer the dictionary of class variables."
    classPool == nil
         ifTrue: [↑Dictionary new]
         ifFalse: [↑classPool]
                                       • instructions like 16 are
 9   <0D>   pushRcvr: 13                 “merge points” in the
10   <73>   pushConstant: nil
11   <C6>   send: ==
                                         instruction sequence:
12   <9A>   jumpFalse: 16
13   <40>   pushLit: Dictionary           ° execution can reachways.
                                            merge point in two
                                                               a
14   <CC>   send: new
15   <7C>   returnTop
16   <0D>   pushRcvr: 13                  ° which stack should we use?
17   <7C>   returnTop
                                                                  22
What about forward jumps?
classPool
    "Answer the dictionary of class variables."
    classPool == nil
         ifTrue: [↑Dictionary new]
         ifFalse: [↑classPool]
                                       • instructions like 16 are
 9   <0D>   pushRcvr: 13                 “merge points” in the
10   <73>   pushConstant: nil
11   <C6>   send: ==
                                         instruction sequence:
12   <9A>   jumpFalse: 16
13   <40>   pushLit: Dictionary           ° execution can reachways.
                                            merge point in two
                                                               a
14   <CC>   send: new
15   <7C>   returnTop
16   <0D>   pushRcvr: 13                  ° which stack should we use?
17   <7C>   returnTop
                                                                  22
What about forward jumps?
classPool
    "Answer the dictionary of class variables."
    classPool == nil
         ifTrue: [↑Dictionary new]
         ifFalse: [↑classPool]
                                       • instructions like 16 are
 9   <0D>   pushRcvr: 13                 “merge points” in the
10   <73>   pushConstant: nil
11   <C6>   send: ==
                                         instruction sequence:
12   <9A>   jumpFalse: 16
13   <40>   pushLit: Dictionary           ° execution can reachways.
                                            merge point in two
                                                               a
14   <CC>   send: new
15   <7C>   returnTop
16   <0D>   pushRcvr: 13                  ° which stack should we use?
17   <7C>   returnTop
                                                                  22
Merging stacks
•   Every conditional forward jump marks its
    destination address as a merge point
        we save the state of the stack at the jump point in a
    °   dictionary keyed by the merge point

                 jump: distance if: aBooleanConstant
                   "Simulate the action of a 'conditional jump' bytecode …"
                   | destination |
                   distance < 0 ifTrue:[↑ self].
                   distance = 0 ifTrue:[self error: 'bad compiler!'].
                   destination ← self pc + distance.
                   self pop.       "remove the condition from the stack."
                   savedStacks at: destination put: stack copy.
                                                                       23
Merging stack (2)
•   We have to check for the possibility of a merge
    point at every instruction!
    °   We do this by overriding InstructionStream >>
        interpretNextInstructionFor:




                                                        24
Merging stack (2)
•   We have to check for the possibility of a merge
    point at every instruction!
    °   We do this by overriding InstructionStream >>
        interpretNextInstructionFor:


             SendsInfo>>interpretNextInstructionFor: client
               self atMergePoint
                  ifTrue: [self mergeStacks].
               super interpretNextInstructionFor: client



                                                              24
Merging stack (2)
•   We have to check for the possibility of a merge
    point at every instruction!
    °   We do this by overriding InstructionStream >>
        interpretNextInstructionFor:




                                                        24
Merging stack (2)
•   We have to check for the possibility of a merge
    point at every instruction!
    °   We do this by overriding InstructionStream >>
        interpretNextInstructionFor:

• Merging stacks is conservative:
        If element i in either stack is #self, then element i in
    °   the merged stack is also #self.




                                                                   24
Dealing with Blocks
•   The code for blocks is compiled in-line, and
    copied onto the stack at execution time
9 <10> pushTemp: 0         removeAll: aCollection
10 <89> pushThisContext:
11 <76> pushConstant: 1      aCollection do: [:each | self remove: each].
12 <C8> send: blockCopy:     ↑ aCollection
13 <A4 05> jumpTo: 20
15 <69> popIntoTemp: 1
16 <70> self
17 <11> pushTemp: 1
18 <E0> send: remove:
19 <7D> blockReturn
20 <CB> send: do:
21 <87> pop
22 <10> pushTemp: 0
23 <7C> returnTop                                                  25
Dealing with Blocks
•   The code for blocks is compiled in-line, and
    copied onto the stack at execution time
9 <10> pushTemp: 0         removeAll: aCollection
10 <89> pushThisContext:
11 <76> pushConstant: 1      aCollection do: [:each | self remove: each].
12 <C8> send: blockCopy:     ↑ aCollection
13 <A4 05> jumpTo: 20         Number of
15 <69> popIntoTemp: 1       arguments of
16 <70> self
17 <11> pushTemp: 1
                               the block
18 <E0> send: remove:
19 <7D> blockReturn
20 <CB> send: do:
21 <87> pop
22 <10> pushTemp: 0
23 <7C> returnTop                                                  25
Dealing with Blocks
•   The code for blocks is compiled in-line, and
    copied onto the stack at execution time
9 <10> pushTemp: 0         removeAll: aCollection
10 <89> pushThisContext:
11 <76> pushConstant: 1      aCollection do: [:each | self remove: each].
12 <C8> send: blockCopy:     ↑ aCollection
13 <A4 05> jumpTo: 20
15 <69> popIntoTemp: 1
16 <70> self
17 <11> pushTemp: 1
18 <E0> send: remove:
19 <7D> blockReturn
20 <CB> send: do:
21 <87> pop
22 <10> pushTemp: 0
23 <7C> returnTop                                                  25
Dealing with Blocks
•   The code for blocks is compiled in-line, and
    copied onto the stack at execution time
9 <10> pushTemp: 0
10 <89> pushThisContext:
11 <76> pushConstant: 1
12 <C8> send: blockCopy:
13 <A4 05> jumpTo: 20
15 <69> popIntoTemp: 1
16 <70> self
17 <11> pushTemp: 1
18 <E0> send: remove:
19 <7D> blockReturn
20 <CB> send: do:
21 <87> pop
22 <10> pushTemp: 0
23 <7C> returnTop                                  25
Dealing with Blocks
•   The code for blocks is compiled in-line, and
    copied onto the stack at execution time
9 <10> pushTemp: 0         1. Sending a block copy
10 <89> pushThisContext:
11 <76> pushConstant: 1       remembers the number of
12 <C8> send: blockCopy:      arguments
13 <A4 05> jumpTo: 20
15 <69> popIntoTemp: 1     2. When we drop into the
16 <70> self
17 <11> pushTemp: 1
                              block, empty stack and push
18 <E0> send: remove:         some dummy arguments
19 <7D> blockReturn
20 <CB> send: do:
                           3. No merge of stacks is
21 <87> pop                   needed at 20 (or after any
22 <10> pushTemp: 0
23 <7C> returnTop
                              unconditional jump
                                                            25
Performance
•   Initially: 27 seconds to analyze every method
    (~45 000) in the image (600 µs per method).
•   Where did the time go?
        Dictionaries: SendsInfo >> #atMergePoint is usually false
    °
        OrderedCollection: stack manipulations are slow
    °
        InstructionStream>>interpretNextInstructionFor:
    °
        uses linear dispatch

•   Fix these things ... 260 µs per method
                                                               26
Conclusions
•   AI was easy to implement
        spare time at last two days of ESUG 2003 and flight home
    °
        Thanks to John Brant and Vassili Bykov
    °
•   AI implementation was more accurate than
    implementation based on parsing source
        Morph>>layoutInBounds: contains the statement:
    °
          (owner ifNil:[self]) cellPositioning
        which AI detects as self-send
•   AI is fast
        Bytescodes are designed for fast interpretation
    °
                                                             27

Contenu connexe

Plus de ESUG

Workshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programmingWorkshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programmingESUG
 
Technical documentation support in Pharo
Technical documentation support in PharoTechnical documentation support in Pharo
Technical documentation support in PharoESUG
 
The Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and RoadmapThe Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and RoadmapESUG
 
Sequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in PharoSequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in PharoESUG
 
Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...ESUG
 
Analyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early resultsAnalyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early resultsESUG
 
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6ESUG
 
A Unit Test Metamodel for Test Generation
A Unit Test Metamodel for Test GenerationA Unit Test Metamodel for Test Generation
A Unit Test Metamodel for Test GenerationESUG
 
Creating Unit Tests Using Genetic Programming
Creating Unit Tests Using Genetic ProgrammingCreating Unit Tests Using Genetic Programming
Creating Unit Tests Using Genetic ProgrammingESUG
 
Threaded-Execution and CPS Provide Smooth Switching Between Execution Modes
Threaded-Execution and CPS Provide Smooth Switching Between Execution ModesThreaded-Execution and CPS Provide Smooth Switching Between Execution Modes
Threaded-Execution and CPS Provide Smooth Switching Between Execution ModesESUG
 
Exploring GitHub Actions through EGAD: An Experience Report
Exploring GitHub Actions through EGAD: An Experience ReportExploring GitHub Actions through EGAD: An Experience Report
Exploring GitHub Actions through EGAD: An Experience ReportESUG
 
Pharo: a reflective language A first systematic analysis of reflective APIs
Pharo: a reflective language A first systematic analysis of reflective APIsPharo: a reflective language A first systematic analysis of reflective APIs
Pharo: a reflective language A first systematic analysis of reflective APIsESUG
 
Garbage Collector Tuning
Garbage Collector TuningGarbage Collector Tuning
Garbage Collector TuningESUG
 
Improving Performance Through Object Lifetime Profiling: the DataFrame Case
Improving Performance Through Object Lifetime Profiling: the DataFrame CaseImproving Performance Through Object Lifetime Profiling: the DataFrame Case
Improving Performance Through Object Lifetime Profiling: the DataFrame CaseESUG
 
Pharo DataFrame: Past, Present, and Future
Pharo DataFrame: Past, Present, and FuturePharo DataFrame: Past, Present, and Future
Pharo DataFrame: Past, Present, and FutureESUG
 
thisContext in the Debugger
thisContext in the DebuggerthisContext in the Debugger
thisContext in the DebuggerESUG
 
Websockets for Fencing Score
Websockets for Fencing ScoreWebsockets for Fencing Score
Websockets for Fencing ScoreESUG
 
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScriptShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScriptESUG
 
Advanced Object- Oriented Design Mooc
Advanced Object- Oriented Design MoocAdvanced Object- Oriented Design Mooc
Advanced Object- Oriented Design MoocESUG
 
A New Architecture Reconciling Refactorings and Transformations
A New Architecture Reconciling Refactorings and TransformationsA New Architecture Reconciling Refactorings and Transformations
A New Architecture Reconciling Refactorings and TransformationsESUG
 

Plus de ESUG (20)

Workshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programmingWorkshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programming
 
Technical documentation support in Pharo
Technical documentation support in PharoTechnical documentation support in Pharo
Technical documentation support in Pharo
 
The Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and RoadmapThe Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and Roadmap
 
Sequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in PharoSequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in Pharo
 
Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...
 
Analyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early resultsAnalyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early results
 
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
 
A Unit Test Metamodel for Test Generation
A Unit Test Metamodel for Test GenerationA Unit Test Metamodel for Test Generation
A Unit Test Metamodel for Test Generation
 
Creating Unit Tests Using Genetic Programming
Creating Unit Tests Using Genetic ProgrammingCreating Unit Tests Using Genetic Programming
Creating Unit Tests Using Genetic Programming
 
Threaded-Execution and CPS Provide Smooth Switching Between Execution Modes
Threaded-Execution and CPS Provide Smooth Switching Between Execution ModesThreaded-Execution and CPS Provide Smooth Switching Between Execution Modes
Threaded-Execution and CPS Provide Smooth Switching Between Execution Modes
 
Exploring GitHub Actions through EGAD: An Experience Report
Exploring GitHub Actions through EGAD: An Experience ReportExploring GitHub Actions through EGAD: An Experience Report
Exploring GitHub Actions through EGAD: An Experience Report
 
Pharo: a reflective language A first systematic analysis of reflective APIs
Pharo: a reflective language A first systematic analysis of reflective APIsPharo: a reflective language A first systematic analysis of reflective APIs
Pharo: a reflective language A first systematic analysis of reflective APIs
 
Garbage Collector Tuning
Garbage Collector TuningGarbage Collector Tuning
Garbage Collector Tuning
 
Improving Performance Through Object Lifetime Profiling: the DataFrame Case
Improving Performance Through Object Lifetime Profiling: the DataFrame CaseImproving Performance Through Object Lifetime Profiling: the DataFrame Case
Improving Performance Through Object Lifetime Profiling: the DataFrame Case
 
Pharo DataFrame: Past, Present, and Future
Pharo DataFrame: Past, Present, and FuturePharo DataFrame: Past, Present, and Future
Pharo DataFrame: Past, Present, and Future
 
thisContext in the Debugger
thisContext in the DebuggerthisContext in the Debugger
thisContext in the Debugger
 
Websockets for Fencing Score
Websockets for Fencing ScoreWebsockets for Fencing Score
Websockets for Fencing Score
 
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScriptShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
 
Advanced Object- Oriented Design Mooc
Advanced Object- Oriented Design MoocAdvanced Object- Oriented Design Mooc
Advanced Object- Oriented Design Mooc
 
A New Architecture Reconciling Refactorings and Transformations
A New Architecture Reconciling Refactorings and TransformationsA New Architecture Reconciling Refactorings and Transformations
A New Architecture Reconciling Refactorings and Transformations
 

Dernier

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 

Dernier (20)

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 

Abstract Interpretation for Program Analysis

  • 1. Abstract Interpretation for Dummies Program Analysis without Tears Andrew P. Black OGI School of Science & Engineering at OHSU Portland, Oregon, USA
  • 2. What is Abstract Interpretation? • An interpretation: a way of understanding something ° • Abstract: omitting some of the details ° 2
  • 7. An Example: Portland • Each interpretations makes available different information all have less information than the real city! ° in general, information content is disjoint ° some interpretations dominate others, but ° • Different interpretations have different uses the map is more useful for driving than the ° photograph it would be more useful still if it showed ° one-way streets! 4
  • 8. Examples from Computing • “Concrete” artifact is a program with the “standard” semantics e.g., + is an operation that performs arithmetic ° • “Abstract” artifact is a program with a non- standard semantics e.g., + is an operation that builds an expression tree, or ° computes the type of the answer 5
  • 9. Fraction >> reduced reduced | gcd numer denom | numerator = 0 ifTrue: [↑0]. gcd ← numerator gcd: denominator. numer ← numerator // gcd. denom ← denominator // gcd. denom = 1 ifTrue: [↑numer]. ↑Fraction numerator: numer denominator: denom 6
  • 10. Fraction >> reduced 1. Interpret inst vars as their types reduced | gcd numer denom | numerator = 0 ifTrue: [↑0]. gcd ← numerator gcd: denominator. numer ← numerator // gcd. denom ← denominator // gcd. denom = 1 ifTrue: [↑numer]. ↑Fraction numerator: numer denominator: denom 6
  • 11. Fraction >> reduced 1. Interpret inst vars as their types reduced | gcd numer denom | numerator = 0 ifTrue: [↑0]. <integer> gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. ↑Fraction numerator: numer denominator: denom 6
  • 12. Fraction >> reduced reduced | gcd numer denom | numerator = 0 ifTrue: [↑0]. <integer> gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. ↑Fraction numerator: numer denominator: denom 6
  • 13. Fraction >> reduced 2. Interpret constants as their types reduced | gcd numer denom | numerator = 0 ifTrue: [↑0]. <integer> gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. ↑Fraction numerator: numer denominator: denom 6
  • 14. Fraction >> reduced 2. Interpret constants as their types reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. [↑numer] <integer> ifTrue: ↑Fraction numerator: numer denominator: denom 6
  • 15. Fraction >> reduced reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. [↑numer] <integer> ifTrue: ↑Fraction numerator: numer denominator: denom 6
  • 16. Fraction >> reduced 3. Interpret operations … reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] gcd ← numerator gcd: denominator. <integer> <integer> numer ← numerator // gcd. <integer> denom ← denominator // gcd. <integer> denom = 1 ifTrue: [↑numer]. [↑numer] <integer> ifTrue: ↑Fraction numerator: numer denominator: denom 6
  • 17. Fraction >> reduced 3. Interpret operations … reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> numer ← numerator // <integer> <integer> gcd. denom ← denominator // <integer> <integer> gcd. denom = 1 ifTrue: [↑numer]. [↑numer] <integer> ifTrue: ↑Fraction numerator: numer denominator: denom 6
  • 18. Fraction >> reduced 3. Interpret operations … reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. denom ← denominator // <integer> <integer> gcd. denom = 1 ifTrue: [↑numer]. [↑numer] ] <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom numer 6
  • 19. Fraction >> reduced 3. Interpret operations … reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. <integer> ← denominator // <integer> denom <integer> gcd. <integer> = 1 ifTrue: [↑numer]. [↑numer] ] denom <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom numer <integer> 6
  • 20. Fraction >> reduced 3. Interpret operations … reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. <integer> ← denominator // <integer> denom <integer> gcd. <integer> = 1 ifTrue: [↑numer]. [↑numer] ] denom <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom <fraction> numer <integer> 6
  • 21. Fraction >> reduced reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. <integer> ← denominator // <integer> denom <integer> gcd. <integer> = 1 ifTrue: [↑numer]. [↑numer] ] denom <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom <fraction> numer <integer> 6
  • 22. Fraction >> reduced 4. Collect answers reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. <integer> ← denominator // <integer> denom <integer> gcd. <integer> = 1 ifTrue: [↑numer]. [↑numer] ] denom <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom <fraction> numer <integer> 6
  • 23. Fraction >> reduced 4. Collect answers reduced | gcd numer denom | numerator = <integer> [↑0]. <integer> 0 ifTrue: ifTrue: [↑<integer>] <integer> ← numerator gcd: denominator. gcd <integer> <integer> <integer> ← numerator // <integer> numer <integer> gcd. <integer> ← denominator // <integer> denom <integer> gcd. <integer> = 1 ifTrue: [↑numer]. [↑numer] ] denom <integer> ifTrue: <integer> ↑Fraction numerator: <integer> denominator: denom <fraction> numer <integer> 6
  • 24. My application: Inferring method requirements • When a method sends a message m to self, infer that its class should have a method on m • When a method sends a message n to super, infer that its superclass should have a method on n • When a method sends a message c to self class, infer that its metaclass should have a method on c 7
  • 25. Back to ESUG 2003 … 8
  • 26. Back to ESUG 2003 … • Virtual Categories 8
  • 27. Back to ESUG 2003 … • Virtual Categories 8
  • 28. Back to ESUG 2003 … categorization of methods by the browser, based • Virtual Categories on their characteristics; always up-to-date 8
  • 29. Andrew Black Collecting information about self-sends requires that we parse the source code of all of the methods ... 9
  • 30. Andrew Black Collecting information about self-sends requires that we parse the source code of all of the methods ... 9
  • 31. John Brant • I think you can do this by looking at the byte code • We did something similar once in the refactoring browser 10
  • 32. Squeak Byte Code • instructions for a simple stack-based VM 17 <00> pushRcvr: 0 18 <75> pushConstant: 0 19 <B6> send: = 20 <99> jumpFalse: 23 21 <75> pushConstant: 0 22 <7C> returnTop reduced 23 <00> pushRcvr: 0 | gcd numer denom | 24 <01> pushRcvr: 1 25 <E0> send: gcd: numerator = 0 ifTrue: [↑0]. 26 <68> popIntoTemp: 0 gcd ← numerator gcd: … denominator. …
  • 33. CompiledMethod • In Squeak, the class CompiledMethod is a subclass of ByteArray. It contains both the instructions (bytes) and the literal ° table (objects) While the on-disk format is standardized, the in- ° memory format of the literals depends on the byte order of the processor • Use CompiledMethod >> literalAt: and InstructionStream to look at CompiledMethods 12
  • 34. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 35. InstructionStream Hierarchy interpret the byte-encoded Smalltalk instruction set. InstructionStream maintain a program counter ContextPart (pc) for streaming through BlockContext CompiledMethods. MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 36. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 37. InstructionStream Hierarchy InstructionStream ContextPart adds the semantics BlockContext for execution, and MethodContext keeps track of Decompiler sending context InstructionPrinter <Your intepreter> 13
  • 38. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 39. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext decompile a method into Decompiler Smalltalk text by a three- phase process: postfix byte InstructionPrinter code prefix symbolic code <Your intepreter> node tree text 13
  • 40. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 41. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> Print-out a symbolic representation of the byte-codes 13
  • 42. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter <Your intepreter> 13
  • 43. InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter SendsInfo <Your intepreter> 13
  • 44. InstructionStream Hierarchy InstructionStream ContextPart My concrete subclass of BlockContext InstructionStream which MethodContext collects information Decompiler about self-, super- and InstructionPrinter class-sends SendsInfo <Your intepreter> 13
  • 45. InstructionStream subclass: #SendsInfo • The core class of my abstract interpreter or for any similar single method interpreter ° • Example usage: si ← SendsInfo on: ArrayedCollection>>#writeOn: . si collectSends answers a SendsInfo that prints as [#(#basicSize) #(#writeOn:) #(#isWords #isPointers)] 14
  • 46. InstructionStream subclass: #SendsInfo • The core class of my abstract interpreter or for any similar single method interpreter ° • Example usage: si ← SendsInfo on: ArrayedCollection>>#writeOn: . si collectSends answers a SendsInfo that prints as [#(#basicSize) #(#writeOn:) #(#isWords #isPointers)] 14
  • 47. SendsInfo >> #collectSends • “main loop” of the interpreter collectSends | end | end self method endPC. [pc <= end] whileTrue: [self interpretNextInstructionFor: self] 15
  • 48. SendsInfo >> #collectSends instruction interpreter inherited • “main loop” of the interpreter from InstructionStream collectSends | end | end self method endPC. [pc <= end] whileTrue: [self interpretNextInstructionFor: self] 15
  • 49. SendsInfo >> #collectSends • “main loop” of the interpreter collectSends | end | end self method endPC. [pc <= end] whileTrue: [self interpretNextInstructionFor: self] 15
  • 50. SendsInfo >> #collectSends • “main loop” of the interpreter collectSends | end | end self method endPC. [pc <= end] whileTrue: [self interpretNextInstructionFor: self] this SendsInfo is the argument as well as the target 15
  • 51. SendsInfo >> #collectSends • “main loop” of the interpreter collectSends | end | end self method endPC. [pc <= end] whileTrue: [self interpretNextInstructionFor: self] 15
  • 52. InstructionStream >>interpretNextInstructionFor: interpretNextInstructionFor: client "Send to client, a message that specifies the type of the next instruction." | byte type offset method | method self method. byte method at: pc. type byte // 16. offset byte 16. pc pc + 1. type=0 ifTrue: [↑client pushReceiverVariable: offset]. type=1 ifTrue: [↑client pushTemporaryVariable: offset]. type=2 ifTrue: [↑client pushConstant: (method literalAt: offset+1)]. type=3 ifTrue: [↑client pushConstant: (method literalAt: offset+17)]. type=4 ifTrue: [↑client pushLiteralVariable: (method literalAt: offset+1)]. type=5 ifTrue: [↑client pushLiteralVariable: (method literalAt: offset+17)]. … 16
  • 53. Instruction set Methods • To detect self- and class-sends, we have to simulate the stack But we can abstract over the set of values pushed on stack ° We need distinguish only between self, self class, and other ° stuff • actually, there are also blocks and small integers representing the number of arguments to the block 17
  • 54. Sample Instruction Set Methods pushTemporaryVariable: offset self push: #stuff pushConstant: value self push: value pushReceiver self push: #self 18
  • 55. The critical instruction set method send: selector super: superFlag numArgs: numArgs "Simulate the action of bytecodes that send a message with selector… The arguments of the message are found in the top numArgs locations on the stack and the receiver just below them." | stackTop | … self pop: numArgs. stackTop self pop. superFlag ifTrue: [superSentSelectors add: selector] ifFalse: [stackTop == #self ifTrue: [self tallySelfSendsFor: selector]. stackTop == #class ifTrue: [classSentSelectors add: selector]]. 19
  • 56. The critical instruction set method send: selector super: superFlag numArgs: numArgs "Simulate the action of bytecodes that send a message with selector… The arguments of the message are found in the top numArgs locations on the stack and the receiver just below them." | stackTop | … self pop: numArgs. stackTop self pop. superFlag ifTrue: [superSentSelectors add: selector] ifFalse: [stackTop == #self ifTrue: [self tallySelfSendsFor: selector]. stackTop == #class ifTrue: [classSentSelectors add: selector]]. 19
  • 57. The critical instruction set method send: selector super: superFlag numArgs: numArgs "Simulate the action of bytecodes that send a message with selector… The arguments of the message are found in the top numArgs locations on the stack and the receiver just below them." | stackTop | … self pop: numArgs. stackTop self pop. superFlag ifTrue: [superSentSelectors add: selector] ifFalse: [stackTop == #self ifTrue: [self tallySelfSendsFor: selector]. stackTop == #class ifTrue: [classSentSelectors add: selector]]. 19
  • 58. The critical instruction set method send: selector super: superFlag numArgs: numArgs "Simulate the action of bytecodes that send a message with selector… The arguments of the message are found in the top numArgs locations on the stack and the receiver just below them." | stackTop | … self pop: numArgs. stackTop self pop. superFlag ifTrue: [superSentSelectors add: selector] ifFalse: [stackTop == #self ifTrue: [self tallySelfSendsFor: selector]. stackTop == #class ifTrue: [classSentSelectors add: selector]]. 19
  • 59. What about loops? • Simplifying assumption: any send in a loop does not change from a self-send to a ° object-send • this is true because we consider temp message to always be an object-send, even if temp == self. so we never need to go around a loop more than once ° • we can just ignore backward jumps 20
  • 60. SendsInfo >>jump: jump: distance "Simulate the action of a 'unconditional jump' bytecode whose offset is distance." distance < 0 ifTrue: [↑ self]. distance = 0 ifTrue: [self error: 'bad compiler!']. … 21
  • 61. What about forward jumps? classPool "Answer the dictionary of class variables." classPool == nil ifTrue: [↑Dictionary new] ifFalse: [↑classPool] • instructions like 16 are “merge points” in the instruction sequence: ° execution can reachways. merge point in two a ° which stack should we use? 22
  • 62. What about forward jumps? classPool "Answer the dictionary of class variables." classPool == nil ifTrue: [↑Dictionary new] ifFalse: [↑classPool] • instructions like 16 are 9 <0D> pushRcvr: 13 “merge points” in the 10 <73> pushConstant: nil 11 <C6> send: == instruction sequence: 12 <9A> jumpFalse: 16 13 <40> pushLit: Dictionary ° execution can reachways. merge point in two a 14 <CC> send: new 15 <7C> returnTop 16 <0D> pushRcvr: 13 ° which stack should we use? 17 <7C> returnTop 22
  • 63. What about forward jumps? classPool "Answer the dictionary of class variables." classPool == nil ifTrue: [↑Dictionary new] ifFalse: [↑classPool] • instructions like 16 are 9 <0D> pushRcvr: 13 “merge points” in the 10 <73> pushConstant: nil 11 <C6> send: == instruction sequence: 12 <9A> jumpFalse: 16 13 <40> pushLit: Dictionary ° execution can reachways. merge point in two a 14 <CC> send: new 15 <7C> returnTop 16 <0D> pushRcvr: 13 ° which stack should we use? 17 <7C> returnTop 22
  • 64. What about forward jumps? classPool "Answer the dictionary of class variables." classPool == nil ifTrue: [↑Dictionary new] ifFalse: [↑classPool] • instructions like 16 are 9 <0D> pushRcvr: 13 “merge points” in the 10 <73> pushConstant: nil 11 <C6> send: == instruction sequence: 12 <9A> jumpFalse: 16 13 <40> pushLit: Dictionary ° execution can reachways. merge point in two a 14 <CC> send: new 15 <7C> returnTop 16 <0D> pushRcvr: 13 ° which stack should we use? 17 <7C> returnTop 22
  • 65. Merging stacks • Every conditional forward jump marks its destination address as a merge point we save the state of the stack at the jump point in a ° dictionary keyed by the merge point jump: distance if: aBooleanConstant "Simulate the action of a 'conditional jump' bytecode …" | destination | distance < 0 ifTrue:[↑ self]. distance = 0 ifTrue:[self error: 'bad compiler!']. destination ← self pc + distance. self pop. "remove the condition from the stack." savedStacks at: destination put: stack copy. 23
  • 66. Merging stack (2) • We have to check for the possibility of a merge point at every instruction! ° We do this by overriding InstructionStream >> interpretNextInstructionFor: 24
  • 67. Merging stack (2) • We have to check for the possibility of a merge point at every instruction! ° We do this by overriding InstructionStream >> interpretNextInstructionFor: SendsInfo>>interpretNextInstructionFor: client self atMergePoint ifTrue: [self mergeStacks]. super interpretNextInstructionFor: client 24
  • 68. Merging stack (2) • We have to check for the possibility of a merge point at every instruction! ° We do this by overriding InstructionStream >> interpretNextInstructionFor: 24
  • 69. Merging stack (2) • We have to check for the possibility of a merge point at every instruction! ° We do this by overriding InstructionStream >> interpretNextInstructionFor: • Merging stacks is conservative: If element i in either stack is #self, then element i in ° the merged stack is also #self. 24
  • 70. Dealing with Blocks • The code for blocks is compiled in-line, and copied onto the stack at execution time 9 <10> pushTemp: 0 removeAll: aCollection 10 <89> pushThisContext: 11 <76> pushConstant: 1 aCollection do: [:each | self remove: each]. 12 <C8> send: blockCopy: ↑ aCollection 13 <A4 05> jumpTo: 20 15 <69> popIntoTemp: 1 16 <70> self 17 <11> pushTemp: 1 18 <E0> send: remove: 19 <7D> blockReturn 20 <CB> send: do: 21 <87> pop 22 <10> pushTemp: 0 23 <7C> returnTop 25
  • 71. Dealing with Blocks • The code for blocks is compiled in-line, and copied onto the stack at execution time 9 <10> pushTemp: 0 removeAll: aCollection 10 <89> pushThisContext: 11 <76> pushConstant: 1 aCollection do: [:each | self remove: each]. 12 <C8> send: blockCopy: ↑ aCollection 13 <A4 05> jumpTo: 20 Number of 15 <69> popIntoTemp: 1 arguments of 16 <70> self 17 <11> pushTemp: 1 the block 18 <E0> send: remove: 19 <7D> blockReturn 20 <CB> send: do: 21 <87> pop 22 <10> pushTemp: 0 23 <7C> returnTop 25
  • 72. Dealing with Blocks • The code for blocks is compiled in-line, and copied onto the stack at execution time 9 <10> pushTemp: 0 removeAll: aCollection 10 <89> pushThisContext: 11 <76> pushConstant: 1 aCollection do: [:each | self remove: each]. 12 <C8> send: blockCopy: ↑ aCollection 13 <A4 05> jumpTo: 20 15 <69> popIntoTemp: 1 16 <70> self 17 <11> pushTemp: 1 18 <E0> send: remove: 19 <7D> blockReturn 20 <CB> send: do: 21 <87> pop 22 <10> pushTemp: 0 23 <7C> returnTop 25
  • 73. Dealing with Blocks • The code for blocks is compiled in-line, and copied onto the stack at execution time 9 <10> pushTemp: 0 10 <89> pushThisContext: 11 <76> pushConstant: 1 12 <C8> send: blockCopy: 13 <A4 05> jumpTo: 20 15 <69> popIntoTemp: 1 16 <70> self 17 <11> pushTemp: 1 18 <E0> send: remove: 19 <7D> blockReturn 20 <CB> send: do: 21 <87> pop 22 <10> pushTemp: 0 23 <7C> returnTop 25
  • 74. Dealing with Blocks • The code for blocks is compiled in-line, and copied onto the stack at execution time 9 <10> pushTemp: 0 1. Sending a block copy 10 <89> pushThisContext: 11 <76> pushConstant: 1 remembers the number of 12 <C8> send: blockCopy: arguments 13 <A4 05> jumpTo: 20 15 <69> popIntoTemp: 1 2. When we drop into the 16 <70> self 17 <11> pushTemp: 1 block, empty stack and push 18 <E0> send: remove: some dummy arguments 19 <7D> blockReturn 20 <CB> send: do: 3. No merge of stacks is 21 <87> pop needed at 20 (or after any 22 <10> pushTemp: 0 23 <7C> returnTop unconditional jump 25
  • 75. Performance • Initially: 27 seconds to analyze every method (~45 000) in the image (600 µs per method). • Where did the time go? Dictionaries: SendsInfo >> #atMergePoint is usually false ° OrderedCollection: stack manipulations are slow ° InstructionStream>>interpretNextInstructionFor: ° uses linear dispatch • Fix these things ... 260 µs per method 26
  • 76. Conclusions • AI was easy to implement spare time at last two days of ESUG 2003 and flight home ° Thanks to John Brant and Vassili Bykov ° • AI implementation was more accurate than implementation based on parsing source Morph>>layoutInBounds: contains the statement: ° (owner ifNil:[self]) cellPositioning which AI detects as self-send • AI is fast Bytescodes are designed for fast interpretation ° 27