Syntax-Directed Translation: Syntax-Directed Definitions, Evaluation Orders for SDD's, Applications of Syntax-Directed Translation, Syntax-Directed Translation Schemes, and Implementing L-Attributed SDD's. Intermediate-Code Generation: Variants of Syntax Trees, Three-Address Code, Types and Declarations, Type Checking, Control Flow, Back patching, Switch-Statements
1. COMPILER DESIGN- Syntax
Directed Translation
Dr R Jegadeesan Prof-CSE
Jyothishmathi Institute of Technology and Science,
Karimnagar
1
1
2. SYLLABUS
U N I T - 3
2
UNIT - III Syntax-Directed Translation: Syntax-Directed
Definitions, Evaluation Orders for SDD's, Applications of
Syntax-Directed Translation, Syntax-Directed Translation
Schemes, and Implementing L-Attributed SDD's.
Intermediate-Code Generation: Variants of Syntax Trees,
Three-Address Code, Types and Declarations, Type
Checking, Control Flow, Back patching, Switch-Statements,.
1
3. UNIT 1 : HEADING
Topic Name :
Topic :
Aim & Objective :
Principle & Operation/ Detailed Explanation :
Application With Example :
Limitations If Any :
Reference Links :
• Book Details
• Video Link Details
• Please Specify Mtutor
• Video Link details
• (NPTEL, YOUTUBE Lectures and etc.)
• Please specify Mtutor link topic wise(www.m-tutor.com)
• Please specify any course available on <www.coursera.org>,
and http://neat.aicte-india.org
• Related research work (ongoing & completed) – refer from IEEE Transactions, Elsevier and Springer.
If avaliable
Sub Heading
3
1
4. UNIT 3 : Syntax-Directed Translation
•We can associate information with a language construct by attaching attributes to the grammar
symbols.
•A syntax directed definition specifies the values of attributes by associating semantic rules with
the grammar productions.
Introduction
4
1
Production Semantic Rule
E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the grammar
E -> E1+T {print ‘+’}
5. 5
1
• A SDD is a context free grammar with attributes and rules
• Attributes are associated with grammar symbols and rules with productions
• Attributes may be of many kinds: numbers, types, table references, strings, etc.
• Synthesized attributes
– A synthesized attribute at node N is defined only in terms of attribute values of children
of N and at N it
• Inherited attributes
– An inherited attribute at node N is defined only in terms of attribute values at N’s
parent, N itself and N’s siblings
UNIT 3 : Syntax-Directed Translation
Syntax Directed Definitions
6. 6
1
Example of S-attributed SDD
Production Semantic Rules
1) L -> E n
2) E -> E1 + T
3) E -> T
4) T -> T1 * F
5) T -> F
6) F -> (E)
7) F -> digit
L.val = E.val
E.val = E1.val + T.val
E.val = T.val
T.val = T1.val * F.val
T.val = F.val
F.val = E.val
F.val = digit.lexval
UNIT 3 : Syntax-Directed Translation
7. Example of mixed attributes
1 7
Production
1) T -> FT’
2) T’ -> *FT’1
3) T’ -> ε
4) F -> digit
Semantic Rules
T’.inh = F.val
T.val = T’.syn
T’1.inh = T’.inh*F.val
T’.syn = T’1.syn
T’.syn = T’.inh
F.val = F.val = digit.lexval
UNIT 3 : Syntax-Directed Translation
8. •A dependency graph is used to determine the order of computation of attributes
•Dependency graph
•For each parse tree node, the parse tree has a node for each attribute associated with
that node
•If a semantic rule defines the value of synthesized attribute A.b in terms of the value of
X.c then the dependency graph has an edge from X.c to A.b
•If a semantic rule defines the value of inherited attribute B.c in terms of the value of X.a
then the dependency graph has an edge from X.c to B.c
•Example!
Evaluation orders for SDD’s
8
•1
UNIT 3 : Syntax-Directed Translation
9. Ordering the evaluation of attributes
9
1
•If dependency graph has an edge from M to N then M must be evaluated
before the attribute of N
•Thus the only allowable orders of evaluation are those sequence of
nodes N1,N2,…,Nk such that if there is an edge from Ni to Nj then i<j
•Such an ordering is called a topological sort of a graph
•Example!
UNIT 3 : Syntax-Directed Translation
10. S-Attributed definitions
10
1
•An SDD is S-attributed if every attribute is synthesized
•We can have a post-order traversal of parse-tree to evaluate attributes in S-attributed
Definitions
•
postorder(N) {
for (each child C of N, from the left) postorder(C);
evaluate the attributes associated with node N;
}
•S-Attributed definitions can be implemented during bottom-up parsing without the
need to explicitly create parse trees
UNIT 3 : Syntax-Directed Translation
11. •A SDD is L-Attributed if the edges in dependency graph goes from Left to Right but
not from Right to Left.
•More precisely, each attribute must be either
•Synthesized
•Inherited, but if there us a production A->X1,X2…,Xn and there is an inherited
attribute Xi.a computed by a rule associated with this production, then the rule may
only use:
•Inherited attributes associated with the head A
•Either inherited or synthesized attributes associated with the occurrences of
symbols X1,X2,…,Xi-1 located to the left of Xi Inherited or synthesized attributes
associated with this occurrence of Xi itself, but in such a way that there is no
cycle in the graph
L-Attributed definitions
11
1
UNIT 3 : Syntax-Directed Translation
12. Type checking and intermediate code generation (chapter 6)
Construction of syntax trees
Leaf nodes: Leaf(op,val)
Interior node: Node(op,c1,c2,…,ck)
Example:
Application of Syntax Directed
Translation
12
1
Production Semantic Rules
1) E -> E1 + T
2) E -> E1 - T
3) E -> T
4) T -> (E)
5) T -> id
6) T -> num
E.node=new node(‘+’, E1.node,T.node)
E.node=new node(‘-’, E1.node,T.node)
E.node = T.node
T.node = E.node
T.node = new Leaf(id,id.entry)
T.node = new Leaf(num,num.val)
UNIT 3 : Syntax-Directed Translation
13. Syntax tree for L-attributed definition
13
1
Production Semantic Rules
1) E -> TE’
2) E’ -> + TE1’
3) E’ -> -TE1’
4) E’ ->
5) T -> (E)
6) T -> id
7) T -> num
E.node=E’.syn
E’.inh=T.node
E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
E1’.inh=new node(‘+’, E’.inh,T.node)
E’.syn=E1’.syn
E’.syn = E’.inh
T.node = E.node
T.node=new Leaf(id,id.entry)
T.node = new Leaf(num,num.val)
UNIT 3 : Syntax-Directed Translation
14. Syntax directed translation schemes
14
1
• An SDT is a Context Free grammar with program fragments embedded within production
bodies
• Those program fragments are called semantic actions
• They can appear at any position within production body
• Any SDT can be implemented by first building a parse tree and then performing the
actions in a left-to-right depth first order
• Typically SDT’s are implemented during parsing without building a parse tree
UNIT 3 : Syntax-Directed Translation
15. Postfix translation schemes
15
1
• Simplest SDDs are those that we can parse the grammar bottom-up and the SDD is
s-attributed
• For such cases we can construct SDT where each action is placed at the end of the
production and is executed along with the reduction of the body to the head of
that production
• SDT’s with all actions at the right ends of the production bodies are called postfix
SDT’s
UNIT 3 : Syntax-Directed Translation
16. Example of postfix SDT
16
1
1) L -> E n {print(E.val);}
2) E -> E1 + T {E.val=E1.val+T.val;}
3) E -> T {E.val = T.val;}
4) T -> T1 * F {T.val=T1.val*F.val;}
5) T -> F {T.val=F.val;}
6) F -> (E) {F.val=E.val;}
7) F -> digit {F.val=digit.lexval;}
UNIT 3 : Syntax-Directed Translation
17. Parse-Stack implementation of postfix SDT’s
17
1
• In a shift-reduce parser we can easily implement semantic action using the parser stack
• For each nonterminal (or state) on the stack we can associate a record holding its attributes
• Then in a reduction step we can execute the semantic action at the end of a production to
evaluate the attribute(s) of the non-terminal at the leftside of the production
• And put the value on the stack in replace of the rightside of production
UNIT 3 : Syntax-Directed Translation
18. Example
18
1
L -> E n {print(stack[top-1].val);
top=top-1;}
E -> E1 + T {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
E -> T
T -> T1 * F {stack[top-2].val=stack[top-2].val+stack.val;
top=top-2;}
T -> F
F -> (E) {stack[top-2].val=stack[top-1].val
top=top-2;}
F -> digit
UNIT 3 : Syntax-Directed Translation
19. •For a production B->X {a} Y
• If the parse is bottom-up then we perform action “a” as soon as this
occurrence of X appears on the top of the parser stack
• If the parser is top down we perform “a” just before we expand Y
•Sometimes we cant do things as easily as explained above
•One example is when we are parsing this SDT with a bottom-up parser
SDT’s with actions inside productions
19
1
1) L -> E n
2) E -> {print(‘+’);} E1 + T
3) E -> T
4) T -> {print(‘*’);} T1 * F
5) T -> F
6) F -> (E)
7) F -> digit {print(digit.lexval);}
UNIT 3 : Syntax-Directed Translation
20. 20
1
• Any SDT can be implemented as
follows
1. Ignore the actions and produce
a parse tree
2. Examine each interior node N
and add actions as new children
at the correct position
3. Perform a postorder traversal
and execute actions when their
nodes are visited
L
E
+
E
{print(‘+’);}
T
F
digit
{print(4);}
T
T F
*
digit
{print(5);}
F
digit
{print(3);}
{print(‘*’);}
UNIT 3 : Syntax-Directed Translation
SDT’s with actions inside productions(cont)
21. SDT’s for L-Attributed definitions
21
1
• We can convert an L-attributed SDD into an SDT using following two rules:
– Embed the action that computes the inherited attributes for a
nonterminal A immediately before that occurrence of A. if several
inherited attributes of A are dpendent on one another in an acyclic
fashion, order them so that those needed first are computed first
– Place the action of a synthesized attribute for the head of a production
at the end of the body of the production
UNIT 3 : Syntax-Directed Translation
22. Example
22
1
S -> while (C) S1 L1=new();
L2=new();
S1.next=L1;
C.false=S.next;
C.true=L2;
S.code=label||L1||C.code||label||L2||S1.code
S -> while ( {L1=new();L2=new();C.false=S.next;C.true=L2;}
C) {S1.next=L1;}
S1{S.code=label||L1||C.code||label||L2||S1.code;}
UNIT 3 : Syntax-Directed Translation
23. UNIT 3 : Intermediate-Code Generation
23
1
• Intermediate code is the interface between front end and
back end in a compiler
• Ideally the details of source language are confined to the front
end and the details of target machines to the back end (a m*n
model)
• In this chapter we study intermediate representations, static
type checking and intermediate code generation
Parser
Static
Checker
Intermediate
Code Generator
Code
Generator
Front end Back end
Introduction
24. UNIT 3 : Intermediate-Code Generation
Variants of syntax trees
24
1
• It is sometimes beneficial to crate a DAG
instead of tree for Expressions.
• This way we can easily show the common sub-
expressions and then use that knowledge
during code generation
• Example: a+a*(b-c)+(b-c)*d
+
+ *
*
-
b c
a
d
25. UNIT 3 : Intermediate-Code Generation
25
SDD for creating DAG’s
1) E -> E1+T
2) E -> E1-T
3) E -> T
4) T -> (E)
5) T -> id
6) T -> num
Production Semantic Rules
E.node= new Node(‘+’, E1.node,T.node)
E.node= new Node(‘-’, E1.node,T.node)
E.node = T.node
T.node = E.node
T.node = new Leaf(id, id.entry)
T.node = new Leaf(num, num.val)
Example:
1) p1=Leaf(id, entry-a)
2) P2=Leaf(id, entry-a)=p1
3) p3=Leaf(id, entry-b)
4) p4=Leaf(id, entry-c)
5) p5=Node(‘-’,p3,p4)
6) p6=Node(‘*’,p1,p5)
7) p7=Node(‘+’,p1,p6)
8) p8=Leaf(id,entry-b)=p3
9) p9=Leaf(id,entry-c)=p4
10) p10=Node(‘-’,p3,p4)=p5
11) p11=Leaf(id,entry-d)
12) p12=Node(‘*’,p5,p11)
13) p13=Node(‘+’,p7,p12)
26. UNIT 3 : Intermediate-Code Generation
26
Value-number method for constructing DAG’s
• Algorithm
– Search the array for a node M with label op, left
child l and right child r
– If there is such a node, return the value number M
– If not create in the array a new node N with label
op, left child l, and right child r and return its value
• We may use a hash table
=
+
10
i
id To entry for i
num 10
+ 1 2
3 1 3
27. UNIT 3 : Intermediate-Code Generation
Three address code
27
1
• In a three address code there is at most one
operator at the right side of an instruction
• Example:
+
+ *
*
-
b c
a
d
28. UNIT 3 : Intermediate-Code Generation
Forms of three address instructions
28
1
• x = y op z
• x = op y
• x = y
• goto L
• if x goto L and ifFalse x goto L
• if x relop y goto L
• Procedure calls using:
– param x
– call p,n
– y = call p,n
• x = y[i] and x[i] = y
• x = &y and x = *y and *x =y
29. UNIT 3 : Intermediate-Code Generation
Example
29
1
• do i = i+1; while (a[i] < v);
L: t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
Symbolic labels
100: t1 = i + 1
101: i = t1
102: t2 = i * 8
103: t3 = a[t2]
104: if t3 < v goto 100
Position numbers
30. UNIT 3 : Intermediate-Code Generation
Data structures for three address codes
30
1
• Quadruples
– Has four fields: op, arg1, arg2 and result
• Triples
– Temporaries are not used and instead references
to instructions are made
• Indirect triples
– In addition to triples we use a list of pointers to
triples
31. UNIT 3 : Intermediate-Code Generation
Example
31
1
• b * minus c + b * minus c
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Three address code
minus
*
minus c t3
*
+
=
c t1
b t2
t1
b t4
t3
t2 t5
t4
t5 a
arg1 result
arg2
op
Quadruples
minus
*
minus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Triples
(4)
0
1
2
3
4
5
minus
*
minus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Indirect Triples
(4)
0
1
2
3
4
5
(0)
(1)
(2)
(3)
(4)
(5)
op
35
36
37
38
39
40
32. UNIT 3 : Intermediate-Code Generation
Type Expressions
32
1
Example: int[2][3]
array(2,array(3,integer))
• A basic type is a type expression
• A type name is a type expression
• A type expression can be formed by applying the array type constructor
to a number and a type expression.
• A record is a data structure with named field
• A type expression can be formed by using the type constructor g for
function types
• If s and t are type expressions, then their Cartesian product s*t is a type
expression
• Type expressions may contain variables whose values are type
expressions
33. UNIT 3 : Intermediate-Code Generation
33
1
Type Equivalence
• They are the same basic type.
• They are formed by applying the same
constructor to structurally equivalent types.
• One is a type name that denotes the other.
34. UNIT 3 : Intermediate-Code Generation
Declarations
34
1
35. UNIT 3 : Intermediate-Code Generation
Storage Layout for Local Names
35
1
• Computing types and their widths
36. UNIT 3 : Intermediate-Code Generation
Storage Layout for Local Names
36
1
Syntax-directed translation of array types
37. UNIT 3 : Intermediate-Code Generation
Sequences of Declarations
37
1
•
•
Actions at the end:
38. UNIT 3 : Intermediate-Code Generation
Fields in Records and Classes
38
1
•
•
39. UNIT 3 : Intermediate-Code Generation
Translation of Expressions and Statements
39
1
• We discussed how to find the types and offset
of variables
• We have therefore necessary preparations to
discuss about translation to intermediate code
• We also discuss the type checking
40. UNIT 3 : Intermediate-Code Generation
Three-address code for expressions
40
1
41. UNIT 3 : Intermediate-Code Generation
Incremental Translation
41
1
42. UNIT 3 : Intermediate-Code Generation
Addressing Array Elements
42
1
• Layouts for a two-dimensional array:
43. UNIT 3 : Intermediate-Code Generation
Semantic actions for array reference
43
1
44. UNIT 3 : Intermediate-Code Generation
Translation of Array References
44
1
Nonterminal L has three synthesized attributes:
• L.addr
• L.array
• L.type
45. UNIT 3 : Intermediate-Code Generation
Conversions between primitive types in Java
45
1
46. UNIT 3 : Intermediate-Code Generation
Introducing type conversions into Expression
evaluation
46
1
47. UNIT 3 : Intermediate-Code Generation
Abstract syntax tree for the function definition
47
1
fun length(x) =
if null(x) then 0 else length(tl(x)+1)
This is a polymorphic function
in ML language
48. UNIT 3 : Intermediate-Code Generation
48
1
Algorithm for Unification
49. UNIT 3 : Intermediate-Code Generation
Unification algorithm
49
1
boolean unify (Node m, Node n) {
s = find(m); t = find(n);
if ( s = t ) return true;
else if ( nodes s and t represent the same basic type ) return true;
else if (s is an op-node with children s1 and s2 and
t is an op-node with children t1 and t2) {
union(s , t) ;
return unify(s1, t1) and unify(s2, t2);
}
else if s or t represents a variable {
union(s, t) ;
return true;
}
else return false;
}
50. UNIT 3 : Intermediate-Code Generation
Control Flow
50
1
boolean expressions are often used to:
• Alter the flow of control.
• Compute logical values.
52. UNIT 3 : Intermediate-Code Generation
Flow-of-Control Statements
52
1
53. UNIT 3 : Intermediate-Code Generation
Syntax-directed definition
53
1
54. UNIT 3 : Intermediate-Code Generation
Generating three-address code for Boolean
54
1
55. UNIT 3 : Intermediate-Code Generation
translation of a simple if-statement
55
1
•
•
56. UNIT 3 : Intermediate-Code Generation
Backpatching
56
1
• Previous codes for Boolean expressions insert symbolic labels for jumps
• It therefore needs a separate pass to set them to appropriate addresses
• We can use a technique named backpatching to avoid this
• We assume we save instructions into an array and labels will be indices in
the array
• For nonterminal B we use two attributes B.truelist and B.falselist together
with following functions:
– makelist(i): create a new list containing only I, an index into the array of
instructions
– Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a
pointer to the concatenated list
– Backpatch(p,i): inserts i as the target label for each of the instruction on
the list pointed to by p
57. UNIT 3 : Intermediate-Code Generation
Backpatching for Boolean Expressions
57
1
•
•
58. UNIT 3 : Intermediate-Code Generation
Backpatching for Boolean Expressions
58
1
• Annotated parse tree for x < 100 || x > 200
&& x ! = y
59. UNIT 3 : Intermediate-Code Generation
Flow-of-Control Statements
59
1
60. UNIT 3 : Intermediate-Code Generation
Translation of a switch-statement
60
1