1. Debremarkos Institute of Technology
School of Computing
Software Engineering Academic Program
By Lamesginew A. (lame2002@gmail.com )1
Chapter – 4
Syntax Directed Translation
2. 2
Introduction to semantic analysis
Introduction to SDT
Syntax Directed Definitions (SDD)
Construction of Syntax trees
Bottom-up evaluation of S-attributed definition
L-attributed definitions
4. Introduction to semantic analysis
•Semantic Analysis phase is the third phase of the compiler .
•Semantic Analysis checks the source program for semantic
errors.
•It uses the hierarchical structure determined by the syntax
analysis phase to identify the operators and operands of
expressions and statement.
•Semantic analysis performs type checking, i.e., it checks
that, whether each operator has operands that are permitted
by the source language specification.
Example: If a real numbers is used to index an array i.e.,
a[1.5] then the compiler will report an error . This error is
handled during semantic analysis.
4
5. Semantic Errors
The primary source of semantic error are
undeclared names and type incompatibilities.
Semantic errors can be detected both at compile
time and at run time.
The errors will be declaration or scope of variables.
Example: Undeclared or multiply-declared identifiers.
Example: Type incompatibilities between operators
and operands and between formal and actual
parameters are another common source of semantic
errors that can be detected at compile time.
5
6. Syntax-Directed Translation
6
It is a notational frame work, of the valid syntactical constructs received from
the syntax analyzer, to make it suitable for intermediate code generation. This
attaches actions or rules or subroutines of the grammar.
It uses a CFG to specify the syntactic structure of the language
It associates a set of attributes with the terminals and nonterminals of the grammar
It associates with each production a set of semantic rules to compute values of
attributes
There are two notations for associating semantic rules with productions
1) Syntax-directed definition
It specifies the values of attributes by associating semantic rules with the
grammar productions
It hides many implementation details
Production Semantic Rule
EE1+T E.code=E1.code||T.code||’+’
2) Translation schemes
It embeds semantic actions within production bodies.
It indicate the order in which semantic rules are to be evaluated.
It allows some implementation details.
E E1+T {print ‘+’;}
7. The subscript in E1 distinguishes the occurrence of E in the
production body from the occurrence of E as the head.
Both E and T have a string-valued attribute code. The semantic rule
specifies that the string E.code is formed by concatenating E1.code,
T.code, and the character ‘+’.
Semantic actions are enclosed within curly braces.
If curly braces occur as grammar symbols, we enclose them within
single quotes, as in ‘{‘ and ‘}’.
Semantic actions may occur at any position in a production body.
Syntax-directed definitions can be more readable, and hence more
useful for specifications.
However, translation schemes can be more efficient, and hence more
useful for implementations.
7
8. • It is a context free grammar with attributes and rules.
• Attributes are associated with grammar symbols and rules with
productions.
• E.g. If X is a grammar symbol and a is one of its attributes, then we write
X.a to denote the value of a at a particular parse-tree node labeled X.
• Attributes may be of many kinds: numbers, types, table references, strings, etc.
• There are two kinds of attributes for non-terminals
1) Synthesized attributes
A synthesized attribute at node N is defined only in terms of attribute
values of children of N and N itself.
The attributes of the parent depend on the attributes of the children.
For the production: S ABC. If S is taking values from its child
nodes (A, B, C), then it is said to be a synthesized attribute.
For production E E + T, the parent node E gets its value from its
child node.
Synthesized attributes never take values from their parent nodes or
any sibling nodes.
8
Syntax-Directed Definition (SDD)
9. 2) Inherited attributes
An inherited attribute at node N is defined only in terms of attribute
values at N’s parent, N itself and N’s siblings.
The attributes of the children depend on the attributes of the parent.
They can take values from parent and/or siblings.
E.g. in the production: S ABC. A can get values from S, B, and C. B
can take values from S, A, and C. Likewise, C can take values from S, A,
and B.
(a) Synthesized at node n (b) Inherited at node n
Terminals can have synthesized attributes, but not inherited attributes.
Attributes for terminals have lexical values that are supplied by the lexical
analyzer;
There are no semantic rules in the SDD itself for computing the value of an
attribute for a terminal. 9
10. Example
Production Semantic rules
10
It evaluates expressions terminated by an end marker n.
Each of the non-terminals has a single synthesized attribute, called val.
An SDD that involves only synthesized attributes is called S-attributed.
The above SDD is S-attributed.
In an S-attributed SDD, each rule computes an attribute for the
nonterminal at the head of a production from attributes taken from the
body of the production.
L.val = E.val
E.val = E1.val + T.val
E.val = T.val
T.val = T1.val * F.val
T.val = F.val
F.val = E.val
F.val = digit.lexval
1) L E n
2) E E1 + T
3) E T
4) T T1 * F
5) T F
6) F (E)
7) F digit
SDD for grammar of arithmetic expressions with operators + and *.
11. Evaluating an SDD at the Nodes of a Parse Tree
To visualize the translation specified by an SDD, it helps to work with
parse trees, even though a translator need not actually build a parse tree.
A parse tree, showing the value(s) of its attribute(s) is called an
annotated parse tree or decorated parse tree.
How do we construct an annotated parse tree?
Before we can evaluate an attribute at a node of a parse tree, we must
evaluate all the attributes upon which its value depends.
For example, if all attributes are synthesized, then we must evaluate the
val attributes at all of the children of a node before we can evaluate the
val attribute at the node itself.
11
12. Annotated parse tree for the input string 3 * 5 + 4 n, constructed
using the previous grammar and rules.
The values of lexval are presumed supplied by the lexical analyzer.
Each of the nodes for the nonterminals has attribute val computed in a bottom-up
order, and we see the resulting values associated with each node.
For instance, at the node with a child labeled *, after computing T.val = 3 and
F.val = 5 at its first and third children, we apply the rule T.val = T1.val * F.val, or 15.
13. Consider the following SDD used for computing terms like 3 * 5 and 3 * 5 * 7.
Each of the nonterminals T and F has a synthesized attribute val;
The terminal digit has a synthesized attribute lexval.
The nonterminal T’ has two attributes: an inherited inh & synthesized syn attributes.
The head T’ of the production T’ * F T1’ inherits the left operand of * in the
production body.
Given a term x * y * z, the root of the subtree for * y * z inherits x. Then, the root of
the subtree for * z inherits the value of x * y, and so on
14. Annotated parse tree for input 3*5 constructed based on the previous grammar is
The leftmost leaf in the parse tree, labeled digit, has attribute value lexval = 3, where
the 3 is supplied by the lexical analyzer.
Its parent is for production 4, F digit. The only semantic rule associated with this
production defines F. val = digit. lexval, which equals 3.
At the second child of the root, the inherited attribute T’.inh is defined by the
semantic rule T’.inh=F.val associated with production 1.
Thus, the left operand, 3, for the * operator is passed from left to right across the
children of the root.
14
15. The production at the node for T’ is T’ * FT1’.
The inherited attribute T1’.inh is defined by the semantic
rule T1’.inh = T’.inh x F. val associated with production 2.
With T’.inh = 3 and F.val = 5, we get T1’.inh = 15.
At the lower node for T1’, the production is T’ ε.
The semantic rule T’.syn = T’.inh defines T1’.syn = 15.
The syn attributes at the nodes for T' pass the value 15 up
the tree to the node for T, where T.val = 15.
15
16. Dependency graph
• The interdependencies among the inherited and synthesized attributes at
the nodes in a parse tree can be depicted by a directed graph called a
dependency graph.
• It is useful tool for determining an evaluation order for the attribute
instances in a given parse tree.
• Annotated parse tree shows the values of attributes, a dependency graph
tells how those values can be computed.
• Edges express constraints implied by the semantic rules.
In more detail:
For each parse tree node, the dependency graph has a node for each
attribute associated with that node.
If a semantic rule defines the value of synthesized attribute A.b in terms of
the value of X.c then the dependency graph has an edge from X.c to A.b
If a semantic rule defines the value of inherited attribute B.c in terms of
the value of X.a then the dependency graph has an edge from X.a to B.c
16
17. Example:
Consider the following production and semantic rule:
Production Semantic Rule
E E1 + T E.val = E1.val + T.val
A portion of the dependency graph for every parse tree in which this
production is used looks like the following.
The parse tree edges as dotted lines, while the edges of the
dependency graph are solid.
17
18. 18
An example of a complete dependency graph is the following
The nodes of the dependency graph, represented by the numbers 1 through
9, correspond to the attributes in the annotated parse tree of input 3*5
Nodes 1 and 2 represent the attribute lexval associated with the two leaves
labeled digit.
Nodes 3 and 4 represent the attribute val associated with the two nodes
labeled F.
Nodes 5 and 6 represent the inherited attribute T’.inh associated with each
of the occurrences of nonterminal T’.
The edge represents dependence, not equality
Example 2:
19. Example 3:
A dependency graph for the input 5+3*4 constructed using the previous grammar and
rules looks like the following.
19
20. Construction of Syntax trees
An (abstract) syntax tree is a condensed form of parse tree useful for
representing language constructs.
In a syntax tree, operators and keywords do not appear as leaves, but
rather are associated with the interior node that would be the parent of
those leaves in the parse tree.
The parse tree of input 3*5 +4 becomes the following syntax tree.
20
21. We construct subtrees for the subexpressions by creating a node for
each operator and operand.
Each node in a syntax tree can be implemented as a record with
several fields.
In the node for an operator, one field identifies the operator and the
remaining fields contain pointers to the nodes for the operands.
We use the following functions to create the nodes of syntax trees for
expressions with binary operators. Each function returns a pointer to a
newly created node.
mknode(op, left, right) creates an operator node with label op and two
fields containing pointers to left and right.
mkleaf (id, entry) creates an identifier node with label id and a field
containing entry, a pointer to the symbol-table entry for the identifier.
mkleaf (num, val) creates a number node with label num and a field
containing val, the value of the number.
21
Constructing Syntax Trees for Expressions
22. Example:
• The following sequence of functions calls creates the syntax tree for
the expression a – 4 + c.
• p1, p2. . . . , p5 are pointers to nodes, and
• entry-a and entry-c are pointers to the symbol-table entries for
identifiers a and c, respectively.
1) pl = mkleaf(id, entry-a);
2) p2 = mkleaf (num, 4);
3) p3 = mknode('-', p1, p2);
4) p4 = mkleaf(id, entry-c);
5) p5 = mknode('+', p3, p4);
• The tree is constructed bottom up.
22
23. A Syntax-Directed definition for Constructing Syntax Trees
S-attributed definition for constructing a syntax tree for an expression
containing the operators + and -.
Every time the first production E E1 + T is used, its rule creates a
node with ‘+’ for op and two children, E1.node and T.node, for the
subexpressions. 2nd productions is the same with the first.
Production 3 and 4 doesn’t create node.
The last 2 T-productions contain single terminal on the right, leaf
node is created
23
Production Semantic rule
1) E E1 + T
2) E E1 - T
3) E T
4) T (E)
5) T id
6) T num
E.node = mknode('+', E1.node, T.node)
E.node = mknode('-', E1.node, T.node)
E.node = T.node
T.node = E.node
T.node = mkleaf (id, id.entry)
T.node = mkleaf (num, num.val)
24. Example
The following figure shows the construction of a syntax tree for the
input a – 4 + c.
The nodes of the syntax tree are shown as records, with the op field
first. Syntax-tree edges are now shown as solid lines.
The underlying parse tree is shown with dotted edges.
The dashed represents the values of E.node and T.node; each line
points to the appropriate syntax tree node.
24
25. Bottom-Up Evaluation of S-attributed definitions
An SDD is S-attributed if every attribute is synthesized.
When an SDD is S-attributed, we can evaluate its attributes in
any bottom-up order of the nodes of the parse tree.
It is simple to evaluate the attributes by performing a postorder
traversal of the parse tree.
That is, we apply the function postorder, defined below, to the
root of the parse tree.
Postorder (N)
{
for (each child C of N, from the left)
Postorder(C);
evaluate the attributes associated with node N;
}
25
26. L-attributed definitions
•A SDD is L-Attributed if the edges in dependency graph
goes from Left to Right but not from Right to Left.
•More precisely, each attribute must be either
Synthesized
Inherited, but if there is a production AX1X2…Xn and
there is an inherited attribute Xi.a computed by a rule
associated with this production, then the rule may only
use:
Inherited attributes associated with the head A
Either inherited or synthesized attributes associated
with the occurrences of symbols X1,X2,…,Xi-1 located
to the left of Xi
Inherited or synthesized attributes associated with this
occurrence of Xi itself, but in such a way that there is
no cycle in the graph 26
27. Example:
• For example, the previously seen SDD is L-attributed. To see
why, consider the semantic rules for inherited attributes, which
are repeated here for convenience:
Production Semantic Rule
T FT’ T’.inh = F.val
T’ *FT1’ T1’.inh = T’.inh X F.val
The first rule defines the inherited attribute T '.inh using only
F.val, and F appears to the left of T’ in the production body.
The second rule defines T1'.inh using the inherited attribute
T '.inh associated with the head, and F.val, where F appears to
the left of T1' in the production body.
In each of these cases, the rules use information "from above
or from the left", as required by the class. The remaining
attributes are synthesized. Hence, the SDD is L-attributed.
27
28. Example:
• Any SDD containing the following production and rules
cannot be L-attributed:
Production Semantic Rule
A BC A.s = B.b;
B.i = f (C.c, A.s);
The first rule, A.s = B.b, is a legitimate rule in either an S-
attributed or L-attributed SDD. It defines a synthesized
attribute A.s in terms of an attribute at a child.
The second rule defines an inherited attribute B.i, so the
entire SDD cannot be S-attributed. Further, although the
rule is legal, the SDD cannot be L-attributed, because the
attribute C.c is used to help define B.i, and C is to the right
of B in the production body.
28
29. Example:
• consider the following SDD
•Is the SDD S-attributed? No, b/se there is inherited attributes
•Is the SDD L-attributed? No, b/se Q.i depends R.s, its right.
29
Production Semantic
Rule
A LM L.i = l (A.i)
M.i = m(L.s)
A.s = f(M.s)
A QR R.i = r(A.i)
Q.i = q(R.s)
A.s = f(Q.s)
30. Syntax directed translation schemes
•An SDT is a Context Free grammar with program fragments
embedded within production bodies
•Those program fragments are called semantic actions
•They can appear at any position within production body
•Any SDT can be implemented by first building a parse tree
and then performing the actions in a left-to-right depth first
order
•Typically SDT’s are implemented during parsing without
building a parse tree
30
31. Example;
•Infix to Postfix conversion (application)
E E + T {print(‘+’);}
/ T { ; }
T T * F {print(‘*’);}
/F { ; }
F num {print(num.lexval);}
Input : 2+3*4
31