3. Syntax Analyzer.pptx

Syntax Analysis(Parsing)
- Syntax analysis (or parsing) is concerned with the
conversion of the stream of tokens to a parse tree,
using the production rules of the grammar.
• Second phase of the compiler.
• The parser uses the first components of the tokens
produced by the lexical analyzer to create a tree-like
intermediate representation(parse-tree) that depicts
the grammatical structure of the token stream.

Syntax Analysis
• Recognize sentences in a language with the help
of given grammar rules..
• Discover the structure of a document/program.
• Construct (implicitly or explicitly) a tree (called as
a parse tree) to represent the structure.
• The parse tree is used to guide translation.

position = initial + rate * 60;
<id,1> < = > < id,2 > < + > < id,3> < * > <num, 60 > <;>
Lexical Analyzer
(Scanner)
Syntax Analyzer
(Parser)
Grammar:
S → id = E;
E → E+E | E*E | id| num
[a-zA-Z_][a-zA-Z_0-9]*
[0-9]+
….
…..

Parsing Example:
A grammar for some statements in C
S→ expr ;
| if ( expr) S
| for( OE;OE;OE) S
| other
OE → expr
| ɛ
Input: for ( ; expr ; expr ) other

Parser Types
• There are three general types of parsers for grammars:
– Universal,
– Top-down
– Bottom-up.
• Universal parsing methods such as the Cocke-Younger-
Kasami algorithm and Earley's algorithm can parse any
grammar.
• These general methods are, however, too inefficient to
use in practical compilers.

• So the methods commonly used in compilers can
be classified as being either top-down or bottom-
up.
• As implied by their names, top-down methods
build parse trees from the top (root) to the
bottom (leaves), while bottom-up methods start
from the leaves and work their way up to the
root.
• In either case, the input to the parser is scanned
from left to right, one symbol at a time.

Context-Free Grammars
• Used to systematically describe the syntax of programming
language constructs like expressions and statements.
• Grammar for simple arithmetic expressions
• expression -> expression + term
• expression -> expression - term
• expression -> term
• term -> term * factor
• term -> term / factor
• term -> factor
• factor -> ( expression )
• factor -> id

Grammars for Arithmetic expression
E →E+E | E*E | (E) | id
E→E+T | T
T →T*F | F
F →(E) | id
E →TE’
E’ →+TE’ | ɛ
T →FT’
T’ →*FT’ | ɛ
F →(E) | id

• The nonterminal symbols are expression, term and factor,
and expression is the start symbol.
• Derivation:
- Given a grammar, derivation a process to find a string(of
terminals) by beginning with the start symbol and
repeatedly replacing a nonterminal by the body of a
production for that nonterminal.
• Sentence: A sequence of terminal symbols ω such that
S -> ω (where S is the start symbol)
• Sentential Form: A sequence of terminal/nonterminal
symbols α such that
S -> α

Parse Trees and Derivations
• A parse tree(derivation tree) is a graphical representation
of a derivation that filters out the order in which
productions are applied to replace non-terminals.
• Each interior node of a parse tree represents the
application of a production.
• The interior node is labeled with the nonterminal A in the
head of the production.
• The children of the node are labeled, from left to right, by
the symbols in the body of the production by which this A
was replaced during the derivation.

• Grammar:
E -> E + E
E -> id
• Leftmost derivation: Leftmost nonterminal is replaced first.
E -> E + E
-> id + E
-> id + id
• Rightmost derivation: Rightmost nonterminal is replaced first.
E -> E + E
-> E + id
-> id + id

Left Recursion
• A grammar is left recursive if it has a nonterminal A such that there
is a derivation
A →+ Aα for some string α
• Top-down parsing techniques cannot handle left-recursive
grammars.
• So, we have to convert our left-recursive grammar into an
equivalent grammar which is not left-recursive.
• The left-recursion may appear in a single step of the derivation
(immediate left-recursion), or may appear in more than one step of
the derivation.

Immediate Left-Recursion
A →A α | β where β does not start with A
⇓ eliminate immediate left recursion
A → β A’
A’ → α A’ | ε an equivalent grammar
In general,
A →A α1 | ... | A αm | β1 | ... | βn where β1 ... βn do not
⇓ start with A
A → β1 A’ | ... | βn A’
A’ → α1 A’ | ... | αm A’ | ε an equivalent grammar

Left-Recursion -- Problem
• A grammar cannot be immediately left-recursive, but it still can be
left-recursive.
• By just eliminating the immediate left recursion, we may not get a
grammar which is not left-recursive.
• S → Aa | b This grammar is not immediately left-recursive,
• A → Sc | d but it is still left-recursive.
Because,
S ⇒ Aa ⇒ Sca or
A ⇒ Sc ⇒ Aac causes to a left-recursion
• So, we have to eliminate all left-recursions from our grammar

Eliminate Left-Recursion -- Algorithm

Eliminate Left-Recursion -- Example

Eliminate Left-Recursion – Example2

Left-Factoring
• A predictive parser (a top-down parser without
backtracking) insists that the grammar must be
left-factored.
stmt → if expr then stmt else stmt |
if expr then stmt
• when we see if, we cannot know which
production rule to choose to substitute stmt in
the derivation.

Top-Down Parsing
• Top-down parsing can be viewed as the problem of constructing a
parse tree for the input string, starting from the root node and
creating other nodes in preorder.
• Equivalently, top-down parsing can be viewed as finding a leftmost
derivation for an input string.
• Top-Down Parsing of id+id*id according to grammar:
E -> TE’
E’ -> +TE’ | e
T -> FT’
T’ -> *FT’ | e
F -> (E) | id

• Show sequence of tree building here…….
• At each step of a top-down parse, the key
problem is to determine the production to be
applied for a non-terminal.
• Once a production is chosen, the rest of
parsing process consists of matching the
terminals in the production body with the
input string.

Top-Down: Recursive-Descent Parsing
• A recursive-decent parsing is a top-down method of syntax analysis in which a set of
recursive procedures is used to process the input. One procedure is associated with
each non-terminal of a grammar.
• Selection of a production for a non-terminal by trial-and-error.
Ex:- for grammar : A -> X1X2…. Xk | Y1Y2……Yl
• proc_A(){
//Use an A-production, A -> X1X2…. Xk
iif fail then try another
for ( i = 1 to k ) {
if (Xi is a nonterminal)
call procedure Xi ();
else if (Xi equals the current input symbol )
advance the input to the next symbol;
else /*An error has occurred */;
}
// Use an A-production, A -> Y1Y2……Yl
for(i=1 to l) { …..
……
}
}

• Execution begins with the procedure for the start symbol, which
halts and announces success if its procedure body scans the entire
input string.
• This pseudo code is non-deterministic, since it begins by choosing
the A-production to apply in a manner that is not specified and so
may require backtracking.
• A left-recursive grammar can cause a recursive-descent parser to go
into an infinite loop.
• May require backtracking.
• Backtracking parsers not used frequently because not needed.

Predictive Parsing
• In order to avoid backtracking, parsers can lookahead one
or more symbols to predict which production rule will be
most suitable for application - this is called predictive
parsing.
• Does not require backtracking.
• Chooses the correct production by looking ahead at the
input a fixed number of symbols (usually one symbol).
• The class of grammars for which we can construct
predictive parsers looking k symbols ahead in the input is
sometimes called the LL(k) class.

• Predictive parsing requires the computation of a set
of symbols for each non-terminal in the language,
known as the FIRST set.
• FIRST (or starter) symbols: any symbol which may appear
at the start of a string generated by the non-terminal.
• “For predictive parsing to work, for each rule of the form
A -> α | β
FIRST(α) and FIRST(β) must be disjoint

• We can often create appropriate grammar for
predictive parsing by :
– removing ambiguity.
– removing left-recursion.
– left factoring the resulting grammar.

Recursive Predictive Parsing
• Each non-terminal corresponds to a procedure.
Ex: A → aBb (This is only the production rule for A)
proc_A {
- match the current token with a, and move to
the next token;
- call procB;
- match the current token with b, and move to
the next token;
}

• A → aBb | bAB
proc_A {
case of the current token {
‘a’: - match the current token with a, and move to the next token;
- call procB;
- match the current token with b, and move to the next token;
‘b’: - match the current token with b, and move to the next token;
- call procA ;
- call procB;
}

• When to apply ε-productions.
A → aA | bB | ε
• If all other productions fail, we should apply an ε-
production. For example, if the current token is not a
or b, we may apply the ε-production.
• Most correct choice: We should apply an ε-production
for a nonterminal A when the current token is in the
follow set of A (which terminals can follow A in the
sentential forms).

FIRST and FOLLOW
FIRST(α) is the set of all terminals that begin any string derived from α.
Computing FIRST:
– If X is a terminal, FIRST(X) = {X}
– If X -> ε is a production, add ε to FIRST(X)
– If X is a nonterminal and X -> Y1Y2…Yn is a production:
• For all terminals a, add a to FIRST(X) if a is a member of any
FIRST(Yi) and ε is a member of FIRST(Y1), FIRST(Y2), …
FIRST(Yi-1)
• If ε is a member of FIRST(Y1), FIRST(Y2), … FIRST(Yn), add ε to
FIRST(X)
Note:-Now, we can compute FIRST(α) for any string X1X2…Xn as follows.
FIRST(X1X2…Xn ) = non- ε symbols of FIRST(X1). If FIRST(X1) contains ε then non -ε of FIRST(X2) is
also added and so on. Finally add ε if for all i, ε is in FIRST(Xi).

• FOLLOW(A), for any nonterminal A, is the set
of terminals a that can appear
immediately to the right of A in
some sentential form.
• More formally a is in FOLLOW(A) if and only if
there exists a derivation of the form S*=>αAaβ
• $ is in FOLLOW(A) if and only if there exists a
derivation of the form S *=> αA.

Computing FOLLOW
• Place $ in FOLLOW(S), where S is staring variable.
• If there is a production A -> αBβ, then everything
in FIRST(β) (except for ε) is in FOLLOW(B).
• If there is a production A -> αB, or a production
A -> αBβ where FIRST(β) contains ε, then
everything in FOLLOW(A) is also in FOLLOW(B).

• Grammar: E -> TE’
E’ -> +TE’| ε
T -> FT’
T’ -> *FT’| ε
F -> (E) | id
• FIRST(E) = FIRST(T) = FIRST(F) = {(, id}
• FIRST(E’) = {+, ε}
• FIRST(T’) = {*, ε}
• FOLLOW(E) = FOLLOW(E’) = {), $}
• FOLLOW(T) = FOLLOW(T’) = {+, ), $}
• FOLLOW(F) = {+, *, ), $}

Creating a Predictive Parsing Table
• Table row: Non-Terminal, column: terminals including spl. end
maker symbol $.
• Entries in Table:

Non-recursive Predictive Parsing

• Initially, the parser is in a configuration with w$ in the
input buffer and the start symbol S of G on top of the
stack, above $(spl. Symbol not present in G).
• The symbol at the top of the stack (say X) and the
current symbol in the input string (say a) determine
the parser action.
• There are four possible parser actions.
1. If X and a are $ :- parser halts (successful
completion).
2. If X and a are the same terminal symbol (different
from $) :- parser pops X from the stack and moves the
next symbol in the input buffer.

3. If X is a non-terminal :-
parser looks at the parsing table entry M[X,a]. If
M[X,a] holds a production rule X→Y1Y2...Yk, it pops X
from the stack and pushes Yk,Yk-1,...,Y1 into the
stack. The parser also outputs the production rule
X→Y1Y2...Yk to represent a step of the derivation.
4. none of the above :- error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is
also an error case.

Using a Predictive Parsing Table

• For following grammar and string, create parsing table for
predictive parser and show the parsing steps :
i) Grammar: S -> +SS | *SS | a
and input: +*aaa
ii) Grammar: S -> S+S | SS | (S) | S* | a
and input : (a+a)*a

Bottom –up parsing
-A bottom-up parse corresponds to the
construction of a parse tree for an input string
beginning at the leaves (the bottom) and working
up towards the root (the top).
• General style of bottom-up parsing known as
shift-reduce parsing.
• The largest class of grammars for which shift-
reduce parsers can be built is the LR grammars,

Reductions
• We can think of bottom-up parsing as the process
of "reducing" a string w to the start symbol of the
grammar.
• At each reduction step, a specific substring
matching the body of a production is replaced by
the nonterminal at the head of that production.
• The key decisions during bottom-up parsing are
about when to reduce and about what
production to apply, as the parse proceeds.

• Handle: A "handle" is a substring that matches
the body of a production, and whose reduction
represents one step along the reverse of a
rightmost derivation.
• Ex- Given Grammar: E -> E + T | T
T -> T * F | F
F -> (E) | id
Parse string id*id

Shift-Reduce Parsing
• Shift-reduce parsing is a form of bottom-up
parsing in which a stack holds grammar symbols
and an input buffer holds the rest of the string to
be parsed.
• Parser makes shift-reduce decision by
maintaining states(Set of items) to keep the record.
• The handle always appears at the top of the
stack.
• Use $ to mark the bottom of the stack and also
the right end of the input.

• Initially, the stack is empty, and the string w is on the
input, as follows:
Initial Configuration:
Stack Input
$ w$
Acceptance Configuration:
Stack Input
$S $

LR-Parsers
• covers wide range of grammars.
• SLR – simple LR parser
• LR –most general LR parser
• LALR – intermediate LR parser (look-head LR
parser)
• SLR, LR and LALR work same way (they use the
same algorithm), only their parsing tables are
different.

• We divide all the sets of items of interest into
two classes:
1. Kernel items: the initial item, S' -> .S, and all
items whose dots are not at the left end.
2. Nonkernel items: all items with their dots at
the left end, except for S' -> .S

The Closure Operation
• If I is a set of LR(0) items for a grammar G, then
closure(I) is the set of LR(0) items constructed
from I by the two rules:
1. Initially, every LR(0) item in I is added to
closure(I).
2. If A → α.Bβ is in closure(I) and B→ γ is a
production rule of G; then B→.γ will be in the
closure(I).
We will apply this rule until no more new LR(0)
items can be added to closure(I).

The Closure Operation -- Example

Construction of The Canonical LR(0)
Collection

The Canonical LR(0) Collection --
Example

Transition Diagram (DFA)
( LR(0)Automaton )

Constructing SLR Parsing Table
(of an augmented grammar G’)

 Productions of given grammar are numbered :
1. E -> E +T
2. E -> T
3. T -> T * F
4. T -> F
5. F -> ( E )
6. F -> id
 Each set of item Ii of canonical collection, C is
used as a state.
These codes are used for the actions in table :
1. si means shift and stack state i,
2. rj means reduce by the production numbered j,
3. acc means accept,
4. blank means error.

Parsing Tables of Expression Grammar

LR-parsing algorithm.
• INPUT: An input string w and an LR-parsing table with functions ACTION and GOTO for a grammar G.
• OUTPUT: If w is in L(G), the reduction steps of a bottom-up parse for w; otherwise, an error indication.
• METHOD: Initially, the parser has s0 on its stack, where s0 is the initial state, and w$ in the input buffer.
• let a be the first symbol of w$;
• while(1) { /* repeat forever */
• let s be the state on top of the stack;
If ( ACTION[s, a] = shift t ) {
push t onto the stack;
let a be the next input symbol;
} else if ( ACTION [s, a] = reduce A -> β {
pop | β| symbols off the stack;
let state t now be on top of the stack;
push GOTO[t, A] onto the stack;
output the production A -> β
} else if ( ACTION[s,a] = accept ) break; /* parsing is done */
else call error-recovery routine;
}

Moves of parser on id * id + id

SLR(1) Grammar
• An LR parser using SLR(1) parsing tables for a
grammar G is called as the SLR(1) parser for G.
• If for a grammar G, an SLR(1) parsing table can
be created, it is called SLR(1) grammar (or SLR
grammar in short).
• Every SLR grammar is unambiguous, but every
unambiguous grammar is not a SLR grammar.

shift/reduce and reduce/reduce
conflicts
• If a state does not know whether it will make a shift
operation or reduction for a terminal, we say that there
is a shift/reduce conflict.
• If a state does not know whether it will make a
reduction operation using the production rule i or j for
a terminal, we say that there is a reduce/reduce
conflict.
• If the SLR parsing table of a grammar G has a conflict,
we say that that grammar is not SLR grammar.

Conflict Example (Shift-Reduce)

Conflict Example2 (Reduce-Reduce)

Handles and Viable prefixes
• Handle: A substring (of a sentential form) that
matches the body of a production, and whose
reduction represents one step along the reverse
of a rightmost derivation.
• The substring(of a sentential form) that matches
the body of some production need not be handle.
• Handle may not be unique if Grammar is
ambiguous.

For input id1*id2
Right sentential form Handle reducing production
id1*id2 id1 F → id
F*id2 F T → F
T*id2 id2 F → id
T*F T*F T → T*F
T T E →T
E
Here T is not a handle in the sentential form T*id2 although it is matching the
body of E → T

viable prefix
• Definition: A viable prefix is a prefix of a right
sentential form that does not continue past the right
end of the rightmost handle of that sentential form.
• The prefixes of right sentential forms that can appear
on the stack of a shift-reduce parser are called viable
prefixes.
• Ex: E =>* F*id => (E)*id
Viable prefixes (can appear on stack) : (, (E, (E) but not (E)*

• Importance of Viable Prefixes:
-The entire SLR parsing algorithm is based on
the idea that the LR(0) automaton can
recognize viable prefixes and reduce them
appropriately.
- Equivalently, this means that the set of viable
prefixes for a given SLR (1) grammar is a
REGULAR language!

Canonical Collection of Sets of LR(1)
Items

Construction of The Canonical LR(1)
Collection

A Short Notation for The Sets of LR(1) Items

Construction of LR(1) Parsing Tables

Canonical LR(1) Collection – Example

Creation of LALR Parsing Tables

Efficient Creation of LALR Parsing
Tables
• First, we can represent any set of LR(0) or LR(1) items I
by its kernel.
• We can construct the LALR(1)-item kernels from the
LR(0)-item kernels by a process of propagation and
spontaneous generation of lookaheads.(described
later)
• If we have the LALR(1) kernels, we can generate the
LALR(1) parsing table by closing each kernel, using the
function CLOSURE, and then computing table entries,
as if the LALR(1) sets of items were canonical LR(1) sets
of items.

Algorithm : Determining lookaheads.
INPUT: The kernel K of a set of LR(0) items I and a grammar symbol X.
OUTPUT: The lookaheads spontaneously generated by items in I for kernel items in GOTO(I,X) and
the items in I from which lookaheads are propagated to kernel items in GOTO(I,X).
• for (each kernel item A → α.β in K)
• J := CLOSURE({[A → α.β, #]})
• if( [B → γ. Xδ, a ] is in J and a is not # )
• conclude that lookahead a is generated spontaneously
for item B → γX. δ in GOTO(I,X) [keep this information in a table:
Computation of lookaheads]
• if( [B → γ. Xδ, #] is in J )
• conclude that lookaheads propagate from A → α.β in I
to B → γX.δ in GOTO(I,X) [keep this information in a table : Propagation of
lookaheads]
•

Algorithm : Efficient computation of the kernels
of the LALR(l) collection of sets of items.
INPUT: An augmented grammar G'.
OUTPUT: The kernels of the LALR(1) collection of sets of items
for G'.
METHOD:
1. Construct the kernels of the sets of LR(0) items for
G.
If space is not at a premium, the simplest way is to construct the
LR(0) sets of items, and then remove the nonkernel items. If space
is severely constrained, we may wish instead to store only the
kernel items for each set, and compute GOTO for a set of items I by
first computing the closure of I.

2. Determining lookaheads
Apply previous Algorithm to the kernel of each set of LR(0) items
and grammar symbol X to determine which lookaheads are
spontaneously generated for kernel items in GOTO(I, X), and from
which items in I lookaheads are propagated to kernel items in
GOTO(I,X).
3. Initialize a table that gives, for each kernel item in
each set of items, the associated lookaheads.
Initially, each item has associated with it only those lookaheads that
we determined in step (2) were generated spontaneously.

4. Make repeated passes over the kernel items in all
sets.
When we visit an item i, we look up the kernel items to which i
propagates its lookaheads, using information tabulated in
step (2). The current set of lookaheads for i is added to those
already associated with each of the items to which i
propagates its lookaheads. We continue making passes over
the kernel items until no more new lookaheads are
propagated.

practice
Given augmented G’ :
S’ → S
S → L=R |R
L → *R |id
R → L
- Create LALR parsing table using efficient
method i.e. without creating LR.

LR(0) and SLR(1) parser
• Easy extension of LR(0).
• SLR parsing table is created same as LR(0) parsing
table except reduction rows:
– Add reduction by a production A -> α only in the
columns of symbols in FOLLOW(A) instead of all
columns.
• SLR parsing table eliminates some conflicts of
LR(0) parsing table by using 1 lookahead symbol
to decide the reduction i.e. apply reduction by A -> α
only when next symbol(lookahead) is in FOLLOW(A).

Classification of Grammars (used in parsing)

A grammar is
• LL(1) if its LL(1) parsing table has no conflicts.
• LR(0) if its LR(0) parsing table has no conflicts.
• SLR if its SLR parsing table has no conflicts.
• LALR(1) if its LALR(1) parsing table has no conflicts.
• LR(1) if its LR(1) parsing table has no conflicts.

Operator Precedence Parsing
• An operator precedence parsing is a bottom-up
parsing technique that can be applied to operator-
precedence grammar(or simply operator grammar).
• Operator grammar:
an operator precedence grammar is a context-free
grammar that has the property that no production
has either an empty right-hand side or two adjacent
nonterminals in its right-hand side.

• Consider:
E  EAE | - E | ( E ) | id
A  - | + | * | / | ^
Not an operator grammar,
• but:
E  E - E | E + E | E * E | E / E | E ^ E | - E
| ( E ) | id
is an operator grammar.

Precedence Relations
• Operator precedence grammars rely on the
following three precedence relations between
the terminals:
Relation Meaning
a <· b a yields precedence to b
a =· b a has the same precedence as b
a ·> b a takes precedence over b

• These operator precedence relations allow to delimit the handles in
the right sentential forms:
– <• marks the left end,
– =• appears in the interior of the handle, and
– •> marks the right end.
• Contrary to other shift-reduce parsers, all nonterminals are
considered equal for the purpose of identifying handles.
• The relations do not have the same properties as their un-dotted
counterparts; e. g.
a =• b does not generally imply b =• a, and
b •> a does not follow from a <• b.
Furthermore, a =• a does not generally hold, and a •> a is possible.

• Example
the following operator precedence relations can be introduced for simple
expressions:
They follow from the following facts:
– + has lower precedence than * (hence + <• * and * •> +).
– Both + and * are left-associative (hence + •> + and * •> *).
id + * $
id ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· ·>

Using Operator-Precedence Relations
• GOAL: delimit the handle of a right sentential form
- <. will mark the beginning, .> will mark the end and .=.
will be in between.
• Since no two adjacent non-terminals appear in the RHS of
any production, the same is true for any sentential form.
• So given 0 a1 1 a2 2 … an n
where each i is either a nonterminal or the empty string.
• We drop all non-terminals and we write the corresponding relation
between each consecutive pair of terminals.
• Example for $id+id*id$ using standard precedence:
$<.id.>+<.id.>*<.id.>$
• Example for $E+E*id$ … $<.+<.*<.id.>$

Using Operator-Precedence
• … Then
1. Scan the string to discover the first .>
2. Scan backwards skipping .=. (if any) until a <. is
found.
3. The handle is the substring delimited by the two
steps above (including any in-between or surrounding
non-terminals).
E.g.
Consider the sentential form E+E*E
we obtain $+*$ and from this the string
$<. + <. * .> $
• The handle is E*E

Operator Precedence Table Construction
• Basic techniques for operators:
– if operator 1 has higher precedence than 2
then set 1.> 2
– If the operators are of equal precedence (or the same
operator)
set 1.> 2 and 2.> 1 if the operators associate to the
left
set 1<. 2 and 2<. 1 if the operators associate to the
right
– Make <.( and (<. and ).> and .>)
– id has higher precedence than any other symbol
– $ has lowest precedence.

Operator-Precedence Relation Table
+ - * / ^ id ( ) $
+ .> .> <. <. <. <. <. .> .>
- .> .> <. <. <. <. <. .> .>
* .> .> .> .> <. <. <. .> .>
/ .> .> .> .> <. <. <. .> .>
^ .> .> .> .> <. <. <. .> .>
id .> .> .> .> .> .> .>
( <. <. <. <. <. <. <. =·
) .> .> .> .> .> .> .>
$ <. <. <. <. <. <. <.

Constructing precedence functions
Method:
1. Create symbols ft and gt for each t that is a terminal or $.
2. Partition the created symbols into as many groups as possible, in
such a way that if a =. b, then fa and gb are in the same group.
3. Create a directed graph whose nodes are the groups found in (2). For
any a and b, if a <.b , place an edge from the group of gb to the
group of fa. Of a .> b, place an edge from the group of fa to that of gb.
4. If the graph constructed in (3) has a cycle, then no precedence
functions exist. If there are no cycle, let f(a) be the length of the
longest path beginning at the group of fa; let g(a) be the length of the
longest path beginning at the group of ga.

Operator Precedence Parsing Algorithm
Initialize: push $ to stack, Set ip to point to the first symbol of w$
Repeat:
If $ is on the top of the stack and ip points to $ then return (Success)
else
Let a be the top terminal on the stack, and b the symbol pointed to by
ip
if a <• b or a =• b then
push b onto the stack
advance ip to the next input symbol
else if a •> b then
repeat
pop the stack
until the top stack terminal is related by <• to the terminal
most recently popped
else error()
end

Operator Precedence Parsing Algorithm
Initialize: push $ to stack, Set ip to point to the first symbol of w$
Repeat:
- If $ is on the top of the stack and ip points to $ then return (Success)
- Obtain OP relation between the top terminal symbol on the stack and the next input
symbol
If the OP relation is <. or =.
Stack input symbol.
Else OP relation is >.
-Pop top of the stack into handle, include non-terminal symbol if
appropriate
-Obtain the relation between the top terminal symbol on the stack and
the leftmost terminal symbol in the handle
-While the OP relation between terminal symbols is =. Do
-Pop top terminal symbol and associated non-terminal
symbol on stack into handle
-Obtain the OP relation between the top terminal symbol on
the stack and the leftmost terminal symbol in the handle
-Match the handle against the RHS of all productions
-Push generic nonterminal N onto the stack
Else return Error

Disadvantages of Operator Precedence
Parsing
• Disadvantages:
– It cannot handle the unary minus (the lexical analyzer
should handle the unary minus).
– Small class of grammars.
– Difficult to decide which language is recognized by the
grammar.
• Advantages:
– simple
– powerful enough for expressions in programming
languages

Precedence Relationship
1. For each nonterminal, construct a Firstop list containing the first
terminal in each production for that nonterminal. Where a
nonterminal is the first symbol on the right side, include both it
and the first terminal following e.g.
For X -> a…. | Bc….
include a,c and B in X’s Firstop list.
2. Similarly, construct a Lastop list for each nonterminal, e.g.
for Y -> …..u | ….vW
include u,v and W in Y’s Lastop list.
3. Compute the Firstop+ and Lastop+ lists as follows:
a. Take each nonterminal in turn, in any order and look for it in all
the Firstop lists. Add its own first symbol list to any other which it
contains. Similarly process the Lastop lists.
b. The nonterminals may now be deleted form the lists.

4. Construct the precedence matrix by the following rules:
a. whenever terminal a immediately precedes
nonterminal B in any production, put a <. c, where c is
any terminal in the Firstop+ list for B.
b. whenever terminal b immediately follows nonterminal
C in any production, put d .> b, where d is any terminal in
the Lastop+ list for C.
c. Whenever a sequence aBc or ac occurs in any
production, put a .= c
5. Add the relations $ <. a and a .> $ for all terminals in the
Firstop+ and Lastop+ lists respectively, for S.

• Consider grammar:
S -> A
A -> T | A+T | A-T
T -> F | T*F | T/F
F -> P | P^F
P -> i | n | (A)

3. Syntax Analyzer.pptx

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à 3. Syntax Analyzer.pptx

Similaire à 3. Syntax Analyzer.pptx (20)

Plus de Mattupallipardhu

Plus de Mattupallipardhu (13)

Dernier

Dernier (20)

3. Syntax Analyzer.pptx

Notes de l'éditeur