3. GRADING AND POLICY
Grades will be based on 100 possible points,
using the following distribution schedule:
Assignments: 5%
Mid-term Exam : 10%
Group Project: 10%
Practical: 10%
Attendance: 5%
Final Exam: 60%
Dr. Hussien M. Sharaf
3
4. COURSE OUTLINE AND REFERENCES
Chapter one – Introduction
Chapter three- Lexical analysis definition using Regular Expression
Chapter three- Lexical analysis using DFA
Chapter three- Lexical analysis using NFA
and Transfer NFA to DFA.
Dr. Hussien M. Sharaf
4
5. COURSE OUTLINE AND REFERENCES
Chapter four- Syntax analysis using CFG.
Chapter four- Syntax analysis, Parsing
trees and Ambiguity.
Chapter four - Removing Left Recursion
and Left Factoring.
Chapter four - Syntax analysis (CFG) using
Top-down parsing.
Dr. Hussien M. Sharaf
5
6. COURSE OUTLINE AND REFERENCES
Chapter four - First and Follow operators.
Chapter four - Syntax analysis (CFG) using
Bottom-Up (predictive/LR) parsing.
Chapter four - Construction of “LR Parsing
Tables” / “parsing Table LL1”
Semantic analysis
Intermediate code and code generation.
Dr. Hussien M. Sharaf
6
7. LECTURE1 OUTLINE
Introduction is split into two lectures:
Lec1: Overview
What are compilers
Phases ( architecture ) of a Compiler.
Some Data structures that are required for
compiler’s work:
Token
Symbol table.
Literal table
Parse tree
Dr. Hussien M. Sharaf
7
8. LECTURE2 OUTLINE
Lec2: Overview
Phases of a Compiler:
Scanning.
Parsing.
Semantic analysis.
Intermediate code generation.
Code generation.
Dr. Hussien M. Sharaf
8
9. WHAT ARE COMPILERS?
A program that translates one
language to another.
Source
program
Responsibility:
Usually
High level
language
1. Accepts a source program
typically written in a high- level
language.
compiler
Target
program
Usually
machine
language
Error
message
2. Produces an equivalent target
program typically in assembly or
machine language.
3. Reports error messages as part
of the translation process.
Dr. Hussien M. Sharaf
9
10. COUSINS OF COMPILERS
1.
2.
3.
4.
Interpreter: is a program that ultimately performs the
same function as a compiler, but in a different manner.
It works by scanning through the source program
instruction by instruction. As each instruction is
encountered, the interpreter translates it into machine
code and executes it directly.
Assembler: is a program that automatically translates
the source program written in assembly language and
to produce as output an object code written in binary
machine code.
Linker: is a program that takes one or more objects
generated by compilers and assembles them into a
single executable program.
Loader: (is a routine that) loads an object program into
memory and prepares it for execution
Dr. Hussien M. Sharaf
10
11. DIFFERENT ARCHITECTURAL VIEWS
1.
2.
Functional view : 6 phases.
Logical view: the 6 phases are grouped into two main
categories
A.
B.
3.
Analysis VS synthesis.
Front end VS back end.
Operations view: execute one or more phase into one
pass. Each pass builds or updates the output of the
previous pass.
A.
B.
C.
Scanning & parsing.
Sematic analysis.
Code generation & optimization.
Dr. Hussien M. Sharaf
11
12. ARCHITECTURE/PHASES OF A COMPILER
Stream of
characters
Scanner/lexical analyzer
Stream of tokens
Parser/ syntax analyzer
Parse/syntax tree
Literal Table
Semantic analyzer
Annotated tree
Source Code optimization
Intermediate code
Symbol Table
Code generator
Target code
Target Code optimization
Target code
Dr. Hussien M.
Sharaf
12
13. SOME DATA STRUCTURES
1. Token
2. Symbol table
3. Literal table
4. Parse tree
5. Semantic parse tree
6. Intermediate code
Dr. Hussien M. Sharaf
13
14. Dr. Hussien M. Sharaf
1. TOKEN
Single Symbol ahead: In most languages the
scanner needs to generate only one token
ahead at a time.
In this case you don’t need a collection/array
of tokens, only one global variable can be
used.
15. 2. SYMBOL TABLE
1.
2.
Stores information associated with identifiers.
Information associated with variables like
[name, type, address, size (for array), etc.]
Stores Information associated with functions
like [name, type of return
value, parameters, address, etc.]
Sample
name
Type address size (for array)
code:
x
int
OxA300 n/a
y
int
OxA304 n/a
int x, y;
c
char OxA308 10
char c[10];
x = 5;
Dr. Hussien M. Sharaf
15
16. 2. SYMBOL TABLE (CONT’D)
Use defined data types like structs, enums
and classes.
The symbol table is modified by the scanner,
parser, and semantic analyzer.
3.
The information at the symbol table is used by
intermediate code generator phase and machine
code generator phase.
Mostly use hash table for efficiency Because
access time is O(k) and space consumption is
not a concern.
Dr. Hussien M. Sharaf
16
17. 3. LITERAL TABLE
Store constants and strings used in program
reduce
the memory size by reusing constants
and strings
Can be combined with symbol table in some
implementations.
Dr. Hussien M. Sharaf
17
19. 5. SEMANTIC PARSE TREE
Usually the same parse tree is used and
annotations are added for each node.
Dr. Hussien M. Sharaf
19
20. 6. INTERMEDIATE CODE
The structure of the code
is kept as simple as
possible usually threeaddress code.
Each instruction is allows
only three addresses
(variables).
Each instruction is added
as an entry into a linked
list that allows dynamic
growth.
Dr. Hussien M. Sharaf
Var1
Var2
op
Var3
Var1
Var2
op
Var3
….
….
op
…
NULL
20