1. The document discusses the different phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, code optimization, and code generation.
2. It provides examples and explanations of each phase, describing how the compiler processes source code through each step to produce executable machine code.
3. The phases involve breaking source code into tokens, checking syntax and semantics, generating intermediate code, optimizing for efficiency, and finally mapping to target machine language instructions.
1. Compilers (CPL5316)
Software Engineering
Koya University
2017-2018
e-mail: rebaz.najeeb@koyauniversity.org
LECTURE 1
Programming language processors
1
Compilers (CPL5316) # 1 Lectured by : Rebaz Najeeb
2. WHY COMPILERS?
⌐ Increase productivity
⌐ Low level programming languages are harder to write , less portable , error-prone area,
harder to maintain.
⌐Hardware synthesis :
Verilog , VHDL to describe high-level hardware description language.
Register Transfer Language (RTL) gates transistors physical layout.
⌐ Reverse engineering eg. (Executble applications .exe )
runtime (0100101100) assembly code high level language.
Compilers (CPL5316) # 2 Lectured by : Rebaz Najeeb
3. UNDERSTANDING BINARY LANGUAGE IS HARD
⌐ 01000101 01110110 01100101 01110010 01111001 01101111 01101110 01100101
00100000 01100011 01100001 01101110 00100000 01110000 01100001 01110011
01110011 00101100 00100000 01110101 01101110 01101100 01100101 01110011
01110011 00100000 01010011 00101111 01101000 01100101 00100000 01100100
01101111 01100101 01110011 01101110 00100111 01110100 00100000 01110111
01100001 01101110 01110100 00100000 01110100 01101111 00101110
What does that mean ?
Compilers (CPL5316) # 3 Lectured by : Rebaz Najeeb
4. PROGRAMMING LANGUAGE CLASSIFICATION (LEVELS)
⌐ Low-level language
Assembly language
Machine language
⌐ High-level language
C, C++, java, Pascal, Prolog, Scheme
⌐ Natural language
English, Kurdish.
There is also a classification by programming language generations.
Compilers (CPL5316) # 4 Lectured by : Rebaz Najeeb
5. TYPES OF COMPILATION
⌐ Cross-compiler: a compiler which can convert instructions into machine code or low-level code for
a computer other than that on which it is run.
⌐ bootstrap compiler : written in the language that is compiled
⌐Decompile : converts Low level language to High level language.
⌐Source-to-source compiler (Transpiler): translates between high-level languages
⌐ language rewriter : translates the form of expressions without a change of language.
⌐Compiler : LLL HLL.
Compilers (CPL5316) # 5 Lectured by : Rebaz Najeeb
6. WHO IS THIS GIRL?
Compilers (CPL5316) # 6 Lectured by : Rebaz Najeeb
7. HISTORY OF COMPILERS
⌐ Grace Hopper, in 1952 , A-0 System .
⌐ Alick Glennie in 1952, Autocode.
⌐ John W. Backus Speedcoding , 1953 .
- More productive , but 10-20 slower in execution.
- took 300 bytes of memory (30% memory)
⌐ Fortran, first complete compiler, in 1954-1957 (18 yrs) %50 code were in Fortran.
⌐ 1960 Cobol, lisp
⌐ 1970 Pascal , C
⌐ 1980 OOP Ada , smalltalk , C++ .
⌐ 1990 Java , script , Perl
⌐ 2000 language specifications
Compilers (CPL5316) # 7 Lectured by : Rebaz Najeeb
8. COMPILERS
⌐ A compiler translates the code written in one language (HLL) to some other language (LLL) without
changing the meaning of the program.
⌐ It is also expected that a compiler should make the target code efficient and optimized in terms of time
and space.
⌐ Compiler design covers basic translation mechanism and error detection & recovery.
⌐ Fortran , Ada, C, C++ , C# , Cobol.
Compilers (CPL5316) # 8 Lectured by : Rebaz Najeeb
Source program Compiler Output (result)
Error/ Warning
Executable program
Input(data)
Executable program
9. INTERPRETERS
⌐ Interpreter is a type of language processor that directly executes the operations
specified in the source program on inputs supplied by the user.
⌐ Python , Perl , Basic , Ruby, AWK.
Compilers (CPL5316) # 9 Lectured by : Rebaz Najeeb
Source program Interpreter Output (result)
Error/ Warning
Input(data)
10. COMPILERS VS INTERPRETERS
⌐
Compilers (CPL5316) # 10 Lectured by : Rebaz Najeeb
No Compiler Interpreter
1 Compiler Takes Entire program as input Interpreter Takes Single instruction as input .
2 Intermediate Object Code is Generated No Intermediate Object Code isGenerated
3 Conditional Control Statements are Executes faster Conditional Control Statements are Executes slower
4
Memory Requirement : More(Since Object Code is
Generated)
Memory Requirement is Less
5 Program need not be compiled every time
Every time higher level program is converted into lower
level program
6 Errors are displayed after entire program is checked
Errors are displayed for every instruction interpreted
(if any)
7 Example : C Compiler Example : BASIC
http://www.c4learn.com/c-programming/compiler-vs-interpreter/
11. JAVA VIRTUAL MACHINE
⌐ Why is Java platform independent ?
⌐Tools to view and edit bytecodes
- ASM (http://asm.ow2.org)
- Jasmin (http://jasmin.sourceforge.net)
⌐ Show demo by CMD byusing javap
Deassembler.
Compilers (CPL5316) # 11 Lectured by : Rebaz Najeeb
13. LANGUAGE PROCESSING PHASES?
Compilers (CPL5316) # 13 Lectured by : Rebaz Najeeb
Source Program
Preprocessor
Compiler
Assembler
Linker/Loader
Target machine code
Modified source program
Target assembly language
Relocatable machine code
Library files
Relocatable object
files
Memory
14. LANGUAGE PROCESSING PHASES
Compilers (CPL5316) # 14 Lectured by : Rebaz Najeeb
Source Program
Preprocessor
Compiler
Assembler
Linker/Loader
Target machine code
Modified source program
Target assembly language
Relocatable machine code
Compiler
Analysis
Synthesis
Library files
Relocatable object
files
Memory
15. ANALYSIS VS SYNTHESIS
Analysis part
Breaks up the source program into constituent pieces and create an intermediate representation of
the source program. It is often called the front end of the compiler.
The analysis part can be divided along the following phases:
Lexical analysis , syntax analysis , semantic analysis and intermediate code generation (optional)
Synthesis part
Construct the desired target program from the intermediate representation and the information of
the symbol table. It is often called the back end.
The Synthesis part can be divided along the following phases:
Intermediate code generator, code optimizer, code generator
Compilers (CPL5316) # 15 Lectured by : Rebaz Najeeb
18. LEXICAL ANALYSIS (SCANNING)
⌐ How do we understand English ?
⌐Break up sentence and recognize words
Compilers (CPL5316) # 18 Lectured by : Rebaz Najeeb
Iamasmartstudent. I am a smart student.
Separator
Noun
19. LEXICAL ANALYSIS (SCANNING)
⌐ lexical analysis is the process of converting a sequence of characters (such as a
computer program or web page) into a sequence of tokens (strings with an identified
"meaning").
Compilers (CPL5316) # 19 Lectured by : Rebaz Najeeb
20. SYNTAX ANALYSIS
⌐ How do we understand English?
The Smart students never ever give up.
Compilers (CPL5316) # 20 Lectured by : Rebaz Najeeb
S Adv V
Sentence
21. SYNTAX ANALYSIS (PARSING)
⌐ A Parser reads a stream of tokens from scanner, and determines if the
syntax of the program is correct according to the context-free grammar
(CFG) of the source language.
⌐ Then, Tokens are grouped into grammatical phrases represented by a Parse
Tree or an abstract syntax tree, which gives a hierarchical structure to the
source program.
Compilers (CPL5316) # 21 Lectured by : Rebaz Najeeb
24. SEMANTIC ANALYSIS
⌐ The Semantic Analysis phase checks the (meaning of) source program for semantic errors
(Type Checking) and gathers type information for the successive phases.
⌐ Semantic analysis is the heart of compiler. Also, type checking is the important part in this phase.
⌐ Check language requirements like proper declarations.
⌐Semantic analysis catches inconsistencies for instance mismatching datatypes.
Compilers (CPL5316) # 24 Lectured by : Rebaz Najeeb
25. INTERMEDIATE CODE GENERATION
⌐ After syntax and semantic analysis of the source program , many compilers generate an explicit
low-level or machine-like intermediate code.
⌐ In some compilers, a source program is translated into an intermediate code first and then
translated into the target language. In other compilers, translated directly into the target
language.
⌐ One of the popular intermediate code is three-address code.
Example:
temp1 = int_to_float(60)
temp2 = id3 ∗ temp1
temp3 = id2 + temp2
id1 = temp3
Compilers (CPL5316) # 25 Lectured by : Rebaz Najeeb
26. OPTIMIZATION
⌐ This phase attempts to improve the intermediate code, which is produced. So that faster-
running machine code can be achieved in the term of time and space.
⌐ The optimized code MUST be CORRECT
⌐ Run Faster (time)
⌐ Minimize power consumption (Mobile devices)
⌐ Use less memory
⌐ Fewer number of lines are better
⌐ Consider network, database access.
Compilers (CPL5316) # 26 Lectured by : Rebaz Najeeb
28. CODE GENERATION
⌐ The code generator takes as input an intermediate code representation of the source program
and maps it into the target language.
⌐ If the target language is machine code , registers or memory locations are selected for each of
the variables used by the program.
Compilers (CPL5316) # 28 Lectured by : Rebaz Najeeb
29. SYMBOL TABLE
⌐A symbol table is a data structure containing a record for each variable name, with fields
for the attributes of the name.
⌐ Symbol table should allow find the record for each name quickly and to store and
retrieve data from that record quickly.
⌐ Attributes may provide information about storage allocation, type , scope , number and
type of arguments , method of passing, the type returned.
Compilers (CPL5316) # 29 Lectured by : Rebaz Najeeb
30. ALL PHASES IN ONE EXAMPLE
Compilers (CPL5316) # 30 Lectured by : Rebaz Najeeb