This document discusses compiler design and how compilers work. It begins with prerequisites and definitions of compilers and their origins. It then describes the architecture of compilers, including lexical analysis, parsing, semantic analysis, code optimization, and code generation. It explains how compilers translate high-level code into machine-executable code. In conclusions, it summarizes that compilers translate code without changing meaning and aim to make code efficient. References for further reading on compiler design principles are also provided.
2. Friday, July 6, 2018
2
Pre-requisites before we start
Introduction
Compiler and its origin
Compiler Architecture
How compiler works?
Advantages and Application of Compiler over interpreter
Conclusions
References and Details links
Content Overview
3. Friday, July 6, 2018
3 Pre-requisites before we start
• High Level and Low Level languages
• Platforms
• Interpreters
• Bytecodes
• Loaders and Linkers
• Cross-platform applications
• Definitions
• Compiler
• De-compiler
4. Friday, July 6, 2018
4 Introduction
• A compiler is computer software that transforms computer code written in
high level language into machine understandable language.
• Compilers are more like a type of translator that support digital devices,
primarily computers.
Compiler Target ProgramSource Program
Error Messages
5. Friday, July 6, 2018
5 Introduction Contd..
Fig: Schematics of High Level Application Code Execution
6. Friday, July 6, 2018
6 Compiler and its origin
• Primitive binary languages evolved because digital devices only understand ones and
zeros and the circuit patterns in the underlying machine architecture.
• Limited memory capacity of early computers led to substantial technical challenges when
the first compilers were designed. Therefore, the compilation process needed to be divided
into several small programs.
• A compiler is likely to perform many or all of the following operations:
• Preprocessing
• Lexical analysis
• Parsing
• Semantic analysis (syntax-directed translation)
• Conversion of input programs to an intermediate representation
• Code optimization
• Code generation.
9. Friday, July 6, 2018
9 Compiler Architecture Contd..
• The Compiler accepts the high level language as input and converts it to target executable code by
passing through the following stages.
• Source Program
• The high level machine program written in human understandable language such as C,
C++, Java
• Often English like language hence easily readable and has specified rules and limited
literals called as keyword and variables.
• Lexical Analyzer
• The code written in high level languages contain literals such as operators, alphabets and
special characters.
• Lexical analyzer divides the code in to tokens and stores them in to symbol table.
• Symbol table is a Hash Table of tokens and contain unique entries of literals.
• Eg: int a = 10; - in this example ‘int’, ‘a’, ‘=‘, ‘;’ are the literals
10. Friday, July 6, 2018
10 Compiler Architecture Contd..
• Syntax Analyzer
• Each piece of code written in high level language must comply to the rules defined by it.
These are called structural rules of the language.
• Eg: printf(“Hi, this is a print statement”);
int a=10, b=20, c;
c = a + b;
• Each of these lines have pre-defined syntax. Here for first line it is as function name,
followed by open brace and double quote, printable content, closing quotes, closing braces
and semicolon.
• Schematic Analyzer
• Semantic analysis checks whether the parse tree constructed follows the rules of language.
• For example, assignment of values is between compatible data types, and adding string to
an integer.
11. Friday, July 6, 2018
11 Compiler Architecture Contd..
Fig : Schematic Analysis - A typical schematic diagram for addition of two number
=
c +
a b
12. Friday, July 6, 2018
12 Compiler Architecture Contd..
• Symbol Table Manager
• Lexical analyzer generates tokens as a part of lexical analysis of the source code. These
tokens needs to be managed and stored. To do so a Hash Dataset is used called as Symbol
Table or SymTab
• Error Handler
• There might be chances of occurring common error such as wrongly typed keyword,
missing semicolon, improper function definition, missing return types. All these errors
must be reported and resolved prior to code execution.
• Error handler produces error messages and notifies users about the same.
• Intermediate Code Generator
• An Intermediate representation (IR) is the data structure or code used internally by a
compiler or virtual machine to represent source code.
• An IR may take one of several forms: an in-memory data structure, or a special tuple- or
stack-based code readable by the program. In the latter case it is also called an
intermediate language.
13. Friday, July 6, 2018
13 Compiler Architecture Contd..
• Intermediate code example
High level expression => a = b + c * d;
Intermediate code => r1 = c * d;
r2 = b + r1;
a = r2
• Code Optimizer
• Optimization is a program transformation technique, which tries to improve the code by
making it consume less resources (i.e. CPU, Memory) and deliver high speed.
• The output code must not, in any way, change the meaning of the program.
• Optimization should increase the speed of the program and if possible, the program should
demand less number of resources.
• Optimization should itself be fast and should not delay the overall compiling process.
14. Friday, July 6, 2018
14 Compiler Architecture Contd..
• Example of code optimization
do
{
item = 10;
value = value + item;
} while(value<100);
item = 10;
do
{
value = value + item;
} while(value<100);
Variable instantiation in each loop
iteration – increasing the amount of
memory used for execution
Variable instantiation in each loop
iteration – increasing the amount of
memory used for execution
15. Friday, July 6, 2018
15 Compiler Architecture Contd..
• Code Generator
• In computing, code generation is the process by which a compiler's code generator
converts some intermediate representation of source code into a form (e.g., machine code)
that can be readily executed by a machine.
• The input to the code generator typically consists of a parse tree or an abstract syntax tree.
• The tree is converted into a linear sequence of instructions, usually in an intermediate
language such as three-address code.
• Major tasks in code generation
• Instruction selection: which instructions to use.
• Instruction scheduling: in which order to put those instructions. Scheduling is a speed
optimization that can have a critical effect on pipelined machines.
• Register allocation: the allocation of variables to processor registers
• Debug data generation if required so the code can be debugged.
18. Friday, July 6, 2018
18 Advantages and Application of Compiler over interpreter
• The entire program is verified so there are no syntax or semantic errors;
• The executable file is optimized by the compiler so it execute faster;
• User do not have to execute the program on the same machine it was built.
• Parse the program
• Check for syntax errors, check for data types
• Create internal structure in memory
• Verify the program semantic, optimize the structure
• Translate the program in other language, generate files on disk
• Link the files into an executable
19. Friday, July 6, 2018
19 Conclusions
• A compiler translates the code written in one language to some other language without
changing the meaning of the program.
• It is also expected that a compiler should make the target code efficient and optimized in
terms of time and space.
• Compiler design principles provide an in-depth view of translation and optimization
process. Compiler design covers basic translation mechanism and error detection &
recovery.
• It includes lexical, syntax, and semantic analysis as front end, and code generation and
optimization as back-end.
20. Friday, July 6, 2018
20 References and Details links
• Principles of Compiler Design (Addison-Wesley series in computer science and information
processing), by Alfred V. Aho and Jeffrey D. Ullman, Aug 1977
• Compilers: Principles, Techniques, and Tools (2nd Edition) Alfred V. Aho, Monica S. Lam, Ravi
Sethi, and Jeffrey D. Ullman Aug 31, 2000
• Principles of Compiler Design, Allman Jeffrey D. Aho Alfred V 1989
• Principles Of Compiler Design. Second Printing, Alfred V. And Ullman, Jeffrey D. Aho, 1984
• The Compiler Design Handbook: Optimizations & Machine Code Generation Y.N. Srikant and
Priti Shankar Sep 25, 2002.
• https://www.tutorialspoint.com/compiler_design/
• https://en.wikipedia.org/wiki/Compiler
• https://en.wikipedia.org/wiki/List_of_compilers
• https://en.wikipedia.org/wiki/Java_compiler
• https://en.wikibooks.org/wiki/Compiler_Construction