Translation of a program written in a source language into a semantically equivalent program written in a target language
It also reports to its users the presence of errors in the source program
2. Natural Languages
• What are Natural Languages?
• How do you understand the language?
• If you know multiple languages then how
can you recognize each of them?
• How you know which sentence is correct
and which one is incorrect?
3. Programming Languages
• What are programming languages?
• How do you understand the programming
language?
• If you know multiple programming
languages then how can you recognize each
of them?
• How do you know which syntax is correct
and which one is incorrect?
4. Compilers and Interpreters
• “Compilation”
– Translation of a program written in a source
language into a semantically equivalent
program written in a target language
– It also reports to its users the presence of errors
in the source program
– C++ uses compiler
Compiler
Error messages
Source
Program
Target
Program
Input
Output4
5. Compilers and Interpreters
Interpreter
Source
Program
Input
Output
Error messages
• “Interpretation”
– Interpreter is a program that reads an executable
program and produces the results of running that
program. OR
– Instead of producing a target program as a translation,
an interpreter performs the operations implied by the
source program.
– GWBASIC is an example of Interpreter
5
6. Why study compilers?
• Application of a wide range of theoretical
techniques
– Data Structures
– Theory of Computation
– Algorithms
– Computer Architecture
• Good SW engineering experience
• Better understanding of programming
languages
7. Features of compilers
• Correctness
– preserve the meaning of the code
• Speed of target code
• Recognize legal and illegal program.
• Speed of compilation
• Good error reporting/handling
• Cooperation with the debugger
• Manage storage of all variables and codes.
• Support for separate compilation
10. Single Pass Compiler
• Source code directly transforms into
machine code.
– For example Pascal
source
code
target
code
Front EndCompiler
11. Two Pass Compiler
• Use intermediate representation
– Why?
source
code
target
code
Front End Back End
IR
Front End
12. Two pass compiler
• intermediate representation (IR)
• front end maps legal code into IR
• back end maps IR onto target machine
• simplify retargeting
• allows multiple front ends
• multiple passes ⇒ better code
12
14. Comparison
• One pass compilers are generally faster than
Multipass Compilers
• Multipass ensures the correctness of small
program rather than the correctness of a
large program (high quality code)
16. Front end
• recognize legal code
• report errors
• produce IR
• preliminary storage map
• shape code for the back end
16
17. Scanner
• Breaks the source code text into small
pieces called tokens.
• It is also known as Lexical Analyzer
18. Scanner / Lexical Analyser
• map characters to tokens
• character string value for a token is a lexeme
• eliminate white space
x = x + y <id,x> = <id,x> + <id,y>
18
20. Front end –Analysis– Machine
Independent
• The front end consists of those phases, that
depend primarily on the source language
and are largely independent of the target
machine.
22. BACK END
• Synthesis process
• Machine dependent
• The back end includes those portions of the
compiler that depends on the target machine
and generally, these portions do not depend
on the source language
23. Back end
• translate IR into target machine code
• choose instructions for each IR operation
• decide what to keep in registers at each point
• ensure conformance with system interfaces
23
24. Compiler Structure
• Front end
– Front end Maps legal code into IR
– Recognize legal/illegal programs
• report/handle errors
– Generate IR
– The process can be automated
• Back end
– Translate IR into target code
• instruction selection
• register allocation
• instruction scheduling
26. The Analysis-Synthesis Model
of Compilation
• There are two parts to compilation:
– Analysis determines the operations implied by
the source program which are recorded in a tree
structure
– Synthesis takes the tree structure and translates
the operations therein into the target program
26
27. ANALYSIS PROCEDURE
• During analysis, the operation implied by
the source program are determined and
recorded in a hierarchical structure called a
tree.
• Often a special type of tree called a Syntax
tree in which each node represents an
operation and the children of a node
represent the arguments of the operation.
29. REMEMBER
The front end is responsible for
analysis process while the back
end is responsible for Synthesis
30. Other Tools that Use the
Analysis-Synthesis Model
• Editors (syntax highlighting)
• Pretty printers (e.g. Doxygen)
• Static checkers (e.g. Lint and Splint)
• Interpreters
• Text formatters (e.g. TeX and LaTeX)
• Silicon compilers (e.g. VHDL)
• Query interpreters/compilers (Databases)
30
31. Structure Editors
• A structure editor takes as input a sequence of
commands to build a source program.
• The structure editor not only performs the text
creation and modification functions of an ordinary
text editor but it also analyzes the program text,
putting an appropriate hierarchical structure on the
source program.
• Thus the structure editor can perform additional
tasks that are useful in the preparation of
programs.
32. Structure Editors (cont..)
• For example, it can check that the input is
correctly formed, can supply key words
automatically (e.g. when the user types
while the editor supplies the matching do
and reminds the user that a conditional must
come between them).
33. Pretty printers
• A pretty printer analyzes a program and
prints it in such a way that the structure of
the program becomes clearly visible.
• For example comments may appear in a
special font, and the statements may appear
with an amount of indentation proportional
to the depth of their nesting in the
hierarchical organization of the statement.
34. Static Checkers
• A static checker reads a program, analyzes it, and
attempts to discover potential bugs without
running the program.
• A static checker may detect that parts of the source
program can never be executed, or that a certain
variable might be used before being defined.
• In addition, it can catch logical errors such as
trying to use a real variable as a pointer,
employing the type checking techniques.
35. Interpreters• Instead of producing a target program as a
translation, an interpreter performs the
operations implied by the source program.
• For example, for an assignment statement
an interpreter might build a tree and then
carry out the operations at the nodes as it
“walks” the tree.
:=
<id,1>
<id,2>
<id,3>
+
*
60
position := initial + rate * 60
36. Interpreters (cont..)• At the root it would discover it had an assignment to
perform, so it would call a routine to evaluate the
expression on the right, and then store the resulting value
in the location associated with the identifier position.
• At the right child of the root, the routine would discover it
had to compute the sum of two expressions
• It would call itself recursively to compute the value of
expression rate * 60
• It would then add that value to the value of the variable
initial
37. Text Formatters
• A text formatter takes input that is a stream
of characters, most of which is text to be
typeset, but some of which includes
commands to indicate paragraphs, figures or
mathematical structures like subscripts and
superscripts.
38. Silicon compilers
• A silicon compiler has a source language
that is similar or identical to a conventional
programming language.
• However, the variables of the language
represent, not locations in memory but
logical signals (0 or 1) or groups of signals
in a switching circuit.
39. Query interpreters
• A query interpreter translates a predicate
containing relational and Boolean operators
into commands to search a database for
records satisfying that predicate.