Lex is officially known as a "Lexical Analyser". It's main job is to break up an input stream into more usable elements.
Yacc is officially known as a "parser". In the course of it's normal work, the parser also verifies that the input is syntactically sound.
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
Systems Programming & Operating Systems - Overview of LEX-and-YACC
1. Systems Programming & Operating Systems
Unit – III
Case Study: Overview of LEX and YACC
Prof. Deptii Chaudhari
Assistant Professor
Department of Computer Engineering
International Institute of Information Technology, Pune
Department of Computer Engineering
1
2. LEX & YACC
• What is Lex?
• Lex is officially known as a "Lexical Analyser".
• It's main job is to break up an input stream into more usable elements.
• Or in, other words, to identify the "interesting bits" in a text file.
• What is Yacc?
• Yacc is officially known as a "parser".
• In the course of it's normal work, the parser also verifies that the input is
syntactically sound.
• YACC stands for "Yet Another Compiler Compiler". This is because this
kind of analysis of text files is normally associated with writing compilers.
2Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
3. 3Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
4. LEX Program Structure
Definitions
%{
C global variables, prototype, Comments
%}
Production
Rules
%% ------------------------------------%%
User Subroutine
Section
(Optional)
4Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
5. • In the rules section, each rule is made up of two parts : a pattern and an action
separated by whitespace.
• The lexer that lex generates will execute the action when it recognizes the
pattern.
• The user subroutine section, consists of any legal C code.
• Lex copies it to the C file after the end of the lex generated code.
• Lex translates the Lex specification into C source file called lex.yy.c which
we compile and link with lex library –ll.
• Then we can execute the resulting program to check that it works as we
expected.
5Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
6. Example
%{
#include <stdio.h>
%}
%%
[0123456789]+ printf("NUMBERn");
[a-zA-Z][a-zA-Z0-9]* printf("WORDn");
%%
• Running the Program
$ lex example_lex.l
gcc lex.yy.c –ll
./a.out
6Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
7. Pattern Matching Primitives
Metacharacter Matches
. any character except newline
n newline
* zero or more copies of the preceding expression
+ one or more copies of the preceding expression
? zero or one copy of the preceding expression
^ beginning of line
$ end of line
a|b a or b
(ab)+ one or more copies of ab (grouping)
"a+b" literal "a+b" (C escapes still work)
[] character class
7Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
8. Pattern Matching Examples
Expression Matches
abc abc
abc* ab abc abcc abccc ...
abc+ abc, abcc, abccc, abcccc, ...
a(bc)+ abc, abcbc, abcbcbc, ...
a(bc)? a, abc
[abc] one of: a, b, c
[a-z] any letter, a through z
[a-z] one of: a, -, z
[-az] one of: - a z
[A-Za-z0-9]+ one or more alphanumeric characters
[ tn]+ whitespace
[^ab] anything except: a, b
[a^b] a, ^, b
[a|b] a, |, b
a|b a, b
8Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
9. Operation of yylex()
•When lex compiles the input specification, it generates the
C file lex.yy.c that contains the routine int yylex(void).
•This routine reads the input string trying to match it with
any of the token patterns specified in the rules section.
•On a match associated action is executed.
•When we call yylex() function, it starts the process of
pattern matching.
•Lex keeps the matched string into the address pointed by
pointer yytext.
•Matched string's length is kept in yyleng while value of
token is kept in variable yylval.
9Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
10. %{
int com=0;
%}
%%
"/*"[^n]+"*/" {com++;fprintf(yyout, " ");}
%%
int main()
{
printf("Write a C programn");
yyout=fopen("output", "w");
yylex();
printf("Comment=%dn",com);
return 0;
}
$ cc lex.yy.c -ll
$ ./a.out
Write a C program
#include<stdio.h>
int main()
{
int a, b;
/*float c;*/
printf(“Hi”);
/*printf(“Hello”);*/
}
Comment=2
$ cat output
#include<stdio.h>
int main()
{
int a, b;
printf(“Hi”);
}
10Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
11. Lex Predefined Variables
11Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
12. YACC
•YACC is a parser generator that takes an input file with
an attribute-enriched BNF (Backus – Naur Form) grammar
specification.
•It generates the output C file y.tab.c containing the
function int yyparse(void) that implements its parser.
•This function automatically invokes yylex() everytime it
needs a token to continue parsing.
12Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
13. 13Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
14. Structure of YACC Program
Definitions
%{
C global variables, prototype, Comments
%}
Context free grammar
& action for each
production
%% ------------------------------------%%
Subroutines/Functions
14Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
15. Arithmatic.l
%{
#include<stdio.h>
#include "y.tab.h"
extern int yylval;
%}
%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
}
[t] ;
[n] return 0;
. return yytext[0];
%%
int yywrap()
{
return 1;}
How To Run:
$yacc -d arithmatic.y
$lex arithmatic.l
$gcc lex.yy.c y.tab.c
$./a.out
15Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
16. References
• https://www.epaperpress.com/lexandyacc/
• John. R. Levine, Tony Mason and Doug Brown - Lex and Yacc‖, O'Reilly
16Deptii Chaudhari, Dept of Computer Engineering, Hope Foundation’s International Institute of Information Technology, I²IT P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057 Tel - +91 20 22933441/2/3 | www.isquareit.edu.in | info@isquareit.edu.in
17. 17
THANK YOU
For further details, please contact
Deptii Chaudhari
deptiic@isquareit.edu.in
Department of Computer Engineering
Hope Foundation’s
International Institute of Information Technology, I²IT
P-14,Rajiv Gandhi Infotech Park
MIDC Phase 1, Hinjawadi, Pune – 411057
Tel - +91 20 22933441/2/3
www.isquareit.edu.in | info@isquareit.edu.in