Soumettre la recherche
Mettre en ligne
4 lexical
•
Télécharger en tant que PPT, PDF
•
1 j'aime
•
557 vues
U
u10co093
Suivre
Formation
Signaler
Partager
Signaler
Partager
1 sur 17
Télécharger maintenant
Recommandé
ANSI C REFERENCE CARD
ANSI C REFERENCE CARD
Tia Ricci
C Reference Card (Ansi) 2
C Reference Card (Ansi) 2
Regis Magalhães
C reference card
C reference card
Kolej Vokasional Tawau
Detecting Occurrences of Refactoring with Heuristic Search
Detecting Occurrences of Refactoring with Heuristic Search
Shinpei Hayashi
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Rushabh2428
Regular expressions
Regular expressions
Ratnakar Mikkili
Questions datastructures-in-c-languege
Questions datastructures-in-c-languege
bhargav0077
Menggunakan Kali Linux Untuk Mengetahui Kelemahan Implementasi TI
Menggunakan Kali Linux Untuk Mengetahui Kelemahan Implementasi TI
Ismail Fahmi
Recommandé
ANSI C REFERENCE CARD
ANSI C REFERENCE CARD
Tia Ricci
C Reference Card (Ansi) 2
C Reference Card (Ansi) 2
Regis Magalhães
C reference card
C reference card
Kolej Vokasional Tawau
Detecting Occurrences of Refactoring with Heuristic Search
Detecting Occurrences of Refactoring with Heuristic Search
Shinpei Hayashi
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Theory of Computation Regular Expressions, Minimisation & Pumping Lemma
Rushabh2428
Regular expressions
Regular expressions
Ratnakar Mikkili
Questions datastructures-in-c-languege
Questions datastructures-in-c-languege
bhargav0077
Menggunakan Kali Linux Untuk Mengetahui Kelemahan Implementasi TI
Menggunakan Kali Linux Untuk Mengetahui Kelemahan Implementasi TI
Ismail Fahmi
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
Kuppusamy P
Lifting 1
Lifting 1
douglaslyon
Pcd(Mca)
Pcd(Mca)
guestf07b62f
Pcd(Mca)
Pcd(Mca)
guestf07b62f
PCD ?s(MCA)
PCD ?s(MCA)
guestf07b62f
Principles of Compiler Design
Principles of Compiler Design
Babu Pushkaran
Compiler Design Material 2
Compiler Design Material 2
Dr. C.V. Suresh Babu
Intermediate code generation1
Intermediate code generation1
Shashwat Shriparv
Intermediate code generation
Intermediate code generation
RamchandraRegmi
New compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdf
eliasabdi2024
Lecture 12 intermediate code generation
Lecture 12 intermediate code generation
Iffat Anjum
Lexicalanalyzer
Lexicalanalyzer
Royalzig Luxury Furniture
Lexicalanalyzer
Lexicalanalyzer
Royalzig Luxury Furniture
Interm codegen
Interm codegen
Anshul Sharma
Syntaxdirected
Syntaxdirected
Royalzig Luxury Furniture
Syntaxdirected
Syntaxdirected
Royalzig Luxury Furniture
Syntaxdirected (1)
Syntaxdirected (1)
Royalzig Luxury Furniture
Predication
Predication
VisualBee.com
System Programming Unit IV
System Programming Unit IV
Manoj Patil
LEC 7-DS ALGO(expression and huffman).pdf
LEC 7-DS ALGO(expression and huffman).pdf
MuhammadUmerIhtisham
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
VishalSingh1417
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
Amita Gupta
Contenu connexe
Similaire à 4 lexical
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
Kuppusamy P
Lifting 1
Lifting 1
douglaslyon
Pcd(Mca)
Pcd(Mca)
guestf07b62f
Pcd(Mca)
Pcd(Mca)
guestf07b62f
PCD ?s(MCA)
PCD ?s(MCA)
guestf07b62f
Principles of Compiler Design
Principles of Compiler Design
Babu Pushkaran
Compiler Design Material 2
Compiler Design Material 2
Dr. C.V. Suresh Babu
Intermediate code generation1
Intermediate code generation1
Shashwat Shriparv
Intermediate code generation
Intermediate code generation
RamchandraRegmi
New compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdf
eliasabdi2024
Lecture 12 intermediate code generation
Lecture 12 intermediate code generation
Iffat Anjum
Lexicalanalyzer
Lexicalanalyzer
Royalzig Luxury Furniture
Lexicalanalyzer
Lexicalanalyzer
Royalzig Luxury Furniture
Interm codegen
Interm codegen
Anshul Sharma
Syntaxdirected
Syntaxdirected
Royalzig Luxury Furniture
Syntaxdirected
Syntaxdirected
Royalzig Luxury Furniture
Syntaxdirected (1)
Syntaxdirected (1)
Royalzig Luxury Furniture
Predication
Predication
VisualBee.com
System Programming Unit IV
System Programming Unit IV
Manoj Patil
LEC 7-DS ALGO(expression and huffman).pdf
LEC 7-DS ALGO(expression and huffman).pdf
MuhammadUmerIhtisham
Similaire à 4 lexical
(20)
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
Lifting 1
Lifting 1
Pcd(Mca)
Pcd(Mca)
Pcd(Mca)
Pcd(Mca)
PCD ?s(MCA)
PCD ?s(MCA)
Principles of Compiler Design
Principles of Compiler Design
Compiler Design Material 2
Compiler Design Material 2
Intermediate code generation1
Intermediate code generation1
Intermediate code generation
Intermediate code generation
New compiler design 101 April 13 2024.pdf
New compiler design 101 April 13 2024.pdf
Lecture 12 intermediate code generation
Lecture 12 intermediate code generation
Lexicalanalyzer
Lexicalanalyzer
Lexicalanalyzer
Lexicalanalyzer
Interm codegen
Interm codegen
Syntaxdirected
Syntaxdirected
Syntaxdirected
Syntaxdirected
Syntaxdirected (1)
Syntaxdirected (1)
Predication
Predication
System Programming Unit IV
System Programming Unit IV
LEC 7-DS ALGO(expression and huffman).pdf
LEC 7-DS ALGO(expression and huffman).pdf
Dernier
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
VishalSingh1417
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
Amita Gupta
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
Jisc
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
Celine George
Application orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
RamjanShidvankar
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
Mebane Rash
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
PROCESS RECORDING FORMAT.docx
PROCESS RECORDING FORMAT.docx
PoojaSen20
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Poonam Aher Patil
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
TechSoup
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
agholdier
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Celine George
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
VishalSingh1417
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
AreebaZafar22
Spatium Project Simulation student brief
Spatium Project Simulation student brief
Association for Project Management
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
dhanalakshmis0310
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
Admir Softic
Dernier
(20)
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
Application orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
PROCESS RECORDING FORMAT.docx
PROCESS RECORDING FORMAT.docx
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
Spatium Project Simulation student brief
Spatium Project Simulation student brief
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
4 lexical
1.
Chapter 4
Lexical analysis CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1
2.
Scanner
• Main task: identify tokens – Basic building blocks of programs – E.g. keywords, identifiers, numbers, punctuation marks • Desk calculator language example: read A sum := A + 3.45e-3 write sum write sum / 2 CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 2
3.
Formal definition of
tokens • A set of tokens is a set of strings over an alphabet – {read, write, +, -, *, /, :=, 1, 2, …, 10, …, 3.45e-3, …} • A set of tokens is a regular set that can be defined by comprehension using a regular expression • For every regular set, there is a deterministic finite automaton (DFA) that can recognize it – (Aka deterministic Finite State Machine (FSM)) – i.e. determine whether a string belongs to the set or not – Scanners extract tokens from source code in the same way DFAs determine membership CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 3
4.
Regular Expressions
A regular expression (RE) is: • A single character • The empty string, ε • The concatenation of two regular expressions – Notation: RE1 RE2 (i.e. RE1 followed by RE2) • The union of two regular expressions – Notation: RE1 | RE2 • The closure of a regular expression – Notation: RE* – * is known as the Kleene star – * represents the concatenation of 0 or more strings • Caution: notations for regular expressions vary – Learn the basic concepts and the rest is just syntactic sugar CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 4
5.
Token Definition Example
DIG • Numeric literals in Pascal, e.g. 1, 123, 3.1415, 10e-3, 3.14e4 * DIG • Definition of token unsignedNum . DIG → 0|1|2|3|4|5|6|7|8|9 DIG * unsignedInt → DIG DIG* e e unsignedNum → + unsignedInt - DIG DIG DIG (( . unsignedInt) | ε) ((e ( + | – | ε) unsignedInt) | ε) * DIG • Notes: – Recursion is not allowed! • FAs with epsilons are nondeterministic. – Parentheses used to avoid • NFAs are much harder to ambiguity implement (use backtracking) – It’s always possible to rewrite • Every NFA can be rewriten as a DFA (gets larger, tho) removing epsilons CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 5
6.
Simple Problem
• Write a C program which reads in a character string, consisting of a’s and b’s, one character at a time. If the string contains a double aa, then print string accepted else print string rejected. • An abstract solution to this can be expressed as a DFA b a 1- 2 3+ a, b a Start state b An accepting state input a b The state transitions of a DFA can be encoded as a table current 1 2 1 which specifies the new state for a given current state and state 2 3 1 input 3 3 3 CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 6
7.
#include <stdio.h>
main() { enum State {S1, S2, S3}; an approach in C enum State currentState = S1; int c = getchar(); while (c != EOF) { switch(currentState) { case S1: if (c == ‘a’) currentState = S2; if (c == ‘b’) currentState = S1; break; case S2: if (c == ‘a’) currentState = S3; if (c == ‘b’) currentState = S1; break; case S3: break; } c = getchar(); } if (currentState == S3) printf(“string acceptedn”); else printf(“string rejectedn”); } CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 7
8.
#include <stdio.h>
Using a table main() { enum State {S1, S2, S3}; simplifies the enum Label {A, B}; enum State currentState = S1; program enum State table[3][2] = {{S2, S1}, {S3, S1}, {S3, S3}}; int label; int c = getchar(); while (c != EOF) { if (c == ‘a’) label = A; if (c == ‘b’) label = B; currentState = table[currentState][label]; c = getchar(); } if (currentState == S3) printf(“string acceptedn”); else printf(“string rejectedn”); } CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 8
9.
Lex
• Lexical analyzer generator – It writes a lexical analyzer • Assumption – each token matches a regular expression • Needs – set of regular expressions – for each expression an action • Produces – A C program • Automatically handles many tricky problems • flex is the gnu version of the venerable unix tool lex. – Produces highly optimized code CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 9
10.
Scanner Generators
• E.g. lex, flex • These programs take a table as their input and return a program (i.e. a scanner) that can extract tokens from a stream of characters • A very useful programming utility, especially when coupled with a parser generator (e.g., yacc) • standard in Unix CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 10
11.
Lex example
input lex cc foolex foo.l foolex.c foolex tokens > foolex < input > flex -ofoolex.c foo.l Keyword: begin > cc -ofoolex foolex.c -lfl Keyword: if Identifier: size Operator: > >more input Integer: 10 (10) begin Keyword: then Identifier: size if size>10 Operator: * then size * -3.1415 Operator: - end Float: 3.1415 (3.1415) Keyword: end CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 11
12.
A Lex Program
DIG [0-9] … definitions … ID [a-z][a-z0-9]* %% %% {DIG}+ printf("Integern”); {DIG}+"."{DIG}* printf("Floatn”); … rules … {ID} [ tn]+ printf("Identifiern”); /* skip whitespace */ . printf(“Huh?n"); %% %% … subroutines … main(){yylex();} CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 12
13.
Simplest Example
%% .|n ECHO; %% main() { yylex(); } CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 13
14.
Strings containing aa
%% (a|b)*aa(a|b)* {printf(“Accept %sn”, yytext);} [ab]+ {printf(“Reject %sn”, yytext);} .|n ECHO; %% main() {yylex();} CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 14
15.
Rules
• Each has a rule has a pattern and an action. • Patterns are regular expression • Only one action is performed – The action corresponding to the pattern matched is performed. – If several patterns match the input, the one corresponding to the longest sequence is chosen. – Among the rules whose patterns match the same number of characters, the rule given first is preferred. CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 15
16.
x
character 'x' Flex’s RE syntax . any character except newline [xyz] character class, in this case, matches either an 'x', a 'y', or a 'z' [abj-oZ] character class with a range in it; matches 'a', 'b', any letter from 'j' through 'o', or 'Z' [^A-Z] negated character class, i.e., any character but those in the class, e.g. any character except an uppercase letter. [^A-Zn] any character EXCEPT an uppercase letter or a newline r* zero or more r's, where r is any regular expression r+ one or more r's r? zero or one r's (i.e., an optional r) {name} expansion of the "name" definition (see above) "[xy]"foo" the literal string: '[xy]"foo' (note escaped “) x if x is an 'a', 'b', 'f', 'n', 'r', 't', or 'v', then the ANSI-C interpretation of x. Otherwise, a literal 'x' (e.g., escape) rs RE r followed by RE s (e.g., concatenation) r|s either an r or an s <<EOF>> end-of-file CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 16
17.
/* scanner for
a toy Pascal-like language */ %{ #include <math.h> /* needed for call to atof() */ %} DIG [0-9] ID [a-z][a-z0-9]* %% {DIG}+ printf("Integer: %s (%d)n", yytext, atoi(yytext)); {DIG}+"."{DIG}* printf("Float: %s (%g)n", yytext, atof(yytext)); if|then|begin|end printf("Keyword: %sn",yytext); {ID} printf("Identifier: %sn",yytext); "+"|"-"|"*"|"/" printf("Operator: %sn",yytext); "{"[^}n]*"}" /* skip one-line comments */ [ tn]+ /* skip whitespace */ 17 CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc.
Notes de l'éditeur
Makes parsing much easier (the parsers looks for relations between tokens) Other tasks: remove comments
Automatic generation of scanners
Télécharger maintenant