Contenu connexe
Similaire à Antlr4 get the right tool for the job (20)
Plus de Alexander Pacha (7)
Antlr4 get the right tool for the job
- 2. © Zühlke 2015
What is Antlr?
Another Tool for Language Recognition
Antlr4 | Alexander Pacha 24. July 2015 Slide 2
- 3. © Zühlke 2015
Basic Concepts
• Every language has syntax and semantic
• A parser is a syntax analyzer
• Two steps:
• Lexical Analysis: Grouping words into tokens
• Actual parsing: Recognize sentence structure and build parse tree
Languages and Parsers
Antlr4 | Alexander Pacha
int a = 42 + 3
24. July 2015 Slide 3
- 4. © Zühlke 2015
Antlr Features
• Parser generator from specified grammar (in Antlr Meta-language)
• Generated Parser in selected target language (e.g. Java, C#, Python)
• High performance
• High flexibility (e.g. grammar islands, rewriting input stream)
• Cool features (e.g. error-handling, visitors, listeners)
Antlr4 | Alexander Pacha
a = (42 + 3
24. July 2015 Slide 4
- 5. © Zühlke 2015
Building an application with Antlr
Grammar in Extended Backus-Naur-Format (EBNF)
grammar MyGrammar;
rule1 : «stuff»;
rule2 : «more stuff»;
Convenience operators: Optional (?), Zero-or-more (*), One-or-more (+)
Lexer-Rules APPLE: ‘apple‘;
INT: [0-9]+;
Parser-Rules
• Sequence decimal: INT ‘.‘ INT;
• Token dependence vector: ‘[‘ INT+ ‘]‘;
• Choice fruit: APPLE | ORANGE;
• Nested phrase breakfast: fruit JOGHURT;
Antlr4 | Alexander Pacha 24. July 2015 Slide 5
- 6. © Zühlke 2015
Grammar Sample
Antlr4 | Alexander Pacha
grammar LabeledExpr;
prog: stat+ ;
stat: expr NEWLINE # printExpr
| ID '=' expr NEWLINE # assign
| CLEAR NEWLINE # clearCmd
| NEWLINE # blank
;
expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
| INT # int
| ID # id
| '(' expr ')' # parens
;
MUL : '*' ;
DIV : '/' ;
ADD : '+' ;
SUB : '-' ;
PRINT: 'print';
CLEAR: 'clear' ;
ID : [a-zA-Z]+ ;
INT : [0-9]+ ;
NEWLINE:'r'? 'n' ;
WS : [ t]+ -> skip ;
24. July 2015 Slide 6
- 7. © Zühlke 2015
Generated Tree
a = 42 + 3
b = (a - 5) * 2
5 + 4
clear
b
Antlr4 | Alexander Pacha
Sample program
24. July 2015 Slide 7
- 8. © Zühlke 2015
Listener Sample
Output:
Antlr4 | Alexander Pacha
package Sample1;
public class SimpleListener extends LabeledExprBaseListener {
@Override
public void enterInt(LabeledExprParser.IntContext ctx) {
System.out.println(ctx.getText());
}
@Override
public void enterId(LabeledExprParser.IdContext ctx) {
System.out.println("ID: " + ctx.getText());
}
}
42
3
ID: a
5
2
5
4
ID: b
24. July 2015 Slide 8
- 9. © Zühlke 2015
Visitor Sample
Antlr4 | Alexander Pacha
package Sample1;
public class SimpleVisitor extends LabeledExprBaseVisitor {
@Override
public Object visitAssign(LabeledExprParser.AssignContext ctx) {
System.out.println(ctx.getText());
return null;
//return super.visitAssign(ctx);
}
@Override
public Object visitAddSub(LabeledExprParser.AddSubContext ctx) {
System.out.println(ctx.getText());
return null;
}
}
Output:
a=42+3
b=(a-5)*2
5+4
24. July 2015 Slide 9
- 10. © Zühlke 2015
Quiz
Sample data:
Goal:
Bonus 1: Allow , or ; to be used as separator
Bonus 2: Allow integer and decimal values (e.g. 33.15)
Create grammar to parse CSV-files
Antlr4 | Alexander Pacha
2,34,13
13,33,14
9,66,94
24. July 2015 Slide 10
- 11. © Zühlke 2015
Example Solution
Antlr4 | Alexander Pacha
grammar CommaSeparatedValues;
file: row+;
row: field (',' field)* NEWLINE;
field: INT;
INT: [0-9]+;
NEWLINE: 'r'? 'n';
//Bonus 1:
row: field ((','|';') field)* NEWLINE;
//Bonus 2:
field: INT | DECIMAL;
DECIMAL: INT '.' INT;
24. July 2015 Slide 11
Notes de l'éditeur
- BNF: Besteht aus Alternativen, Token-Referenzen und Regel-Referenzen