SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
SMACC: BEHIND THE REFACTORINGS
Jason Lecerf & Thierry Goubier
May 17, 2017
TABLE OF CONTENTS
1 Introduction
2 Overview of SmaCC
General workflow
From a user point of view
3 The rewrite engine
The engine
Anatomy of the rewrites
And RB
4 Final words
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 2/36
SMACC
The Smalltalk Compiler Compiler is a parser generator for
Smalltalk
Originally developed by John Brant & Don Roberts
Used in Moose, Synectique, CEA, RefactoryWorkers
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 3/36
CONTEXT
Architecture to generate the front-end of compilers
for DSL parsing
for program analysis
for program migration
for refactoring and transformation of source code
for compiler front-end implementation
SmaCC is written in Smalltalk and generate parsers in
Smalltalk
It has the infrastructure for generating parsers in other
programming languages
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 4/36
EXAMPLE OF USE
What you want to do:
Find pattern in code written in a (niche) language
Refactor these patterns into something new
What you will need:
The grammar for your language (if it does not already exists
in SmaCC)
Your patterns and related transformations
Your program
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 5/36
TABLE OF CONTENTS
1 Introduction
2 Overview of SmaCC
General workflow
From a user point of view
3 The rewrite engine
The engine
Anatomy of the rewrites
And RB
4 Final words
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 6/36
CAPABILITIES
Automated generation of LR parsers
Input: the specification of the grammar
Output: parser for the grammar
can create arbitrary code run at parse time
can create an AST1
1
Abstract Syntax Tree SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 7/36
SMACC GENERATION PIPELINE
Figure: SmaCC overall pipeline
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 8/36
SMACC INPUT: THE GRAMMAR
In a slightly modified BNF2 form
CondExp
: ArithExp
| BitwiseExp
| RelationalExp
| BoolExp
| TernaryExp
| <lpar > CondExp <rpar >
| "defined (" Id <rpar >
| Id
| Number
;
#if NB_BITS < 16
#elif NB_BITS < 32
#else
#endif
2
Bachus-Naur Form: a meta-grammarSmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 9/36
SMACC PARSER GENERATION
The generation:
Produces a DFA lexer (the scanner)
Produces either LR(1) or LALR(1) parsers
Can be augmented to support GLR parsing
Generate methods for parse table transitions
not exactly the parse tables themselves (optimizations were
done)
no tables, just methods for the lexer
LR Standard parser for context-free grammars
LALR Merge states resulting in a smaller memory footprint
GLR Try all the possible transitions for a state
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 10/36
SMACC PARSER GENERATION OUTPUT
The generated parser is a Smalltalk package containing:
a Scanner class
a Parser class
the AST node classes
a generic AST visitor
Running the parser on an input pogram:
produce AST nodes instances
execute arbitrary code given to the grammar
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 11/36
GRAMMAR WITH ARBITRARY CODE EXECUTION
Expression
: Expression ’left ’ "+" Expression ’right ’
{left + right}
| Expression ’left ’ "-" Expression ’right ’
{left - right}
| Expression ’left ’ "*" Expression ’right ’
{left * right}
| Expression ’left ’ "/" Expression ’right ’
{left / right}
| Expression ’left ’ "^" Expression ’right ’
{left raisedTo: right}
| "(" Expression ’expression ’ ")" {expression}
| Number ’number ’ {number}
;
Number
: <number > ’numberToken ’
{numberToken value asNumber}
;
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 12/36
GRAMMAR WITH AST PRODUCTION
Expression
: Expression ’left ’ "+" ’op ’ Expression ’right ’
{{ Expression }}
| Expression ’left ’ "-" ’op ’ Expression ’right ’
{{ Expression }}
| Expression ’left ’ "*" ’op ’ Expression ’right ’
{{ Expression }}
| Expression ’left ’ "/" ’op ’ Expression ’right ’
{{ Expression }}
| Expression ’left ’ "^" ’op ’ Expression ’right ’
{{ Expression }}
| "(" Expression ’expression ’ ")" {{}}
| Number
;
Number
: <number > {{ Number }}
;
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 13/36
TABLE OF CONTENTS
1 Introduction
2 Overview of SmaCC
General workflow
From a user point of view
3 The rewrite engine
The engine
Anatomy of the rewrites
And RB
4 Final words
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 14/36
EXTENDED SMACC PIPELINE
Figure: Extended SmaCC pipeline
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 15/36
PRECONDITIONS
Parser for the language
Enable GLR parsing in the grammar
Declare a pattern token in the grammar
usually in between backquotes since they are used in barely
any language
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 16/36
INPUT TO THE REWRITE ENGINE
Pattern
Metavariables
Transformation
Parser: MyExpressionParser
>>>‘a‘ + ‘b‘<<<
->
>>>‘a‘ ‘b‘ +<<<
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 17/36
REWRITTEN OUTPUT EXAMPLE
Input program:
(3 + 4) + (4 + 3)
Rewritten program using the rewrite engine:
3 4 + 4 3 + +
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 18/36
ANATOMY OF THE REWRITE ENGINE
Figure: SmaCC rewriting process
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 19/36
PARSING OF THE PATTERN STRING
Metavariables can match any nodes (unless specified
otherwise)
can be modified to match list of nodes or specific types of
nodes
Use the GLR parser to parse the pattern
Try all the possible starting symbols (entry points) of the
grammar
ex: Methods, expressions, method call
not only the top entry point (often ”Program”)
Get all the possible ASTs for the pattern
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 20/36
PRELIMINARY OPERATIONS
1 Parse program using the GLR parser
Produces the program AST
2 Parse pattern using the GLR parser
If there are conflicts: we get a forest of trees
If there are pattern nodes (metavariables): we get a forest of
trees if valid for the grammar
Otherwise: we get a single tree
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 21/36
SIMPLIFIED SPLITFORPATTERNTOKEN
if currentToken = patternToken then
for all symbol in {tokens OR non-terminal nodes} do
actionsToProcess ←all possible LR actions for symbol
for all LR action in actionsToProcess do
Check if action was not already performed
if symbol = Token OR
(symbol = Node AND action = reduction) then
Add a token interpretation to the current token
Try to perform current LR action
else if symbol = Node AND action = shift then
stateStack add new ambiguous state
end if
end for
end for
Remove current pattern token state
end if SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 22/36
MATCHING OF THE PATTERN
Based on ambiguity handling by GLR
Reuses the parser for your grammar
Based on the parse tables of said parser
When parsing a pattern token:
Try all the valid action-token combinations (transitions) for the
current state
When conflicts arise (i.e. more than one transition is possible),
fork the parser
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 23/36
ANATOMY OF THE REWRITE ENGINE
Figure: SmaCC rewriting process
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 24/36
UNIFICATION ALGORITHM
Require: patternForest, programTree
for all programNode in programTree do
Depth first traversal
for all patternTree in patternForest do
for all patternNode in patternTree do
if patternNodeclass = programNodeclass then
Tries to match patternNode subnodes with
programNode subnodes
else
continue
end if
end for
end for
end for
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 25/36
UNIFICATION ALGORITHM
Figure: SmaCC rewriting process
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 26/36
EQUALITY
ArgumentListPar
: "(" (Expression ’argument ’
("," Expression ’argument ’)* )? ")"
Node equality Class are identical and every subnodes, subtokens
match
Token equality Both values are identical (same string)
Here: the left and right parenthesis tokens
Node collection equality Every individual node matches
Here: the list of n − 1 commas
Token collection equality Every individual token matches
Here: the list of n Expression arguments
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 27/36
CONTEXT & METAVARIABLE BINDING
Figure: SmaCC binding & rewriting process
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 28/36
ANATOMY OF THE REWRITE ENGINE
Figure: SmaCC rewriting process
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 29/36
REWRITING
When nodes match, they are stored in the context
Even if it seems intuitive, SmaCC does not perform AST
rewriting
Is rewriting a part of a tree really that simple ?
And what if I rewrite in another programming language ?
When rewriting, only transform the source of the nodes to the
source of the transform
i.e.: the source of the transform is not parsed
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 30/36
WHAT ABOUT RB ?
RB and SmaCC share the same creators
But Smalltalk is a very simple language (to parse)
It is simple to specify a pattern tree directly
Use a bit the parser to complete the pattern tree
Rule: a pattern is valid Smalltalk code
But may match a slightly different tree (message)
Subtree matching algorithm is exactly the same as in SmaCC
For example, see
RBPatternMethodNode>>#match:inContext:
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 31/36
TABLE OF CONTENTS
1 Introduction
2 Overview of SmaCC
General workflow
From a user point of view
3 The rewrite engine
The engine
Anatomy of the rewrites
And RB
4 Final words
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 32/36
SMACC APPROACH
Put some (most) of the complexity in the parser
Use reflexivity on the grammar
The grammar specify all correct phrases (all valid sequences of
tokens)
The AST directives specify all possibles nodes and trees of
nodes (complete type specification)
The parser contains all that information in the state tables
Query it!
Match and rewrite (on a large scale... over a million lines of
code)
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 33/36
CONCLUSION
SmaCC is a parser generator extended with pattern matching
and rewriting capabilities
Uses the parser as a way to reflect on the grammar and build
pattern trees
Tree traversal for matching is depth first (ASTs are not very
deep)
Generalization of the Refactoring Browser in the case of an
”arbitrary” grammar
Used extensively by John Brant, Thierry Goubier and others...
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 34/36
WIP: SMACC BOOKLET
Tutorial and documentation book
for SmaCC
SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 35/36
Thank you!
Commissariat `a l’´energie atomique et aux ´energies alternatives
Campus Saclay Nano-Innov F-91191 Gif-sur-Yvette Cedex
www-list.cea.fr
´Etablissement public `a caract`ere industriel et commercial RCS Paris B 775 685 019

Contenu connexe

En vedette (7)

A living programming environment for blockchain
A living programming environment for blockchainA living programming environment for blockchain
A living programming environment for blockchain
 
Pharo VM Performance
Pharo VM PerformancePharo VM Performance
Pharo VM Performance
 
Raspberry and Pharo
Raspberry and PharoRaspberry and Pharo
Raspberry and Pharo
 
Welcome: PharoDays 2017
Welcome: PharoDays 2017Welcome: PharoDays 2017
Welcome: PharoDays 2017
 
Robotic Exploration and Mapping with Pharo
Robotic Exploration and Mapping with PharoRobotic Exploration and Mapping with Pharo
Robotic Exploration and Mapping with Pharo
 
Pharo 6
Pharo 6Pharo 6
Pharo 6
 
Pharo 64bits
Pharo 64bitsPharo 64bits
Pharo 64bits
 

Similaire à Smack: Behind the Refactorings

Regular%20 expression%20processing%20in%20abap
Regular%20 expression%20processing%20in%20abapRegular%20 expression%20processing%20in%20abap
Regular%20 expression%20processing%20in%20abap
abaperscnjavasdn
 

Similaire à Smack: Behind the Refactorings (20)

Inexpensive Datamasking for MySQL with ProxySQL - data anonymization for deve...
Inexpensive Datamasking for MySQL with ProxySQL - data anonymization for deve...Inexpensive Datamasking for MySQL with ProxySQL - data anonymization for deve...
Inexpensive Datamasking for MySQL with ProxySQL - data anonymization for deve...
 
Ch4.mapreduce algorithm design
Ch4.mapreduce algorithm designCh4.mapreduce algorithm design
Ch4.mapreduce algorithm design
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica
 
New features in abap
New features in abapNew features in abap
New features in abap
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話
 
Advanced REXX Programming Techniques
Advanced REXX Programming TechniquesAdvanced REXX Programming Techniques
Advanced REXX Programming Techniques
 
Use Cases of Row Pattern Matching in Oracle 12c
Use Cases of Row Pattern Matching in Oracle 12cUse Cases of Row Pattern Matching in Oracle 12c
Use Cases of Row Pattern Matching in Oracle 12c
 
Hipster FP code harder to maintain because it actively removes domain knowledge
Hipster FP code harder to maintain because it actively removes domain knowledge Hipster FP code harder to maintain because it actively removes domain knowledge
Hipster FP code harder to maintain because it actively removes domain knowledge
 
Big Data Processing using a AWS Dataset
Big Data Processing using a AWS DatasetBig Data Processing using a AWS Dataset
Big Data Processing using a AWS Dataset
 
Streaming Random Forest Learning in Spark and StreamDM with Heitor Murilogome...
Streaming Random Forest Learning in Spark and StreamDM with Heitor Murilogome...Streaming Random Forest Learning in Spark and StreamDM with Heitor Murilogome...
Streaming Random Forest Learning in Spark and StreamDM with Heitor Murilogome...
 
Mdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathonMdb dn 2017_18_query_hackathon
Mdb dn 2017_18_query_hackathon
 
MonadReader: Less is More
MonadReader: Less is MoreMonadReader: Less is More
MonadReader: Less is More
 
Natural Language Generation in the Wild
Natural Language Generation in the WildNatural Language Generation in the Wild
Natural Language Generation in the Wild
 
Introduction to Graph Databases wth neo4J
Introduction to Graph Databases wth neo4JIntroduction to Graph Databases wth neo4J
Introduction to Graph Databases wth neo4J
 
Incremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher QueriesIncremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher Queries
 
Incremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher QueriesIncremental View Maintenance for openCypher Queries
Incremental View Maintenance for openCypher Queries
 
effectivegraphsmro1
effectivegraphsmro1effectivegraphsmro1
effectivegraphsmro1
 
Matlab-Data types and operators
Matlab-Data types and operatorsMatlab-Data types and operators
Matlab-Data types and operators
 
Regular%20 expression%20processing%20in%20abap
Regular%20 expression%20processing%20in%20abapRegular%20 expression%20processing%20in%20abap
Regular%20 expression%20processing%20in%20abap
 

Plus de Pharo

Plus de Pharo (20)

Yesplan: 10 Years later
Yesplan: 10 Years laterYesplan: 10 Years later
Yesplan: 10 Years later
 
Object-Centric Debugging: a preview
Object-Centric Debugging: a previewObject-Centric Debugging: a preview
Object-Centric Debugging: a preview
 
The future of testing in Pharo
The future of testing in PharoThe future of testing in Pharo
The future of testing in Pharo
 
Spec 2.0: The next step on desktop UI
Spec 2.0: The next step on desktop UI Spec 2.0: The next step on desktop UI
Spec 2.0: The next step on desktop UI
 
UI Testing with Spec
 UI Testing with Spec UI Testing with Spec
UI Testing with Spec
 
Pharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alphaPharo 7.0 and 8.0 alpha
Pharo 7.0 and 8.0 alpha
 
PHARO IoT: Installation Improvements and Continuous Integration
PHARO IoT: Installation Improvements and Continuous IntegrationPHARO IoT: Installation Improvements and Continuous Integration
PHARO IoT: Installation Improvements and Continuous Integration
 
Easy REST with OpenAPI
Easy REST with OpenAPIEasy REST with OpenAPI
Easy REST with OpenAPI
 
Comment soup with a pinch of types, served in a leaky bowl
Comment soup with a pinch of types, served in a leaky bowlComment soup with a pinch of types, served in a leaky bowl
Comment soup with a pinch of types, served in a leaky bowl
 
apart Framework: Porting from VisualWorks
apart Framework: Porting from VisualWorksapart Framework: Porting from VisualWorks
apart Framework: Porting from VisualWorks
 
Git with Style
Git with StyleGit with Style
Git with Style
 
Pharo JS
Pharo JSPharo JS
Pharo JS
 
Seaside & ReactJS
Seaside & ReactJSSeaside & ReactJS
Seaside & ReactJS
 
Material Design and Seaside
Material Design and SeasideMaterial Design and Seaside
Material Design and Seaside
 
Calypso Browser
Calypso BrowserCalypso Browser
Calypso Browser
 
Pharo 7: The key challenges
Pharo 7: The key challengesPharo 7: The key challenges
Pharo 7: The key challenges
 
An introduction to the CARGO package manager
An introduction to the CARGO package managerAn introduction to the CARGO package manager
An introduction to the CARGO package manager
 
gt4gemstone
gt4gemstonegt4gemstone
gt4gemstone
 
Pharo Hand-Ons: 06 finding information
Pharo Hand-Ons: 06 finding informationPharo Hand-Ons: 06 finding information
Pharo Hand-Ons: 06 finding information
 
Pharo Hands-On: 01 welcome
Pharo Hands-On: 01 welcomePharo Hands-On: 01 welcome
Pharo Hands-On: 01 welcome
 

Dernier

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Dernier (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 

Smack: Behind the Refactorings

  • 1. SMACC: BEHIND THE REFACTORINGS Jason Lecerf & Thierry Goubier May 17, 2017
  • 2. TABLE OF CONTENTS 1 Introduction 2 Overview of SmaCC General workflow From a user point of view 3 The rewrite engine The engine Anatomy of the rewrites And RB 4 Final words SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 2/36
  • 3. SMACC The Smalltalk Compiler Compiler is a parser generator for Smalltalk Originally developed by John Brant & Don Roberts Used in Moose, Synectique, CEA, RefactoryWorkers SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 3/36
  • 4. CONTEXT Architecture to generate the front-end of compilers for DSL parsing for program analysis for program migration for refactoring and transformation of source code for compiler front-end implementation SmaCC is written in Smalltalk and generate parsers in Smalltalk It has the infrastructure for generating parsers in other programming languages SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 4/36
  • 5. EXAMPLE OF USE What you want to do: Find pattern in code written in a (niche) language Refactor these patterns into something new What you will need: The grammar for your language (if it does not already exists in SmaCC) Your patterns and related transformations Your program SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 5/36
  • 6. TABLE OF CONTENTS 1 Introduction 2 Overview of SmaCC General workflow From a user point of view 3 The rewrite engine The engine Anatomy of the rewrites And RB 4 Final words SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 6/36
  • 7. CAPABILITIES Automated generation of LR parsers Input: the specification of the grammar Output: parser for the grammar can create arbitrary code run at parse time can create an AST1 1 Abstract Syntax Tree SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 7/36
  • 8. SMACC GENERATION PIPELINE Figure: SmaCC overall pipeline SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 8/36
  • 9. SMACC INPUT: THE GRAMMAR In a slightly modified BNF2 form CondExp : ArithExp | BitwiseExp | RelationalExp | BoolExp | TernaryExp | <lpar > CondExp <rpar > | "defined (" Id <rpar > | Id | Number ; #if NB_BITS < 16 #elif NB_BITS < 32 #else #endif 2 Bachus-Naur Form: a meta-grammarSmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 9/36
  • 10. SMACC PARSER GENERATION The generation: Produces a DFA lexer (the scanner) Produces either LR(1) or LALR(1) parsers Can be augmented to support GLR parsing Generate methods for parse table transitions not exactly the parse tables themselves (optimizations were done) no tables, just methods for the lexer LR Standard parser for context-free grammars LALR Merge states resulting in a smaller memory footprint GLR Try all the possible transitions for a state SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 10/36
  • 11. SMACC PARSER GENERATION OUTPUT The generated parser is a Smalltalk package containing: a Scanner class a Parser class the AST node classes a generic AST visitor Running the parser on an input pogram: produce AST nodes instances execute arbitrary code given to the grammar SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 11/36
  • 12. GRAMMAR WITH ARBITRARY CODE EXECUTION Expression : Expression ’left ’ "+" Expression ’right ’ {left + right} | Expression ’left ’ "-" Expression ’right ’ {left - right} | Expression ’left ’ "*" Expression ’right ’ {left * right} | Expression ’left ’ "/" Expression ’right ’ {left / right} | Expression ’left ’ "^" Expression ’right ’ {left raisedTo: right} | "(" Expression ’expression ’ ")" {expression} | Number ’number ’ {number} ; Number : <number > ’numberToken ’ {numberToken value asNumber} ; SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 12/36
  • 13. GRAMMAR WITH AST PRODUCTION Expression : Expression ’left ’ "+" ’op ’ Expression ’right ’ {{ Expression }} | Expression ’left ’ "-" ’op ’ Expression ’right ’ {{ Expression }} | Expression ’left ’ "*" ’op ’ Expression ’right ’ {{ Expression }} | Expression ’left ’ "/" ’op ’ Expression ’right ’ {{ Expression }} | Expression ’left ’ "^" ’op ’ Expression ’right ’ {{ Expression }} | "(" Expression ’expression ’ ")" {{}} | Number ; Number : <number > {{ Number }} ; SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 13/36
  • 14. TABLE OF CONTENTS 1 Introduction 2 Overview of SmaCC General workflow From a user point of view 3 The rewrite engine The engine Anatomy of the rewrites And RB 4 Final words SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 14/36
  • 15. EXTENDED SMACC PIPELINE Figure: Extended SmaCC pipeline SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 15/36
  • 16. PRECONDITIONS Parser for the language Enable GLR parsing in the grammar Declare a pattern token in the grammar usually in between backquotes since they are used in barely any language SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 16/36
  • 17. INPUT TO THE REWRITE ENGINE Pattern Metavariables Transformation Parser: MyExpressionParser >>>‘a‘ + ‘b‘<<< -> >>>‘a‘ ‘b‘ +<<< SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 17/36
  • 18. REWRITTEN OUTPUT EXAMPLE Input program: (3 + 4) + (4 + 3) Rewritten program using the rewrite engine: 3 4 + 4 3 + + SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 18/36
  • 19. ANATOMY OF THE REWRITE ENGINE Figure: SmaCC rewriting process SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 19/36
  • 20. PARSING OF THE PATTERN STRING Metavariables can match any nodes (unless specified otherwise) can be modified to match list of nodes or specific types of nodes Use the GLR parser to parse the pattern Try all the possible starting symbols (entry points) of the grammar ex: Methods, expressions, method call not only the top entry point (often ”Program”) Get all the possible ASTs for the pattern SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 20/36
  • 21. PRELIMINARY OPERATIONS 1 Parse program using the GLR parser Produces the program AST 2 Parse pattern using the GLR parser If there are conflicts: we get a forest of trees If there are pattern nodes (metavariables): we get a forest of trees if valid for the grammar Otherwise: we get a single tree SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 21/36
  • 22. SIMPLIFIED SPLITFORPATTERNTOKEN if currentToken = patternToken then for all symbol in {tokens OR non-terminal nodes} do actionsToProcess ←all possible LR actions for symbol for all LR action in actionsToProcess do Check if action was not already performed if symbol = Token OR (symbol = Node AND action = reduction) then Add a token interpretation to the current token Try to perform current LR action else if symbol = Node AND action = shift then stateStack add new ambiguous state end if end for end for Remove current pattern token state end if SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 22/36
  • 23. MATCHING OF THE PATTERN Based on ambiguity handling by GLR Reuses the parser for your grammar Based on the parse tables of said parser When parsing a pattern token: Try all the valid action-token combinations (transitions) for the current state When conflicts arise (i.e. more than one transition is possible), fork the parser SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 23/36
  • 24. ANATOMY OF THE REWRITE ENGINE Figure: SmaCC rewriting process SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 24/36
  • 25. UNIFICATION ALGORITHM Require: patternForest, programTree for all programNode in programTree do Depth first traversal for all patternTree in patternForest do for all patternNode in patternTree do if patternNodeclass = programNodeclass then Tries to match patternNode subnodes with programNode subnodes else continue end if end for end for end for SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 25/36
  • 26. UNIFICATION ALGORITHM Figure: SmaCC rewriting process SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 26/36
  • 27. EQUALITY ArgumentListPar : "(" (Expression ’argument ’ ("," Expression ’argument ’)* )? ")" Node equality Class are identical and every subnodes, subtokens match Token equality Both values are identical (same string) Here: the left and right parenthesis tokens Node collection equality Every individual node matches Here: the list of n − 1 commas Token collection equality Every individual token matches Here: the list of n Expression arguments SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 27/36
  • 28. CONTEXT & METAVARIABLE BINDING Figure: SmaCC binding & rewriting process SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 28/36
  • 29. ANATOMY OF THE REWRITE ENGINE Figure: SmaCC rewriting process SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 29/36
  • 30. REWRITING When nodes match, they are stored in the context Even if it seems intuitive, SmaCC does not perform AST rewriting Is rewriting a part of a tree really that simple ? And what if I rewrite in another programming language ? When rewriting, only transform the source of the nodes to the source of the transform i.e.: the source of the transform is not parsed SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 30/36
  • 31. WHAT ABOUT RB ? RB and SmaCC share the same creators But Smalltalk is a very simple language (to parse) It is simple to specify a pattern tree directly Use a bit the parser to complete the pattern tree Rule: a pattern is valid Smalltalk code But may match a slightly different tree (message) Subtree matching algorithm is exactly the same as in SmaCC For example, see RBPatternMethodNode>>#match:inContext: SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 31/36
  • 32. TABLE OF CONTENTS 1 Introduction 2 Overview of SmaCC General workflow From a user point of view 3 The rewrite engine The engine Anatomy of the rewrites And RB 4 Final words SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 32/36
  • 33. SMACC APPROACH Put some (most) of the complexity in the parser Use reflexivity on the grammar The grammar specify all correct phrases (all valid sequences of tokens) The AST directives specify all possibles nodes and trees of nodes (complete type specification) The parser contains all that information in the state tables Query it! Match and rewrite (on a large scale... over a million lines of code) SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 33/36
  • 34. CONCLUSION SmaCC is a parser generator extended with pattern matching and rewriting capabilities Uses the parser as a way to reflect on the grammar and build pattern trees Tree traversal for matching is depth first (ASTs are not very deep) Generalization of the Refactoring Browser in the case of an ”arbitrary” grammar Used extensively by John Brant, Thierry Goubier and others... SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 34/36
  • 35. WIP: SMACC BOOKLET Tutorial and documentation book for SmaCC SmaCC & rewriting Jason Lecerf & Thierry Goubier May 17, 2017 35/36
  • 36. Thank you! Commissariat `a l’´energie atomique et aux ´energies alternatives Campus Saclay Nano-Innov F-91191 Gif-sur-Yvette Cedex www-list.cea.fr ´Etablissement public `a caract`ere industriel et commercial RCS Paris B 775 685 019