SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Regular Expressions
for
Beginners
Srikanth Modegunta
Introduction

Also referred to as Regex or RegExp

Used to match the pattern of text
− Ex: maven and maeven can be matched with
regex “mae?ven”

Regular Expressions are processed by a piece
of software called “Regular Expressions
Engine”

Most of the languages support Regex
− Ex: perl, java, c# etc.
Introduction (Contd..)

Used where text processing is required.

XML parsing involves Regex as it is based on the pattern
matching.
− We will see how to match xml or html tag.

Automation of the tasks
− Ex: if mail subject contains “<operation> <some task
name> <command>” then start processing the task.

Text Editors updating the comments to functions
automatically(Replacing a pattern with some text)
− Ex: replace
− “sub subroutine(parameters){<statements>}” by
/* this is a sample subroutine*/
sub subroutine(parameters){<statements>}
Meta Characters
The following are the meta characters
 | ( ) [ { ^ $ * + ? .
Meta Characters (Contd..)
Character Meaning
* 0 or more
+ 1 or more
? 0 or 1 (optional)
. All characters excluding new-line
^ Start of line. But [^abc] means
character other than 'a' or 'b' or 'c'
$ End of line
A Start of string
Z End of string
Meta Characters (Contd..)
Character Meaning
{ } If I know How many times the pattern
repeats I can use this
Ex: a{2, 5} matches 'a' repeated
minimum 2 times and maximum 5
times.
| Saying 'or' in patterns
Ex: cat|dog|mouse
() Used to capture groups
[ ] Only one letter from the set
Quantifiers

To specify the quantity
− Ex: ear, eaaaar – the quantity of a is 1 and 4
in these two cases.

If a pattern is repeated then we need to use
quantifiers to match that repeated pattern.

To match the above case we use the following
regex
− ea+r means a can come 1 or more times
Quantifiers (Contd..)
* 0 or more times (it is hungry matching)
Ex: ca* matches c, ca, caa, caaa etc.
Matches even if the character does not
exist and matches any number of 'a' s
generally till last occurrence of pattern
+ 1 or more times (it is hungry matching)
Ex: ca+ matches ca, caa, caaa etc
{n} Match exactly n times
Ex: ca{4}r matches caaaar
{m,} Matches minimum of m times and
maximum of more than m times
Ex: ca{2,}r matches only if a repeats
greater than 2 times. (hungry matching)
{m,n} Matches minimum m times and maximum n
times.
Ex: ca{2,3}r matches and 'a' repeats
minimum 2 times and maximum 3 times.
(hungry matching)
Hungry Matching refers to the behavior that the pattern matches maximum possible text.
Ex: for ca{0,4} the text “caaaa” matches I.e all the 4 'a's are matched.
Quantifiers (Contd..)
*? Lazy matching i.e it matches 0 or
more times but stops at first match
Ex: if text is “caaaaaa” then “ca*?”
will match only 'c'.
+? Lazy matching i.e it matches 1 or
more times but stops at first match
Ex: if text is “caaaaaa” then “ca+?”
will match only 'ca'.
?? Lazy matching i.e it matches 0 or 1
times but stops at first match
Ex: if text is “ca” then “ca??” will
match only 'c'.
{min,}?
{n}?
{min, max}?
Lazy matching
Lazy Matching refers to the behavior that the pattern matches minimum possible text.
Ex: for ca{0,4}? the text “caaaa” matches only “c”
Character Sets

Matches one character among the set of
characters

[abcd] is same as [a-d]

[a-di-l] is same as [abcdijkl]

[^abcd] matches any character other than
a,b,c,d

Quantifiers can be applied to the character sets
− [a-z]+ matches the string 'hello' in
'hello1234E'
Characters for Matching
Common character classes shorthand
[a-zA-Z0-9_] w
[0-9] d
[ tnr] s
[^a-zA-Z0-9_] W
[^0-9] D
[^ tnr] S
b Word Boundary
B Other than a Word Boundary
Simple Matching

modegunta.srikanth@gmail.com
− mail id should not start with number or special
symbols
− Mail id id can start with _
− Mail id can have '.' in the middle
− Should end with @domain.com

Pattern :
− [a-zA-Z_][a-zA-Z_.]+@w+.(com|co.in)
− Meta characters must be escaped in the
pattern to match them as normal characters
Modifiers
Modifier Meaning
i Case insensitive
g Global matching (in perl)
m Multiline matching
s Dot all ('.' matches n also)
x Extended regex pattern (pretty format
ref: perl)
e (Used for replacing string) evaluate the
replacing pattern as an expression
(ref: perl)
Grouping

Groups can be captured using parenthesis
− (<pattern>)
− Saves the text identified by the group into a
backreference (we will see it later)

Groups are to capture part of text in the matching
pattern
− Ex: take simple xml element
<root>test</root>
− <(w+)>.*?</1>
− Here 1 is back reference

Java has a method “group(int)” method in
“java.util.regex.Matcher” class.
Grouping Example

If the command is
− /sbin/service <service-name> <command>
− ([^s]+)s+([w-_]+)s+(start|stop|status)
− Group 0=matched pattern
− Group 1=”/sbin/service”
− Group 2=<service-name>
− Group 3=<command>
− Command can be start, stop or status
Back References

Stores the part of the string matched by the part
of the regular expression inside the
parentheses

If there is any string that occurs multiple times
in the input, we can use back reference to
identify the match

Ex: xml/html start-tag should have the end-tag

Here if we capture the start-tag name in first
group, we can put end-tag name as back
reference (1)
Back references example

For example take the xml tag
− <root id=”E12”>test</root>
− <([w-_]+)s*([^<>]+)?>w+</1> matches
xml element
− Group 0: <root id=”E12”>test</root>
− Group 1: root
− Group 2: id=”E12”
− 1 in the regex pattern is the back reference to
group 1.
No grouping with parenthesis

If groups are not required for the parenthesized
patterns
− Use ?: inside group (?:)
− (text1|text2|text3) is any on of text1, text2 and
text3
− (?:text1|text2|text3) but will not be a group
Look ahead and Look behind

Positive look-ahead
− w+(?=:) not all words.... select words that come
before ':'

Negative look-ahead
− w+(?!:) words other than those coming before :

When the pattern comes the regex engine looks ahead for
the filtering pattern in case of Look ahead.

Positive look-behind
− (?<=a)b selects 'b' that follows 'a'

Negative look-behind
− (?<!a)b selects 'b' that doesn't follow 'a'

When the pattern comes the regex engine looks behind for
the filtering pattern in case of Look behind.
References:
1) http://www.regular-expressions.info/tutorial.html
2) Thinking in java 4th
Editon –
Chapter: Strings
page 392
Thank You

Contenu connexe

Tendances

Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular ExpressionsOCSI
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwinschamber
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionProf. Wim Van Criekinge
 
The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++Anjesh Tuladhar
 
String in python lecture (3)
String in python lecture (3)String in python lecture (3)
String in python lecture (3)Ali ٍSattar
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Pythonprimeteacher32
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regexwayn
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular ExpressionsJesse Anderson
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeProf. Wim Van Criekinge
 
Strings in Python
Strings in PythonStrings in Python
Strings in Pythonnitamhaske
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regexJalpesh Vasa
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in PythonSujith Kumar
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressionsBen Brumfield
 

Tendances (20)

Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular Expressions
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwin
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
 
Regex Basics
Regex BasicsRegex Basics
Regex Basics
 
The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++
 
Andrei's Regex Clinic
Andrei's Regex ClinicAndrei's Regex Clinic
Andrei's Regex Clinic
 
String in python lecture (3)
String in python lecture (3)String in python lecture (3)
String in python lecture (3)
 
Python strings
Python stringsPython strings
Python strings
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
Array and functions
Array and functionsArray and functions
Array and functions
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
 
Strings in Python
Strings in PythonStrings in Python
Strings in Python
 
Java String class
Java String classJava String class
Java String class
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regex
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
Grep Introduction
Grep IntroductionGrep Introduction
Grep Introduction
 

Similaire à Regex startup

Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaghu nath
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007Geoffrey Dunn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionskeeyre
 
Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5bilcorry
 
Regular Expression in Action
Regular Expression in ActionRegular Expression in Action
Regular Expression in ActionFolio3 Software
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and YouJames Armes
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)Chirag Shetty
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdfDarellMuchoko
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlsana mateen
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionssana mateen
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressionsKrishna Nanda
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdfGaneshRaghu4
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP StringsAhmed Swilam
 

Similaire à Regex startup (20)

Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5
 
Regular Expression in Action
Regular Expression in ActionRegular Expression in Action
Regular Expression in Action
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
 
Les08
Les08Les08
Les08
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressions
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressions
 
2.regular expressions
2.regular expressions2.regular expressions
2.regular expressions
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP Strings
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Dernier (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Regex startup

  • 2. Introduction  Also referred to as Regex or RegExp  Used to match the pattern of text − Ex: maven and maeven can be matched with regex “mae?ven”  Regular Expressions are processed by a piece of software called “Regular Expressions Engine”  Most of the languages support Regex − Ex: perl, java, c# etc.
  • 3. Introduction (Contd..)  Used where text processing is required.  XML parsing involves Regex as it is based on the pattern matching. − We will see how to match xml or html tag.  Automation of the tasks − Ex: if mail subject contains “<operation> <some task name> <command>” then start processing the task.  Text Editors updating the comments to functions automatically(Replacing a pattern with some text) − Ex: replace − “sub subroutine(parameters){<statements>}” by /* this is a sample subroutine*/ sub subroutine(parameters){<statements>}
  • 4. Meta Characters The following are the meta characters | ( ) [ { ^ $ * + ? .
  • 5. Meta Characters (Contd..) Character Meaning * 0 or more + 1 or more ? 0 or 1 (optional) . All characters excluding new-line ^ Start of line. But [^abc] means character other than 'a' or 'b' or 'c' $ End of line A Start of string Z End of string
  • 6. Meta Characters (Contd..) Character Meaning { } If I know How many times the pattern repeats I can use this Ex: a{2, 5} matches 'a' repeated minimum 2 times and maximum 5 times. | Saying 'or' in patterns Ex: cat|dog|mouse () Used to capture groups [ ] Only one letter from the set
  • 7. Quantifiers  To specify the quantity − Ex: ear, eaaaar – the quantity of a is 1 and 4 in these two cases.  If a pattern is repeated then we need to use quantifiers to match that repeated pattern.  To match the above case we use the following regex − ea+r means a can come 1 or more times
  • 8. Quantifiers (Contd..) * 0 or more times (it is hungry matching) Ex: ca* matches c, ca, caa, caaa etc. Matches even if the character does not exist and matches any number of 'a' s generally till last occurrence of pattern + 1 or more times (it is hungry matching) Ex: ca+ matches ca, caa, caaa etc {n} Match exactly n times Ex: ca{4}r matches caaaar {m,} Matches minimum of m times and maximum of more than m times Ex: ca{2,}r matches only if a repeats greater than 2 times. (hungry matching) {m,n} Matches minimum m times and maximum n times. Ex: ca{2,3}r matches and 'a' repeats minimum 2 times and maximum 3 times. (hungry matching) Hungry Matching refers to the behavior that the pattern matches maximum possible text. Ex: for ca{0,4} the text “caaaa” matches I.e all the 4 'a's are matched.
  • 9. Quantifiers (Contd..) *? Lazy matching i.e it matches 0 or more times but stops at first match Ex: if text is “caaaaaa” then “ca*?” will match only 'c'. +? Lazy matching i.e it matches 1 or more times but stops at first match Ex: if text is “caaaaaa” then “ca+?” will match only 'ca'. ?? Lazy matching i.e it matches 0 or 1 times but stops at first match Ex: if text is “ca” then “ca??” will match only 'c'. {min,}? {n}? {min, max}? Lazy matching Lazy Matching refers to the behavior that the pattern matches minimum possible text. Ex: for ca{0,4}? the text “caaaa” matches only “c”
  • 10. Character Sets  Matches one character among the set of characters  [abcd] is same as [a-d]  [a-di-l] is same as [abcdijkl]  [^abcd] matches any character other than a,b,c,d  Quantifiers can be applied to the character sets − [a-z]+ matches the string 'hello' in 'hello1234E'
  • 11. Characters for Matching Common character classes shorthand [a-zA-Z0-9_] w [0-9] d [ tnr] s [^a-zA-Z0-9_] W [^0-9] D [^ tnr] S b Word Boundary B Other than a Word Boundary
  • 12. Simple Matching  modegunta.srikanth@gmail.com − mail id should not start with number or special symbols − Mail id id can start with _ − Mail id can have '.' in the middle − Should end with @domain.com  Pattern : − [a-zA-Z_][a-zA-Z_.]+@w+.(com|co.in) − Meta characters must be escaped in the pattern to match them as normal characters
  • 13. Modifiers Modifier Meaning i Case insensitive g Global matching (in perl) m Multiline matching s Dot all ('.' matches n also) x Extended regex pattern (pretty format ref: perl) e (Used for replacing string) evaluate the replacing pattern as an expression (ref: perl)
  • 14. Grouping  Groups can be captured using parenthesis − (<pattern>) − Saves the text identified by the group into a backreference (we will see it later)  Groups are to capture part of text in the matching pattern − Ex: take simple xml element <root>test</root> − <(w+)>.*?</1> − Here 1 is back reference  Java has a method “group(int)” method in “java.util.regex.Matcher” class.
  • 15. Grouping Example  If the command is − /sbin/service <service-name> <command> − ([^s]+)s+([w-_]+)s+(start|stop|status) − Group 0=matched pattern − Group 1=”/sbin/service” − Group 2=<service-name> − Group 3=<command> − Command can be start, stop or status
  • 16. Back References  Stores the part of the string matched by the part of the regular expression inside the parentheses  If there is any string that occurs multiple times in the input, we can use back reference to identify the match  Ex: xml/html start-tag should have the end-tag  Here if we capture the start-tag name in first group, we can put end-tag name as back reference (1)
  • 17. Back references example  For example take the xml tag − <root id=”E12”>test</root> − <([w-_]+)s*([^<>]+)?>w+</1> matches xml element − Group 0: <root id=”E12”>test</root> − Group 1: root − Group 2: id=”E12” − 1 in the regex pattern is the back reference to group 1.
  • 18. No grouping with parenthesis  If groups are not required for the parenthesized patterns − Use ?: inside group (?:) − (text1|text2|text3) is any on of text1, text2 and text3 − (?:text1|text2|text3) but will not be a group
  • 19. Look ahead and Look behind  Positive look-ahead − w+(?=:) not all words.... select words that come before ':'  Negative look-ahead − w+(?!:) words other than those coming before :  When the pattern comes the regex engine looks ahead for the filtering pattern in case of Look ahead.  Positive look-behind − (?<=a)b selects 'b' that follows 'a'  Negative look-behind − (?<!a)b selects 'b' that doesn't follow 'a'  When the pattern comes the regex engine looks behind for the filtering pattern in case of Look behind.
  • 20. References: 1) http://www.regular-expressions.info/tutorial.html 2) Thinking in java 4th Editon – Chapter: Strings page 392