SlideShare une entreprise Scribd logo
1  sur  48
Télécharger pour lire hors ligne
PYTHON APPLICATION
PROGRAMMING -18EC646
MODULE-3
REGULAR EXPRESSIONS
PROF. KRISHNANANDA L
DEPARTMEN T OF ECE
GSKSJTI, BENGALURU
WHAT IS MEANT BY
REGULAR EXPRESSION?
We have seen string/file slicing, searching, parsing etc and
built-in methods like split, find etc.
This task of searching and extracting finds applications in
Email classification, Web searching etc.
Python has a very powerful library called regularexpressions
that handles many of these tasks quite elegantly
Regular expressions are like small but powerful programming
language, for matching text patterns and provide a
standardized way of searching, replacing, and parsing text
with complex patterns of characters.
Regular expressions can be defined as the sequence of
characters which are used to search for a pattern in a string.
2
FEATURES OF REGEX
Hundreds of lines of code could be reduced to few lines with regular
expressions
Used to construct compilers, interpreters and text editors
Used to search and match text patterns
The power of the regular expressions comes when we add special
characters to the search string that allow us to do sophisticated
matching and extraction with very little code.
Used to validate text data formats especially input data
ARegular Expression (or Regex) is a pattern (or filter) that describes
a set of strings that matches the pattern. A regex consists of a
sequence of characters, metacharacters (such as . , d , ?, W etc ) and
operators (such as + , * , ? , | , ^ ).
Popular programming languages like Python, Perl, JavaScript, Ruby,
Tcl, C# etc have Regex capabilities 3
GENERAL USES OF REGULAR
EXPRESSIONS
Search a string (search and match)
Replace parts of a string(sub)
Break string into small pieces(split)
Finding a string (findall)
The module re provides the support to use regex in the
python program. The re module throws an exception if there
is some error while using the regular expression.
Before using the regular expressions in program, we have to
import the library using “import re”
4
REGEX FUNCTIONS
The re module offers a set of functions
FUNCTION DESCRIPTION
findall Returns a list containing all matches of a pattern in
the string
search Returns a match Object if there is a match
anywhere in the string
split Returns a list where the string has been split at each
match
sub Replaces one or more matches in a string
(substitute with another string)
match This method matches the regex pattern in the string
with the optional flag. It returns true if a match is
found in the string, otherwise it returns false.
5
EXAMPLE PROGRAM
• We open the file, loop through
each line, and use the regular
expression search() to only print
out lines that contain the string
“hello”. (same can be done using
“line.find()” also)
# Search for lines that contain ‘hello'
import re
fp = open('d:/18ec646/demo1.txt')
for line in fp:
line = line.rstrip()
if re.search('hello', line):
print(line)
Output:
hello and welcome to python class
hello how are you?
# Search for lines that contain ‘hello'
import re
fp = open('d:/18ec646/demo2.txt')
for line in fp:
line = line.rstrip()
if re.search('hello', line):
print(line)
Output:
friends,hello and welcome
hello,goodmorning 6
EXAMPLE PROGRAM
• To get the optimum performance from Regex, we need to use special
characters called ‘metacharacters’
# Search for lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo1.txt')
for line in fp:
line = line.rstrip()
if re.search('^hello', line): ## note 'caret' metacharacter
print(line) ## before hello
Output:
hello and welcome to python class
hello how are you?
# Search for lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo2.txt')
for line in fp:
line = line.rstrip()
if re.search('^hello', line): ## note 'caret' metacharacter
print(line) ## before hello
Output:
hello, goodmorning
7
METACHARACTERS
Metacharacters are characters that are interpreted in a
special way by a RegEx engine.
Metacharacters are very helpful for parsing/extraction
from the given file/string
Metacharacters allow us to build more powerful regular
expressions.
Table-1 provides a summary of metacharacters and their
meaning in RegEx
Here's a list of metacharacters:
[ ] . ^ $ * + ? { } ( )  |
8
Metacharacter Description Example
[ ] It represents the set of characters. "[a-z]"
 It represents the special sequence (can also be
used to escape special characters)
"r"
. It signals that any character is present at some
specific place (except newline character)
"Ja...v."
^ It represents the pattern present at the beginning
of the string (indicates “startswith”)
"^python"
$ It represents the pattern present at the end of the
string. (indicates “endswith”)
"world"
* It represents zero or more occurrences of a
pattern in the string.
"hello*"
+ It represents one or more occurrences of a
pattern in the string.
"hello+"
{} The specified number of occurrences of a pattern
the string.
“hello{2}"
| It represents either this or the other character is
present.
"hello|hi"
() Capture and group
9
[ ] - SQUARE BRACKETS
• Square brackets specifies a set of characters you wish to match.
• A set is a group of characters given inside a pair of square brackets. It represents
the special meaning.
10
[abc] Returns a match if the string contains any of the specified
characters in the set.
[a-n] Returns a match if the string contains any of the characters between a to
n.
[^arn] Returns a match if the string contains the characters except a, r, and n.
[0123] Returns a match if the string contains any of the specified digits.
[0-9] Returns a match if the string contains any digit between 0 and 9.
[0-5][0-9] Returns a match if the string contains any digit between 00 and 59.
[a-zA-Z] Returns a match if the string contains any alphabet (lower-case or upper-
case).
CONTD..
### illustrating square brackets
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("[w]", line):
print(line)
## search all the lines where w is
present and display
Output:
Hello and welcome
@abhishek,how are you
### illustrating square brackets
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("[ge]", line):
print(line)
### Search for characters g or e or
both and display
Output:
Hello and welcome
This is Bangalore
11
CONTD…
### illustrating square brackets
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("[th]", line):
print(line)
Ouput:
This is Bangalore
This is Paris
This is London
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("[y]", line):
print(line) Ouput:
johny johny yes papa
open your mouth
### illustratingsquare brackets
import re
fh =
open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("[x-z]", line):
print(line)
Output:
to:abhishek@yahoo.com
@abhishek,how are you
12
. PERIOD (DOT)
A period matches any single character (except newline 'n‘)
Expression String Matched?
..
(any two
characters)
a No match
ac 1 match
acd 1 match
acde
2 matches
(contains 4
characters)
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("y.", line):
print(line)
Output:
to: abhishek@yahoo.com
@abhishek,how are you
13
CONTD..
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("P.", line):
print(line)
Output:
This is Paris
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("T..s", line):
print(line)
Output:
This is London
These are beautiful flowers
Thus we see the great London bridge
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("L..d", line):
print(line)
Output:
This is London
Thus we see the great London bridge
## any two characters betweenT and s
14
^ - CARET
The caret symbol ^ is used to check if a string starts with a certain
character
Expression String Matched?
^a
a 1 match
abc 1 match
bac No match
^ab
abc 1 match
acb No match (starts with a but not followedby b)
### illustrating caret
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("^h",line):
print(line) Output:
hello, goodmorning
### illustrating caret
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("^f", line):
print(line)
from:krishna.sksj@gmail.com
15
$ - DOLLAR
The dollar symbol $ is used to check if a string ends with a certain
character.
Expression String Matched?
a$
a 1 match
formula 1 match
cab No match
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("m$", line):
print(line)
Output:
from:krishna.sksj@gmail.com
to: abhishek@yahoo.com
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("papa$", line):
print(line)
Output:
johny johny yes papa
eating sugar no papa
16
* - STAR
The star symbol * matches zero or more occurrences of the pattern left
to it.
Expression String Matched?
ma*n
mn 1 match
man 1 match
maaan 1 match
main No match (a is not followedby n)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("London*",line):
print(line)
Output:
This is London
Thus we see the great London bridge
17
+ - PLUS
The plus symbol + matchesone or more occurrences of the pattern left
to it.
Expression String Matched?
ma+n
mn No match (no a character)
man 1 match
maaan 1 match
main No match (a is not followedby n)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("see+", line):
print(line)
Output:
Thus we see the great London bridge
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("ar+", line):
print(line)
Output:
These are beautiful flowers
18
? - QUESTION MARK
The question mark symbol ? matches zero or one occurrence of the pattern left to
it.
Expression String Matched?
ma?n
mn 1 match
man 1 match
maaan No match (more than one a character)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("@gmail?", line):
print(line)
Output:
from:krishna.sksj@gmail.com
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("you?",line):
print(line)
Output:
@abhishek,how are you
19
{} - BRACES
Finds the specified number of occurrences of a pattern. Consider {n, m}. This
means at least n, and at most m repetitions of the pattern left to it.
If a{2} was given, a should be repeated exactly twice
Expression String Matched?
a{2,3}
abc dat No match
abc daat 1 match (at daat)
aabc daaat 2 matches (at aabc and daaat)
aabc daaaat 2 matches (at aabc and daaaat)
20
| - ALTERNATION
Vertical bar | is used for alternation (or operator).
Expression String Matched?
a|b
cde No match
ade 1 match (match at ade)
acdbea 3 matches (at acdbea)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("yes|no", line):
print(line)
Output:
johny johny yes papa
eating sugar no papa
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("hello|how", line):
print(line)
Output:
friends,hello and welcome
hello,goodmorning
21
() - GROUP
Parentheses () is used to group sub-patterns.
For ex, (a|b|c)xz match any string that matches
either a or b or c followed by xz
Expression String Matched?
(a|b|c)xz
ab xz No match
abxz 1 match (match at abxz)
axz cabxz 2 matches (at axzbc cabxz)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("(hello|how) are", line):
print(line)
Output:@abhishek,how are you
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("(hello and)", line):
print(line)
Ouptut:
friends,hello and welcome
22
- BACKSLASH
Backlash  is used to escape various characters including all
metacharacters.
For ex, $a match if a string contains $ followed by a.
Here, $ is not interpreted by a RegEx engine in a special way.
If you are unsure if a character has special meaning or not, you
can put  in front of it. This makes sure the character is not treated
in a special way.
NOTE :- Another way of doing it is putting the special
character in the square brackets [ ]
23
SPECIAL SEQUENCES
A special sequence is a  followed by one of the characters
(see Table) and has a special meaning
Special sequences make commonly used patterns easier to
write.
24
SPECIAL SEQUENCES
Character Description Example
A It returns a match if the specified characters are
present at the beginning of the string.
"AThe"
b It returns a match if the specified characters are
present at the beginning or the end of the string.
r"bain"
r"ainb"
B It returns a match if the specified characters are
present at the beginning of the string but not at the
end.
r"Bain"
r"ainB
d It returns a match if the string contains digits [0-9]. "d"
D It returns a match if the string doesn't contain the
digits [0-9].
"D"
s It returns a match if the string contains any white
space character.
"s"
S It returns a match if the string doesn't contain any
white space character.
"S"
w It returns a match if the string contains any word
characters (Ato Z, a to z, 0 to 9 and underscore)
"w"
W It returns a match if the string doesn't contain any
word characters
"W" 25
A - Matches if the specified characters are at the start of a string.
Expression String Matched?
Athe
the sun Match
In the sun No match
26
b - Matches if the specified characters are at the beginning or end of a word
Expression String Matched?
bfoo
football Match
a football Match
afootball No match
foob
football No Match
the afoo test Match
the afootest No match
B - Opposite of b. Matches if the specified characters
are not at the beginning or end of a word.
Expression String Matched?
Bfoo
football No match
a football No match
afootball Match
fooB
the foo No match
the afoo test No match
the afootest Match
27
d - Matches any decimal digit. Equivalent to [0-9]
D - Matches any non-decimal digit. Equivalent to [^0-9]
Expression String Matched?
d
12abc3 3 matches (at 12abc3)
Python No match
Expression String Matched?
D
1ab34"50 3 matches (at 1ab34"50)
1345 No match
28
s - Matches where a string contains any whitespace
character. Equivalent to [ tnrfv].
S - Matches where a string contains any non-whitespace
character. Equivalent to [^ tnrfv].
Expression String Matched?
s
Python RegEx 1 match
PythonRegEx No match
Expression String Matched?
S
a b 2 matches (at a b)
No match
29
w - Matches any alphanumeric character. Equivalent to [a-zA-Z0-
9_]. Underscore is also considered an alphanumeric character
W - Matches any non-alphanumeric character. Equivalent
to [^a-zA-Z0-9_]
Expression String Matched?
w
12&":;c 3 matches (at 12&":;c)
%"> ! No match
Expression String Matched?
W
1a2%c 1 match (at 1a2%c)
Python No match
30
Z - Matches if the specified characters are at the end of a
string.
Expression String Matched?
PythonZ
I like Python 1 match
I like Python
Programming
No match
Python is fun. No match
31
# check whether the specified
#characters are at the end of string
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ("comZ", x):
print(x)
Output:
from:krishna.sksj@gmail.com
to: abhishek@yahoo.com
REGEX FUNCTIONS
The re module offers a set of functions
FUNCTION DESCRIPTION
findall Returns a list containing all matches of a pattern in
the string
search Returns a match Object if there is a match
anywhere in the string
split Returns a list where the string has been split at each
match
sub Replaces one or more matches in a string
(substitute with another string)
match This method matches the regex pattern in the string
with the optional flag. It returns true if a match is
found in the string, otherwise it returns false.
32
THE FINDALL() FUNCTION
The findall() function returns a list containing all matches.
The list contains the matches in the order they are found.
If no matches are found, an empty list is returned
Here is the syntax for this function −
re. findall(pattern, string, flags=0)
33
import re
str ="How are you. How is everything?"
matches= re.findall("How",str)
print(matches)
['How','How']
EXAMPLES Contd..
OUTPUTS:
34
CONTD..
35
#check whether string starts with How
import re
str ="How are you. How is everything?"
x= re.findall("^How",str)
print (str)
print(x)
if x:
print ("string starts with 'How' ")
else:
print ("string does not start with 'How'")
Output:
How are you.How is everything?
['How']
string starts with 'How'
CONTD…
36
# match all lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo1.txt')
for x in fp:
x = x.rstrip()
if re.findall ('^hello',x): ## note 'caret'
print(x)
Output:
hello and welcome to python class
hello how are you?
# match all lines that starts with ‘@'
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ('^@',x): ## note 'caret'
metacharacter
print(x)
Output:
@abhishek,how are you
# check whether the string contains
## non-digit characters
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ("D", x): ## special sequence
print(x)
from:krishna.sksj@gmail.com
to:abhishek@yahoo.com
Hello and welcome
@abhishek,how are you
THE SEARCH() FUNCTION
The search() function searches the string for a match, and
returns a Match object if there is a match.
If there is more than one match, only the first occurrence
of the match will be returned
If no matches are found, the value None is returned
Here is the syntax for this function −
re.search(pattern, string, flags=0)
37
EXAPLES on search() function:-
outputs:
38
THE SPLIT() FUNCTION
The re.split method splits the string where there is a match
and returns a list of strings where the splits have occurred.
You can pass maxsplit argument to the re.split() method. It's
the maximum number of splits that will occur.
If the pattern is not found, re.split() returns a list containing
the original string.
Here is the syntax for this function −
re.split(pattern, string, maxsplit=0, flags=0)
39
EXAPLES on split() function:-
40
# split function
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
x= re.split("@",x)
print(x)
Output:
['from:krishna.sksj','gmail.com']
['to: abhishek','yahoo.com']
['Hello and welcome']
['','abhishek,how are you']
CONTD..
41
# split function
import re
fp =
open('d:/18ec646/demo7.txt')
for x in fp:
x = x.rstrip()
x= re.split("e",x)
print(x)
Output:
['johny johny y','s papa']
['', 'ating sugar no papa']
['t','lling li', 's']
['op','n your mouth']
Output:
['johny johny yes ', '']
['eating sugar no ','']
['telling lies']
['open your mouth']
# split function
import re
fp =
open('d:/18ec646/demo7.txt')
for x in fp:
x = x.rstrip()
x= re.split("papa",x)
print(x)
# split function
import re
fp =
open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
x= re.split("is",x)
print(x)
Output:
['Hello and welcome']
['Th',' ',' Bangalore']
['Th',' ',' Par','']
['Th',' ',' London']
THE SUB() FUNCTION
The sub() function replaces the matches with the text of your
choice
You can control the number of replacements by specifying
the count parameter
If the pattern is not found, re.sub() returns the original string
Here is the syntax for this function −
re.sub(pattern, repl, string, count=0, flags=0)
42
EXAPLES on sub() function:-
43
### illustration of substitute (replace)
import re
str ="How are you.How is everything?"
x= re.sub("How","where",str)
print(x)
Output:
where are you.where is everything?
# sub function
import re
fp = open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
x= re.sub("This","Where",x)
print(x)
Output:
Hello and welcome
Where is Bangalore
Where is Paris
Where is London
THE MATCH() FUNCTION
If zero or more characters at the beginning of string match
this regular expression, return a corresponding match object.
Return None if the string does not match the pattern.
Here is the syntax for this function −
Pattern.match(string[, pos[, endpos]])
The optional pos and endpos parameters have the same
meaning as for the search() method.
44
search() Vs match()
Python offers two different primitive operations based on
regular expressions:
 re.match() checksfor a match only at the beginning of the string,
while re.search() checks for a match anywhere in the string
Eg:-
45
# match function
import re
fp = open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
if re.match("This",x):
print(x)
Outptut:
This is Bangalore
This is Paris
This is London
MATCH OBJECT
A Match Object is an object containing information about the
search and the result
If there is no match, the value None will be returned, instead
of the Match Object
Some of the commonly used methods and attributes of match
objects are:
match.group(), match.start(), match.end(), match.span(),
match.string
46
match.group()
The group() method returns the part of the string where
there is a match
match.start(), match.end()
The start() function returns the index of the start of the
matched substring.
 Similarly, end() returns the end index of the matched
substring.
match.string
string attribute returns the passed string.
47
match.span()
The span() function returns a tuple containing start
and end index of the matched part.
Eg:-
OUTPUT:
(12,17)
48

Contenu connexe

Tendances

Python Dictionaries and Sets
Python Dictionaries and SetsPython Dictionaries and Sets
Python Dictionaries and SetsNicole Ryan
 
Strings in Python
Strings in PythonStrings in Python
Strings in Pythonnitamhaske
 
Python-03| Data types
Python-03| Data typesPython-03| Data types
Python-03| Data typesMohd Sajjad
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in pythonhydpy
 
FLOW OF CONTROL-INTRO PYTHON
FLOW OF CONTROL-INTRO PYTHONFLOW OF CONTROL-INTRO PYTHON
FLOW OF CONTROL-INTRO PYTHONvikram mahendra
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answersRojaPriya
 
Python dictionary
Python dictionaryPython dictionary
Python dictionaryeman lotfy
 
Python dictionary
Python dictionaryPython dictionary
Python dictionarySagar Kumar
 
String classes and its methods.20
String classes and its methods.20String classes and its methods.20
String classes and its methods.20myrajendra
 
Python Data Structures and Algorithms.pptx
Python Data Structures and Algorithms.pptxPython Data Structures and Algorithms.pptx
Python Data Structures and Algorithms.pptxShreyasLawand
 
Arrays In Python | Python Array Operations | Edureka
Arrays In Python | Python Array Operations | EdurekaArrays In Python | Python Array Operations | Edureka
Arrays In Python | Python Array Operations | EdurekaEdureka!
 
String functions and operations
String functions and operations String functions and operations
String functions and operations Mudasir Syed
 

Tendances (20)

Python Dictionaries and Sets
Python Dictionaries and SetsPython Dictionaries and Sets
Python Dictionaries and Sets
 
Python : Data Types
Python : Data TypesPython : Data Types
Python : Data Types
 
Python functions
Python functionsPython functions
Python functions
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
List in python
List in pythonList in python
List in python
 
Strings in Python
Strings in PythonStrings in Python
Strings in Python
 
Python-03| Data types
Python-03| Data typesPython-03| Data types
Python-03| Data types
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in python
 
FLOW OF CONTROL-INTRO PYTHON
FLOW OF CONTROL-INTRO PYTHONFLOW OF CONTROL-INTRO PYTHON
FLOW OF CONTROL-INTRO PYTHON
 
Python programming : Control statements
Python programming : Control statementsPython programming : Control statements
Python programming : Control statements
 
Python interview questions and answers
Python interview questions and answersPython interview questions and answers
Python interview questions and answers
 
Python Programming Essentials - M9 - String Formatting
Python Programming Essentials - M9 - String FormattingPython Programming Essentials - M9 - String Formatting
Python Programming Essentials - M9 - String Formatting
 
Python dictionary
Python dictionaryPython dictionary
Python dictionary
 
Dictionaries in Python
Dictionaries in PythonDictionaries in Python
Dictionaries in Python
 
Python dictionary
Python dictionaryPython dictionary
Python dictionary
 
Python tuple
Python   tuplePython   tuple
Python tuple
 
String classes and its methods.20
String classes and its methods.20String classes and its methods.20
String classes and its methods.20
 
Python Data Structures and Algorithms.pptx
Python Data Structures and Algorithms.pptxPython Data Structures and Algorithms.pptx
Python Data Structures and Algorithms.pptx
 
Arrays In Python | Python Array Operations | Edureka
Arrays In Python | Python Array Operations | EdurekaArrays In Python | Python Array Operations | Edureka
Arrays In Python | Python Array Operations | Edureka
 
String functions and operations
String functions and operations String functions and operations
String functions and operations
 

Similaire à Python regular expressions

Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptxDurgaNayak4
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular ExpressionsMukesh Tekwani
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlsana mateen
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionssana mateen
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdfGaneshRaghu4
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdfDarellMuchoko
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressionsmussawir20
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Pythonprimeteacher32
 
Unit 1-array,lists and hashes
Unit 1-array,lists and hashesUnit 1-array,lists and hashes
Unit 1-array,lists and hashessana mateen
 
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...jaychoudhary37
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expressionazzamhadeel89
 
Java căn bản - Chapter9
Java căn bản - Chapter9Java căn bản - Chapter9
Java căn bản - Chapter9Vince Vo
 
unit-4 regular expression.pptx
unit-4 regular expression.pptxunit-4 regular expression.pptx
unit-4 regular expression.pptxPadreBhoj
 
Chapter 9 - Characters and Strings
Chapter 9 - Characters and StringsChapter 9 - Characters and Strings
Chapter 9 - Characters and StringsEduardo Bergavera
 

Similaire à Python regular expressions (20)

Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressions
 
PHP Web Programming
PHP Web ProgrammingPHP Web Programming
PHP Web Programming
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Unit 1-array,lists and hashes
Unit 1-array,lists and hashesUnit 1-array,lists and hashes
Unit 1-array,lists and hashes
 
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
 
Perl Basics with Examples
Perl Basics with ExamplesPerl Basics with Examples
Perl Basics with Examples
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
 
Java căn bản - Chapter9
Java căn bản - Chapter9Java căn bản - Chapter9
Java căn bản - Chapter9
 
unit-4 regular expression.pptx
unit-4 regular expression.pptxunit-4 regular expression.pptx
unit-4 regular expression.pptx
 
Chapter 9 - Characters and Strings
Chapter 9 - Characters and StringsChapter 9 - Characters and Strings
Chapter 9 - Characters and Strings
 

Plus de Krishna Nanda

Computer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layerComputer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layerKrishna Nanda
 
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLSComputer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLSKrishna Nanda
 
COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4Krishna Nanda
 
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2Krishna Nanda
 
Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1Krishna Nanda
 
Computer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LANComputer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LANKrishna Nanda
 
Computer Communication Networks-Network Layer
Computer Communication Networks-Network LayerComputer Communication Networks-Network Layer
Computer Communication Networks-Network LayerKrishna Nanda
 
Lk module4 structures
Lk module4 structuresLk module4 structures
Lk module4 structuresKrishna Nanda
 

Plus de Krishna Nanda (16)

Python dictionaries
Python dictionariesPython dictionaries
Python dictionaries
 
Python lists
Python listsPython lists
Python lists
 
Python-Tuples
Python-TuplesPython-Tuples
Python-Tuples
 
Python- strings
Python- stringsPython- strings
Python- strings
 
Python-files
Python-filesPython-files
Python-files
 
Computer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layerComputer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layer
 
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLSComputer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
 
COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4
 
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
 
Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1
 
Computer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LANComputer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LAN
 
Computer Communication Networks-Network Layer
Computer Communication Networks-Network LayerComputer Communication Networks-Network Layer
Computer Communication Networks-Network Layer
 
Lk module3
Lk module3Lk module3
Lk module3
 
Lk module4 structures
Lk module4 structuresLk module4 structures
Lk module4 structures
 
Lk module4 file
Lk module4 fileLk module4 file
Lk module4 file
 
Lk module5 pointers
Lk module5 pointersLk module5 pointers
Lk module5 pointers
 

Dernier

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 

Dernier (20)

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 

Python regular expressions

  • 1. PYTHON APPLICATION PROGRAMMING -18EC646 MODULE-3 REGULAR EXPRESSIONS PROF. KRISHNANANDA L DEPARTMEN T OF ECE GSKSJTI, BENGALURU
  • 2. WHAT IS MEANT BY REGULAR EXPRESSION? We have seen string/file slicing, searching, parsing etc and built-in methods like split, find etc. This task of searching and extracting finds applications in Email classification, Web searching etc. Python has a very powerful library called regularexpressions that handles many of these tasks quite elegantly Regular expressions are like small but powerful programming language, for matching text patterns and provide a standardized way of searching, replacing, and parsing text with complex patterns of characters. Regular expressions can be defined as the sequence of characters which are used to search for a pattern in a string. 2
  • 3. FEATURES OF REGEX Hundreds of lines of code could be reduced to few lines with regular expressions Used to construct compilers, interpreters and text editors Used to search and match text patterns The power of the regular expressions comes when we add special characters to the search string that allow us to do sophisticated matching and extraction with very little code. Used to validate text data formats especially input data ARegular Expression (or Regex) is a pattern (or filter) that describes a set of strings that matches the pattern. A regex consists of a sequence of characters, metacharacters (such as . , d , ?, W etc ) and operators (such as + , * , ? , | , ^ ). Popular programming languages like Python, Perl, JavaScript, Ruby, Tcl, C# etc have Regex capabilities 3
  • 4. GENERAL USES OF REGULAR EXPRESSIONS Search a string (search and match) Replace parts of a string(sub) Break string into small pieces(split) Finding a string (findall) The module re provides the support to use regex in the python program. The re module throws an exception if there is some error while using the regular expression. Before using the regular expressions in program, we have to import the library using “import re” 4
  • 5. REGEX FUNCTIONS The re module offers a set of functions FUNCTION DESCRIPTION findall Returns a list containing all matches of a pattern in the string search Returns a match Object if there is a match anywhere in the string split Returns a list where the string has been split at each match sub Replaces one or more matches in a string (substitute with another string) match This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string, otherwise it returns false. 5
  • 6. EXAMPLE PROGRAM • We open the file, loop through each line, and use the regular expression search() to only print out lines that contain the string “hello”. (same can be done using “line.find()” also) # Search for lines that contain ‘hello' import re fp = open('d:/18ec646/demo1.txt') for line in fp: line = line.rstrip() if re.search('hello', line): print(line) Output: hello and welcome to python class hello how are you? # Search for lines that contain ‘hello' import re fp = open('d:/18ec646/demo2.txt') for line in fp: line = line.rstrip() if re.search('hello', line): print(line) Output: friends,hello and welcome hello,goodmorning 6
  • 7. EXAMPLE PROGRAM • To get the optimum performance from Regex, we need to use special characters called ‘metacharacters’ # Search for lines that starts with 'hello' import re fp = open('d:/18ec646/demo1.txt') for line in fp: line = line.rstrip() if re.search('^hello', line): ## note 'caret' metacharacter print(line) ## before hello Output: hello and welcome to python class hello how are you? # Search for lines that starts with 'hello' import re fp = open('d:/18ec646/demo2.txt') for line in fp: line = line.rstrip() if re.search('^hello', line): ## note 'caret' metacharacter print(line) ## before hello Output: hello, goodmorning 7
  • 8. METACHARACTERS Metacharacters are characters that are interpreted in a special way by a RegEx engine. Metacharacters are very helpful for parsing/extraction from the given file/string Metacharacters allow us to build more powerful regular expressions. Table-1 provides a summary of metacharacters and their meaning in RegEx Here's a list of metacharacters: [ ] . ^ $ * + ? { } ( ) | 8
  • 9. Metacharacter Description Example [ ] It represents the set of characters. "[a-z]" It represents the special sequence (can also be used to escape special characters) "r" . It signals that any character is present at some specific place (except newline character) "Ja...v." ^ It represents the pattern present at the beginning of the string (indicates “startswith”) "^python" $ It represents the pattern present at the end of the string. (indicates “endswith”) "world" * It represents zero or more occurrences of a pattern in the string. "hello*" + It represents one or more occurrences of a pattern in the string. "hello+" {} The specified number of occurrences of a pattern the string. “hello{2}" | It represents either this or the other character is present. "hello|hi" () Capture and group 9
  • 10. [ ] - SQUARE BRACKETS • Square brackets specifies a set of characters you wish to match. • A set is a group of characters given inside a pair of square brackets. It represents the special meaning. 10 [abc] Returns a match if the string contains any of the specified characters in the set. [a-n] Returns a match if the string contains any of the characters between a to n. [^arn] Returns a match if the string contains the characters except a, r, and n. [0123] Returns a match if the string contains any of the specified digits. [0-9] Returns a match if the string contains any digit between 0 and 9. [0-5][0-9] Returns a match if the string contains any digit between 00 and 59. [a-zA-Z] Returns a match if the string contains any alphabet (lower-case or upper- case).
  • 11. CONTD.. ### illustrating square brackets import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("[w]", line): print(line) ## search all the lines where w is present and display Output: Hello and welcome @abhishek,how are you ### illustrating square brackets import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("[ge]", line): print(line) ### Search for characters g or e or both and display Output: Hello and welcome This is Bangalore 11
  • 12. CONTD… ### illustrating square brackets import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("[th]", line): print(line) Ouput: This is Bangalore This is Paris This is London import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("[y]", line): print(line) Ouput: johny johny yes papa open your mouth ### illustratingsquare brackets import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("[x-z]", line): print(line) Output: to:abhishek@yahoo.com @abhishek,how are you 12
  • 13. . PERIOD (DOT) A period matches any single character (except newline 'n‘) Expression String Matched? .. (any two characters) a No match ac 1 match acd 1 match acde 2 matches (contains 4 characters) ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("y.", line): print(line) Output: to: abhishek@yahoo.com @abhishek,how are you 13
  • 14. CONTD.. ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("P.", line): print(line) Output: This is Paris ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("T..s", line): print(line) Output: This is London These are beautiful flowers Thus we see the great London bridge ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("L..d", line): print(line) Output: This is London Thus we see the great London bridge ## any two characters betweenT and s 14
  • 15. ^ - CARET The caret symbol ^ is used to check if a string starts with a certain character Expression String Matched? ^a a 1 match abc 1 match bac No match ^ab abc 1 match acb No match (starts with a but not followedby b) ### illustrating caret import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("^h",line): print(line) Output: hello, goodmorning ### illustrating caret import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("^f", line): print(line) from:krishna.sksj@gmail.com 15
  • 16. $ - DOLLAR The dollar symbol $ is used to check if a string ends with a certain character. Expression String Matched? a$ a 1 match formula 1 match cab No match ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("m$", line): print(line) Output: from:krishna.sksj@gmail.com to: abhishek@yahoo.com ### illustrating metacharacters import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("papa$", line): print(line) Output: johny johny yes papa eating sugar no papa 16
  • 17. * - STAR The star symbol * matches zero or more occurrences of the pattern left to it. Expression String Matched? ma*n mn 1 match man 1 match maaan 1 match main No match (a is not followedby n) ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("London*",line): print(line) Output: This is London Thus we see the great London bridge 17
  • 18. + - PLUS The plus symbol + matchesone or more occurrences of the pattern left to it. Expression String Matched? ma+n mn No match (no a character) man 1 match maaan 1 match main No match (a is not followedby n) ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("see+", line): print(line) Output: Thus we see the great London bridge ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("ar+", line): print(line) Output: These are beautiful flowers 18
  • 19. ? - QUESTION MARK The question mark symbol ? matches zero or one occurrence of the pattern left to it. Expression String Matched? ma?n mn 1 match man 1 match maaan No match (more than one a character) ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("@gmail?", line): print(line) Output: from:krishna.sksj@gmail.com ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("you?",line): print(line) Output: @abhishek,how are you 19
  • 20. {} - BRACES Finds the specified number of occurrences of a pattern. Consider {n, m}. This means at least n, and at most m repetitions of the pattern left to it. If a{2} was given, a should be repeated exactly twice Expression String Matched? a{2,3} abc dat No match abc daat 1 match (at daat) aabc daaat 2 matches (at aabc and daaat) aabc daaaat 2 matches (at aabc and daaaat) 20
  • 21. | - ALTERNATION Vertical bar | is used for alternation (or operator). Expression String Matched? a|b cde No match ade 1 match (match at ade) acdbea 3 matches (at acdbea) ### illustrating metacharacters import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("yes|no", line): print(line) Output: johny johny yes papa eating sugar no papa ### illustrating metacharacters import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("hello|how", line): print(line) Output: friends,hello and welcome hello,goodmorning 21
  • 22. () - GROUP Parentheses () is used to group sub-patterns. For ex, (a|b|c)xz match any string that matches either a or b or c followed by xz Expression String Matched? (a|b|c)xz ab xz No match abxz 1 match (match at abxz) axz cabxz 2 matches (at axzbc cabxz) ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("(hello|how) are", line): print(line) Output:@abhishek,how are you ### illustrating metacharacters import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("(hello and)", line): print(line) Ouptut: friends,hello and welcome 22
  • 23. - BACKSLASH Backlash is used to escape various characters including all metacharacters. For ex, $a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx engine in a special way. If you are unsure if a character has special meaning or not, you can put in front of it. This makes sure the character is not treated in a special way. NOTE :- Another way of doing it is putting the special character in the square brackets [ ] 23
  • 24. SPECIAL SEQUENCES A special sequence is a followed by one of the characters (see Table) and has a special meaning Special sequences make commonly used patterns easier to write. 24
  • 25. SPECIAL SEQUENCES Character Description Example A It returns a match if the specified characters are present at the beginning of the string. "AThe" b It returns a match if the specified characters are present at the beginning or the end of the string. r"bain" r"ainb" B It returns a match if the specified characters are present at the beginning of the string but not at the end. r"Bain" r"ainB d It returns a match if the string contains digits [0-9]. "d" D It returns a match if the string doesn't contain the digits [0-9]. "D" s It returns a match if the string contains any white space character. "s" S It returns a match if the string doesn't contain any white space character. "S" w It returns a match if the string contains any word characters (Ato Z, a to z, 0 to 9 and underscore) "w" W It returns a match if the string doesn't contain any word characters "W" 25
  • 26. A - Matches if the specified characters are at the start of a string. Expression String Matched? Athe the sun Match In the sun No match 26 b - Matches if the specified characters are at the beginning or end of a word Expression String Matched? bfoo football Match a football Match afootball No match foob football No Match the afoo test Match the afootest No match
  • 27. B - Opposite of b. Matches if the specified characters are not at the beginning or end of a word. Expression String Matched? Bfoo football No match a football No match afootball Match fooB the foo No match the afoo test No match the afootest Match 27
  • 28. d - Matches any decimal digit. Equivalent to [0-9] D - Matches any non-decimal digit. Equivalent to [^0-9] Expression String Matched? d 12abc3 3 matches (at 12abc3) Python No match Expression String Matched? D 1ab34"50 3 matches (at 1ab34"50) 1345 No match 28
  • 29. s - Matches where a string contains any whitespace character. Equivalent to [ tnrfv]. S - Matches where a string contains any non-whitespace character. Equivalent to [^ tnrfv]. Expression String Matched? s Python RegEx 1 match PythonRegEx No match Expression String Matched? S a b 2 matches (at a b) No match 29
  • 30. w - Matches any alphanumeric character. Equivalent to [a-zA-Z0- 9_]. Underscore is also considered an alphanumeric character W - Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_] Expression String Matched? w 12&":;c 3 matches (at 12&":;c) %"> ! No match Expression String Matched? W 1a2%c 1 match (at 1a2%c) Python No match 30
  • 31. Z - Matches if the specified characters are at the end of a string. Expression String Matched? PythonZ I like Python 1 match I like Python Programming No match Python is fun. No match 31 # check whether the specified #characters are at the end of string import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ("comZ", x): print(x) Output: from:krishna.sksj@gmail.com to: abhishek@yahoo.com
  • 32. REGEX FUNCTIONS The re module offers a set of functions FUNCTION DESCRIPTION findall Returns a list containing all matches of a pattern in the string search Returns a match Object if there is a match anywhere in the string split Returns a list where the string has been split at each match sub Replaces one or more matches in a string (substitute with another string) match This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string, otherwise it returns false. 32
  • 33. THE FINDALL() FUNCTION The findall() function returns a list containing all matches. The list contains the matches in the order they are found. If no matches are found, an empty list is returned Here is the syntax for this function − re. findall(pattern, string, flags=0) 33 import re str ="How are you. How is everything?" matches= re.findall("How",str) print(matches) ['How','How']
  • 35. CONTD.. 35 #check whether string starts with How import re str ="How are you. How is everything?" x= re.findall("^How",str) print (str) print(x) if x: print ("string starts with 'How' ") else: print ("string does not start with 'How'") Output: How are you.How is everything? ['How'] string starts with 'How'
  • 36. CONTD… 36 # match all lines that starts with 'hello' import re fp = open('d:/18ec646/demo1.txt') for x in fp: x = x.rstrip() if re.findall ('^hello',x): ## note 'caret' print(x) Output: hello and welcome to python class hello how are you? # match all lines that starts with ‘@' import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ('^@',x): ## note 'caret' metacharacter print(x) Output: @abhishek,how are you # check whether the string contains ## non-digit characters import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ("D", x): ## special sequence print(x) from:krishna.sksj@gmail.com to:abhishek@yahoo.com Hello and welcome @abhishek,how are you
  • 37. THE SEARCH() FUNCTION The search() function searches the string for a match, and returns a Match object if there is a match. If there is more than one match, only the first occurrence of the match will be returned If no matches are found, the value None is returned Here is the syntax for this function − re.search(pattern, string, flags=0) 37
  • 38. EXAPLES on search() function:- outputs: 38
  • 39. THE SPLIT() FUNCTION The re.split method splits the string where there is a match and returns a list of strings where the splits have occurred. You can pass maxsplit argument to the re.split() method. It's the maximum number of splits that will occur. If the pattern is not found, re.split() returns a list containing the original string. Here is the syntax for this function − re.split(pattern, string, maxsplit=0, flags=0) 39
  • 40. EXAPLES on split() function:- 40 # split function import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() x= re.split("@",x) print(x) Output: ['from:krishna.sksj','gmail.com'] ['to: abhishek','yahoo.com'] ['Hello and welcome'] ['','abhishek,how are you']
  • 41. CONTD.. 41 # split function import re fp = open('d:/18ec646/demo7.txt') for x in fp: x = x.rstrip() x= re.split("e",x) print(x) Output: ['johny johny y','s papa'] ['', 'ating sugar no papa'] ['t','lling li', 's'] ['op','n your mouth'] Output: ['johny johny yes ', ''] ['eating sugar no ',''] ['telling lies'] ['open your mouth'] # split function import re fp = open('d:/18ec646/demo7.txt') for x in fp: x = x.rstrip() x= re.split("papa",x) print(x) # split function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() x= re.split("is",x) print(x) Output: ['Hello and welcome'] ['Th',' ',' Bangalore'] ['Th',' ',' Par',''] ['Th',' ',' London']
  • 42. THE SUB() FUNCTION The sub() function replaces the matches with the text of your choice You can control the number of replacements by specifying the count parameter If the pattern is not found, re.sub() returns the original string Here is the syntax for this function − re.sub(pattern, repl, string, count=0, flags=0) 42
  • 43. EXAPLES on sub() function:- 43 ### illustration of substitute (replace) import re str ="How are you.How is everything?" x= re.sub("How","where",str) print(x) Output: where are you.where is everything? # sub function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() x= re.sub("This","Where",x) print(x) Output: Hello and welcome Where is Bangalore Where is Paris Where is London
  • 44. THE MATCH() FUNCTION If zero or more characters at the beginning of string match this regular expression, return a corresponding match object. Return None if the string does not match the pattern. Here is the syntax for this function − Pattern.match(string[, pos[, endpos]]) The optional pos and endpos parameters have the same meaning as for the search() method. 44
  • 45. search() Vs match() Python offers two different primitive operations based on regular expressions:  re.match() checksfor a match only at the beginning of the string, while re.search() checks for a match anywhere in the string Eg:- 45 # match function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() if re.match("This",x): print(x) Outptut: This is Bangalore This is Paris This is London
  • 46. MATCH OBJECT A Match Object is an object containing information about the search and the result If there is no match, the value None will be returned, instead of the Match Object Some of the commonly used methods and attributes of match objects are: match.group(), match.start(), match.end(), match.span(), match.string 46
  • 47. match.group() The group() method returns the part of the string where there is a match match.start(), match.end() The start() function returns the index of the start of the matched substring.  Similarly, end() returns the end index of the matched substring. match.string string attribute returns the passed string. 47
  • 48. match.span() The span() function returns a tuple containing start and end index of the matched part. Eg:- OUTPUT: (12,17) 48