SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
MODULE 3 – PART 4
REGULAR EXPRESSIONS
By,
Ravi Kumar B N
Assistant professor, Dept. of CSE
BMSIT & M
➢ Regular expression is a sequence of characters that define a search pattern.
➢ patterns are used by string searching algorithms for "find" or "find and
replace" operations on strings, or for input validation.
➢ The regular expression library “re” must be imported into our program before
we can use it.
INTRODUCTION
➢ search() function: used to search for a particular string. will only return the first occurrence that
matches the specified pattern.
This function is available in “re” library.
➢ the caret character (^) : is used in regular expressions to match the beginning of a line.
➢ The dollar character ($) : is used in regular expressions to match the end of a line.
Example: program to match only lines where “From:” is at the beginning of the line
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search('^From:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
mbox1.txt
From:stephen Sat Jan 5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
Subject: [sakai] svn commit:
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Return-Path: <postmaster@collab.sakaiproject.org>
✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the
string library.
SEARCH() FUNCTION:
➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full
stop, which matches any character.
The regular expression “F..m:” would match any of the following strings since the period
characters in the regular expression match any character.
“From:”, “Fxxm:”, “F12m:”, or “F!@m:”
➢ The program in the previous slide is rewritten using dot character which gives the same output
CHARACTER MATCHING IN REGULAR
EXPRESSIONS
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^F..m:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Character can be repeated any number of times using the “*” or “+” characters in a
regular expression.
➢ The Asterisk character (*) : matches zero-or-more characters
➢ The Plus character (+) : matches one-or-more characters
Example: Program to match lines that start with “From:”, followed by mail-id
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^From:.+@', line) :
print(line)
#Output
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one
or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters
between the colon character and the at-sign.
➢ non-whitespace character (S) - matches one non-whitespace character
➢findall() function: It is used to search for “all” occurrences that match a given pattern.
In contrast, search() function will only return the first occurrence that matches the specified pattern.
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.findall('S+@S+', s)
print(lst)
#output
['csev@umich.edu', 'cwen@iupui.edu']
Example1: Program returns a list of all of the strings that look like email addresses from a given line.
# same program using search() it will display only first mail id or first
matching string
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.search('S+@S+', s)
print(lst)
#output
<re.Match object; span=(11, 25), match='csev@umich.edu'>
'S+@S+’ this regular expression
matches substrings that have at least one
non-whitespace character, followed by an
at-sign, followed by at least one more
non-whitespace character
Example2: Program returns a list of all of the strings that look like email addresses from a given file.
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('S+@S+', line)
if len(x) > 0 :
print(x)
#Output
['<postmaster@collab.sakaiproject.org>']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['<postmaster@collab.sakaiproject.org>']
➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we
are willing to consider matching.
Example: [a-z] matches single lowercase letter
[A-Z] matches single uppercase letter
[a-zA-Z] matches single lowercase letter or uppercase letter
[a-zA-Z0-9] matches single lowercase letter or uppercase letter or number
Some of our email addresses have incorrect characters like
“<” or “;” at the beginning or end. we are only interested in
the portion of the string that starts and ends with a letter or
a number. To get the proper output we have to use following
character.
[amk] matches 'a', 'm', or ’k’
[(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’
[0-5][0-9] matches all the two-digits numbers from 00 to 59
➢ Characters that are not within a range can be matched by complementing the set
If the first character of the set is '^', all the characters that are not in the set will be matched.
For example,
[^5] will match any character except ’5’
Ex: Program returns list of all email addresses in proper format.
import re
hand = open('mbox.txt')
for line in hand:
line = line.rstrip()
x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line)
if len(x) > 0 :
print(x)
#output
['postmaster@collab.sakaiproject.org']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['postmaster@collab.sakaiproject.org']
[a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a
single lowercase letter, uppercase letter, or number “[a-zA-
Z0-9]”, followed by zero or more non-blank characters “S*”,
followed by an at-sign, followed by zero or more non-blank
characters “S*”, followed by an uppercase or lowercase
letter “[a-zA-Z]”.
SEARCH AND EXTRACT
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
if re.search('^XS*: [0-9.]+', line) :
print(line)
#Output
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Example1: Find numbers on lines that start with the string “X-”
lines such as: X-DSPAM-Confidence: 0.8475
➢ parentheses “()” in regular expression : used to extract a portion of the substring that
matches the regular expression.
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^XS*: ([0-9.]+)', line)
if len(x) > 0 :
print(x) Search
#Output
['0.8475’] Extract
['0.9245']
mbox2.txt
From: stephen.marquard@uct.ac.za
Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten
impl/impl/src/java/org
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Above output has entire line we only want to extract
numbers from lines that have the above syntax
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^From.* ([0-3][0-9]):', line)
if len(x) > 0 :
print(x)
#Output
['09']
['16']
['16']
Example2: Program to print the day of received mails
RANDOM EXECUTION
>>> s=" 0.9 .90 1.0 1. 138 pqr“
>>> re.findall('[0-9.]+',s)
['0.9', '.90', '1.0', '1.', '138’]
>>> re.findall('[0-9]+[.][0-9]',s)
['0.9', '1.0’]
>>> re.findall('[0-9]+[.][0-9]+',s)
['0.9', '1.0']
>>> re.findall('[0-9]*[.][0-9]+’,s)
['0.9', '.90', '1.0’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“
>>> re.findall('1bycs...',usn)
['1bycs123', '1bycs009’]
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190"
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009', '1vecs112', '1svcs190’]
>>> re.findall('[0-9]+cs[0-9]+',usn)
[]
>>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn)
['123', '009', '112', '190']
ESCAPE CHARACTER
➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special
characters to be used without invoking their special meaning.
If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a
special meaning.
For example, we can find money amounts with the following regular expression.
>>>import re
>>>x = 'We just received $10.00 for cookies.’
>>>y = re.findall(‘$[0-9.]+’,x)
>>> y
['$10.00']
SUMMARY
Character Meaning
ˆ Matches the beginning of the line
$ Matches the end of the line
. Matches any character (a wildcard)
s Matches a whitespace character
S Matches a non-whitespace character (opposite of s)
* Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s)
*? Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s) in “non-greedy mode”
+ Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s)
+? Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s) in “non-greedy mode”.
[aeiou] Matches a single character as long as that character is in the specified set. In this example, it would
match “a”, “e”, “i”, “o”, or “u”, but no other characters.
[a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that
must be a lowercase letter or a digit.
Character Meaning
[ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches
a single character that is anything other than an uppercase or lowercase letter.
( ) When parentheses are added to a regular expression, they are ignored for the purpose of
matching, but allow you to extract a particular subset of the matched string rather than the
whole string when using findall()
b Matches the empty string, but only at the start or end of a word.
B Matches the empty string, but not at the start or end of a word
d Matches any decimal digit; equivalent to the set [0-9].
D Matches any non-digit character; equivalent to the set [ˆ0-9]
ASSIGNMENT
1) Write a python program to check the validity of a Password In this program, we will be taking a
password as a combination of alphanumeric characters along with special characters, and check whether
the password is valid or not with the help of few conditions.
Primary conditions for password validation :
1.Minimum 8 characters.
2.The alphabets must be between [a-z]
3.At least one alphabet should be of Upper Case [A-Z]
4.At least 1 number or digit between [0-9].
5.At least 1 character from [ _ or @ or $ ].
2) Write a pattern for the following:
Pattern to extract lines starting with the word From (or from) and ending with edu.
Pattern to extract lines ending with any digit.
Start with upper case letters and end with digits.
Search for the first white-space character in the string and display its position.
Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"
THANK
YOU

Contenu connexe

Tendances

Python variables and data types.pptx
Python variables and data types.pptxPython variables and data types.pptx
Python variables and data types.pptxAkshayAggarwal79
 
Constructors and destructors
Constructors and destructorsConstructors and destructors
Constructors and destructorsNilesh Dalvi
 
Modules and packages in python
Modules and packages in pythonModules and packages in python
Modules and packages in pythonTMARAGATHAM
 
Basics of Object Oriented Programming in Python
Basics of Object Oriented Programming in PythonBasics of Object Oriented Programming in Python
Basics of Object Oriented Programming in PythonSujith Kumar
 
2. python basic syntax
2. python   basic syntax2. python   basic syntax
2. python basic syntaxSoba Arjun
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical AnalysisMunni28
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAMaulik Borsaniya
 
Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...Edureka!
 
Java oops PPT
Java oops PPTJava oops PPT
Java oops PPTkishu0005
 
Conditional and control statement
Conditional and control statementConditional and control statement
Conditional and control statementnarmadhakin
 
Asp.NET Validation controls
Asp.NET Validation controlsAsp.NET Validation controls
Asp.NET Validation controlsGuddu gupta
 
File handling in Python
File handling in PythonFile handling in Python
File handling in PythonMegha V
 

Tendances (20)

Python : Regular expressions
Python : Regular expressionsPython : Regular expressions
Python : Regular expressions
 
Python variables and data types.pptx
Python variables and data types.pptxPython variables and data types.pptx
Python variables and data types.pptx
 
Vectors in Java
Vectors in JavaVectors in Java
Vectors in Java
 
Oop concepts in python
Oop concepts in pythonOop concepts in python
Oop concepts in python
 
Constructors and destructors
Constructors and destructorsConstructors and destructors
Constructors and destructors
 
Modules and packages in python
Modules and packages in pythonModules and packages in python
Modules and packages in python
 
Basics of Object Oriented Programming in Python
Basics of Object Oriented Programming in PythonBasics of Object Oriented Programming in Python
Basics of Object Oriented Programming in Python
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 
2. python basic syntax
2. python   basic syntax2. python   basic syntax
2. python basic syntax
 
Java String
Java String Java String
Java String
 
Function
FunctionFunction
Function
 
File Handling in Python
File Handling in PythonFile Handling in Python
File Handling in Python
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
 
Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...Python File Handling | File Operations in Python | Learn python programming |...
Python File Handling | File Operations in Python | Learn python programming |...
 
Java oops PPT
Java oops PPTJava oops PPT
Java oops PPT
 
Beyond syllabus for web technology
Beyond syllabus for web technologyBeyond syllabus for web technology
Beyond syllabus for web technology
 
Conditional and control statement
Conditional and control statementConditional and control statement
Conditional and control statement
 
Asp.NET Validation controls
Asp.NET Validation controlsAsp.NET Validation controls
Asp.NET Validation controls
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
 

Similaire à Python Regular Expressions

Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxDave Tan
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptxDurgaNayak4
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierherosaikiran
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007Geoffrey Dunn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsRaj Gupta
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxalfred4lewis58146
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxSonu62614
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsMegha V
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdfDarellMuchoko
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editorJose Pla
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYvikram mahendra
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++Azeemaj101
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20Max Kleiner
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressionsKrishna Nanda
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)Chirag Shetty
 

Similaire à Python Regular Expressions (20)

Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
P3 2017 python_regexes
P3 2017 python_regexesP3 2017 python_regexes
P3 2017 python_regexes
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifier
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
 
php string part 4
php string part 4php string part 4
php string part 4
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operations
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
 
lecture_lex.pdf
lecture_lex.pdflecture_lex.pdf
lecture_lex.pdf
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editor
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressions
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 

Plus de BMS Institute of Technology and Management (10)

Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels
 
Pytho_tuples
Pytho_tuplesPytho_tuples
Pytho_tuples
 
Pytho dictionaries
Pytho dictionaries Pytho dictionaries
Pytho dictionaries
 
Pytho lists
Pytho listsPytho lists
Pytho lists
 
Introduction to the Python
Introduction to the PythonIntroduction to the Python
Introduction to the Python
 
15CS562 AI VTU Question paper
15CS562 AI VTU Question paper15CS562 AI VTU Question paper
15CS562 AI VTU Question paper
 
weak slot and filler
weak slot and fillerweak slot and filler
weak slot and filler
 
strong slot and filler
strong slot and fillerstrong slot and filler
strong slot and filler
 
Problems, Problem spaces and Search
Problems, Problem spaces and SearchProblems, Problem spaces and Search
Problems, Problem spaces and Search
 
Introduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examplesIntroduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examples
 

Dernier

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Dernier (20)

Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

Python Regular Expressions

  • 1. MODULE 3 – PART 4 REGULAR EXPRESSIONS By, Ravi Kumar B N Assistant professor, Dept. of CSE BMSIT & M
  • 2. ➢ Regular expression is a sequence of characters that define a search pattern. ➢ patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. ➢ The regular expression library “re” must be imported into our program before we can use it. INTRODUCTION
  • 3. ➢ search() function: used to search for a particular string. will only return the first occurrence that matches the specified pattern. This function is available in “re” library. ➢ the caret character (^) : is used in regular expressions to match the beginning of a line. ➢ The dollar character ($) : is used in regular expressions to match the end of a line. Example: program to match only lines where “From:” is at the beginning of the line import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search('^From:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 mbox1.txt From:stephen Sat Jan 5 09:14:16 2008 Return-Path: <postmaster@collab.sakaiproject.org> From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 Subject: [sakai] svn commit: From:zqian@umich.edu Fri Jan 4 16:10:39 2008 Return-Path: <postmaster@collab.sakaiproject.org> ✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the string library. SEARCH() FUNCTION:
  • 4. ➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full stop, which matches any character. The regular expression “F..m:” would match any of the following strings since the period characters in the regular expression match any character. “From:”, “Fxxm:”, “F12m:”, or “F!@m:” ➢ The program in the previous slide is rewritten using dot character which gives the same output CHARACTER MATCHING IN REGULAR EXPRESSIONS import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^F..m:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008
  • 5. Character can be repeated any number of times using the “*” or “+” characters in a regular expression. ➢ The Asterisk character (*) : matches zero-or-more characters ➢ The Plus character (+) : matches one-or-more characters Example: Program to match lines that start with “From:”, followed by mail-id import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^From:.+@', line) : print(line) #Output From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 ✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters between the colon character and the at-sign.
  • 6. ➢ non-whitespace character (S) - matches one non-whitespace character ➢findall() function: It is used to search for “all” occurrences that match a given pattern. In contrast, search() function will only return the first occurrence that matches the specified pattern. import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.findall('S+@S+', s) print(lst) #output ['csev@umich.edu', 'cwen@iupui.edu'] Example1: Program returns a list of all of the strings that look like email addresses from a given line. # same program using search() it will display only first mail id or first matching string import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.search('S+@S+', s) print(lst) #output <re.Match object; span=(11, 25), match='csev@umich.edu'> 'S+@S+’ this regular expression matches substrings that have at least one non-whitespace character, followed by an at-sign, followed by at least one more non-whitespace character
  • 7. Example2: Program returns a list of all of the strings that look like email addresses from a given file. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('S+@S+', line) if len(x) > 0 : print(x) #Output ['<postmaster@collab.sakaiproject.org>'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['<postmaster@collab.sakaiproject.org>'] ➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we are willing to consider matching. Example: [a-z] matches single lowercase letter [A-Z] matches single uppercase letter [a-zA-Z] matches single lowercase letter or uppercase letter [a-zA-Z0-9] matches single lowercase letter or uppercase letter or number Some of our email addresses have incorrect characters like “<” or “;” at the beginning or end. we are only interested in the portion of the string that starts and ends with a letter or a number. To get the proper output we have to use following character.
  • 8. [amk] matches 'a', 'm', or ’k’ [(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’ [0-5][0-9] matches all the two-digits numbers from 00 to 59 ➢ Characters that are not within a range can be matched by complementing the set If the first character of the set is '^', all the characters that are not in the set will be matched. For example, [^5] will match any character except ’5’ Ex: Program returns list of all email addresses in proper format. import re hand = open('mbox.txt') for line in hand: line = line.rstrip() x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line) if len(x) > 0 : print(x) #output ['postmaster@collab.sakaiproject.org'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['postmaster@collab.sakaiproject.org'] [a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a single lowercase letter, uppercase letter, or number “[a-zA- Z0-9]”, followed by zero or more non-blank characters “S*”, followed by an at-sign, followed by zero or more non-blank characters “S*”, followed by an uppercase or lowercase letter “[a-zA-Z]”.
  • 9. SEARCH AND EXTRACT import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() if re.search('^XS*: [0-9.]+', line) : print(line) #Output X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Example1: Find numbers on lines that start with the string “X-” lines such as: X-DSPAM-Confidence: 0.8475 ➢ parentheses “()” in regular expression : used to extract a portion of the substring that matches the regular expression. import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() x = re.findall('^XS*: ([0-9.]+)', line) if len(x) > 0 : print(x) Search #Output ['0.8475’] Extract ['0.9245'] mbox2.txt From: stephen.marquard@uct.ac.za Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten impl/impl/src/java/org X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8 X-Content-Type-Message-Body: text/plain; charset=UTF-8 Content-Type: text/plain; charset=UTF-8 X-DSPAM-Result: Innocent X-DSPAM-Processed: Sat Jan 5 09:14:16 2008 X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Above output has entire line we only want to extract numbers from lines that have the above syntax
  • 10. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('^From.* ([0-3][0-9]):', line) if len(x) > 0 : print(x) #Output ['09'] ['16'] ['16'] Example2: Program to print the day of received mails
  • 11. RANDOM EXECUTION >>> s=" 0.9 .90 1.0 1. 138 pqr“ >>> re.findall('[0-9.]+',s) ['0.9', '.90', '1.0', '1.', '138’] >>> re.findall('[0-9]+[.][0-9]',s) ['0.9', '1.0’] >>> re.findall('[0-9]+[.][0-9]+',s) ['0.9', '1.0'] >>> re.findall('[0-9]*[.][0-9]+’,s) ['0.9', '.90', '1.0’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“ >>> re.findall('1bycs...',usn) ['1bycs123', '1bycs009’] >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190" >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009', '1vecs112', '1svcs190’] >>> re.findall('[0-9]+cs[0-9]+',usn) [] >>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn) ['123', '009', '112', '190']
  • 12. ESCAPE CHARACTER ➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special characters to be used without invoking their special meaning. If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a special meaning. For example, we can find money amounts with the following regular expression. >>>import re >>>x = 'We just received $10.00 for cookies.’ >>>y = re.findall(‘$[0-9.]+’,x) >>> y ['$10.00']
  • 13. SUMMARY Character Meaning ˆ Matches the beginning of the line $ Matches the end of the line . Matches any character (a wildcard) s Matches a whitespace character S Matches a non-whitespace character (opposite of s) * Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) *? Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) in “non-greedy mode” + Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) +? Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) in “non-greedy mode”. [aeiou] Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters. [a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.
  • 14. Character Meaning [ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter. ( ) When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall() b Matches the empty string, but only at the start or end of a word. B Matches the empty string, but not at the start or end of a word d Matches any decimal digit; equivalent to the set [0-9]. D Matches any non-digit character; equivalent to the set [ˆ0-9]
  • 15. ASSIGNMENT 1) Write a python program to check the validity of a Password In this program, we will be taking a password as a combination of alphanumeric characters along with special characters, and check whether the password is valid or not with the help of few conditions. Primary conditions for password validation : 1.Minimum 8 characters. 2.The alphabets must be between [a-z] 3.At least one alphabet should be of Upper Case [A-Z] 4.At least 1 number or digit between [0-9]. 5.At least 1 character from [ _ or @ or $ ]. 2) Write a pattern for the following: Pattern to extract lines starting with the word From (or from) and ending with edu. Pattern to extract lines ending with any digit. Start with upper case letters and end with digits. Search for the first white-space character in the string and display its position. Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"