SlideShare une entreprise Scribd logo
Looking for Patterns - Finding
them with Regular
Expressions
Presented by Keith Wright
One Course Source
keith@OneCourseSource.com
From http://xkcd.com/1171/
If this is how you think of regular expression now…
Regular expressions…
REGULAR EXPRESSIONS ARE…
➢Strings used to search for patterns in text
➢More powerful than wildcards
➢Available in many programming languages and
programs
➢Also known as "regexp", "RegEx", and "RE"
RE DOS AND DON'TS…
✔ Input Validation
✔ Data Extraction
✔ Data Elimination
✔ Search/Replace
Do this… Don't do this…
✗Parsing
✗Allow publicly available searches
✗Use where better tools exists
✗Where using a procedure would be better
RE ARE AVAILABLE IN…AND MORE!
 .NET
 C#
 Delphi
 Java
 JavaScript
 Perl
 PCRE
 PHP
 Python
 Ruby
 Tcl
 PowerShell
POSIX PROGRAMS USING RE
awk
pattern scanning and
processing language
find
utility to search for files
grep
utility to print lines
matching a pattern
sed
stream editor for filtering
and transforming text
POSIX PROGRAMS SUPPORT RE…
Basic Regular Expressions (BRE)
Character classes [ ]
Named Character classes
[[:digit:]]
Asterisk *
Dot .
Carat ^
Dollar $
Backslashed Braces { }
Backslashed Parens ( )
Extended Regular Expressions (ERE)
Question mark ?
Plus sign +
Pipe symbol |
Braces { }
Parentheses ( )
All other BRE
grep [options] 'pattern' [file…]
grep is command line tool for
printing lines that match a pattern
Useful for demonstrating how
regular expressions work
By default, grep interprets regular
expressions as BRE
Using egrep, or grep -E interprets
regular expressions as ERE
• --color=auto highlights the part of the
line that matched the pattern
• -i is used to make grep case-
insensitive
• -c is used to have grep report a count
of the lines that matched
• -v is used to print the lines that don't
match the pattern
BASIC RE LITERALS
Alphanumeric characters and
non-regular expression
characters match themselves
Regular expression characters
will match themselves if
preceded by the backslash
character
RE DOT (PERIOD)
The dot . will match any single
character
To match the dot itself, it must be
preceded by a backslash
The RE .* is used to match an
entire string
RE CHARACTER CLASSES
Character classes match a single
character in the list or range enclosed
by brackets [ ]
If the first character enclosed is the
carat ^, then the list or range is
negated
To match the right square bracket ] it
must be the first character enclosed.
To not match it, it must be the second
character after a carat
To match a hyphen, it can be the first
or last character enclosed. To not
match it, it must be the second
character after a carat
RE NAMED CHARACTER CLASSES
Named character classes must
be enclosed in brackets like
[[:xdigit:]]
Many are available: [:alnum:],
[:alpha:], [:cntrl:], [:digit:],
[:graph:], [:lower:], [:print:],
[:punct:], [:space:], [:upper:],
and [:xdigit:]
RE CARAT ANCHOR
The character after the carat
character ^ must appear at the
beginning of the text
If used as the first character in
square brackets, it negates the list
or range of characters
If preceded by the backslash, the
carat character loses it's special
meaning
RE DOLLAR SIGN ANCHOR
The character before the dollar
sign character $ must appear at
the end of the text
If not at the end of the regular
expression, then the dollar sign
loses it's special meaning
When combined with the carat
character ^, the dollar sign
character $ must match the entire
text
RE REPETITION
Basic Regular Expressions
* preceding item repeated zero or more
times or {0,}
+ preceding item repeated one or more
times or {1,}
? preceding item is optional or {0,1}
{n} preceding item repeated exactly n
times
{n,} preceding item repeated n or more
times
{,m} preceding item matched at most m
times
{n,m} preceding item matched at least n
times, but not more than m times
Extended Regular Expressions
* preceding item repeated zero or more
times or {0,}
+ preceding item repeated one or more
times or {1,}
? preceding item is optional or {0,1}
{n} preceding item repeated exactly n
times
{n,} preceding item repeated n or more
times
{,m} preceding item matched at most m
times
{n,m} preceding item matched at least n
times, but not more than m times
RE ASTERISK
The asterisk * will match zero or
more of the item that precedes it
The asterisk is equivalent to the
BRE {0,} and the ERE {0,}
expressions for zero or more
A single item followed by an
asterisk will always match
To match an asterisk, it can be
preceded by a backslash
RE PLUS SIGN
In BRE, the backslashed plus sign +
will match one or more of the item
that precedes it
In ERE, the plus sign + will match one
or more of the item that precedes it
The plus sign is equivalent to the
BRE {1,} and the ERE {1,}
expressions for one or more
In BRE, the plus sign matches itself. In
ERE to match a plus sign, it can be
preceded by a backslash
RE QUESTION MARK
In BRE, the backslashed
question mark ? optionally
matches the item that
precedes it
In ERE, the question mark will
optionally match the item that
precedes it
The question mark equivalent
to the BRE {0,1} and the ERE
{0,1} expressions for zero to one
In BRE, the question mark
matches itself. In ERE to match
a question mark, it can be
preceded by a backslash
RE GROUPING
In BRE, the backslashed parentheses ( and ) are
used to create groups of characters that may
repeat as specified by repetition expressions
In ERE, the parentheses ( and ) are used to create
groups of characters that may repeat as specified
by repetition expressions
In BRE, the parentheses will match themselves, and
in ERE they can be matched if backslashed
RE ALTERNATION
In ERE, the pipe symbol | can
be used to perform alternation
Alternation allows for two or
more alternatives to match as
separated by the pipe symbol |
In BRE, the pipe symbol | will
match itself, and in ERE it will
match if backslashed
PERL US POSTAL CODE EXAMPLE
^d{5}((-|s)?d{4})?$
^ - Starts with
d{5} - exactly five digits
()? - optional group (two)
-|s - hyphen or whitespace
d{4} - exactly four digits
$ - Ends with
To use the perl debugger
type:
perl -d -e1
PERL CHARACTER SEQUENCES
w Alphanumeric and _ (word
characters)
W Not word characters
d Digit characters
D Not digit characters
s Whitespace characters
S Not whitespace characters
b Word boundaries
• grep supports the perl character
sequences in ERE except d
and D
PYTHON PROTOCOL EXAMPLE
(mailto:|(news|(ht|f)tp(s?))://){1}
(){1} - group repeats only once
mailto: - mailto followed by a
colon
| - separates alternatives
news|(ht|f)tp - news, http or ftp
(ht|f)tp(s?) - optional s added
:// - added to news, http, https,
ftp, or ftps
• To start the python shell type:
python
USE THE LIBRARY
RegExLib.com
The Regular Expression Library
Comes with a cheat sheet
A Regular Expression tester
Search thousands of rated expressions
You don't have to reinvent the wheel!
From http://xkcd.com/208/
About One Course Source
➢Online public classes (Linux, Programming & Security)
➢Custom corporate classes
➢Develop custom training programs
www.OneCourseSource.com

Contenu connexe

Tendances

The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++
Anjesh Tuladhar
 
Regular expression
Regular expressionRegular expression
Regular expression
Larry Nung
 
Introduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_RIntroduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_R
Hellen Gakuruh
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
Danny Bryant
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
Max Kleiner
 
Regular Expressions in Stata
Regular Expressions in StataRegular Expressions in Stata
Regular Expressions in Stata
John Ong'ala Lunalo
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
arnolambert
 
Regular Expressions in PHP, MySQL by programmerblog.net
Regular Expressions in PHP, MySQL by programmerblog.netRegular Expressions in PHP, MySQL by programmerblog.net
Regular Expressions in PHP, MySQL by programmerblog.net
Programmer Blog
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
Matt Casto
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
Ben Brumfield
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
Chirag Shetty
 
PHP Regular Expressions
PHP Regular ExpressionsPHP Regular Expressions
PHP Regular Expressions
Jussi Pohjolainen
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
primeteacher32
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
Mahzad Zahedi
 
Regex posix
Regex posixRegex posix
Regex posix
sana mateen
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Eran Zimbler
 
Regular Expressions in PHP
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHP
Andrew Kandels
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
Mesut Günes
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
Raj Rajandran
 

Tendances (20)

The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Introduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_RIntroduction_to_Regular_Expressions_in_R
Introduction_to_Regular_Expressions_in_R
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
 
Regular Expressions in Stata
Regular Expressions in StataRegular Expressions in Stata
Regular Expressions in Stata
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Regular Expressions in PHP, MySQL by programmerblog.net
Regular Expressions in PHP, MySQL by programmerblog.netRegular Expressions in PHP, MySQL by programmerblog.net
Regular Expressions in PHP, MySQL by programmerblog.net
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 
PHP Regular Expressions
PHP Regular ExpressionsPHP Regular Expressions
PHP Regular Expressions
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regex posix
Regex posixRegex posix
Regex posix
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expressions in PHP
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHP
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
 

Similaire à Looking for Patterns

Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
Max Kleiner
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
PHP Conference Argentina
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
azzamhadeel89
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Ивелин Кирилов
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
Jun Shimizu
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
Binsent Ribera
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
Regular expressions using Python
Regular expressions using PythonRegular expressions using Python
Regular expressions using Python
Md. Shafiuzzaman Hira
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
James Armes
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
Prof. Wim Van Criekinge
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
Ben Brumfield
 
Working with text, Regular expressions
Working with text, Regular expressionsWorking with text, Regular expressions
Working with text, Regular expressions
Krasimir Berov (Красимир Беров)
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
Ahmed El-Arabawy
 
Regular Expression Cheat Sheet
Regular Expression Cheat SheetRegular Expression Cheat Sheet
Regular Expression Cheat Sheet
SydneyJohnson57
 
Les08
Les08Les08
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
An Introduction to Regular expressions
An Introduction to Regular expressionsAn Introduction to Regular expressions
An Introduction to Regular expressions
Yamagata Europe
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
Prof. Wim Van Criekinge
 

Similaire à Looking for Patterns (20)

Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
Regular expressions using Python
Regular expressions using PythonRegular expressions using Python
Regular expressions using Python
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
 
Working with text, Regular expressions
Working with text, Regular expressionsWorking with text, Regular expressions
Working with text, Regular expressions
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
 
Regular Expression Cheat Sheet
Regular Expression Cheat SheetRegular Expression Cheat Sheet
Regular Expression Cheat Sheet
 
Les08
Les08Les08
Les08
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
 
An Introduction to Regular expressions
An Introduction to Regular expressionsAn Introduction to Regular expressions
An Introduction to Regular expressions
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
 

Dernier

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Dernier (20)

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

Looking for Patterns

  • 1. Looking for Patterns - Finding them with Regular Expressions Presented by Keith Wright One Course Source keith@OneCourseSource.com
  • 2. From http://xkcd.com/1171/ If this is how you think of regular expression now… Regular expressions…
  • 3. REGULAR EXPRESSIONS ARE… ➢Strings used to search for patterns in text ➢More powerful than wildcards ➢Available in many programming languages and programs ➢Also known as "regexp", "RegEx", and "RE"
  • 4. RE DOS AND DON'TS… ✔ Input Validation ✔ Data Extraction ✔ Data Elimination ✔ Search/Replace Do this… Don't do this… ✗Parsing ✗Allow publicly available searches ✗Use where better tools exists ✗Where using a procedure would be better
  • 5. RE ARE AVAILABLE IN…AND MORE!  .NET  C#  Delphi  Java  JavaScript  Perl  PCRE  PHP  Python  Ruby  Tcl  PowerShell
  • 6. POSIX PROGRAMS USING RE awk pattern scanning and processing language find utility to search for files grep utility to print lines matching a pattern sed stream editor for filtering and transforming text
  • 7. POSIX PROGRAMS SUPPORT RE… Basic Regular Expressions (BRE) Character classes [ ] Named Character classes [[:digit:]] Asterisk * Dot . Carat ^ Dollar $ Backslashed Braces { } Backslashed Parens ( ) Extended Regular Expressions (ERE) Question mark ? Plus sign + Pipe symbol | Braces { } Parentheses ( ) All other BRE
  • 8. grep [options] 'pattern' [file…] grep is command line tool for printing lines that match a pattern Useful for demonstrating how regular expressions work By default, grep interprets regular expressions as BRE Using egrep, or grep -E interprets regular expressions as ERE • --color=auto highlights the part of the line that matched the pattern • -i is used to make grep case- insensitive • -c is used to have grep report a count of the lines that matched • -v is used to print the lines that don't match the pattern
  • 9. BASIC RE LITERALS Alphanumeric characters and non-regular expression characters match themselves Regular expression characters will match themselves if preceded by the backslash character
  • 10. RE DOT (PERIOD) The dot . will match any single character To match the dot itself, it must be preceded by a backslash The RE .* is used to match an entire string
  • 11. RE CHARACTER CLASSES Character classes match a single character in the list or range enclosed by brackets [ ] If the first character enclosed is the carat ^, then the list or range is negated To match the right square bracket ] it must be the first character enclosed. To not match it, it must be the second character after a carat To match a hyphen, it can be the first or last character enclosed. To not match it, it must be the second character after a carat
  • 12. RE NAMED CHARACTER CLASSES Named character classes must be enclosed in brackets like [[:xdigit:]] Many are available: [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]
  • 13. RE CARAT ANCHOR The character after the carat character ^ must appear at the beginning of the text If used as the first character in square brackets, it negates the list or range of characters If preceded by the backslash, the carat character loses it's special meaning
  • 14. RE DOLLAR SIGN ANCHOR The character before the dollar sign character $ must appear at the end of the text If not at the end of the regular expression, then the dollar sign loses it's special meaning When combined with the carat character ^, the dollar sign character $ must match the entire text
  • 15. RE REPETITION Basic Regular Expressions * preceding item repeated zero or more times or {0,} + preceding item repeated one or more times or {1,} ? preceding item is optional or {0,1} {n} preceding item repeated exactly n times {n,} preceding item repeated n or more times {,m} preceding item matched at most m times {n,m} preceding item matched at least n times, but not more than m times Extended Regular Expressions * preceding item repeated zero or more times or {0,} + preceding item repeated one or more times or {1,} ? preceding item is optional or {0,1} {n} preceding item repeated exactly n times {n,} preceding item repeated n or more times {,m} preceding item matched at most m times {n,m} preceding item matched at least n times, but not more than m times
  • 16. RE ASTERISK The asterisk * will match zero or more of the item that precedes it The asterisk is equivalent to the BRE {0,} and the ERE {0,} expressions for zero or more A single item followed by an asterisk will always match To match an asterisk, it can be preceded by a backslash
  • 17. RE PLUS SIGN In BRE, the backslashed plus sign + will match one or more of the item that precedes it In ERE, the plus sign + will match one or more of the item that precedes it The plus sign is equivalent to the BRE {1,} and the ERE {1,} expressions for one or more In BRE, the plus sign matches itself. In ERE to match a plus sign, it can be preceded by a backslash
  • 18. RE QUESTION MARK In BRE, the backslashed question mark ? optionally matches the item that precedes it In ERE, the question mark will optionally match the item that precedes it The question mark equivalent to the BRE {0,1} and the ERE {0,1} expressions for zero to one In BRE, the question mark matches itself. In ERE to match a question mark, it can be preceded by a backslash
  • 19. RE GROUPING In BRE, the backslashed parentheses ( and ) are used to create groups of characters that may repeat as specified by repetition expressions In ERE, the parentheses ( and ) are used to create groups of characters that may repeat as specified by repetition expressions In BRE, the parentheses will match themselves, and in ERE they can be matched if backslashed
  • 20. RE ALTERNATION In ERE, the pipe symbol | can be used to perform alternation Alternation allows for two or more alternatives to match as separated by the pipe symbol | In BRE, the pipe symbol | will match itself, and in ERE it will match if backslashed
  • 21. PERL US POSTAL CODE EXAMPLE ^d{5}((-|s)?d{4})?$ ^ - Starts with d{5} - exactly five digits ()? - optional group (two) -|s - hyphen or whitespace d{4} - exactly four digits $ - Ends with To use the perl debugger type: perl -d -e1
  • 22. PERL CHARACTER SEQUENCES w Alphanumeric and _ (word characters) W Not word characters d Digit characters D Not digit characters s Whitespace characters S Not whitespace characters b Word boundaries • grep supports the perl character sequences in ERE except d and D
  • 23. PYTHON PROTOCOL EXAMPLE (mailto:|(news|(ht|f)tp(s?))://){1} (){1} - group repeats only once mailto: - mailto followed by a colon | - separates alternatives news|(ht|f)tp - news, http or ftp (ht|f)tp(s?) - optional s added :// - added to news, http, https, ftp, or ftps • To start the python shell type: python
  • 24. USE THE LIBRARY RegExLib.com The Regular Expression Library Comes with a cheat sheet A Regular Expression tester Search thousands of rated expressions You don't have to reinvent the wheel!
  • 26. About One Course Source ➢Online public classes (Linux, Programming & Security) ➢Custom corporate classes ➢Develop custom training programs www.OneCourseSource.com

Notes de l'éditeur

  1. In ed or vi, g/re/p was to do a global search for the regular expression and print
  2. Backslash example: echo 'xyz^abzzz' | grep '\^ab'
  3. # Source: http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/ # Source: Abigail -- perl -wle 'print "Prime" if (1 x shift) !~ /^1?$|^(11+?)\1+$/' sub is_prime { if ((1 x shift) !~ /^1?$|^(11+?)\1+$/) { return 1; } else { return 0; } } <number>
  4. sub is_what { if ((1 x shift) !~ /^1?$|^(11+?)\1+$/) { return 1; } else { return 0; } }