SlideShare une entreprise Scribd logo
1  sur  38
Regular Expressions in Oracle

     Logan Palanisamy
Agenda

 Introduction to regular expressions
 REGEXP_* functions in Oracle
 Coffee Break
 Examples
 More examples
Meeting Basics

 Put your phones/pagers on vibrate/mute
 Messenger: Change the status to offline or
  in-meeting
 Remote attendees: Mute yourself (*6). Ask
  questions via Adobe Connect.
What are Regular Expressions?

 A way to express patterns
   credit cards, license plate numbers, vehicle identification
    numbers, voter id, driving license
 UNIX (grep, egrep), PHP, JAVA support Regular
  Expressions
 PERL made it popular
String operations before Regular Expression
               support in Oracle

 Pull the data from DB and perform it in middle tier
  or FE
 OWA_PATTERN in 9i and before
 LIKE operator
LIKE operator

 % matches zero or more of any character
 _ matches exactly one character
 Examples
    WHERE col1 LIKE 'abc%';
    WHERE col1 LIKE '%abc';
    WHERE col1 LIKE 'ab_d';
    WHERE col1 LIKE 'ab_d' escape '';
    WHERE col1 NOT LIKE 'abc%';
 Very limited functionality
    Check whether first character is numeric: where c1 like '0%' OR c1
     like '1%' OR .. .. c1 like '9%'
    Very trivial with Regular Exp: where regexp_like(c1, '^[0-9]')
Regular Expressions

Meta        Meaning
character
.           Matches any single "character" except newline.
*           Matches zero or more of the character preceding it
            e.g.: bugs*, table.*
^           Denotes the beginning of the line. ^A denotes lines starting
            with A
$           Denotes the end of the line. :$ denotes lines ending with :
           Escape character (., *, [, , etc)
[]          matches one or more characters within the brackets. e.g.
            [aeiou], [a-z], [a-zA-Z], [0-9], [:alpha:], [a-z?,!]
[^]         negation - matches any characters other than the ones
            inside brackets. eg. ^[^13579] denotes all lines not starting
            with odd numbers, [^02468]$ denotes all lines not ending
            with even numbers
                                                                   7
Extended Regular Expressions

Meta character   Meaning
|                alternation. e.g.: ho(use|me), the(y|m),
                 (they|them)
+                one or more occurrences of previous character.
?                zero or one occurrences of previous character.
{n}              exactly n repetitions of the previous char or group
{n,}             n or more repetitions of the previous char or
                 group
{,m}             zero to m repetitions of the previous char or
                 group
{n, m}           n to m repetitions of previous char or group
(....)           grouping or subexpression
n               back referencing where n stands for the nth sub-
                 expression. e.g.: 1 is the back reference for first
                 sub-expression.
                                                                 8
POSIX Character Classes

POSIX         Description
[:alnum:]     Alphanumeric characters
[:alpha:]     Alphabetic characters
[:ascii:]     ASCII characters
[:blank:]     Space and tab
[:cntrl:]     Control characters
[:digit:]     Digits, Hexadecimal digits
[:xdigit:]
[:graph:]     Visible characters (i.e. anything except spaces, control characters,
              etc.)
[:lower:]     Lowercase letters

[:print:]     Visible characters and spaces (i.e. anything except control
              characters)
[:punct:]     Punctuation and symbols.
[:space:]     All whitespace characters, including line breaks
[:upper:]     Uppercase letters
[:word:]      Word characters (letters, numbers and underscores)
Perl Character Classes

Perl   POSIX           Description
d     [[:digit:]]     [0-9]
D     [^[:digit:]]    [^0-9]
w     [[:alnum:]_]    [0-9a-zA-Z_]
W     [^[:alnum:]_]   [^0-9a-zA-Z_]
s     [[:space:]]
S     [^[:space:]]




                                         10
Tools to learn Regular Expressions

 http://www.weitz.de/regex-coach/
 http://www.regexbuddy.com/
REGEXP_* functions

 Available from 10g onwards.
 Powerful and flexible, but CPU-hungry.
 Easy and elegant, but sometimes less performant
 Usable on text literal, bind variable, or any column
  that holds character data such as CHAR, NCHAR,
  CLOB, NCLOB, NVARCHAR2, and VARCHAR2
  (but not LONG).
 Useful as column constraint for data validation
REGEXP_LIKE

 Determines whether pattern matches.
 REGEXP_LIKE (source_str, pattern,
  [,match_parameter])
 Returns TRUE or FALSE.
 Use in WHERE clause to return rows matching a pattern
 Use as a constraint
    alter table t add constraint alphanum check (regexp_like (x,
     '[[:alnum:]]'));
 Use in PL/SQL to return a boolean.
    IF (REGEXP_LIKE(v_name, '[[:alnum:]]')) THEN ..
 Can't be used in SELECT clause
 regexp_like.sql
REGEXP_SUBSTR

 Extracts the matching pattern. Returns NULL when
    nothing matches
   REGEXP_SUBSTR(source_str, pattern [, position
    [, occurrence [, match_parameter]]])
   position: character at which to begin the search.
    Default is 1
   occurrence: The occurrence of pattern you want to
    extract
   regexp_substr.sql
REGEXP_INSTR

 Returns the location of match in a string
 REGEXP_INSTR(source_str, pattern, [, position [,
  occurrence [, return_option [, match_parameter]]]])
 return_option:
    0, the default, returns the position of the first character.
    1 returns the position of the character following the occurence.
 regexp_instr.sql
REGEXP_REPLACE

 Search and Replace a pattern
 REGEXP_REPLACE(source_str, pattern
  [, replace_str] [, position [, occurrence
  [, match_parameter]]]])
 If replace_str is not specified, pattern/search_str is
  replaced with empty string
 occurence:
    when 0, the default, replaces all occurrences of the match.
    when n, any positive integer, replaces the nth occurrence.
 regexp_replace.sql
REGEXP_COUNT

 New in 11g
 Returns the number of times a pattern appears in a
  string.
 REGEXP_COUNT(source_str, pattern [,position
  [,match_param]])
 For simple patterns it is same as
  (LENGTH(source_str) –
  LENGTH(REPLACE(source_str,
  pattern)))/LENGTH(pattern)
 regexp_count.sql
Pattern Matching modifiers

 i – Specifies case-insensitive matching (ignore case)
 c – Specifies case-sensitive matching
 n – allows the period (.) to match the newline character
 m - treats the source string as multiple lines.
 x - ignores whitespace characters
 when match_parameter is not specified,
     case sensitivity is determined by NLS_SORT parameter
      (BINARY, BINARY_CI)
     A period (.) doesn't match newline character
     Source string is treated as a single line
 match_params.sql
Is a CHAR column all numeric?

 to_number(c1) returns ORA-01722:
  invalid number if a varchar2 column
  contains alpha characters.
 is_numeric.sql
Check constraints

 Put validation close to where the data is
  stored
 No need to have validation at different
  clients
 check_constraint.sql
Extract email-ids

 Find email-ids embedded in text
  strings. Possible email-id formats:
  abc123@company.com

  namex@mail.company.com

  xyz_1@yahoo.co.in

 extract_emailid.sql
Extract dates

 Extract dates embedded in text strings.
 Possible formats
 1/5/2007, 2-5-03, 12-31-2009,
  1/31/10, 2/5-10

 extract_date.sql
Extracting hostnames from URLs
 Extract hostnames/domain-names embedded in
 text strings. Possible formats
  http://us.mg201.mail.yahoo.com/dc/launch?.partner
   =sbc&.gx=1&.rand=fegr2vucbecu5
  https://www.mybank.com:8080/abc/xyz

  www.mybank.com

  ftp://www.mycharity.org/abc/xyz

 extract_hostname.sql
Convert value pairs to XML
 Input: A string such as 'remain1=1;remain2=2;'
 Output: An XML string
  <remain1><value=1></remain1>
  <remain2><value=2></remain2>
 convert_to_xml.sql
Sort IP addresses in numerical order
 Sort IP addresses, that are stored as character
  strings, in numerical order.
 Input
    10.10.20.10
    127.0.0.1
    166.22.33.44
    187.222.121.0
    20.23.23.20

 sort_ip_address.sql
Extract first name, last name, and middle initial

 Extract the first name, last name with
  an optional middle initial.
 first_last_mi.sql
Finding the Last Occurrence

 Find the last numeric sequence from a
  sequence.
 Return 567 from 'abc/123/def567/xyz'
 INSTR and SUBSTR allow backward
  search when position is negative.
  REGEXP functions don't allow backward
  search
 last_occurrence.sql
Fuzzy Match

 Tables t1 and t2 each have a
  varchar2(12) column (t1.x, t2.y).
 A row in t1 is considered a match for a
  row in t2, if any six characters in t1.x
  matches with any six characters in t2.y
 fuzzy_match.sql
The lazy operator

 ? is lazy/non-greedy quantifier
 greedy_lazy.sql
Meta-characters with multiple meanings

 Same meta characters are used with
  multiple meanings
   ^ used for anchoring and negation.

   ? used as quantifier and lazy operator

   () used for grouping or sub-expression

 metachars_with_multiple_meanings.sql
Nuances

 ? (zero or one), * (zero or more)
  could sometimes mislead you
 nuances.sql
Stored patterns


 patterns can be stored in table
  columns and be referenced in
  REGEXP functions
 No need to hard-code them

 stored_patterns.sql
Random things
 Insert a dash before the two last digits
 Remove a substring
 Get rid of useless commas from a string
 Find the word that comes immediately
  before a substring (e.g. XXX)
 Replace whole words, not its parts
 Trimming the trailing digits
 random.sql
A few other points

 When not to use Regular Expressions
   If the same thing could be used without regular
    expressions and without too much coding.
 POSIX notations need double brackets [[:upper]]. [:upper:]
  won't work. [[:UPPER:]] won't work either. It has to be in
  lower case letters.
 Locale support provided with Collation Element ][.ce.]],
  and Equivalence Classes [[=e=]]
 MySQL supports regular expressions with RLIKE
References

 Oracle® Database Advanced Application
    Developer's Guide
    (http://download.oracle.com/docs/cd/E11882_0
    1/appdev.112/e17125/adfns_regexp.htm#CHDGH
    BHF)
   Anti-Patterns in Regular Expressions:
    http://gennick.com/antiregex.html
   First Expressions. An article by Jonathan Gennick Oracle
    Magazine, Sep/Oct 2003.
   Oracle Regular Expressions Pocket Reference by
    Gonathan Gennick
   http://examples.oreilly.com/9780596006013/Re
    gexPocketData.sql
References ...

 http://www.psoug.org/reference/regexp.html
 http://download.oracle.com/docs/cd/E11882_01/se
  rver.112/e10592/conditions007.htm#SQLRF00501
 http://www.oracle.com/technology/pub/articles/sat
  ernos_regexp.html
 http://www.oracle.com/technology/products/datab
  ase/application_development/pdf/TWP_Regular_E
  xpressions.pdf
 http://asktom.oracle.com/pls/asktom/asktom.searc
  h?p_string=regexp_
References ...

 http://www.oracle.com/technology/obe/10gr2_db_single/de
    velop/regexp/regexp_otn.htm
   http://www.oracle.com/technology/sample_code/tech/pl_sq
    l/index.html
   http://forums.oracle.com/forums/thread.jspa?threadID=427
    716
   http://forums.oracle.com/forums/search.jspa?threadID=&q=
    regular+expression&objID=f75&dateRange=all&userID=&nu
    mResults=120&rankBy=9
   http://www.oracle.com/technology/sample_code/tech/pl_sq
    l/index.html
   http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUE
    STION_ID:2200894550208#1568589800346862515
Q&A

 devel_oracle@

Contenu connexe

Tendances

Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuning
OSSCube
 

Tendances (20)

Chapter 4 Structured Query Language
Chapter 4 Structured Query LanguageChapter 4 Structured Query Language
Chapter 4 Structured Query Language
 
MYSQL join
MYSQL joinMYSQL join
MYSQL join
 
C# quick ref (bruce 2016)
C# quick ref (bruce 2016)C# quick ref (bruce 2016)
C# quick ref (bruce 2016)
 
Sql and Sql commands
Sql and Sql commandsSql and Sql commands
Sql and Sql commands
 
Aggregate functions in SQL.pptx
Aggregate functions in SQL.pptxAggregate functions in SQL.pptx
Aggregate functions in SQL.pptx
 
Oracle: Joins
Oracle: JoinsOracle: Joins
Oracle: Joins
 
Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)
 
set operators.pptx
set operators.pptxset operators.pptx
set operators.pptx
 
Indexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuningIndexing the MySQL Index: Key to performance tuning
Indexing the MySQL Index: Key to performance tuning
 
SQL
SQLSQL
SQL
 
MySQL Cursors
MySQL CursorsMySQL Cursors
MySQL Cursors
 
Introduction to SQL
Introduction to SQLIntroduction to SQL
Introduction to SQL
 
Sql injection
Sql injectionSql injection
Sql injection
 
SQL Functions
SQL FunctionsSQL Functions
SQL Functions
 
SQL BUILT-IN FUNCTION
SQL BUILT-IN FUNCTIONSQL BUILT-IN FUNCTION
SQL BUILT-IN FUNCTION
 
Regular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular ExpressionsRegular Expressions 101 Introduction to Regular Expressions
Regular Expressions 101 Introduction to Regular Expressions
 
Index in sql server
Index in sql serverIndex in sql server
Index in sql server
 
Advanced Sql Training
Advanced Sql TrainingAdvanced Sql Training
Advanced Sql Training
 
Basic sql Commands
Basic sql CommandsBasic sql Commands
Basic sql Commands
 
SQL JOINS
SQL JOINSSQL JOINS
SQL JOINS
 

En vedette

En vedette (14)

Oracle regular expressions
Oracle regular expressionsOracle regular expressions
Oracle regular expressions
 
Aprende html efectivo - Resumen Capítulo 1
Aprende html efectivo - Resumen Capítulo 1Aprende html efectivo - Resumen Capítulo 1
Aprende html efectivo - Resumen Capítulo 1
 
Presentacion de dropbox
Presentacion de dropboxPresentacion de dropbox
Presentacion de dropbox
 
Resumen de DropBox
Resumen de DropBoxResumen de DropBox
Resumen de DropBox
 
Bio2#8
Bio2#8Bio2#8
Bio2#8
 
Logmein presentación
Logmein presentaciónLogmein presentación
Logmein presentación
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
File000173
File000173File000173
File000173
 
3 Steps to Fix Your Customer Support Strategy
3 Steps to Fix Your Customer Support Strategy3 Steps to Fix Your Customer Support Strategy
3 Steps to Fix Your Customer Support Strategy
 
Presentación dropbox
Presentación dropboxPresentación dropbox
Presentación dropbox
 
Utilitarios
UtilitariosUtilitarios
Utilitarios
 
Software utilitario presentacion
Software utilitario presentacionSoftware utilitario presentacion
Software utilitario presentacion
 
Software de aplicación Programas utilitarios
Software de aplicación   Programas utilitariosSoftware de aplicación   Programas utilitarios
Software de aplicación Programas utilitarios
 
Paginas de matematicas
Paginas de matematicasPaginas de matematicas
Paginas de matematicas
 

Similaire à Regular expressions in oracle

Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 

Similaire à Regular expressions in oracle (20)

SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
 
Les08
Les08Les08
Les08
 
A regex ekon16
A regex ekon16A regex ekon16
A regex ekon16
 
Python regular expressions
Python regular expressionsPython regular expressions
Python regular expressions
 
Regular expression
Regular expressionRegular expression
Regular expression
 
ANSI C REFERENCE CARD
ANSI C REFERENCE CARDANSI C REFERENCE CARD
ANSI C REFERENCE CARD
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Diploma ii cfpc u-4 function, storage class and array and strings
Diploma ii  cfpc u-4 function, storage class and array and stringsDiploma ii  cfpc u-4 function, storage class and array and strings
Diploma ii cfpc u-4 function, storage class and array and strings
 
2 data types and operators in r
2 data types and operators in r2 data types and operators in r
2 data types and operators in r
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
 
Mcai pic u 4 function, storage class and array and strings
Mcai pic u 4 function, storage class and array and stringsMcai pic u 4 function, storage class and array and strings
Mcai pic u 4 function, storage class and array and strings
 
function, storage class and array and strings
 function, storage class and array and strings function, storage class and array and strings
function, storage class and array and strings
 
Btech i pic u-4 function, storage class and array and strings
Btech i pic u-4 function, storage class and array and stringsBtech i pic u-4 function, storage class and array and strings
Btech i pic u-4 function, storage class and array and strings
 
Functions torage class and array and strings-
Functions torage class and array and strings-Functions torage class and array and strings-
Functions torage class and array and strings-
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Regular expressions in oracle

  • 1. Regular Expressions in Oracle Logan Palanisamy
  • 2. Agenda  Introduction to regular expressions  REGEXP_* functions in Oracle  Coffee Break  Examples  More examples
  • 3. Meeting Basics  Put your phones/pagers on vibrate/mute  Messenger: Change the status to offline or in-meeting  Remote attendees: Mute yourself (*6). Ask questions via Adobe Connect.
  • 4. What are Regular Expressions?  A way to express patterns  credit cards, license plate numbers, vehicle identification numbers, voter id, driving license  UNIX (grep, egrep), PHP, JAVA support Regular Expressions  PERL made it popular
  • 5. String operations before Regular Expression support in Oracle  Pull the data from DB and perform it in middle tier or FE  OWA_PATTERN in 9i and before  LIKE operator
  • 6. LIKE operator  % matches zero or more of any character  _ matches exactly one character  Examples  WHERE col1 LIKE 'abc%';  WHERE col1 LIKE '%abc';  WHERE col1 LIKE 'ab_d';  WHERE col1 LIKE 'ab_d' escape '';  WHERE col1 NOT LIKE 'abc%';  Very limited functionality  Check whether first character is numeric: where c1 like '0%' OR c1 like '1%' OR .. .. c1 like '9%'  Very trivial with Regular Exp: where regexp_like(c1, '^[0-9]')
  • 7. Regular Expressions Meta Meaning character . Matches any single "character" except newline. * Matches zero or more of the character preceding it e.g.: bugs*, table.* ^ Denotes the beginning of the line. ^A denotes lines starting with A $ Denotes the end of the line. :$ denotes lines ending with : Escape character (., *, [, , etc) [] matches one or more characters within the brackets. e.g. [aeiou], [a-z], [a-zA-Z], [0-9], [:alpha:], [a-z?,!] [^] negation - matches any characters other than the ones inside brackets. eg. ^[^13579] denotes all lines not starting with odd numbers, [^02468]$ denotes all lines not ending with even numbers 7
  • 8. Extended Regular Expressions Meta character Meaning | alternation. e.g.: ho(use|me), the(y|m), (they|them) + one or more occurrences of previous character. ? zero or one occurrences of previous character. {n} exactly n repetitions of the previous char or group {n,} n or more repetitions of the previous char or group {,m} zero to m repetitions of the previous char or group {n, m} n to m repetitions of previous char or group (....) grouping or subexpression n back referencing where n stands for the nth sub- expression. e.g.: 1 is the back reference for first sub-expression. 8
  • 9. POSIX Character Classes POSIX Description [:alnum:] Alphanumeric characters [:alpha:] Alphabetic characters [:ascii:] ASCII characters [:blank:] Space and tab [:cntrl:] Control characters [:digit:] Digits, Hexadecimal digits [:xdigit:] [:graph:] Visible characters (i.e. anything except spaces, control characters, etc.) [:lower:] Lowercase letters [:print:] Visible characters and spaces (i.e. anything except control characters) [:punct:] Punctuation and symbols. [:space:] All whitespace characters, including line breaks [:upper:] Uppercase letters [:word:] Word characters (letters, numbers and underscores)
  • 10. Perl Character Classes Perl POSIX Description d [[:digit:]] [0-9] D [^[:digit:]] [^0-9] w [[:alnum:]_] [0-9a-zA-Z_] W [^[:alnum:]_] [^0-9a-zA-Z_] s [[:space:]] S [^[:space:]] 10
  • 11. Tools to learn Regular Expressions  http://www.weitz.de/regex-coach/  http://www.regexbuddy.com/
  • 12. REGEXP_* functions  Available from 10g onwards.  Powerful and flexible, but CPU-hungry.  Easy and elegant, but sometimes less performant  Usable on text literal, bind variable, or any column that holds character data such as CHAR, NCHAR, CLOB, NCLOB, NVARCHAR2, and VARCHAR2 (but not LONG).  Useful as column constraint for data validation
  • 13. REGEXP_LIKE  Determines whether pattern matches.  REGEXP_LIKE (source_str, pattern, [,match_parameter])  Returns TRUE or FALSE.  Use in WHERE clause to return rows matching a pattern  Use as a constraint  alter table t add constraint alphanum check (regexp_like (x, '[[:alnum:]]'));  Use in PL/SQL to return a boolean.  IF (REGEXP_LIKE(v_name, '[[:alnum:]]')) THEN ..  Can't be used in SELECT clause  regexp_like.sql
  • 14. REGEXP_SUBSTR  Extracts the matching pattern. Returns NULL when nothing matches  REGEXP_SUBSTR(source_str, pattern [, position [, occurrence [, match_parameter]]])  position: character at which to begin the search. Default is 1  occurrence: The occurrence of pattern you want to extract  regexp_substr.sql
  • 15. REGEXP_INSTR  Returns the location of match in a string  REGEXP_INSTR(source_str, pattern, [, position [, occurrence [, return_option [, match_parameter]]]])  return_option:  0, the default, returns the position of the first character.  1 returns the position of the character following the occurence.  regexp_instr.sql
  • 16. REGEXP_REPLACE  Search and Replace a pattern  REGEXP_REPLACE(source_str, pattern [, replace_str] [, position [, occurrence [, match_parameter]]]])  If replace_str is not specified, pattern/search_str is replaced with empty string  occurence:  when 0, the default, replaces all occurrences of the match.  when n, any positive integer, replaces the nth occurrence.  regexp_replace.sql
  • 17. REGEXP_COUNT  New in 11g  Returns the number of times a pattern appears in a string.  REGEXP_COUNT(source_str, pattern [,position [,match_param]])  For simple patterns it is same as (LENGTH(source_str) – LENGTH(REPLACE(source_str, pattern)))/LENGTH(pattern)  regexp_count.sql
  • 18. Pattern Matching modifiers  i – Specifies case-insensitive matching (ignore case)  c – Specifies case-sensitive matching  n – allows the period (.) to match the newline character  m - treats the source string as multiple lines.  x - ignores whitespace characters  when match_parameter is not specified,  case sensitivity is determined by NLS_SORT parameter (BINARY, BINARY_CI)  A period (.) doesn't match newline character  Source string is treated as a single line  match_params.sql
  • 19. Is a CHAR column all numeric?  to_number(c1) returns ORA-01722: invalid number if a varchar2 column contains alpha characters.  is_numeric.sql
  • 20. Check constraints  Put validation close to where the data is stored  No need to have validation at different clients  check_constraint.sql
  • 21. Extract email-ids  Find email-ids embedded in text strings. Possible email-id formats: abc123@company.com namex@mail.company.com xyz_1@yahoo.co.in  extract_emailid.sql
  • 22. Extract dates  Extract dates embedded in text strings. Possible formats 1/5/2007, 2-5-03, 12-31-2009, 1/31/10, 2/5-10  extract_date.sql
  • 23. Extracting hostnames from URLs  Extract hostnames/domain-names embedded in text strings. Possible formats  http://us.mg201.mail.yahoo.com/dc/launch?.partner =sbc&.gx=1&.rand=fegr2vucbecu5  https://www.mybank.com:8080/abc/xyz  www.mybank.com  ftp://www.mycharity.org/abc/xyz  extract_hostname.sql
  • 24. Convert value pairs to XML  Input: A string such as 'remain1=1;remain2=2;'  Output: An XML string <remain1><value=1></remain1> <remain2><value=2></remain2>  convert_to_xml.sql
  • 25. Sort IP addresses in numerical order  Sort IP addresses, that are stored as character strings, in numerical order.  Input  10.10.20.10  127.0.0.1  166.22.33.44  187.222.121.0  20.23.23.20  sort_ip_address.sql
  • 26. Extract first name, last name, and middle initial  Extract the first name, last name with an optional middle initial.  first_last_mi.sql
  • 27. Finding the Last Occurrence  Find the last numeric sequence from a sequence.  Return 567 from 'abc/123/def567/xyz'  INSTR and SUBSTR allow backward search when position is negative. REGEXP functions don't allow backward search  last_occurrence.sql
  • 28. Fuzzy Match  Tables t1 and t2 each have a varchar2(12) column (t1.x, t2.y).  A row in t1 is considered a match for a row in t2, if any six characters in t1.x matches with any six characters in t2.y  fuzzy_match.sql
  • 29. The lazy operator  ? is lazy/non-greedy quantifier  greedy_lazy.sql
  • 30. Meta-characters with multiple meanings  Same meta characters are used with multiple meanings  ^ used for anchoring and negation.  ? used as quantifier and lazy operator  () used for grouping or sub-expression  metachars_with_multiple_meanings.sql
  • 31. Nuances  ? (zero or one), * (zero or more) could sometimes mislead you  nuances.sql
  • 32. Stored patterns  patterns can be stored in table columns and be referenced in REGEXP functions  No need to hard-code them  stored_patterns.sql
  • 33. Random things  Insert a dash before the two last digits  Remove a substring  Get rid of useless commas from a string  Find the word that comes immediately before a substring (e.g. XXX)  Replace whole words, not its parts  Trimming the trailing digits  random.sql
  • 34. A few other points  When not to use Regular Expressions  If the same thing could be used without regular expressions and without too much coding.  POSIX notations need double brackets [[:upper]]. [:upper:] won't work. [[:UPPER:]] won't work either. It has to be in lower case letters.  Locale support provided with Collation Element ][.ce.]], and Equivalence Classes [[=e=]]  MySQL supports regular expressions with RLIKE
  • 35. References  Oracle® Database Advanced Application Developer's Guide (http://download.oracle.com/docs/cd/E11882_0 1/appdev.112/e17125/adfns_regexp.htm#CHDGH BHF)  Anti-Patterns in Regular Expressions: http://gennick.com/antiregex.html  First Expressions. An article by Jonathan Gennick Oracle Magazine, Sep/Oct 2003.  Oracle Regular Expressions Pocket Reference by Gonathan Gennick  http://examples.oreilly.com/9780596006013/Re gexPocketData.sql
  • 36. References ...  http://www.psoug.org/reference/regexp.html  http://download.oracle.com/docs/cd/E11882_01/se rver.112/e10592/conditions007.htm#SQLRF00501  http://www.oracle.com/technology/pub/articles/sat ernos_regexp.html  http://www.oracle.com/technology/products/datab ase/application_development/pdf/TWP_Regular_E xpressions.pdf  http://asktom.oracle.com/pls/asktom/asktom.searc h?p_string=regexp_
  • 37. References ...  http://www.oracle.com/technology/obe/10gr2_db_single/de velop/regexp/regexp_otn.htm  http://www.oracle.com/technology/sample_code/tech/pl_sq l/index.html  http://forums.oracle.com/forums/thread.jspa?threadID=427 716  http://forums.oracle.com/forums/search.jspa?threadID=&q= regular+expression&objID=f75&dateRange=all&userID=&nu mResults=120&rankBy=9  http://www.oracle.com/technology/sample_code/tech/pl_sq l/index.html  http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUE STION_ID:2200894550208#1568589800346862515