22. Readily Available
Support in programming languages:
JavaScript, PHP, PERL, C/C++,etc
Command-line: grep, awk, sed
Text-editors: VIM, emacs, Notepad++
IDEs: Aptana, Netbeans, Visual
Studio .NET
26. Character Classes
Matches one and ONLY one character in
a set of characters
[Aa] : matches either ‘A’ or ‘a’
[a-z] : matches any of the lowercase
alphabets in the specified range ONCE
[^Aa] : matches anything but ‘A’ and
‘a’
27. Character Classes
Metacharacters may behave differently
within character classes
[^red] : matches anything but ‘r’,
‘e’ and ‘d’
[r^ed] : matches only ‘r’, ‘^’, ‘e’
or ‘d’
28. Shorthand Classes
d, [0-9]: digits
w, [da-zA-Z_]: alphanumeric or _
s or [ t(?:n|rn)] : whitespace
D, W, S : the above BUT negated
29. The Dot Character
The Dot (.) character matches any
single character BUT the newline
Synonymous with [^n] (UNIX/Linux/
Mac)
as well as [^rn] (Windows)
Use it sparingly - it’s expensive
30. Alternation
Using a pipe |, match either the left
or right side of the pattern
bear|tiger : matches a string that
contains either “bear” or “tiger”
pedo(bea|tige)r : matches a string
that contains either “pedobear” or
“pedotiger”
31. Quantifiers
{n} : matches exactly n times
{n,} : matches n or more times
{n,m} : matches between n and m times
* : same as {0,}
+ : same as {1,}
? : same as {0,1}
32. Quantifiers
Quantifiers are greedy
<.+> : matches “<div>holy RegEx,
Batman!</div>” instead of stopping at
“<div>”
Add a ? to make it lazy
<.+?> : stops at “<div>” in
“<div>holy regex!</div>”
33. Anchors
Matches positions instead of
characters
^ : matches the beginning of a string
$ : matches the end of a string
b : matches between a w and a token
that’s not a w
34. Groupings
Placing parentheses around tokens
groups them together : /nyan(cat)/
It also provides a
backreference : /(cat)1/ matches
“cat”
OR if you don’t want a
backreference : /(?:nyan)(cat)1/
matches “nyancatcat” and not
“nyancatnyan”
35. Lookahead
Positive Lookahead
Iron(?=man) : matches “Iron” only
if it is followed by “man”
Negative Lookahead
Iron(?!man) : matches “Iron” only
if it is not followed by “man”
36. Lookbehind
Positive Lookbehind
(?<=Iron)man : matches “man” only
if it is preceded by “Iron”
Negative Lookbehind
(?<!Iron)man : matches “man” only
if it is not preceded by “Iron”
37. Modifiers
alter behavior of the matching mode
(differs between tools)
/i : case-insensitive match
/m : Multi-line mode
/g : affects all possible matches,
not just the first