Regex examples

If-then-else

Match "Mr." or "Ms." if word "her" is later in string

M(?(?=.*?\bher\b)s|r)\.

requires lookaround for IF condition

Lookaround

Pattern Meaning
(?= ) Lookahead, if you can find ahead
(?! ) Lookahead,if you can not find ahead
(?<= ) Lookbehind, if you can find behind
(?<! ) Lookbehind, if you can NOT find behind
\b\w+?(?=ing\b) Match warbling, string, fishing, …
\b(?!\w+ing\b)\w+\b Words NOT ending in "ing"
(?<=\bpre).*?\b Match pretend, present, prefix, …
\b\w{3}(?<!pre)\w*?\b Words NOT starting with "pre"
\b\w+(?<!ing)\b Match words NOT ending in "ing"

Atomic groups

Pattern Meaning
(?>red|green|blue) Faster than non-capturing
(?>id|identity)\b Match id, but not identity
"id" matches, but \b fails after atomic group,
parser doesn't backtrack into group to retry 'identity'

If alternatives overlap, order longer to shorter.

Non-capturing group

Pattern Meaning
on(?:click|load) Faster than: on(click|load)
Use non-capturing or atomic groups when possible

Back references

Pattern Matches
(to) (be) or not \1 \2 Match to be or not to be
([^\s])\1{2} Match non-space, then same twice more   aaa, …
\b(\w+)\s+\1\b Match doubled words

Groups

Pattern Meaning
(in|out)put Match input or output
\d{5}(-\d{4})? US zip code ("+ 4" optional)
Parser tries EACH alternative if match fails after group.

Can lead to catastrophic backtracking.

Modifiers

Pattern Meaning
(?i)[a-z]*(?-i) Ignore case ON / OFF
(?s).*(?-s) Match multiple lines (causes . to match newline)
(?m)^.*;$(?-m) ^ & $ match lines not whole string
(?x) #free-spacing mode, this EOL comment ignored
(?-x) free-spacing mode OFF
/regex/ismx Modify mode for entire string

Scope

Pattern Meaning
\b "Word" edge (next to non "word" character)
\bring Word starts with "ring", ex ringtone
ring\b Word ends with "ring", ex spring
\b9\b Match single digit 9, not 19, 91, 99, etc..
\b[a-zA-Z]{6}\b Match 6-letter words
\B Not word edge
\Bring\B Match springs and wringer
^\d*$ Entire string must be digits
^[a-zA-Z]{4,20}$ String must have 4-20 letters
^[A-Z] String must begin with capital letter
[\.!?"')]$ String must end with terminal puncutation

Greedy versus lazy

Pattern Meaning
* + {n,}greedy Match as much as possible
<.+> Finds 1 big match in bold
*? +? {n,}?lazy Match as little as possible
<.+?> Finds 2 matches in bold

Occurrences

Pattern Matches
colou?r Match color or colour
[BW]ill[ieamy's]* Match Bill, Willy, William's etc.
[a-zA-Z]+ Match 1 or more letters
\d{3}-\d{2}-\d{4} Match a SSN
[a-z]\w{1,7} Match a UW NetID

Shorthand classes

Pattern Meaning
\w "Word" character (letter, digit, or underscore)
\d Digit
\s Whitespace (space, tab, vtab, newline)
\W, \D, or \S Not word, digit, or whitespace
[\D\S] Means not digit or whitespace, both match
[^\d\s] Disallow digit and whitespace

Alternatives

Pattern Matches
cat|dog Match cat or dog
id|identity Match id or identity
identity|id Match id or identity
Order longer to shorter when alternatives overlap

Characters

Pattern Matches
ring Match ring springboard etc.
. Match a, 9, + etc.
h.o Match hoo, h2o, h/o etc.
ring\? Match ring?
\(quiet\) Match (quiet)
c:\\windows Match c:\windows
Use \ to search for these special characters:
[ \ ^ $ . | ? * + ( ) { }
Comments