Getting Started

Import the regular expressions module

import re

Character Class	Same as	Meaning
`[[:alnum:]]`	`[0-9A-Za-z]`	Letters and digits
`[[:alpha:]]`	`[A-Za-z]`	Letters
`[[:ascii:]]`	`[\x00-\x7F]`	ASCII codes 0-127
`[[:blank:]]`	`[\t ]`	Space or tab only
`[[:cntrl:]]`	`[\x00-\x1F\x7F]`	Control characters
`[[:digit:]]`	`[0-9]`	Decimal digits
`[[:graph:]]`	`[[:alnum:][:punct:]]`	Visible characters (not space)
`[[:lower:]]`	`[a-z]`	Lowercase letters
`[[:print:]]`	`[ -~] == [ [:graph:]]`	Visible characters
`[[:punct:]]`	`[!"#$%&â()*+,-./:;<=>?@[]^_`{	}~]`
`[[:space:]]`	`[\t\n\v\f\r ]`	Whitespace
`[[:upper:]]`	`[A-Z]`	Uppercase letters
`[[:word:]]`	`[0-9A-Za-z_]`	Word characters
`[[:xdigit:]]`	`[0-9A-Fa-f]`	Hexadecimal digits
`[[:<:]]`	`[\b(?=\w)]`	Start of word
`[[:>:]]`	`[\b(?<=\w)]`	End of word

-	-
`(?R)`	Recurse entire pattern
`(?1)`	Recurse first subpattern
`(?+1)`	Recurse first relative subpattern
`(?&name)`	Recurse subpattern `name`
`(?P=name)`	Match subpattern `name`
`(?P>name)`	Recurse subpattern `name`

-	-
`(?=...)`	Positive Lookahead
`(?!...)`	Negative Lookahead
`(?<=...)`	Positive Lookbehind
`(?<!...)`	Negative Lookbehind
Lookaround lets you match a group before (lookbehind) or after (lookahead) your main pattern without including it in the result.

-	-
`(?(1)yes\|no)`	Conditional statement
`(?(R)yes\|no)`	Conditional statement
`(?(R#)yes\|no)`	Recursive Conditional statement
`(?(R&name)yes\|no)`	Conditional statement
`(?(?=...)yes\|no)`	Lookahead conditional
`(?(?<=...)yes\|no)`	Lookbehind conditional

Pattern	Description
`(...)`	Capture everything enclosed
`(a\|b)`	Match either a or b
`(?:...)`	Match everything enclosed
`(?>...)`	Atomic group (non-capturing)
`(?\|...)`	Duplicate subpattern group number
`(?#...)`	Comment
`(?'name'...)`	Named Capturing Group
`(?<name>...)`	Named Capturing Group
`(?P<name>...)`	Named Capturing Group
`(?imsxXU)`	Inline modifiers
`(?(DEFINE)...)`	Pre-define patterns before using them

Pattern	Description
`\0`	Complete match contents
`\1`	Contents in capture group 1
`$1`	Contents in capture group 1
`${foo}`	Contents in capture group `foo`
`\x20`	Hexadecimal replacement values
`\x{06fa}`	Hexadecimal replacement values
`\t`	Tab
`\r`	Carriage return
`\n`	Newline
`\f`	Form-feed
`\U`	Uppercase Transformation
`\L`	Lowercase Transformation
`\E`	Terminate any Transformation

Pattern	Description
`.`	Any single character
`\s`	Any whitespace character
`\S`	Any non-whitespace character
`\d`	Any digit, Same as [0-9]
`\D`	Any non-digit, Same as [^0-9]
`\w`	Any word character
`\W`	Any non-word character
`\X`	Any Unicode sequences, linebreaks included
`\C`	Match one data unit
`\R`	Unicode newlines
`\v`	Vertical whitespace character
`\V`	Negation of \v - anything except newlines and vertical tabs
`\h`	Horizontal whitespace character
`\H`	Negation of \h
`\K`	Reset match
`\n`	Match nth subpattern
`\pX`	Unicode property X
`\p{...}`	Unicode property or script category
`\PX`	Negation of \pX
`\P{...}`	Negation of \p
`\Q...\E`	Quote; treat as literals
`\k<name>`	Match subpattern `name`
`\k'name'`	Match subpattern `name`
`\k{name}`	Match subpattern `name`
`\gn`	Match nth subpattern
`\g{n}`	Match nth subpattern
`\g<n>`	Recurse nth capture group
`\g'n'`	Recurses nth capture group.
`\g{-n}`	Match nth relative previous subpattern
`\g<+n>`	Recurse nth relative upcoming subpattern
`\g'+n'`	Match nth relative upcoming subpattern
`\g'letter'`	Recurse named capture group `letter`
`\g{letter}`	Match previously-named capture group `letter`
`\g<letter>`	Recurses named capture group `letter`
`\xYY`	Hex character YY
`\x{YYYY}`	Hex character YYYY
`\ddd`	Octal character ddd
`\cY`	Control character Y
`[\b]`	Backspace character
`\`	Makes any character literal

Pattern	Description
`[abc]`	A single character of: a, b or c
`[^abc]`	A character except: a, b or c
`[a-z]`	A character in the range: a-z
`[^a-z]`	A character not in the range: a-z
`[0-9]`	A digit in the range: 0-9
`[a-zA-Z]`	A character in the range:a-z or A-Z
`[a-zA-Z0-9]`	A character in the range: a-z, A-Z or 0-9

This is a quick cheat sheet to getting started with regular expressions.

Comments