- | - |
---|---|
(*ACCEPT) |
Control verb |
(*FAIL) |
Control verb |
(*MARK:NAME) |
Control verb |
(*COMMIT) |
Control verb |
(*PRUNE) |
Control verb |
(*SKIP) |
Control verb |
(*THEN) |
Control verb |
(*UTF) |
Pattern modifier |
(*UTF8) |
Pattern modifier |
(*UTF16) |
Pattern modifier |
(*UTF32) |
Pattern modifier |
(*UCP) |
Pattern modifier |
(*CR) |
Line break modifier |
(*LF) |
Line break modifier |
(*CRLF) |
Line break modifier |
(*ANYCRLF) |
Line break modifier |
(*ANY) |
Line break modifier |
\R |
Line break modifier |
(*BSR_ANYCRLF) |
Line break modifier |
(*BSR_UNICODE) |
Line break modifier |
(*LIMIT_MATCH=x) |
Regex engine modifier |
(*LIMIT_RECURSION=d) |
Regex engine modifier |
(*NO_AUTO_POSSESS) |
Regex engine modifier |
(*NO_START_OPT) |
Regex engine modifier |
Getting Started
Import the regular expressions module
import re
Control verb
POSIX Character Classes
Character Class | Same as | Meaning |
---|---|---|
[[:alnum:]] |
[0-9A-Za-z] |
Letters and digits |
[[:alpha:]] |
[A-Za-z] |
Letters |
[[:ascii:]] |
[\x00-\x7F] |
ASCII codes 0-127 |
[[:blank:]] |
[\t ] |
Space or tab only |
[[:cntrl:]] |
[\x00-\x1F\x7F] |
Control characters |
[[:digit:]] |
[0-9] |
Decimal digits |
[[:graph:]] |
[[:alnum:][:punct:]] |
Visible characters (not space) |
[[:lower:]] |
[a-z] |
Lowercase letters |
[[:print:]] |
[ -~] == [ [:graph:]] |
Visible characters |
[[:punct:]] |
[!"#$%&â()*+,-./:;<=>?@[]^_ { |
}~]` |
[[:space:]] |
[\t\n\v\f\r ] |
Whitespace |
[[:upper:]] |
[A-Z] |
Uppercase letters |
[[:word:]] |
[0-9A-Za-z_] |
Word characters |
[[:xdigit:]] |
[0-9A-Fa-f] |
Hexadecimal digits |
[[:<:]] |
[\b(?=\w)] |
Start of word |
[[:>:]] |
[\b(?<=\w)] |
End of word |
Recurse
- | - |
---|---|
(?R) |
Recurse entire pattern |
(?1) |
Recurse first subpattern |
(?+1) |
Recurse first relative subpattern |
(?&name) |
Recurse subpattern name |
(?P=name) |
Match subpattern name |
(?P>name) |
Recurse subpattern name |
Flags/Modifiers
Pattern | Description |
---|---|
g |
Global |
m |
Multiline |
i |
Case insensitive |
x |
Ignore whitespace |
s |
Single line |
u |
Unicode |
X |
eXtended |
U |
Ungreedy |
A |
Anchor |
J |
Duplicate group names |
Lookarounds
- | - |
---|---|
(?=...) |
Positive Lookahead |
(?!...) |
Negative Lookahead |
(?<=...) |
Positive Lookbehind |
(?<!...) |
Negative Lookbehind |
Lookaround lets you match a group before (lookbehind) or after (lookahead) your main pattern without including it in the result. |
Assertions
- | - |
---|---|
(?(1)yes|no) |
Conditional statement |
(?(R)yes|no) |
Conditional statement |
(?(R#)yes|no) |
Recursive Conditional statement |
(?(R&name)yes|no) |
Conditional statement |
(?(?=...)yes|no) |
Lookahead conditional |
(?(?<=...)yes|no) |
Lookbehind conditional |
Group Constructs
Pattern | Description |
---|---|
(...) |
Capture everything enclosed |
(a|b) |
Match either a or b |
(?:...) |
Match everything enclosed |
(?>...) |
Atomic group (non-capturing) |
(?|...) |
Duplicate subpattern group number |
(?#...) |
Comment |
(?'name'...) |
Named Capturing Group |
(?<name>...) |
Named Capturing Group |
(?P<name>...) |
Named Capturing Group |
(?imsxXU) |
Inline modifiers |
(?(DEFINE)...) |
Pre-define patterns before using them |
Substitution
Pattern | Description |
---|---|
\0 |
Complete match contents |
\1 |
Contents in capture group 1 |
$1 |
Contents in capture group 1 |
${foo} |
Contents in capture group foo |
\x20 |
Hexadecimal replacement values |
\x{06fa} |
Hexadecimal replacement values |
\t |
Tab |
\r |
Carriage return |
\n |
Newline |
\f |
Form-feed |
\U |
Uppercase Transformation |
\L |
Lowercase Transformation |
\E |
Terminate any Transformation |
Anchors
Pattern | Description |
---|---|
\G |
Start of match |
^ |
Start of string |
$ |
End of string |
\A |
Start of string |
\Z |
End of string |
\z |
Absolute end of string |
\b |
A word boundary |
\B |
Non-word boundary |
Meta Sequences
Pattern | Description |
---|---|
. |
Any single character |
\s |
Any whitespace character |
\S |
Any non-whitespace character |
\d |
Any digit, Same as [0-9] |
\D |
Any non-digit, Same as [^0-9] |
\w |
Any word character |
\W |
Any non-word character |
\X |
Any Unicode sequences, linebreaks included |
\C |
Match one data unit |
\R |
Unicode newlines |
\v |
Vertical whitespace character |
\V |
Negation of \v - anything except newlines and vertical tabs |
\h |
Horizontal whitespace character |
\H |
Negation of \h |
\K |
Reset match |
\n |
Match nth subpattern |
\pX |
Unicode property X |
\p{...} |
Unicode property or script category |
\PX |
Negation of \pX |
\P{...} |
Negation of \p |
\Q...\E |
Quote; treat as literals |
\k<name> |
Match subpattern name |
\k'name' |
Match subpattern name |
\k{name} |
Match subpattern name |
\gn |
Match nth subpattern |
\g{n} |
Match nth subpattern |
\g<n> |
Recurse nth capture group |
\g'n' |
Recurses nth capture group. |
\g{-n} |
Match nth relative previous subpattern |
\g<+n> |
Recurse nth relative upcoming subpattern |
\g'+n' |
Match nth relative upcoming subpattern |
\g'letter' |
Recurse named capture group letter |
\g{letter} |
Match previously-named capture group letter |
\g<letter> |
Recurses named capture group letter |
\xYY |
Hex character YY |
\x{YYYY} |
Hex character YYYY |
\ddd |
Octal character ddd |
\cY |
Control character Y |
[\b] |
Backspace character |
\ |
Makes any character literal |
Common Metacharacters
- ^
- {
- +
- <
- [
- *
- )
- >
- .
- (
- |
- $
- \
- ?
Escape these special characters with
\
Quantifiers
Pattern | Description |
---|---|
a? |
Zero or one of a |
a* |
Zero or more of a |
a+ |
One or more of a |
[0-9]+ |
One or more of 0-9 |
a{3} |
Exactly 3 of a |
a{3,} |
3 or more of a |
a{3,6} |
Between 3 and 6 of a |
a* |
Greedy quantifier |
a*? |
Lazy quantifier |
a*+ |
Possessive quantifier |
Character Classes
Pattern | Description |
---|---|
[abc] |
A single character of: a, b or c |
[^abc] |
A character except: a, b or c |
[a-z] |
A character in the range: a-z |
[^a-z] |
A character not in the range: a-z |
[0-9] |
A digit in the range: 0-9 |
[a-zA-Z] |
A character in the range:a-z or A-Z |
[a-zA-Z0-9] |
A character in the range: a-z, A-Z or 0-9 |
Introduction
This is a quick cheat sheet to getting started with regular expressions.
- Regex in Python (quickref.me)
- Regex in JavaScript (quickref.me)
- Regex in PHP (quickref.me)
- Regex in Java (quickref.me)
- Regex in MySQL (quickref.me)
- Regex in Vim (quickref.me)
- Regex in Emacs (quickref.me)
- Online regex tester (regex101.com)
Comments