Python 2.7 Regular Expressions printable pdf download

Extensions. Do not cause grouping, except 'P<name>':

Python 2.7 Regular

(?iLmsux)

Match empty string, sets re.X flags

Expressions

(?:...)

Non-capturing version of regular parens

(?P<name>...) Create a named capturing group.

(?P=name)

Match whatever matched prev named group

(?#...)

A comment; ignored.

Non-special chars match themselves. Exceptions are

(?=...)

Lookahead assertion, match without consuming

special characters:

(?!...)

Negative lookahead assertion

(?<=...)

Lookbehind assertion, match if preceded

(?<!...)

Negative lookbehind assertion

Escape special char or start a sequence.

(?(id)y|n)

Match 'y' if group 'id' matched, else 'n'

Match any char except newline, see re.DOTALL

Match start of the string, see re.MULTILINE

Flags for re.compile(), etc. Combine with '|':

Match end of the string, see re.MULTILINE

[]

Enclose a set of matchable chars

R|S

Match either regex R or regex S.

re.I == re.IGNORECASE

Ignore case

()

Create capture group, & indicate precedence

re.L == re.LOCALE

Make \w, \b, and \s locale dependent

re.M == re.MULTILINE

Multiline

re.S == re.DOTALL

Dot matches all (including newline)

re.U == re.UNICODE

Make \w, \b, \d, and \s unicode dependent

After '[', enclose a set, the only special chars are:

re.X == re.VERBOSE

Verbose (unescaped whitespace in pattern

is ignored, and '#' marks comment lines)

]

End the set, if not the 1st char

A range, eg. a-c matches a, b or c

Module level functions:

Negate the set only if it is the 1st char

compile(pattern[, flags]) -> RegexObject

match(pattern, string[, flags]) -> MatchObject

Quantifiers (append '?' for non-greedy):

search(pattner, string[, flags]) -> MatchObject

findall(pattern, string[, flags]) -> list of strings

finditer(pattern, string[, flags]) -> iter of MatchObjects

{m}

Exactly m repetitions

split(pattern, string[, maxsplit, flags]) -> list of strings

{m,n}

From m (default 0) to n (default infinity)

sub(pattern, repl, string[, count, flags]) -> string

0 or more. Same as {,}

subn(pattern, repl, string[, count, flags]) -> (string, int)

escape(string) -> string

1 or more. Same as {1,}

purge() # the re cache

0 or 1. Same as {,1}

RegexObjects (returned from compile()):

Special sequences:

.match(string[, pos, endpos]) -> MatchObject

Start of string

.search(string[, pos, endpos]) -> MatchObject

Match empty string at word (\w+) boundary

.findall(string[, pos, endpos]) -> list of strings

.finditer(string[, pos, endpos]) -> iter of MatchObjects

Match empty string not at word boundary

.split(string[, maxsplit]) -> list of strings

Digit

.sub(repl, string[, count]) -> string

Non-digit

.subn(repl, string[, count]) -> (string, int)

Whitespace [ \t\n\r\f\v], see LOCALE,UNICODE

.flags

# int, Passed to compile()

Non-whitespace

.groups

# int, Number of capturing groups

Alphanumeric: [0-9a-zA-Z_], see LOCALE

.groupindex # {}, Maps group names to ints

.pattern

# string, Passed to compile()

Non-alphanumeric

End of string

\g<id>

Match prev named or numbered group,

MatchObjects (returned from match() and search()):

'<' & '>' are literal, e.g. \g<0>

or \g<name> (not \g0 or \gname)

.expand(template) -> string, Backslash & group expansion

.group([group1...]) -> string or tuple of strings, 1 per arg

.groups([default]) -> tuple of all groups, non-matching=default

Special character escapes are much like those already

.groupdict([default]) -> {}, Named groups, non-matching=default

.start([group]) -> int, Start/end of substring match by group

escaped in Python string literals. Hence regex '\n' is

.end([group]) -> int, Group defaults to 0, the whole match

same as regex '\\n':

.span([group]) -> tuple (match.start(group), match.end(group))

.pos

int, Passed to search() or match()

.endpos

int, "

ASCII Bell (BEL)

.lastindex int, Index of last matched capturing group

.lastgroup string, Name of last matched capturing group

ASCII Formfeed

.re

regex, As passed to search() or match()

ASCII Linefeed

.string

string, "

ASCII Carriage return

ASCII Tab

Gleaned

from

the

python

2.7

're'

docs.

ASCII Vertical tab

A single backslash

\xHH

Two digit hexadecimal character goes here

\OOO

Three digit octal char (or just use an

Version: v0.3.3

initial zero, e.g. \0, \09)

\DD

Decimal number 1 to 99, match

previous numbered group

Python 2.7 Regular Expressions

Related Articles

Related forms

Related Categories