Java Regex Cheat Sheet

ADVERTISEMENT

Regex cheat sheet
Character classes
Useful Java classes & methods
Quantifiers
[abc]
PATTERN
matches a or b, or c.
[^abc] negation, matches everything except a, b, or c.
Greedy
Reluctant Possessive
Description
A pattern is a compiler representation of a regular expression.
[a-c]
range, matches a or b, or c.
Pattern compile(String regex)
[a-c[f-h]]
union, matches a, b, c, f, g, h.
X?
X??
X?+
X, once or not at all.
Compiles the given regular expression into a pattern.
[a-c&&[b-c]] intersection, matches b or c.
[a-c&&[^b-c]] subtraction, matches a.
X*
X*?
X*+
X, zero or more times.
Pattern compile(String regex, int flags)
Compiles the given regular expression into a pattern
Predefined character classes
X+
X+?
X++
X, one or more times.
with the given flags.
.
Any character.
boolean matches(String regex)
X{n}
X{n}?
X{n}+
X, exactly n times.
\d
A digit: [0-9]
Tells whether or not this string matches the given
\D
A non-digit: [^0-9]
regular expression.
\s
A whitespace character: [ \t\n\x0B\f\r]
X{n,}
X{n,}?
X{n,}+
X, at least n times.
\S
A non-whitespace character: [^\s]
String[] split(CharSequence input)
\w
A word character: [a-zA-Z_0-9]
Splits the given input sequence around matches of
X, at least n but
X{n,m}
X{n,m}?
X{n,m}+
\W
A non-word character: [^\w]
not more than m times.
this pattern.
String quote(String s)
Greedy - matches the longest matching group.
Boundary matches
Returns a literal pattern String for the specified String.
Reluctant - matches the shortest group.
^
The beginning of a line.
Possessive - longest match or bust (no backoff).
Predicate<String> asPredicate()
$
The end of a line.
\b
A word boundary.
Creates a predicate which can be used to match a string.
Groups & backreferences
\B
A non-word boundary.
MATCHER
\A
The beginning of the input.
A group is a captured subsequence of characters which may
\G
An engine that performs match operations on a character
The end of the previous match.
\Z
sequence by interpreting a Pattern.
be used later in the expression with a backreference.
The end of the input but for the final terminator, if any.
\z
The end of the input.
boolean matches()
(...) - defines a group.
Attempts to match the entire region against the pattern.
\N - refers to a matched group.
Pattern flags
boolean find()
(\d\d) - a group of two digits.
Pattern.CASE_INSENSITIVE - enables case-insensitive
Attempts to find the next subsequence of the input
(\d\d)/\1- two digits repeated twice.
matching.
sequence that matches the pattern.
\1 - refers to the matched group.
Pattern.COMMENTS - whitespace and comments starting
int start()
with # are ignored until the end of a line.
Pattern.MULTILINE - one expression can match
Logical operations
Returns the start index of the previous match.
multiple lines.
Pattern.UNIX_LINES - only the '\n' line terminator
int end()
XY
X then Y.
is recognized in the behavior of ., ^, and $.
Returns the offset after the last character matched.
X|Y
X or Y.

ADVERTISEMENT

00 votes

Related Articles

Related forms

Related Categories

Parent category: Education
Go