loading...
Cover image for Java Regex Cheat Sheet

Java Regex Cheat Sheet

voidjuneau profile image Juneau Lim Updated on ・3 min read

This post will be more of a note for myself. I didn't mean to write a post, but couldn't find a comprehensive one I wanted. So, this post will be not a well-polished one, and, I am not going to say anything about expression in general since there are millions of great resources.

Compare to JavaScript, it is a bit annoying to use Regex in Java. Yes, it is true, and that is the reason that I happened to write this post.

Syntax

For pattern, instead of single \, Java requires double backslash \\.
On good thing though, I don't know about other editors, but at least in NetBeans, when you paste the code from clipboard, the extra \ is automatically added.
In addition, different string requires it's own matcher unless called with a matches() method.

Methods

Pattern

Pattern compile(String regex[, int flags])    // flags optional, use fields sepreate by |
boolean matches([String regex, ]CharSequence input)    // regex optional.
String[] split(String regex[, int limit])    // limit optional.
String quote(String s)    // returns a literal pattern String for the specified String.

Matcher

// for names, in regex, (?<name>pattern)
int start([int group | String name])    // argument optional
int end([int group | String name])    // argument optional.
boolean find([int start])    // start is optional.
String group([int group | String name])    // argument optional.
Matcher reset()

String

boolean matches(String regex)
String replaceAll(String regex, String replacement)
String[] split(String regex[, int limit])    // limit optional

There are more methods.

Usages

If there is a match exist.

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.find());    // true

If they are identical

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.matches());    // false, only true when
                                    // the pattern matches with string
                                    // without remainder.

System.out.println("seashore".matches(reg));    // true

or,

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
System.out.println(Pattern.matches(reg, str);    // false

Array of all matches

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
List<String> matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}
System.out.println(matches);    // [sells, seashells, seashore]

Count number of matches

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
int counter = 0;
while (m.find()) {
    counter++;
}

I found that code golf doesn't work well for all cases.
It's the same for the array of matches.

Flags

All flags are fields of Pattern class.
In Java, you could think that all Regex is global.

Java Javascript equivalent Explain
CANON_EQ Enables canonical equivalence.
CASE_INSENSITIVE i Enables case-insensitive matching.
COMMENTS white space and comments are ignored in the pattern.
DOTALL s dot matches end line character as well.
LITERAL Enables literal parsing of the pattern.
MULTILINE m Enables multiline mode.
UNICODE_CASE u Enables Unicode-aware case folding.
UNIX_LINES only the '\n' line terminator is recognized in the behaviour of ., ^, and $.

There could be Javascript equivalent I've missed.

Reference

Posted on by:

voidjuneau profile

Juneau Lim

@voidjuneau

full discloser: #CodeNewbie #horribleAtHumanLanguage

Discussion

pic
Editor guide