loading...
Cover image for Boost your Productiveness with RegEx (a little)

Boost your Productiveness with RegEx (a little)

stealthmusic profile image Jan Wedel ・4 min read

I love RegEx, I use it every day and I will show you how to use it to easily get some smaller and larger tasks done.

But...

Don’t use it in production

Ok, first things first: Be very careful using RegEx for anything in production code if you're not absolutely certain it's actually necessary.

This is an example of what could happen. In 95% of the cases, it's much safer and easier to comprehend to use simple loops to go over data, using something like String.contains() or String.split(delimiter) to search and break strings up in a simple and readable way.

[EDIT] To be very clear: I mean what I said above. Don’t use anything I show you here in production. I personally only use that on log files, test data and manual data creation.

Tools

There is actually no special tool I use. Every more or less sophisticated text editor or IDE supports RegEx in search an replace. Most of the work I personally do in Sublime Text, sometimes in IntelliJ.

Useful RegEx

This is how I most often use RegEx in my day-to-day life.

Replace start end of line

Consider you have the following text

Flour
Eggs
Milk
Salt
Maple sirup

And you want to make a bulleted list. You could obviously enter a * in front of every line manually. But, you can use RegEx, of course.

Search Replace by
^ *

This will result in:

* Flour
* Eggs
* Milk
* Salt
* Maple sirup

The ^ is a special character that matches the beginning of a line. Replacing this with one or more characters will prefix each line.

The same goes for end of a line. Let's say you need to add a comma at the end of each line.

"Foo"
"Bar"
"Baz"
Search Replace by
$ ,
"Foo",
"Bar",
"Baz",

The last comma might be unnecessary and thus must be removed manually. There is a more sophisticated search to fix this but most of the time it's not worth the effort. It's always good to let RegEx do the heavy lifting and fix the resulting 2% manually.

Swapping Columns

Assume we got the following data

"foo":8,
"bar":42,
"baz":13,
Search Replace by
"(\w+)":(\d+), "$2":"$1",
"8":"foo",
"42":"bar",
"13":"baz",

What's happening here? We are using groups. A group is delimited by parentheses. So we have (group1)(group2)(group3). The cool thing about groups is to use them later on. In Sublime, $n is used where n is the group index starting with 1. Notice that we did not include the , and " inside the groups. Inside each group, I am using \d which matches a single digit and \w matching a word character like a-z, A-Z, 0-9 and _, but no - e.g. + matches one ore more characters of the kind.

Convert CSV to JSON

Let's assume we have the following CSV:

1,35,"Bob"
2,42,"Eric"
3,27,"Jimi"
Search Replace by
(\d+),(\d+),"(\w+)" {"id":$1,"age":$2,"name":"$3"},

Result:

{"id":1,"age":35,"name":"Bob"},
{"id":2,"age":42,"name":"Eric"},
{"id":3,"age":27,"name":"Jimi"},

Again, we're using groups and digit or word matchers.

The transformed result could easily turned into valid JSON by adding a wrapper object and arrays as well as removing the last comma. But the heavy lifting is done by RegEx.

Create Test Data

Sometimes I need test data, a lot.

What I usually do, is to create a sequence of numbers using...Excel. Yep, Excel. Excel is pretty smart when it comes to sequences. E.g. you can enter something like:

#
10
20

Then select both an drag on the right bottom corner to fill the cells below. Excel is able to determine that the next number is 30. So based on that that, copy the rows in to Sublime:

10
20
30
40

Then I apply the same strategy as before:

Search Replace by
(\d+) {"id":$1,"username":"user$1"},
{"id":10,"username":"user10"},
{"id":20,"username":"user20"},
{"id":30,"username":"user30"},
{"id":40,"username":"user40"},

Learning

RegEx101

There is RegEx101 where you can test if RegEx matches. Modern editors like Sublime and IntelliJ will dynamically highlight matches in your current window. However, this page is also great to find errors and to learn what actually matches and why by using hover and the explanation section.

RegEx Golf

Then, you can use RegEx Golf as a fun way to learn RegEx.

And of course, here on dev.to

Summary

As you can see there are plenty of use cases for RegEx to help you with small and larger tasks that would manually take hours, especially with large data sets.

Discussion

pic
Editor guide
Collapse
stochastimus profile image
Larry Lancaster

Great article, good topic. If you’re an expert, there’s no reason not to add regexes to your bag of tricks. The key is to understand not only what happens logically, but also the runtime consequences. For example, take PCRE2, an ubiquitously available library. In this flavor of extended regex, you can use greedy matching (i.e., \d++). Used right, along with other constructs, you can judiciously avoid backtracking by the regex state machine and make your regexes fast and lean. I would advise not to be afraid of them, but like swords, to respect them and understand how to work with them. So it is often with powerful things. :)

Collapse
stealthmusic profile image
Jan Wedel Author

Thanks 🙏
Since I would not consider myself as an expert, I would not do it :)
I would still vote against if there is any more readable alternative. Strive for readability/maintainability and only optimize for speed if it’s necessary.

Collapse
stochastimus profile image
Larry Lancaster

Of course! Makes sense.

Collapse
skhmt profile image
Mike S

As an alternative to RegEx Golf, I've found Regex Crossword to be pretty fun!

Collapse
kip13 profile image
kip

Good!

Collapse
stealthmusic profile image
Jan Wedel Author

Thanks, I will have a look!
Looks like you’ve started a markdown link but missed the url... ;)

Collapse
skhmt profile image
Collapse
jscooksey profile image
Justin Cooksey

I really have to learn more regex. I use the online tools to figure what I need, but I really need to learn more on it, so it's more ingrained. Especially on search and replace in editors.
Thanks for the article.

Collapse
aadibajpai profile image
Aadi Bajpai

I love regular expressions! I was able to circumvent using two whole different APIs by employing some very clever regex string manipulation in one of my projects. The speed improvement is unparalleled.

Collapse
stealthmusic profile image
Jan Wedel Author

It depends on the circumstances and requirements but I’d still reply with:
dev.to/stealthmusic/comment/cnm2

Collapse
aadibajpai profile image
Aadi Bajpai

I'd say that regex should be used if they can make a significant difference and you're aware of the scope of the problem being solved by it. That's where Cloudflare went wrong, I'd say. I use it for url formatting so even if it goes wrong, all I get is a 404 hopefully :P

Collapse
stealthmusic profile image
Jan Wedel Author

Thanks for your advise. I absolutely share your views, so I have to ask if you actually read the first section about „not to use it in production“? ;)
I even use the results of such an operation only for testing purposes.

Collapse
stealthmusic profile image
Jan Wedel Author

BTW, I just added a disclaimer, just in case what I wrote here could be misunderstood. I don’t mean something like „using regex is dangerous but I will show you how to do it right“. That’s absolutely not what I intended.

Collapse
rickmcgavin profile image
Rick McGavin

I read this, and fatefully was given the task of taking taking two excel columns of 6,000 zip codes and turning them in to arrays. Made incredibly quick work of that, so thanks!

Collapse
stealthmusic profile image
Jan Wedel Author

Haha, I‘m glad my article could help! 😊

Collapse
softmantk profile image
NIKHIL CM

Thanks @stealthmusic . It was very helpful. I tried it, and it is amazing !

Collapse
stealthmusic profile image
Jan Wedel Author

Glad it helps. There are certainly more things to explore and learn. :)

Collapse
cyr1l profile image
cyr1l

Nice article. Another alternative, an online visual regex tester: extendsclass.com/regex-tester.html