I made a thread yesterday called How do you feel about regex?
Lots of fun discussion. Lots of people weighed in on their tools and tactics, but I'd love to move the conversation that way.
How do you go about using regex.
- What tools do you use? (if not from full memory)
- How do you encapsulate/label/comment regex?
- What types of problems do you most solve with regex?
Looking forward to any and all comments!
Top comments (23)
For building & explaining regular expressions, I always use regex101.com/. It's a great tool that really break down how your regular expression will work, which is very useful if it can get complex. It also has multi-language support to match the regex flavors across different major languages & distributions.
For learning regex, I usually just Google and dive through StackOverflow posts similar to what I'm trying to do. If I write a regex that is more complicated than just some alphanumeric grouping, I try to comment it with some human text such as
capture the ID from the post slug
or something of the sorts...not very in-depth, but 😅Regular expressions are most commonly used when I am validating string structure, which could be form inputs, hostname information, or CSV data. If I am parsing strings, my first go-to for basic cases is to do some combination of
.split()
and.join
on the resulting array, as its a bit easier to reason about and requires no understanding of regular expressions, just some basic logic. I tend to fall back to regular expressions if the string parsing is more complicated, has more than one match, or requires some more advanced substitution.Just curious, have you ever considered taking on a dedicated approach to learning Regex — like a book or course?
Definitely not a judgment, just a curiousity.
No judgements found!
I tend to learn on the fly as I'm trying to implement something, so it probably means I am going to immediately use the idea if I am learning something new concerning regular expressions. This is also predicated on the fact that I've been using them with regular consistency for 4-5 years so I have a firm foundation in the basics. If I am looking something up, its usually related to Named Groups (I always forget the exact syntax), Lookarounds, and Conditions. The latter two only get used in very rare & complex scenarios, so I don't bother committing them to memory fully.
When I was just getting started, I just spent a day on regex101.com/ trying to build different expressions to match different strings to see how it would work. This learning was compounded by the fact that Django used to exclusively use regular expressions for route matching, so it "forced" me to work with them.
I usually work from memory and try to use as many named capturing groups as possible because I find that it serves to provide basic, inline, documentation of the pattern itself, and provides a more expressive way of accessing the groups on the match result:
I just write them down; ,usually they work. I comment more complex RegExp by splitting up the parts of it in a comment, e.g. for a simple example:
Well I did use regex quite heavily to build a basic JS syntax highlighter (instead of doing it properly with tokenisation)
Other than that I just tend to use them for validation of input!
I like using regexr.com/ to test my regex and make sure it behaves correctly for edge cases. It's a wonderful tool!
These days, I mainly use regex at the editor level (e.g., in VS Code) to mass-replace certain patterns with other patterns when it's not possible to easily rename them using built-in editor shortcuts. For example, in early 2021, I migrated my site from Jekyll to 11ty, and as part of that migration, I had to convert a bunch of my Liquid shortcodes to use a new syntax. Since I had hundreds of matches, I relied on regex to mass-replace them rather than doing it by hand.
More recently, I also learned about the HTML
pattern
attribute, which accepts any valid regex to validate a form input, and have been using it where appropriate for client-side validation. For example, in a recent project, I used the pattern^[a-zA-Z0-9-](?:(,\s*)?[a-zA-Z0-9-])*$
to match a comma-separated list of identifiers, with potential spaces after the commas. It seemed difficult to arrive at this solution initially, but then I realized that it was just a more complex case of the slug regex pattern I used here: npmjs.com/package/is-slug.While I find it easy to compose basic regex patterns, I do think it's harder to read regex, especially for complex patterns like the one above, and especially if I'm reading other people's regex.
For Ruby regex I've always used Rubular as a lightweight reference tool. I do much more from knowledge/memory than I used to, but I'm still mostly a user of search and tools like this for anything complicated.
/[a-z][0-9]+/
, in Ada it is a bit more involved.so $askagain
curl_cat_whatever $something | rg -n -w $regex
I wrote(WIP) an article which I shared with my colleagues which explains regex. These colleagues have little to no knowledge on how regex works hence the urge to write and help them get started using it since we're the last people who work on clients' project before giving it back to them. Our work includes making sure the data's clean and consistent. notion.so/vicentereyes/Introductio...
I use Regexr (regexr.com) to quickly test out my regexes. It supports JS style regex. and is very handy since it has features like live preview, and explanation of your regex.
For explaining regex, I mostly try to create simple and small regex, and assign it to some const (or function) to make it understandable.
e.g. I'll never do
if (/regex/.test(str))
, but alwaysconst EMAIL_REGEX = /regex/
, and then my code.For solving problems, I use it to sometimes modify data structures on the fly, and code refactor.
I also recently wrote a post about it: dev.to/admitkard/regexp-cheatsheet...
RegExp Cheatsheet to speed up code editing and refactor
Piyush Kumar Baliyan for AdmitKard ・ Jan 4 ・ 5 min read