DEV Community

loading...
Cover image for Regex..why can’t you just be normal? 🤯

Regex..why can’t you just be normal? 🤯

eliasgroll profile image Elias Groll ・1 min read

The problem with regular expressions is that they are too powerful, hence often mistaken for a Parser + Lexer and therefore lead to code which is VERY HARD TO MAINTAIN.

Please devs, when you need to match a c struct - use ANTLR or whatever and avoid what I did back in the day:

/((\w+\s*(,\w+\s*))(\n|\s))?{([^}])}(\n|\s)(\w+\s*(,\s*\w+\s*)*)?;/g 🤪

(ps: it’s only one of many, yes it might have bugs, no I did not find them yet, no it’s not in production anymore..)

Discussion (9)

pic
Editor guide
Collapse
nickmaris profile image
nickmaris

Terminal and programming languages have some form of regex already, whereas antlr is something you have to install and you still need comments anyway:

github.com/antlr/grammars-v4/blob/...

Regex can be written in multiple lines with comments, clear commit messages and tests.

Collapse
eliasgroll profile image
Elias Groll Author

Sure, nothing bad with regex for small things like matching an email, version number or similar.

Just not the best IDEA to treat it as a real parser :)

Collapse
lexlohr profile image
Alex Lohr

Regex is a simple yet powerful concept, comparable to a hand gun - and as easily abused to shoot oneself in the foot. There are valid uses for it, and a lot more invalid ones. But don't be too harsh on yourself. If all you have is a hammer, everything starts to look like a nail.

Collapse
eliasgroll profile image
Collapse
codr profile image
Ilya Nevolin

Use tokenizers

Collapse
madza profile image
Madza

I always use regexr.com or regex101.com 😉
Based on the rarity I need it, never even consider to learn it fully 😀😀

Collapse
eliasgroll profile image
Elias Groll Author

I still use them! It is valuable to be good at regex, especially for one-time tasks where you want to reformat some data :)

Collapse
ben profile image
Ben Halpern

Can you explain ANTLR?

Collapse
eliasgroll profile image
Elias Groll Author

"ANother Tool for Language Recognition" - It is a parser generator :)

You can use it in similar ways to regex but it is meant for bigger tasks (like detecting a c-struct) and it can also detect and locate errors.

Every programming language uses a parser to interpret the code or translate it to machine code.

Fun fact: Some parser generators even use regex to create the lexer, which is responsible for splitting the input into tokens.