Regex seems to have a broad array of love and hate. How do you feel about it? Do you seek to use or avoid it as a problem solver, and how much do you understand it?
Regex seems to have a broad array of love and hate. How do you feel about it? Do you seek to use or avoid it as a problem solver, and how much do you understand it?
For further actions, you may consider blocking this person and/or reporting abuse
I like it but I tend to tell people to avoid it if possible just because it makes code hard to read.
If people on the team do use regex I ask them to include a link to regexper.com/ in a comment.
It generates a flow chart like this one.
Oooh. I like this tool you introduced upon everyone.
Personally, if a regex is confusing me, I'll just pop over to regex101 for a quick test. It's also a great tool.
I prefer regexr.com because regex101 is not embeddable in figjam document figma.com I use to keep code notes as it supports code blocks.
Other tools I find pretty but buggy for complex regex are :
regexper.com/
jex.im/regulex
extendsclass.com/regex-tester.html
I didn't know about one. definitely useful. Thanks for sharing.
Wow thanks for sharing that tip.
Now, this is really cool and useful - thank you!!
didn't know that service, seems awesome, also worth a try cucumber.io/docs/cucumber/cucumber..., makes regular expressions more human friendly and readable
Have tried them all I think, that one is nice but it doesn't work for very complex regular expressions (like with some lookaround expressions) whereas it works with regexr.com
I don't write regex. I look up what I need to when I need it and I never think about it again. 🤣
I've found that the specificity of what I need when I do need it can vary so vastly, and that the library for it is so large, that the best use of my time is figuring out what I need at that specific moment and moving on.
It's not needed often enough to warrant full deep dive into it, for me at least.
You write a regex and don't write a comment next to it describing what it does, you and I gonna have a lil' chat ;)
I recently had a case where the person I was reviewing added regex, and they did comment what it did well, and honestly the comment looked exactly correct. They were missing tests, and it wasn't until they ran tests that they realized that what they had done was not what they described in the comment.
Hey, tests go without saying, right?
;)
As long as it can be tested proven, and injectable (i.e. not hard coded) should be fine? :)
Works great in my terminal and editor. I can search for something till it looks right, then replace it with what i want.
It production code its so hard to see all the edge cases. As a Data Engineer performance is about last on my list of metrics, as most of what I do is run on a schedule, not when someone clicks it, so an extra 5s on a 10 minute run is no big deal. With that said every time I see a regex in a code review I ask can we do this without a regex, even with a huge performance hit, or write tests for every possible edge case we can brainstorm.
I am what you might call an ocassional developer. As an author, I created an app that helps me convert my work to ePub format.
One of the requirements with ePub is that you output all your files to xHTML. However, the output from my word processing software (Microsoft Word) outputs to some very haphazardly developed HTML that is not xHTML compliant.
I found a whole bunch of libraries that allow me to manipulate the output to xHTML, but in fact they did not do many of the things required to pass basic specifications. Specifically, in xHTML, tags must be lowercase. All of the libraries I worked with at the time made broad assumptions about the HTML structure and worse, did not do the basic conversion from uppercase tags to lowercase tags.
After spending way too much time dealing with this, I paid a developer to figure out the problem. One day later, he delivered three lines of regex code that handled the uppercase to lowercase issue and two other problems I was dealing with. Nearly a month of work and me trying to understand how all these libraries work and sudenly I had working code. My books are thousands of pages long and broken up into dozens of files. The regex code worked great everytime.
Having not known about regex until that moment, I went about using it everywhere. I used it when my XML output was not correct. I used it to fix the file output where special characters should be renamed with escape codes and much more.
I am willing to bet that while my code technically works, all that regex is probably a bad idea. Also, regex is not easy to read, so you really have to document it well.
Overall, I really do not like the structure because of it being hard to read, but will say that without it, I probably would have just tossed my app into the trash bin had I spent two more weeks on something as simple as fixing uppercase and lowercase letters along with a few edge case issues.
The drive to complete my app drove me to use regex in areas I probably should not have used it. For example, I used regex to modify some XML files and since I did not really know XML that well, I simply gave up learning it. Instead, I used the XML output from a library and then modified the file with REGEX. Really, I should have sat down and learned XML a little more.
Regex feels like one of those things we all refer to as monolith applications. It seems like you can do anything and everything with it, but the complexities in how you write proper regex and then create test cases all feels very convoluted. At the same time, there is something very tantalizing about 1-3 lines of regex code that would otherwise require customizing libraries, creating a custom API, or doing something else to solve some basic problems.
I am curious, do we know if there are alternatives that are easier to understand and use?
If there's a problem where it makes sense, I love it.
Regular expressions can be written in a clear and easy-to-read way, across multiple lines and with comments. They can condense a lot of logic into something easy to parse by humans, even though their reputation says otherwise. Trying to replicate what they do with a bunch of separate
if contains(..) and startsWith(..) and not contains(..)
methods is a hack, imo.Using them for anything where a simple single, named method would suffice is a bad idea.
Good. It's a useful tool.
I treat it similarly to SQL or CSS – use it where appropriate, but try to keep it behind an abstraction if possible. E.g. I would wrap a phone number RegEx in a function such as
isValidPhoneNumber
.If it's simple to solve the problem without a RegEx then I'll solve it without the RegEx. But sometimes a RegEx is simpler, e.g. the above example.
RegEx language generally – I know the basic concepts well enough, but I always check reference materials and/or use a RegEx tool when implementing one.
Specific RegExs – I almost never re-use a RegEx without first taking it apart and making sure I understand what's going on (same with any code snippet really).
I love working with Regular Expressions.
I use them frequently in VS Code and have a few articles out here on implementing them for Search-and-Replace.
I had a project where I had to replace an AS400 Custom Script Search Language with a JavaScript version. I quickly learned how slow they are when running hundreds of them per line. It also prompted me to create a new tool for documenting Regular Expressions (github.com/bob-fornal/reggie-docs).
I think they are great. They don't work in all situations.
ultrapico.com/expresso.htm is for .Net and a great tool for creating regex expressions, it doesn't support all elements of regex. Am not a super genius on them but they are an essential element of development for me when handling information.
Here is an example of configuration from a translation file from my application which is an awesome DevOps Deployment application am hoping to market in future.
The configuration below is to either find a value or a regex expression in an input file (typically itself an application configuration file) to clean a development configuration file. Sounds complex, but a great way of translating source configuration into something generic for different environmental deployment.
Regexes provides more control, but aren't always the best approach.
inforhino.co.uk/beta/automation-an...
For being terse it's surprisingly hard to read compared to other forms of computer programming line noise, like old Perl that runs on oak barrels and mules or everyday J.
Requiring visualisation tools to be somewhat interpretable outside the trivial case means one can't just skim it and be fairly confident about what it does, unlike the surrounding application code (hopefully).
If I can take the performance hit or development time I'll probably avoid regex if I can, usually it's possible to implement a parser that's easier to understand at a quick glance.
So I mostly use them in CLI settings, like patterns for ripgrep or sed.
What's the alternative? Just curious. I use REs when I need REs, and when I need one it's because the alternative seems to be to write a whole string parser or some clunky seriously code-verbose combination for finds, splits, replaces, etc that is even harder to read that an regex.
But then I guess I grew up with REs and have no real issue with them. Be nice if advanced RE syntax had evolved a little more standardly and didn't diverge into flavours, which it did but hey.
A few years ago I bought a book (I can't find right now) about the theory of computation, and it started with finite deterministic automata (FDA), moving on to non-deterministic finite automata (NDA) and how FDA and NDA are mathematically equivalent, then it showed how to turn NDA into regular expressions.
The explanation opened my mind to a new way to look at regex, so I found it a lot easier to understand and write them, and also understand their limitations.
The main issue I'm left with is that regex engines these days have added look-ahead and look-behind which means they are no longer mathematically equivalent to NDAs.
Regex is one of our most useful tools, when used appropriately. But that requires good knowledge of regex (such that you actually understand what you're doing) and good knowledge of the limitations.
If you're touching HTML with regex, chances are you're doing it wrong.
If you need backreferences and readaheads, chances are you're doing it wrong.
If you're only using
.*
, chances are you're doing it wrong.I recommend that anyone who wants to understand regex implement a small regex parser for themselves, to get a greater understanding of how it works. Alternatively: Try to rewrite your regex pattern exclusively in terms of
|+.()
to explain what it actually does. Everything else in the basic regex syntax builds off of that.If I can avoid it in my code, I will. They can grow like crazy overtime, and they become difficult to read and understand.
I love them in my editor (I use Vim btw) or to perform operations in the shell, however. It's super powerful for everything plain text: search, search and replace, repeating an action on specific lines... the list goes on.
I admit, when I first became really fluent in RegExp, I tended to overuse them for some time. Even now, I have the impulse to solve string-based issues with RegExp, but I have learned to stop and think about if they really are an improvement over other solutions. So you could say I seek to use it and try to avoid it at the same time, mostly because I understand it.
My current stance is that RegExps are hard to read, have performance issues and some dangerous pitfalls (e.g. regular expression denial of service (ReDoS)).
I find it more pleasant to write small custom parsers instead.
regexone.com/ is BY FAR the best resource to learn RegEx from scratch.
Pretty much everything you would get in this book (but also worth reading!)
amazon.co.uk/Mastering-Regular-Exp...
And actually, core RegEx is fairly straightforward:
Good luck!
Never liked it, I tried to but never.
While writing or reading some, for a very short while, I get this feeling that I understood it now , but next second here and there, gets confused again.
I guess its just one of those skills, where one would *either get it entirely or nothing at all.*
Writing regex especially is tricky for matching IPV4 addresses.
I hope I am not alone :/
Thank god for these parser/convertor tools.
It's generated code, according to a declarative syntax, that has been tested and iterated on by millions of projects over decades... What isn't to love?
Prefer if we can use simpler things, but if it's needed. Then it is needed.
If you're trying to pattern match a string, is there really a better option? If so, I don't know it, but feel free to let me know.
There's definitely a rabbit hole to fall into and a lot of complexity along the way, but I've found regex to be one of the more reusable and powerful tools in my arsenal even though most of the time they seem to be on the simple side rather than complex.
If it ever get's confusing I've always got my goto: regex101.com/
Hmmm...I would advise to use it only for search, data scrapping and data validation/cleaning purposes. Beyond that I won't advise anyone to use it unless there's a real reason to use it like to reduce the complexity of your code for the above purposes.
Plus there's tons of websites that allows you to copy and paste regex for specific uses depending on your language of choice or regex engine you are using.
To me it's sort of like Leetcode, readability is a issue unless you had learnt it but beyond that it kind of being too clever with your code.
I like to use Regex and I use it frequently but not for anything too complicated. I generally build all my RegEx from Regex101.
I really understood it at its core during my university Computer Science Program where we talked about DFAs -> NDFAs -> Regular Languages -> Regular Expressions
I started on Perl, so I'm very familiar with regular expressions. That said, they definitely are not that easy to read, so for code, keep it simple and comment what you are doing. It solves a lot of problems, but not necessarily better than more readable approaches. Still, you won't completely avoid them. Rewrite rules, for example, still require a basic understanding of regular expressions, as well as some network device configurations.
Pretty good. I remember first time when I saw guy typing long multi-group and nested regex to grep out logs from file. I was amazed. Then I learnt it on my own. Still wikis are helpful, but for most daily cases it's possible to learn.
Indeed. The very point. That REs shine for small ad hoc jobs. Grammars and parsers are from a different context altogether. To epitomise the role of REs consider a job I example that simply wants detect all lines that contain one word somewhere before the other. This is trivially simple in an RE and preciously whence their definition stems (as in I can write that on the command line for a grep or find) and the niche in which their ubiquity reigns. That small need right now for a mildly complex pattern test or field extraction.
It's a great tool for solving certain types of problems that would be very hard to solve otherwise. I've seen some very arcane solutions for problems that could have been solved very simply with regex - because the author of the code didn't know how to use regex. Almost everyone learns about regex later than they should - I know I did.
Any non-trivial regex should be clearly commented, though. That's often missing.
I love regex and I'm fairly confident with creating complex regex's from memory, but I understand that many people find it difficult, so I try to either use it sparingly or place a lot of comments around it to help others who may not be so comfortable.
When I use RegEx's I try to follow these rules (more like a guideline 🏴☠️)
My context is usually used in file name parsing as my users are supposed to name their documents in a meaningful, structured way for ingestion into other systems. The exception report (files not match-able) typically runs longer than the source code ;) Turns out people just can't spell common words consistently when it matters. (I'm no exception)
I love them. I find them really powerful and flexible, especially in script-like languages like Ruby.
Yes, I know, sometimes they look like line noise... That is maybe their major drawback, but it is compensated by their power.
How much do I understand regexp? Well, it is difficult to judge yourself, but I think I know them pretty well, although there are maybe some features (e.g. greedy vs not greedy) that I do not use often and I need to check the manual.
Oh, yes, BTW, with regexp-search-and-replace in emacs you can do miracles...
I think they're fun to figure out but I only understand the very basics. My own regexps are usually way longer than the stack overflow suggestions. It seems like a fun topic to dive into, although I've never had a real need to do so. Maybe we need a game for learning regexp 🤔
My go to example of a love/hate relationship. It’s a super useful tool, but used in the wrong ways can lead to a lot of pain. One thing I always like to do Is leave useful comments around a regex when I have to use it in code. This is good for future me just as much as it is good for other team members who stumble along it in the future.
youtube.com/watch?v=WDaNJW_jEBo
I've learned to love
more than hate
and
that's the overall thing
of it
all just to try to learn
to love more
because it's easy to hate
but it's hard to love
-- Snoop Dogg
It's a tool in the toolbox. Like the other tools, my understanding is incomplete, but generally good enough to know when to use it and when to choose a different tool. Can use it sufficiently well to get done what I need to, which normally isn't particularly demanding. While it'd be nice to understand the mathematics behind it, it probably wouldn't actually change how I'd use it.
I usually take reference and modify it as per needs while working in the terminal.
It takes time for me to digest what's going on, after tinkering a litle I feel better and start to move towards the result.
Working with regex feels really messy to me sometimes, but the feeling when that works is second to none :)
Very useful BUT it's almost an entire language in it's own right. I kind of respect that and just skirt around it. I use online regex builders when I need a regex for a filter style func call and that's me and regex done with until next time!
There was an 300 page O'Reilly book sold at one point just on Regex and nothing else! That's how deep the rabbit hole goes on regex. If you can avoid regex, then do so, if you can't then make the expression simple and watertight by testing the heck out it. Regex is greedy and if you don't test it properly you will regret it later.
In the past I hate them and just searched copy and paste because it was pretty simple need but as I'm currently developping a Visual Meta Programming tool I'm obliged to really understand what I'm doing like Lookaround and Lookbehind so I'm now starting to like them more :)
Code Smell 41 - Regular Expression Abusers
Maxi Contieri ・ Dec 3 '20 ・ 1 min read
Regex was my second programming “aha moment.” I love regex, all thanks to Apache’s mod_rewrite way back in like 2004.
If you're here and want to learn more about regex (regular expressions)...
Getting Started with Regular Expressions
Nick Taylor ・ Jul 18 '21 ・ 4 min read
I like how regex makes it easier with the task, but personally I really don't understand it most of the time. But I will try my best to learn it.
I'd rather master oop then master regex
As a vim/nvim user I would say that regex is my bread and butter. I also love using GNU grep, "Global REgular exPression Print", sed and a bunch of other programs that make use of it.
Regex is difficult to read and can have unanticipated side effects. I tend to run from it if I can.
I use it for validations many times, I think it is better (sometimes) instead of using multiple if-else statements. I think the code looks more clear when using regex.
I have some beef with regex, because it doesn't want me to add support for it to ParseJS... 😭🤬
It's great, the only problem is I can never remember the syntax ... I just need to look stuff up almost every time I use it, lol
One of the key building blocks of tech over the past 30 years, but one that you still have to reach for the cheat sheet. Once you have written your first mod_rewrite rules, you never look back...
Does sed count?
Regex is like recursion, I don't need it very often but when I do it is amazing at solving the task at hand.
Hello Developers
Super useful, interesting to write
I'm my opinion regex is awesome in validating strings, but npm packages like yup removes it's necessity; by introducing more readable code and validation callbacks and responses.
Very underrated in writing tests
It's something I use often, but still 100% only copy from Stack Overflow.