loading...

I hated Regex so much that I made iHateRegex.io

geongeorge profile image Geon George ・1 min read

In years of being a developer I never really spent the time to learn regular expressions. It always seemed like a hard nut to crack.

I used to spend so much time trying to find regular expressions for my use case.

now you have two problems

Enter iHateRegex.io

https://iHateRegex.io

I made a simple tool that will explain to you commonly used regular expressions work.

It is a simple tool that will show a visual graph as well as matches and highlighting for code.

Hope you like the tool :D

Tech

Public Repo: https://github.com/geongeorge/i-hate-regex

The application is built using:

Update

  • Thank you for all the love and support guys πŸ’“ Posting this has given me enough motivation to work on this more.
  • The application is just an MVP right now. With the support of everyone, I can make this into a great project for beginners.
  • I'm also working on tutorial pages to get started on regex. here's a sample

Producthunt launch

I just launched iHateRegex on Producthunt 😺

πŸ‘‰πŸ‘‰
https://www.producthunt.com/posts/i-hate-regex

Posted on by:

geongeorge profile

Geon George

@geongeorge

Full-Stack Maker πŸš€ β€’ Blogger β€’ http://ihateregex.io β€’ http://igbro.com β€’ http://geongeorge.com

Discussion

markdown guide
 

The main problem with regex is that they overused.

For simple problems, regex are not needed, you have simpler, more readable solutions.

A combinaison of split(), subString(), removePrefix(), removeSuffix() is usually enough.

For complex problems, regex are not good at all.

Do not use a regex try to parse email, or URL, or HTML, ...

If the regex is not trivial, do not use a regex.

My advice:

Have a rule for pull requests that insists that every regex must come with a unit test that includes input that the regex is supposed to match and the ones that it's supposed to reject.

 

I love the "regex must come with a unit test" bit.

I would say though, sometimes validating real life values require non trivial expressions. Thinking about phone numbers or proprietary unique identifiers here.

 

phone numbers are a good example
for a simple validation I don't need a regex
for a bullet proof, you shouldn't use a regex either, you should send a SMS to the number via twilio and ask for the confirmation code

I'm having a hard time picturing what simple validation of a phone number looks like without regex.

Also I would mention the twilio scenario doesn't work for land lines(to my knowledge).

Would you mind giving an example of something you might do? Let's say you have only to validate for 3 patterns:

5555555555
+1 (555) 555-5555
555-555-5555

Lets also say any of the patterns above may contain an optional extension:
x345

Thinking about this now, I would say the regex would be trivial for simple validation. Something like: "number of digits is greater than or equal to 10" would do, so your original statement stands. Still curious how you would go about this without regex.

No regex can validate that a phone number is valid.

The phone number 0049234523453452 does not exist, how do you know it?

Sending a SMS - or calling a landline, Twilio can do that as well - is the only reliable method.
Same for emails, sending an email and waiting for the user to click is the only reliable method.

I'm not saying a regex can tell you a phone number is valid. I'm refuting the point:

phone numbers are a good example
for a simple validation I don't need a regex

Just curious to see how you approach that.

To pre-validate, you can keep it simple like this:

fun main() {
    val emails = listOf("john@example.com", "John Doe")
    emails.forEach { println("input=$it isEmail=${it.isEmail()}")}
    // input=john@example.com isEmail=true
    // input=John Doe isEmail=false

    val phones = listOf(" 5555555555 ", "+1 (555) 555-5555", "555 - 555 - 5555", "invalid")
    phones.forEach { println(it.normalizePhoneNumber()) }
    // 5555555555
    // 15555555555
    // 5555555555
    // null
}

fun String.isEmail(): Boolean = 
    length > 6 && contains("@")

fun String.normalizePhoneNumber(): String? =
    this.filter { it.isDigit() }.takeIf { it.length > 6 }

That's pretty readable. Still we can see it fail in an edge case right away: +1 555-555. Not to mention the plus is a valid character for the phone number.

This isn't a hill i'm going to die on or anything, but regex is great for pattern matching.

You had a great way to avoid it for the equivalent

return phone.replace(/\D/g, '').length > 6;

I just don't understand why using something a bit more complex to catch an obviously invalid patterns is so offensive to some people.

Anyways, thanks for sharing how you would do it. Much appreciated.

@bpedroza
You have a good point, which will allow me to refine my position.

I think that we should stop over focusing on the HOW you did the pattern matching part (oh see, a nice regex! let me make it even better) and focus more on the WHAT it's supposed to do and WHY it's important.

Having the unit tests for the regex allows to reframe the question this way.
Then it's an implementation detail whether you use a regex or a parsing library or whatever, and I am fine with all of those solutions.

 

For simple problems, regex are not needed, you have simpler, more readable solutions.

I used to feel the same.

Then I actually learned how to use regex.

Now I think that non-regex solutions that use more than one function are better done with regex. Because 80% of the time you only need simple regex patterns.

For complex problems, regex are not good at all.

(Almost) totally agree.

 

Have a rule for pull requests that insists that every regex must come with a unit test that includes input that the regex is supposed to match and the ones that it's supposed to reject.

Will do <3

 

I agree about the part where people shove in regex where it's absolutely overkill.

 
 

Delightful tool. Bookmarking!

I also hate regex, so much that I'm helping design an expression parser that addressees many of its flaws. This new parser won't replace regex β€” trying to be everything to everyone is how it came to be so annoyingly esoteric to begin with β€” but it'll hopefully be better than what we have for a number of common cases.

 

Thank you.
Your parser would be amazing. If it plugs into javascript I really would love to try it out.

 

I'm implementing it for our C++ game engine, so this implementation won't be Javascript compatible. That said, both the code and the specification will be open source, so there's nothing stopping you (or anyone else) from implementing it elsewhere.

is it already under construction? I'd love to see the repo

We're fairly early in the implementation stage β€” the code is currently under a serious breaking refactor β€” but you certainly check it out. The project page is here, and you can follow our development on Phabricator.

I really wanted a parser built like regex101 for my tool. Really don't know where to start
github.com/geongeorge/i-hate-regex

 

I have been a programmer for over a quarter of a century. And I used to think that I would never "get" regular expressions. Then one day about 6 or 7 years ago I realized I could write regular expressions correct about 80% of the time, after which it occurred to me that I now "get" regex. What changed?

I started using PhpStorm (or any JetBrain's IDE) and forced myself to use regular expressions to do search and replace. Because of how their UI works β€” itΒ includes a preview of the result for each occurence when it asks if okay to replace β€”Β it basically trained me to grok regex.

So, trying to learn regular expressions only when you need to use one in your code will, if you are like me, leave you forever unable to become one with regex. But if you would like to learn the 20% of regex like the back of your hand that you'll use 80% of the time, grabbing a JetBrain's IDE (or maybe some other IDEs or text editors) and forcing yourself to use regex on almost every search is really all it takes.

#jmtcw

 

Thanks for sharing the story Although I didn't use any special ide other than vscode, building ihateregex.io made me understand a lot of it. I can build my own expressions now.

 

This is a nice, clean resource, with a great little graphical description system. I like it.
It's different to all the other regex pages because they're geared around you typing things in and testing them, where this is a "tl;dr" of sorts.

I don't understand why people don't like regex though. It's kind of like saying you don't like arrays.

 

Thank you. I like clean as well; clean is good.

I don't understand why people don't like regex though. It's kind of like saying you don't like arrays.

In my case, I just put a clickbaity url so people would actually look at it and I thought it's funny..lol

PS: I'd love if you put this review on product hunt. This is my first launch there. producthunt.com/posts/i-hate-regex

 

You really did a great job at self-improvement here, even if it wasn't your intention at first! :)

Take a problem, solve it and overcome it.
What I like especially about your site is that you are greeted with a couple of popular choices and just a searchbox to get started. On other sites I always find myself getting lost in information like tips, setting options, understanding different flags and stuff that is visible right after loading the page.
The clean UI helps to stay focused on getting the regex and leaving again.

 

Thank you for the coffee and the feedback <3

I'm happy to see you liked it

 

according to your regex, this is a valid emailaddress: geon@ihate@regex.io

 

I have 2 email expressions in the app(try a search). Just use the simple one if you don't know what you want. It works 99% of all the time.
If you wanna get it 99.9% success rate use the complicated one (but please don't)

read the comment thread in this post by @hyftar and me

 

I see. Well, I just clicked "email" on the homepage and I landed on the simple email :-)

With the complicated email, I think you need to consider word boundaries, since you match a part theproblem@test@gmail.com , but it should be entirely right or entirely wrong.

I'm starting to see why you hate regex ;-)

πŸ˜‚ Yeah. I will spend some time to properly understand the email part

 
 
 

Many people forget that you can use multiple regex pattern for one input. While everybody knows that a function for everything is bad and you should write one function for one purpose, everybody still writes regexes that include everything in one command. Maybe it's time to write an article on encapsulation of regex?

 

Where does all this hate for Regex come from?? I never thought it was complicated and learning it can be done within a few hours.

 

it comes from fear and lack of clarity.
Take the case of a simple expression to validate an email:
emailregex.com/
Why does it have to be this hard?
even if you check any stack overflow thread you'll find 100s of opinions.
stackoverflow.com/questions/46155/...

(read that again imagining you are a beginner)

 

Well simply put, Regex is not the right tool for validating email addresses and should be only used as feedback on the frontend of your app. If the patterns you're using get this complicated, it's probably a sign the thing you're using it for isn't the right task for the tool (and considering the amount of people in the comments of your website that are finding all sorts of edge cases, I think this is accurate).

Of course if you want to do things right, it often gets really tricky, but that's like showing the code base of a huge website as an example to someone who wants to learn a programming language, of course he's going to be overwhelmed and will feel defeated, but no one should start with the complicated stuff.

Regex at its core is fairly simple, but sadly, it's often overused and misused.

Regex at its core is fairly simple, but sadly, it's often overused and misused.

That was one of the main reasons why I started this idea. It's a simple cheat sheet for commonly used regular expressions.

Also, I don't really hate regex (hypocrite me πŸ˜‚). The name is just a funny thing I came up with. I've seen a lot of hate for this online.

I hope the idea became clear when you opened the website. It's not a rant about how many people and I hate the thing.

PS: If you ask me the only email validation expression that I might ever use with regex is this: ihateregex.io/expr/email

I completely agree with the email one, that's also the one I went with in most of my projects. I convinced some teachers in the past that Regex is not the best tool for email validation and that we should use that pattern in frontend instead.

Also I figured you didn't actually hate the thing if you created a whole website about the thing πŸ˜‚πŸ˜.

You've done a good job of explaining it, although some patterns (such as the IP one) could probably be simplified a bit.

I'll definitely forward your website if I see someone struggling. Great job!

Thank you. I need to improve the explanations as well. (Will fix those :) )

at some point, I really felt embarrassed to share this.
It's been sitting in my Github for a while and I finally decided to share and see what everyone else thinks about it.
Your feedback has been of great help :D (there is also a beginners tutorial article in the making)

If your repo is public, I'd be glad to help with a PR or two! 😊

Here you go 😊:
github.com/geongeorge/i-hate-regex

The code is not the most elegant (I have to warn you)

Regex is not the right tool for validating email addresses

Huh. I always do complex stuff like this with regex, in frontend as well as in backend.
What would be your tool of choice?

Depends on the language, in PHP for example, there's this library: github.com/egulias/EmailValidator

Basically, since there are so many rules and edge cases about emails, creating a Regex to solve that problem is not a productive approach since if new edge cases are created or found, maintenance instantly becomes hell and the pattern most likely has to be recreated from scratch.

Fortunately, people have already poured tons of hours into solving that problem for us and we can build on their shoulders.

Another example where regex looks like an appropriate tool but isn't is with parsing HTML. It sprung a famous Stack overflow question answer.

I just launched iHateRegex on Producthunt 😺

πŸ‘‰πŸ‘‰
producthunt.com/posts/i-hate-regex

 

The problem with Regexes (especially the complex ones) is that they are really prone to errors so you have to write tons of tests to verify them. But who wants to write a million tests? We just copy and paste from stack-overflow or use a trusted library and pray for the best. That's why they have a bad reputation.

 
 

for newbie, it too complicated. when write regex only for email and username, it cool. But when it going with complicated stuff. me also sometimes stuck with it and hate it. but at end of i like it...!! because i m also programmer. :D. i have no else option. :)

 
 

Great tool, bookmarked! ⚑

 

Thank you <3 More features on the way

 
 

I'm happy you got the joke :P
Many took it personally

 

I like regex. I hate when people invent weird custom 'subsets' of regexp, instead of using this thing that works well. (albeit, I agree: it's not very readable. Free space regex with comments solves that, though.)

 

Great thing that you came up with something this useful when a lot of people still hate all the complexities behind regex. So, does ihateregex help to generate regular expressions with a set of strings?

 

not in its current form. It just allows you to edit and play around already saved expressions

 

Is the site down? I can't reach it.

 

Yes, it was for a couple of minutes. It's back online :)
Don't know what happened

 

Did a huge update and had some issue with ssl.
I guess it's fine now :)

 

Ooh i really liked your tool,why don't you make more like this?

 

Awesome work!

I don't particularly hate Regex's but I hate getting them right!

I have used this online tool (No connections to the site) regex101.com/ for a few years as something to test and help at least validate the cRaZyNeSs that sometimes happen when you "over do it" with Regex's

Keep up the great work!

Sandy

 

I love regex101.com for understanding some else's regex and breaking it down into component parts. Really handy for that.

 

One of the things about regexes is that they are (generally) not regular in two senses of the word:

1) they have different syntax for perl/PCRE, Python, Emacs, Vim, ...;
2) they are not regular in the linguistic/CompSci sense of the "regular languages", and even when they are, backtracking parsers are (almost always) used on them and this can lead to denial-of-service bugs :(

 

Regular expressions are extremely useful in (ad hoc) scripts and sometimes in search-replace operations in text editors (also for pure search to become more specific (fewer false positives)).

Typically one-off scripts to extract and output information from line-oriented input (even HTML). They don't need to be robust and are (usually) not maintained. They get the job done quickly and efficiently.

 

This is great, I hate regex too!

 
 

Nice service, Regex works nicely on simple things.
Thanks for the link - you deserve at least a Github * for your efforts.

 

Github stars are much appreciated -^ Thank you

 
 

Awesome, having the diagrams and cheatsheet right there is very helpful. It will be much easier to figure out how to customize it and learn something along the way. Great work!

 

Thank you. More features are coming as we speak. Do check us out in GitHub: github.com/geongeorge/i-hate-regex

 
 

Thank you. Doing as we speak

 

Hey Geon, cool project. Pretty sure it'll be handy to many.

Even the other day at work we were talking about how clueless we are about regex.
I'll do share it with them.

 

Thank you. that'd be awesome

 

Really useful app ! Thanks ;)

 

oooohhhh, I will defintely take a look at this. I hate regex also and the visuals you are providing will definitely help me debug my expressions

 

Thanks brother.. i hate regex too..nice share for people like us

 
 

🀩 This is awesome!
How can I contribute more builtin patterns?

 

Thank you <3 ! Right now there is a google form on the website.

You can also send a pr: github.com/geongeorge/i-hate-regex

Just add your regex to static/regexdata.json

(excuse the bad code. will slowly fix that :P)

 
 

Love regex, it's honestly not so bad, go to regextester and write some tests, besides, regex can be used in any language 😍