DEV Community

Cover image for Can you find the bug in this piece of code? - RegExp edition ๐ŸŒ
Keff
Keff

Posted on

Can you find the bug in this piece of code? - RegExp edition ๐ŸŒ

Hey there! ๐Ÿ‘‹

I'm back with another installment of Find the bug, this time with Typescript/Javascript. Regular expressions are useful, but can behave in some unexpected ways. Can you tell me what the code below will output and what the cause for it is?

!! Don't look at the comments to prevent spoilers if you want to solve it by yourself !!


Buggy code

const TEST_REGEXP = /[a-z0-9]+_[a-z0-9]+/gi;

function isValidName(value) {
    if (typeof value !== 'string') return false;

    return TEST_REGEXP.test(value);
}

const filenames = [
  "test_1",
  "test_1",
  "test_2",
  "other_test",
  "some_file"
];

for (let name of filenames) {
    console.log(isValidName(name));
}
Enter fullscreen mode Exit fullscreen mode

ย Now then, can you find the bug?

Top comments (24)

Collapse
 
teetotum profile image
Martin

The regex instance keeps track of the lastIndex and therefore does not for all items start from zero. You can remove the g flag and instead use string start and end anchors /^[a-z0-9]+_[a-z0-9]+$/i so the regex won't keep the lastIndex. Or you reset the lastIndex before test isValidName = (value) => { TEST_REGEXP.lastIndex = 0; return TEST_REGEXP.test(value); }

Collapse
 
darkwiiplayer profile image
๐’ŽWii ๐Ÿณ๏ธโ€โšง๏ธ • Edited

Here I was, looking for an error in the actual regex xD

I was not even aware of this weird behaviour, and would probably have written it as

/* line 6 */ return value.match(TEST_REGEXP)
Enter fullscreen mode Exit fullscreen mode

instead, and anchored the regex to beginning and end, of course :D


And for the actual answer:

--- before.js   2021-11-03 16:31:00.256761809 +0100
+++ after.js    2021-11-03 16:30:46.961086329 +0100
@@ -1,4 +1,4 @@
-const TEST_REGEXP = /[a-z0-9]+_[a-z0-9]+/gi;
+const TEST_REGEXP = /^[a-z0-9]+_[a-z0-9]+$/i;

 function isValidName(value) {
     if (typeof value !== 'string') return false;
Enter fullscreen mode Exit fullscreen mode
Collapse
 
nombrekeff profile image
Keff

Fantastic, thanks for the solutions! I'd go with removing the global flag . Though the other solution is interesting, did not think of reseting lastIndex manually!

Collapse
 
siddharthshyniben profile image
Siddharth • Edited

Got it! Because .test() tests the rest of the string when the g flag is enabled!

Collapse
 
siddharthshyniben profile image
Siddharth

Just remove the g flag to fix

Collapse
 
nombrekeff profile image
Keff

There you go, you found it!! Kinda weird if you don't expect that behaviour right?

Thread Thread
 
siddharthshyniben profile image
Siddharth

It is! I never knew that till today!

Thread Thread
 
nombrekeff profile image
Keff

Nice that's wonderful! That's what I aim this series to be, so I'm glad you learned something new!

Thread Thread
 
darkwiiplayer profile image
๐’ŽWii ๐Ÿณ๏ธโ€โšง๏ธ

I also didn't know this. It's probably very useful when building a parser, but when you don't expect it, it's a super confusing error ๐Ÿ˜–

Thread Thread
 
nombrekeff profile image
Keff

Yup, I was super confused when I first encountered this error. I found it making some unit tests for work, and lost a bit of time trying to figure it out... just for the issue to be a silly little flag xD

Collapse
 
thebatman1 profile image
Mrinmay Mukherjee

Using the g(global) flag with .exec stores the index up to which the pattern matches. If the pattern does not match, the index is reset to 0.

If the previous search was successful, the next search starts after the stored index, even though the string is different. This is because the regular expression stores the matched index. I encountered this bug while writing an input validator. Took me half a day to figure this out. ๐Ÿ˜…๐Ÿ˜…

Here is the explanation from the docs

So for the above code:

  1. 'test_1' matches. So the stored index is 5.
  2. 'test_1' does not match, since the pattern matching starts at index 5, which is just an empty string. So the stored index becomes 0.
  3. 'test_2' matches. So the stored index becomes 5.
  4. 'other_test' does not match. So the stored index becomes 0.
  5. 'some_file' matches.

Thus the output is:

true
false
true
false
true
Enter fullscreen mode Exit fullscreen mode
Collapse
 
nombrekeff profile image
Keff

Great breakdown, thanks for taking the time to explain it in such detail!!

Collapse
 
eljayadobe profile image
Eljay-Adobe

Tricky. Just needed to remove the g. Now it's a ReExp.

Collapse
 
lionelrowe profile image
lionel-rowe

2nd bug no-one has mentioned yet: the regex matches anywhere in the string, so for example, isValidName('ๅฝๅฝๅ–ณๅ–ณ o_O ๅ’”ๅš“ๅ’”ๅš“ๅ’”ๅš“') is true, because the o_O in the middle matches.

Collapse
 
nombrekeff profile image
Keff

Good catch, that's my mistake. When simplifying the example for this post I must've missed that.

Collapse
 
lgrammel profile image
Lars Grammel

Great riddle!

Collapse
 
adam_cyclones profile image
Adam Crockett ๐ŸŒ€

A little note on the typescript in this file, it might as well not be here because it's better to state what is expected then guard against it rather than fall through to runtime

Collapse
 
nombrekeff profile image
Keff • Edited

Yup, good point, my bad. I found this bug on a typescript project so I though to include it as TS, but I simplified the example so much that it does not make sense to have it now!

I will fix the type to make it more relevant (I've removed the any type actually).

Collapse
 
adam_cyclones profile image
Adam Crockett ๐ŸŒ€

false
false
false
true
true
๐Ÿคทโ€โ™‚๏ธ

Collapse
 
nombrekeff profile image
Keff

Good try, but no, the result is weirder!!! ๐Ÿค”

Collapse
 
jkkaluga profile image
jk-kaluga

just remove the g flag

Collapse
 
nombrekeff profile image
Keff

Nice! There you go

Collapse
 
siddharthshyniben profile image
Siddharth

I'm really close... But can't seem to get it! (Don't tell me)

Collapse
 
nombrekeff profile image
Keff

I won't ๐Ÿ˜œ