This post has been archived and you can now find it on my personal blog at https://www.marcusturewicz.com/blog/speed-up-regex-performance-with-dotnet-5.
For further actions, you may consider blocking this person and/or reporting abuse
This post has been archived and you can now find it on my personal blog at https://www.marcusturewicz.com/blog/speed-up-regex-performance-with-dotnet-5.
For further actions, you may consider blocking this person and/or reporting abuse
Probir Sarkar -
Mansuur Abdullahi Abdirahman -
ANIRUDDHA ADAK -
Itamar Tati -
Top comments (2)
Thanks for the nice write-up, Marcus. One issue I noticed in your final benchmark on Regex. Regex.Matches is lazy, meaning it doesn't actually execute the regex until it needs to. When you run Regex.Matches, it's just fetching the relevant regex from the cache (or parsing/compiling it if it can't find it), and returning to you the collection object that will enable you to retrieve all the matches, but it hasn't done any matching yet. It's only when you iterate the collection, ask for its Count, index into it, etc., that it will compute as little as it needs to in order to answer your question (e.g. if you ask for matches[2], it'll ensure it has the 0th, 1st, and 2nd matches, but it needn't go beyond that yet). So in your benchmark, it's not actually running the regex at all. That's also why, even though the three regexes being tested have various levels of complexity, the benchmarks are all coming back as approximately the same answer. I'd suggest redoing the benchmark with
.Count
appended to each Matches call, or something like that.Ahh thanks Stephen! I did wonder why I was not seeing the 3-6x speed up that you were... I just thought the benchmark I had chosen was not a good candidate, but clearly I've got some reading to do on Regex! I've now updated to include Count and the results look much better.