🤪 Tell me how to build a duplicate detection system!

#javascript #discuss #webdev #productivity

I'm working on a linter right now, and one of the requested features for it was code duplication detection. I made an issue for it already, but I need to start working on it. And, that's where my question lies.

I can either build the system to detect the duplications based on plain text. This is how most systems work because it is the simpler of the two options. But, it is also the most failure prone. For instance, this system would fail if there was the same exact code in two places, but there was a comment in the middle of one of them- it would not register as a duplicate.

Alternatively, I can use an abstract syntax tree to detect the duplications. But, theres another problem there- what is the most-lightweight and all-around-best javascript parser out there? I'm planning on using the babel parser but I'm already running into a problem because it doesn't parse the comments in a way I would like.

So, if you have an opinion on what I should do, please leave a comment below. Also, please star the project and contribute if you have time. If you can, that would be amazing, and I thank you so much!

Top comments (3)

Shrihan • Jan 2 '22

Let me add something: If comments are a roadblock for deduplication via plain text, why not ignore it? Like, ignore lines starting with // or blocks surrounded by /* */