DEV Community

How I developed a captcha cracker for my University's website

Priyansh Jain on April 06, 2018

Hello again! Consider this a spinoff of my original article. I had some requests from the readers to explain how I developed the parser, and hen...
Collapse
 
peter profile image
Peter Kim Frank

This was a really fun and entertaining read! I really appreciated the step-by-step breakdown for how you thought about the problem, identified some key insights, and then implemented a solution. Thank you for sharing :)

Collapse
 
presto412 profile image
Priyansh Jain

Thank you!

Collapse
 
lauriy profile image
Lauri Elias

Good to see you didn't run to TensorFlow right away. Sometimes you just don't need a billion samples and a 16-layer neural net.

Collapse
 
rafalpienkowski profile image
Rafal Pienkowski

Very nice article.
When I was studying I really liked my classes about signal processing. When I was reading your article I felt as I went back in time. 😁
I'm not sure if you have heard about salt and pepper? It's related to additional dots on an image. There are several techniques for removing such a noise like median filter. Maybe it would give even betterer results.

Thank you for your post.

Collapse
 
presto412 profile image
Priyansh Jain

Thanks!
Yes, I know about the median filter. Very recently had a semester long course in image processing. I had implemented region growing with this, to get great noise free skeletons. Will try with the median filter!

Collapse
 
presto412 profile image
Priyansh Jain • Edited

Median
Got this result, with 2px radius.

Collapse
 
rafalpienkowski profile image
Rafal Pienkowski

I thought I would be better 🤔 Thanks for your time I appreciate it.

Collapse
 
iambalajirk profile image
balaji radhakrishnan

Image processing is a beautiful subject. With no knowledge about it I did my final year project on it. The fun part is that the algo you build will already exist and gives you the good feeling that we are doing something awesome. Good luck :)

Collapse
 
presto412 profile image
Priyansh Jain

So true.

Collapse
 
briedis profile image
Mārtiņš Briedis

After the clean-up step, did you try a simple OCR approach?

Collapse
 
chabala profile image
Greg Chabala

I appreciate the detail in the article, but that captcha looks so trivial I bet off the shelf OCR libraries could handle it without any preprocessing.

Collapse
 
presto412 profile image
Priyansh Jain

Sorry for the late reply, but yeah they most definitely would. I just wanted to do something real new cause I was new to programming and this was actually something that wasn't taught in class. Felt nice.

Collapse
 
markjohnson303 profile image
Mark Johnson 👔

Thanks for writing this up! I enjoyed reading about how you solved the problem, and your explanation was super clear and helpful :) I'm amazed how easy this was to accomplish... I would have expecting clearing the lines would have been more difficult. What do you think you would have done if they were pure black like the letters?

Collapse
 
presto412 profile image
Priyansh Jain • Edited

Actually its cases like these where getting the best skeleton out of all the skeletons comes in handy

Collapse
 
presto412 profile image
Priyansh Jain

I would have eliminated single pixel thickness lines, by checking the top and bottom pixels I guess

Collapse
 
hussaintamboli profile image
Hussain Tamboli

Nicely explained! I was also trying to do similar thing some time ago using tesseract-ocr. It was not that accurate though :)

Collapse
 
muqadirhussain profile image
Collapse
 
seankilleen profile image
Sean Killeen

Hey! I've noticed that in this post you use "guys" as a reference to the entire community, which is not made up of only guys but a variety of community members.

I'm running an experiment and hope you'll participate. Would you consider changing "guys" to a more inclusive term? If you're open to that, please let me know when you've changed it and I'll delete this comment.

For more information and some alternate suggestions, see dev.to/seankilleen/a-quick-experim....

Thanks for considering!

Collapse
 
falansari profile image
Ash • Edited

This is why you use Google's reCAPTCHA :) the old traditional captcha is worthless nowadays, might as well not have it, it won't make much of a difference.

Collapse
 
5bentz profile image
5bentz

You may be surprised at the effectiveness of such simple CAPTCHA's.
On a website of mine, a CAPTCHA has neutralized spam messages. The CAPTCHA is more elaborate than this one, but still...Bad CAPTCHA's are better than nothing ;)

Collapse
 
fuadzulfikar29 profile image
fuadzulfikar29 • Edited

hi, i got a problem when running your script, want you help me? from the picture/captcha i send to you, which code in the script i can change?
(the bitmaps i can make by myself later)

thepracticaldev.s3.amazonaws.com/i...

Collapse
 
basanirakesh profile image
Basani Rakesh • Edited

What if the lines in the captcha are pure black as well. Could you help remove the unnecessary lines in the captcha and make blurred letters normal?

thepracticaldev.s3.amazonaws.com/i...

Collapse
 
faizan7ali profile image
Faizan ALi • Edited

Hi in this code, I am getting an error LIST INDEX OUT OF RANGE
on this line if pixel_matrix[row, column] == 0 \
and pixel_matrix[row, column - 1] == 255 and pixel_matrix[row, column + 1] == 255

Also could you please explain how to create bitmaps.json file

Collapse
 
thomasbnt profile image
Thomas Bnt ☕

Woah nice !

Collapse
 
vishalsharma95570 profile image
vishalsharma95570

Priyansh jain i need your little help can you pass me your mail or ping me on vishalsha95570@gmail.com

Collapse
 
shalvah profile image
Shalvah

Brilliant!

Collapse
 
presto412 profile image
Priyansh Jain • Edited

Thank you!

Collapse
 
piotroxp profile image
Piotr Słupski

Awesome work!

A great read, very nicely solved problem, super work :)

Thanks for this article!!!

Collapse
 
paalavi profile image
paalavi

what about some captchas which their character's width is not a Specified value like this url's captcha

irsherkat.ssaa.ir/Captcha/Captcha....

Collapse
 
faizan7ali profile image
Faizan ALi

Could you please help me

Collapse
 
laoquocthaivhu profile image
laoquocthaivhu

Please help me with this captcha

Collapse
 
laoquocthaivhu profile image
laoquocthaivhu