DEV Community

Cover image for Decoding captcha created by library svg-captcha
Sushrut Kasture
Sushrut Kasture

Posted on • Updated on

Decoding captcha created by library svg-captcha

The Cowin portal set up for booking vaccination slots in India has (had?) a captcha to make sure bots are not able to book slots. They used an open source library called svg-captcha to generate the captcha.

I created a chrome extension to speed up the booking process by auto filling inputs which will mostly be constant - and I included piece of code in it which could read the captcha and decode it.

In this post I will be explaining how that captcha decoder works!

Now that my github README is gone where credits for captcha decode were visible, I would like to give credit to @ayushchd for the idea. I used the captcha decode code snippet from his extension "Cowin Bot". His idea was the driver to this analysis I did to understand the logic behind the decode.

What is svg?

SVG is scalable vector graphics. SVG image is made by defining a starting point, lines drawn from one point to another, curves and fill colours! It is simple collection of definitions of all these attributes.

I will explain how we can draw the character "L" with svg.

Screenshot 2021-06-06 at 11.17.10 AM

Numbers in red are the co-ordinates of points that make the shape "L"
Numbers in blue are line numbers and these will help you understand which line we are drawing when we do as I have referenced below

How is "L" made?

The shape "L" we are drawing is made of following

  1. Starting point at (150,0)
  2. A line from (150,0) to (150,300) [1]
  3. A line from (150,300) to (300,300) [2]
  4. A line from (300,300) to (300,250) [3]
  5. A line from (300,250) to (200,250) [4]
  6. A line from (200,250) to (200,0) [5]
  7. A line from (200,0) to (150,0) [6]

Numbers enclosed between "[" and "]" correspond to line numbers in the figure above

Starting point (origin)

To draw any new shape, we define a starting point - relative to (0,0) for this particular shape. We define that by writing "M" (M-command or moveto) followed by absolute point on a 2d plane. So, if we want to leave some margin on left of the shape, we would start at (150, 0) [this is in pixels] - we define origin as

M150 0
Enter fullscreen mode Exit fullscreen mode

'M' followed by a point denotes the starting point of the shape

Line from (x1, y1) to (x2, y2)

To achieve the shape above, we will have to draw multiple lines. For example we have to draw a line from our starting point (150,0) to (150,300).

To understand how to draw a line, we need to know that we can only draw a line from the point where our cursor currently is to an ending point that we define (I know this is a line segment and not a line, because line extends to infinity and blah, but for sake of simplicity, let us call it as a line). For example, as we define a starting point with 'M', the cursor moves to the starting point.

Therefore, to draw the line[1] from (150,0) to (150,300), as we defined the starting point, our cursor is already at (150,0). We now just have to define the ending point for line [1].
We do this with "L" (L-command or lineto)

L150 300
Enter fullscreen mode Exit fullscreen mode

Thus we have drawn a line form (150,0) to (150,300).
In similar fashion we can draw other lines too

  L300 300 [2]
  L300 250 [3]
  L200 250 [4]
  L200 0 [5]
  L150 0 [6]

Enter fullscreen mode Exit fullscreen mode

Numbers in the square brackets are just for our information and are not written in the actual svg definition

[There is some other way to do this. Instead of always specifying endpoints of lines, if we know that we just want to move vertically or horizontally, we can use "H" (H-command or horizontal lineto) and "V" (V-command or vertical lineto) commands to just define length of that line segment, but let us save that for sometime later for now.]

Writing these as svg

I explained how we can define points and lines, we need to tell the processor or parser that these are attributes of an SVG.

All we discussed is part of "path" element of svg and those points and lines are values for the "d" attribute of the path element.

Thus we define the full svg as:

<svg xmlns="http://www.w3.org/2000/svg" height="500" width="400" >
    <path d="
        M150 0
        L150 300
        L300 300
        L300 250
        L200 250
        L200 0
        L150 0
    "/>
</svg>
Enter fullscreen mode Exit fullscreen mode

[You should read more about the lowercase variants of "L", "H" and "V" to understand what they do and when to use them!]
[There are few more commands like "C", "Q", "Z" which you can read more about]
[There is a command Z which defines a line segment from the cursor to origin "M", we could use it above by replacing L150 0 with Z]

Now if you take the svg above and put it in html file, the html file in the browser will show the shape "L"! (like in the figure above, except for the annotations)

How the library creates captcha?

The svg-captcha library which I mentioned above has shapes of alphanumeric characters defined in .ttf file.

The font file is loaded in the library and with openType.js library the ASCII characters are converted to "path" - as we discussed above - these paths are basically the values of d attribute of path element in the svg - for example, "S" is created as

Screenshot 2021-06-06 at 12.26.47 PM

<svg xmlns="http://www.w3.org/2000/svg" height="300" width="300">
  <path d="M70.85 40.26L70.90 40.31L70.75 40.15Q67.42 40.40 65.78 39.87L65.80 39.89L65.74 39.83Q63.71 39.14 63.41 35.75L63.48 35.82L64.91 34.74L64.81 34.63Q65.61 34.18 66.37 33.68L66.42 33.73L66.43 33.74Q66.03 35.52 67.63 36.73L67.69 36.79L67.72 36.83Q68.92 37.76 71.05 37.57L71.04 37.55L71.11 37.62Q75.53 37.24 75.34 33.70L75.31 33.67L75.25 33.62Q75.20 31.47 72.57 30.25L72.56 30.24L72.54 30.22Q69.71 29.18 67.35 27.96L67.36 27.98L67.23 27.84Q64.81 26.64 63.82 21.96L63.76 21.89L63.74 21.88Q63.65 21.45 63.54 20.69L63.58 20.73L63.51 20.66Q63.51 19.97 63.59 19.40L63.64 19.46L63.61 19.43Q63.77 17.83 65.02 17.37L64.96 17.31L64.92 17.28Q67.33 16.41 71.33 16.60L71.41 16.69L71.30 16.57Q73.28 16.83 74.08 16.91L73.93 16.77L73.93 16.77Q75.50 17.04 76.57 17.50L76.61 17.54L76.57 17.51Q78.59 17.88 78.82 20.36L78.94 20.48L78.81 20.35Q77.85 21.14 75.65 22.48L75.69 22.52L75.57 22.41Q75.18 19.43 70.81 19.43L70.76 19.39L70.79 19.42Q68.93 19.46 67.94 20.14L68.01 20.21L67.88 20.08Q66.80 20.68 67.03 22.47L67.05 22.48L67.05 22.49Q67.22 24.60 70.19 26.12L70.25 26.18L70.27 26.20Q70.68 26.31 75.18 27.95L75.15 27.92L75.27 28.04Q77.91 29.54 78.33 33.91L78.20 33.78L78.25 33.84Q78.21 33.95 78.29 35.21L78.32 35.24L78.44 35.36Q78.39 38.01 76.83 39.12L76.94 39.23L76.88 39.17Q74.93 40.03 70.74 40.15ZM73.13 42.61L73.17 42.66L73.20 42.68Q74.50 42.62 76.48 42.62L76.48 42.61L76.62 42.75Q78.73 42.77 79.98 42.35L79.84 42.20L79.82 42.19Q81.19 41.50 81.11 39.71L81.03 39.62L81.05 39.65Q81.01 38.65 80.63 36.59L80.60 36.57L80.65 36.62Q79.66 31.86 77.57 29.99L77.61 30.04L77.59 30.01Q76.86 28.48 75.52 27.76L75.46 27.69L70.14 25.61L70.12 25.59Q69.82 25.45 69.37 25.22L69.40 25.25L69.25 24.84L69.22 24.46L69.37 24.61Q69.11 23.09 70.25 22.48L70.23 22.47L70.34 22.57Q70.97 21.87 72.68 21.68L72.85 21.85L72.78 21.78Q73.86 21.60 75.00 22.06L74.96 22.02L74.96 22.02Q75.03 22.12 75.22 22.96L75.32 23.06L75.26 23.00Q75.41 22.82 75.91 22.59L75.93 22.61L75.91 22.59Q76.75 23.62 76.86 24.76L76.83 24.72L76.83 24.72Q76.98 24.76 80.63 22.17L80.54 22.08L80.62 22.16Q80.40 19.58 78.95 18.89L78.94 18.88L78.86 18.80Q78.36 17.61 76.91 17.08L76.90 17.06L76.83 17.00Q74.63 16.28 71.43 16.28L71.39 16.24L71.43 16.28Q66.35 16.12 64.49 16.80L64.56 16.87L64.65 16.96Q63.29 17.39 63.18 19.07L63.16 19.06L63.15 19.04Q63.04 19.50 63.46 21.71L63.61 21.86L63.51 21.77Q64.11 25.45 66.28 27.70L66.43 27.84L66.46 27.87Q67.26 29.58 68.76 30.27L68.77 30.27L68.73 30.24Q70.11 30.80 74.03 32.36L74.03 32.36L74.10 32.46L74.76 32.82L74.84 32.94L74.77 32.87Q74.98 33.35 75.02 33.69L75.02 33.69L74.88 33.55Q75.08 37.14 71.16 37.29L71.22 37.35L71.06 37.19Q69.98 37.29 68.68 36.91L68.69 36.91L68.72 36.94Q68.44 36.17 68.44 35.44L68.30 35.31L68.36 35.37Q68.43 35.17 68.47 34.90L68.43 34.87L68.47 34.91Q68.00 35.09 67.20 35.58L67.17 35.55L67.15 35.53Q66.75 34.52 66.90 33.15L66.83 33.07L66.82 33.06Q64.72 34.24 63.12 35.65L63.10 35.63L63.16 35.69Q63.26 36.55 63.34 37.58L63.33 37.57L63.31 37.55Q63.68 39.33 64.97 40.09L64.95 40.06L64.95 40.07Q66.16 41.89 68.71 42.27L68.66 42.22L68.57 42.12Q70.14 42.37 73.07 42.56Z" />
</svg>
Enter fullscreen mode Exit fullscreen mode

Now if you look at the library clearly, it also adds noise in terms of lines which are drawn over the captcha - for example

Screenshot 2021-06-06 at 12.31.40 PM

<svg xmlns="http://www.w3.org/2000/svg" width="150" height="50" viewBox="0,0,150,50"><path fill="#333" d="M64.89 40.36L64.86 40.33L64.86 40.33Q61.98 36.54 59.36 27.90L59.36 27.90L59.29 27.83Q58.61 25.67 57.81 23.54L57.90 23.63L55.05 32.42L55.07 32.44Q53.17 37.29 50.78 40.52L50.80 40.55L50.68 40.42Q49.88 40.54 48.24 40.69L48.26 40.71L48.24 40.69Q48.45 39.61 48.45 38.35L48.32 38.22L48.45 38.34Q48.43 32.32 45.69 26.19L45.73 26.22L45.58 26.07Q42.48 19.20 36.61 14.21L36.69 14.29L36.75 14.36Q38.87 15.03 41.23 15.56L41.07 15.40L41.19 15.52Q49.82 23.92 51.12 34.89L51.10 34.87L51.05 34.82Q52.79 31.61 54.24 26.25L54.15 26.16L54.14 26.14Q56.16 18.69 56.66 17.28L56.74 17.36L59.02 17.40L58.93 17.31Q59.73 19.44 60.37 21.76L60.32 21.71L61.43 26.01L61.60 26.18Q63.08 31.55 64.49 34.78L64.45 34.75L64.58 34.88Q66.23 23.24 74.15 15.89L74.06 15.80L74.02 15.76Q75.61 15.49 78.58 14.84L78.50 14.76L78.63 14.88Q73.26 19.15 70.22 25.28L70.28 25.34L70.27 25.34Q67.08 31.47 67.08 38.25L67.22 38.38L67.07 38.24Q67.19 39.46 67.26 40.60L67.20 40.54L66.12 40.49L66.12 40.49Q65.51 40.41 64.94 40.41ZM71.21 43.33L71.16 43.28L71.19 43.31Q69.86 39.66 69.97 35.77L69.99 35.79L70.00 35.80Q70.37 24.29 79.74 16.03L79.64 15.94L79.74 16.03Q78.69 16.16 76.75 16.70L76.79 16.74L76.88 16.83Q77.76 15.93 79.67 14.14L79.62 14.09L79.72 14.19Q77.00 14.90 74.07 15.39L74.00 15.32L74.05 15.37Q66.56 22.50 64.65 31.98L64.59 31.91L64.69 32.01Q64.12 30.38 61.27 18.92L61.22 18.88L61.34 18.99Q60.74 18.89 59.79 18.89L59.96 19.06L59.52 17.86L59.50 17.84Q59.39 17.31 59.16 16.81L59.30 16.95L56.25 16.80L56.26 16.80Q55.60 19.42 54.32 24.54L54.26 24.48L54.22 24.43Q53.03 29.65 52.04 32.12L51.92 31.99L52.03 32.11Q50.37 24.13 44.81 17.81L44.93 17.93L44.87 17.86Q44.37 17.79 43.42 17.60L43.36 17.54L43.40 17.57Q42.62 16.68 41.05 15.04L41.21 15.20L41.14 15.13Q37.65 14.23 35.48 13.43L35.65 13.60L35.62 13.56Q41.28 18.23 44.55 24.44L44.44 24.33L44.58 24.47Q48.00 31.05 48.00 38.05L47.88 37.93L47.92 37.97Q47.92 39.53 47.73 41.13L47.86 41.26L47.83 41.23Q47.97 41.14 48.42 41.04L48.50 41.12L48.41 41.03Q48.85 40.91 49.07 40.91L49.13 40.97L49.24 42.11L49.22 42.08Q49.20 42.53 49.28 43.06L49.17 42.95L49.21 42.99Q50.38 42.94 52.66 42.79L52.65 42.78L52.57 42.70Q56.31 37.42 59.05 28.01L59.10 28.06L59.05 28.01Q61.82 36.80 64.79 40.75L64.68 40.64L64.71 40.67Q65.29 40.80 66.13 40.92L66.07 40.85L66.04 40.83Q66.87 41.84 67.93 42.91L68.00 42.98L67.97 42.94Q68.78 43.03 71.18 43.30Z"/><path fill="#222" d="M26.73 41.05L26.85 41.17L25.07 41.10L25.16 41.19Q21.71 41.13 20.76 39.00L20.84 39.09L20.76 39.00Q22.03 38.07 23.48 36.96L23.46 36.95L23.44 36.92Q24.03 39.11 26.88 38.96L26.91 38.98L26.87 38.94Q27.71 38.95 28.74 38.72L28.68 38.66L28.74 38.72Q29.63 38.17 29.56 37.21L29.47 37.12L29.49 37.15Q29.35 35.98 27.56 35.30L27.70 35.44L23.81 33.83L23.74 33.75Q21.52 32.53 21.17 28.99L21.15 28.96L21.15 28.96Q20.89 26.76 23.06 26.19L23.04 26.17L23.16 26.29Q24.09 26.04 27.06 26.04L27.03 26.01L27.06 26.04Q31.99 26.09 32.90 28.61L32.74 28.45L32.79 28.50Q32.19 29.07 31.46 29.53L31.45 29.51L30.09 30.52L30.10 30.53Q29.52 28.53 26.28 28.30L26.23 28.26L26.29 28.31Q25.83 28.46 24.80 28.88L24.74 28.82L24.81 28.89Q24.07 29.07 24.07 30.17L24.24 30.34L24.15 30.24Q24.43 31.25 26.26 31.94L26.23 31.91L26.38 32.06Q27.33 32.32 29.99 33.42L30.02 33.45L30.11 33.54Q31.82 34.26 32.01 36.81L32.09 36.90L32.03 36.83Q32.16 37.53 32.08 38.52L32.05 38.49L32.01 38.45Q32.10 39.45 31.52 40.10L31.41 39.98L31.48 40.05Q29.77 41.04 26.72 41.04ZM31.51 43.66L31.52 43.67L31.61 43.76Q32.55 43.82 33.92 43.52L33.84 43.44L33.93 43.53Q35.03 42.88 34.84 41.66L34.68 41.50L34.84 41.66Q34.60 40.92 34.29 39.32L34.43 39.47L34.36 39.39Q33.87 36.50 32.08 35.28L32.04 35.24L32.12 35.32Q31.43 33.75 30.17 33.03L30.33 33.19L30.21 33.08Q28.97 32.67 26.50 31.72L26.53 31.75L26.60 31.82Q26.73 31.35 27.00 31.23L26.97 31.21L26.89 31.12Q27.49 30.66 28.14 30.62L28.25 30.74L28.19 30.67Q28.93 30.58 29.70 30.84L29.73 30.88L29.73 30.96L29.92 31.03L30.03 30.95L30.03 30.99L30.02 30.98Q31.10 31.45 31.29 32.71L31.32 32.74L31.49 32.90Q32.61 31.93 34.51 30.26L34.54 30.28L34.46 30.20Q34.35 29.63 33.36 28.45L33.38 28.47L33.22 28.31Q32.12 25.65 27.01 25.54L27.01 25.53L27.11 25.63Q23.81 25.46 21.83 26.03L21.95 26.15L21.96 26.15Q20.42 26.59 20.65 28.69L20.69 28.73L20.75 28.79Q20.85 30.03 21.76 32.12L21.78 32.15L21.78 32.14Q22.29 33.22 23.31 33.94L23.39 34.01L23.33 33.96Q23.93 35.35 25.22 36.04L25.34 36.16L25.30 36.12Q26.18 36.31 27.13 36.69L27.12 36.68L29.14 37.56L29.11 37.53Q28.78 38.65 26.84 38.53L26.74 38.43L26.84 38.54Q26.31 38.50 25.09 38.19L25.21 38.31L25.09 38.04L25.00 38.18L25.03 38.21Q24.20 37.88 23.55 36.39L23.50 36.33L23.52 36.35Q21.44 37.96 20.33 39.11L20.20 38.97L20.18 38.95Q20.52 39.90 21.54 40.70L21.65 40.81L21.45 40.91L21.38 40.84Q22.94 43.09 27.17 43.47L27.05 43.35L27.07 43.37Q28.41 43.57 31.65 43.80Z"/><path fill="#333" d="M116.76 18.55L116.71 18.50L116.62 18.41Q116.01 18.90 115.86 19.70L115.83 19.67L115.73 24.71L115.82 24.80Q116.85 24.69 119.06 24.42L119.15 24.51L119.18 24.54Q119.14 24.96 119.10 25.80L119.09 25.79L119.08 25.78Q118.99 26.49 118.99 26.94L119.09 27.04L117.49 27.11L117.41 27.03Q116.58 27.08 115.70 27.08L115.70 27.08L115.79 27.17Q115.71 31.28 115.64 39.65L115.69 39.70L115.79 39.81Q113.44 39.74 111.95 40.42L112.02 40.49L112.04 40.51Q113.06 34.30 112.95 27.06L112.89 27.01L113.08 27.19Q112.25 26.98 110.96 26.71L111.08 26.84L111.02 26.78Q110.98 25.75 110.79 23.81L110.82 23.83L110.90 23.91Q112.04 24.45 112.99 24.64L112.94 24.59L112.95 24.60Q112.81 23.84 112.50 21.37L112.57 21.43L112.53 21.39Q112.30 19.34 112.30 18.39L112.33 18.42L112.34 18.43Q112.42 16.83 113.83 16.26L113.69 16.12L113.79 16.23Q114.49 15.74 119.32 14.90L119.40 14.98L119.43 15.01Q120.11 14.74 120.95 14.63L121.02 14.70L121.07 14.75Q120.84 15.51 120.69 16.38L120.63 16.32L120.30 17.85L120.44 18.00Q119.67 17.72 119.17 17.83L119.26 17.93L119.12 17.78Q118.65 17.92 116.74 18.53ZM122.03 19.52L122.03 19.51L122.02 19.50Q122.44 17.37 123.05 14.82L123.05 14.83L123.12 14.89Q122.49 15.21 121.20 15.82L121.09 15.72L121.54 14.08L121.55 14.08Q120.39 14.34 117.92 14.87L117.87 14.82L117.85 14.80Q116.68 14.88 113.48 15.80L113.40 15.72L113.56 15.88Q111.83 16.13 111.83 17.99L111.94 18.11L111.92 18.08Q111.94 18.41 112.02 18.75L112.00 18.73L112.04 18.77Q112.28 20.12 112.40 21.45L112.30 21.35L112.56 24.09L112.52 24.05Q111.10 23.62 110.45 23.16L110.59 23.30L110.51 23.22Q110.56 24.11 110.60 25.13L110.68 25.22L110.75 27.15L110.76 27.16Q111.43 27.30 112.23 27.37L112.36 27.51L112.27 29.51L112.74 29.60L112.65 29.51Q112.64 35.74 111.65 41.19L111.47 41.01L111.52 41.06Q112.49 40.55 113.67 40.28L113.74 40.34L113.50 42.36L113.50 42.35Q114.71 42.04 116.01 42.00L116.05 42.04L116.02 42.02Q117.31 42.05 118.56 42.31L118.52 42.27L118.61 42.36Q117.77 37.37 117.77 29.41L117.72 29.36L117.66 29.30Q118.86 29.40 120.95 29.48L120.98 29.50L121.01 27.66L121.00 27.66Q120.99 26.70 121.03 25.75L120.95 25.67L120.94 25.65Q120.49 25.78 119.43 26.05L119.34 25.96L119.38 26.00Q119.38 25.32 119.50 23.99L119.54 24.02L119.51 24.00Q118.61 24.16 117.81 24.24L117.93 24.36L117.85 24.27Q117.84 22.86 117.95 21.45L118.04 21.54L117.97 21.47Q118.16 20.86 118.69 20.51L118.70 20.52L118.78 20.60Q119.60 19.79 120.42 19.71L120.45 19.74L120.38 19.67Q121.37 19.76 122.21 19.69Z"/><path fill="#333" d="M85.94 41.03L85.92 41.01L85.86 40.94Q81.76 29.77 76.63 24.78L76.58 24.74L76.54 24.69Q77.97 25.29 80.94 25.94L80.95 25.94L81.01 26.00Q84.63 30.01 87.18 36.67L87.11 36.60L87.03 36.51Q89.87 29.34 92.50 26.30L92.38 26.18L92.46 26.26Q94.27 25.98 96.71 25.30L96.71 25.30L96.86 25.45Q93.67 28.16 91.50 32.73L91.45 32.68L91.45 32.67Q91.04 33.29 87.65 40.90L87.79 41.04L87.64 40.90Q87.16 41.10 85.94 41.02ZM90.62 43.50L90.46 43.34L90.61 43.49Q93.81 31.04 98.38 26.36L98.38 26.36L96.88 26.84L96.99 26.95Q96.12 27.10 95.28 27.22L95.38 27.32L95.35 27.29Q95.86 26.92 96.67 26.04L96.57 25.94L96.56 25.92Q97.34 25.02 97.80 24.60L97.94 24.73L97.96 24.76Q95.60 25.25 92.17 25.75L92.34 25.91L92.36 25.94Q89.52 28.99 87.46 34.59L87.45 34.57L87.49 34.61Q85.85 30.43 84.02 27.95L84.07 28.00L83.58 28.04L83.48 27.95Q83.19 27.92 82.96 27.92L82.91 27.87L83.01 27.97Q82.81 27.65 81.10 25.60L81.03 25.52L81.12 25.62Q77.76 25.11 75.51 23.97L75.46 23.92L75.57 24.03Q81.62 29.66 85.61 41.38L85.64 41.41L86.64 41.42L86.53 41.31Q86.87 41.96 87.71 43.25L87.76 43.30L89.13 43.30L89.25 43.42Q89.83 43.28 90.48 43.36Z"/><path d="M7 30 C80 13,82 19,139 22" stroke="#222" fill="none"/></svg>
Enter fullscreen mode Exit fullscreen mode

The library basically generates the random group of characters (4 characters in this case, but you can tell the library how many characters should be there in the captcha). - The basic shape of the character is the base

  • Each character is translated to path element
  • Character is shifted back and forth depending on its position in the captcha - this is done by setting the arguments for different commands (M, C, L, Q, H, V, etc.)
  • While defining the arguments for commands above, random distortion is added by adding +/- (~0.1) in each value.
  • Noise is added by drawing random curves over the captcha
  • The order of path elements in svg is shuffled - but the definition of path contains information about character's position in captcha (because the origin "M" point and other lines, curves are well defined for their start and endpoints). Thus the order of elements inside the captcha does change the order of characters in the image.
  • The full svg is returned as an output.

Reverse engineering - understanding

Break down the problem

Now that we know how captcha is created, we just have to

  1. Filter out the noise in the captcha
  2. Define the position of each path element
  3. Map back each path element to the original character

Filter out the noise

If you take a closer look at the captcha generated, the noise lines are of the form

<path d="M7 30 C80 13,82 19,139 22" stroke="#222" fill="none"/>
Enter fullscreen mode Exit fullscreen mode

The d attribute is too small to make a real character and there is no fill attribute.
I observed many captchas generated from the library and came to a conclusion that it is safe to assume - All the path elements with fill="none" contribute to noise lines.

Thus if we just leave out all such path elements, we are done with filtering the noise out.

Define the position of each path element

Or we can say arrange all the path elements in correct order!

As we know,

  • each path element has a defined starting point (origin) defined with M in its d attribute
  • the characters in captcha are arranged horizontally - making the origin points lie on same horizontal line making them differ just in x-coordinate

Thus sorting the path elements by the x-coordinates of their origin should do our job!
(There are random distortions added, which distort each point by +/- (< 0.1px) - but we can ignore that for sorting as they are not going to affect the relative position of paths)

Mapping path to original character

This is the most important and tricky part!
Assume in this section that we are talking about a single character in the captcha.

The library generates character and distorts each part of its d attribute by +/- (less than 0.1) px

let distortion = (Math.random() * 0.2) - 0.1;
Enter fullscreen mode Exit fullscreen mode

From this, it is clear that a character "R" generated by this library may not be exactly similar to another "R" generated by it.

But, the important thing to note in this is, the order of commands (M, L, Q, C, Z, etc.) to generate R will always be the same! Thus if we ignore the arguments of these commands, the order of commands is going to be same for any "R" generated by the library with same font style!

Also, by observation, I can say that all the alphanumeric characters in both upper and lower cases - each one of them, has a unique order of commands used to draw them!

Algorithm

Did it click yet?
Yes,

  • I generated paths for all alphanumeric characters (uppercase and lowercase) using the library
  • For each character, I found the series of commands which draw it (just ignored the numbers in the d attribute of path)
  • Stored the series of commands as string for each of the character in a map with the series of commands as key and the original character as the value

This map now serves as the model to decode the captcha.

All we have to do is,

  • For each path element (corresponding to a character) in captcha

    • find the d attribute

      • For instance in the svg shown for "S" above, d attribute is:
        M70.85 40.26L70.90 40.31L70.75 40.15Q67.42 40.40 65.78 39.87L65.80 39.89L65.74 39.83Q63.71 39.14 63.41 35.75L63.48 35.82L64.91 34.74L64.81 34.63Q65.61 34.18 66.37 33.68L66.42 33.73L66.43 33.74Q66.03 35.52 67.63 36.73L67.69 36.79L67.72 36.83Q68.92 37.76 71.05 37.57L71.04 37.55L71.11 37.62Q75.53 37.24 75.34 33.70L75.31 33.67L75.25 33.62Q75.20 31.47 72.57 30.25L72.56 30.24L72.54 30.22Q69.71 29.18 67.35 27.96L67.36 27.98L67.23 27.84Q64.81 26.64 63.82 21.96L63.76 21.89L63.74 21.88Q63.65 21.45 63.54 20.69L63.58 20.73L63.51 20.66Q63.51 19.97 63.59 19.40L63.64 19.46L63.61 19.43Q63.77 17.83 65.02 17.37L64.96 17.31L64.92 17.28Q67.33 16.41 71.33 16.60L71.41 16.69L71.30 16.57Q73.28 16.83 74.08 16.91L73.93 16.77L73.93 16.77Q75.50 17.04 76.57 17.50L76.61 17.54L76.57 17.51Q78.59 17.88 78.82 20.36L78.94 20.48L78.81 20.35Q77.85 21.14 75.65 22.48L75.69 22.52L75.57 22.41Q75.18 19.43 70.81 19.43L70.76 19.39L70.79 19.42Q68.93 19.46 67.94 20.14L68.01 20.21L67.88 20.08Q66.80 20.68 67.03 22.47L67.05 22.48L67.05 22.49Q67.22 24.60 70.19 26.12L70.25 26.18L70.27 26.20Q70.68 26.31 75.18 27.95L75.15 27.92L75.27 28.04Q77.91 29.54 78.33 33.91L78.20 33.78L78.25 33.84Q78.21 33.95 78.29 35.21L78.32 35.24L78.44 35.36Q78.39 38.01 76.83 39.12L76.94 39.23L76.88 39.17Q74.93 40.03 70.74 40.15ZM73.13 42.61L73.17 42.66L73.20 42.68Q74.50 42.62 76.48 42.62L76.48 42.61L76.62 42.75Q78.73 42.77 79.98 42.35L79.84 42.20L79.82 42.19Q81.19 41.50 81.11 39.71L81.03 39.62L81.05 39.65Q81.01 38.65 80.63 36.59L80.60 36.57L80.65 36.62Q79.66 31.86 77.57 29.99L77.61 30.04L77.59 30.01Q76.86 28.48 75.52 27.76L75.46 27.69L70.14 25.61L70.12 25.59Q69.82 25.45 69.37 25.22L69.40 25.25L69.25 24.84L69.22 24.46L69.37 24.61Q69.11 23.09 70.25 22.48L70.23 22.47L70.34 22.57Q70.97 21.87 72.68 21.68L72.85 21.85L72.78 21.78Q73.86 21.60 75.00 22.06L74.96 22.02L74.96 22.02Q75.03 22.12 75.22 22.96L75.32 23.06L75.26 23.00Q75.41 22.82 75.91 22.59L75.93 22.61L75.91 22.59Q76.75 23.62 76.86 24.76L76.83 24.72L76.83 24.72Q76.98 24.76 80.63 22.17L80.54 22.08L80.62 22.16Q80.40 19.58 78.95 18.89L78.94 18.88L78.86 18.80Q78.36 17.61 76.91 17.08L76.90 17.06L76.83 17.00Q74.63 16.28 71.43 16.28L71.39 16.24L71.43 16.28Q66.35 16.12 64.49 16.80L64.56 16.87L64.65 16.96Q63.29 17.39 63.18 19.07L63.16 19.06L63.15 19.04Q63.04 19.50 63.46 21.71L63.61 21.86L63.51 21.77Q64.11 25.45 66.28 27.70L66.43 27.84L66.46 27.87Q67.26 29.58 68.76 30.27L68.77 30.27L68.73 30.24Q70.11 30.80 74.03 32.36L74.03 32.36L74.10 32.46L74.76 32.82L74.84 32.94L74.77 32.87Q74.98 33.35 75.02 33.69L75.02 33.69L74.88 33.55Q75.08 37.14 71.16 37.29L71.22 37.35L71.06 37.19Q69.98 37.29 68.68 36.91L68.69 36.91L68.72 36.94Q68.44 36.17 68.44 35.44L68.30 35.31L68.36 35.37Q68.43 35.17 68.47 34.90L68.43 34.87L68.47 34.91Q68.00 35.09 67.20 35.58L67.17 35.55L67.15 35.53Q66.75 34.52 66.90 33.15L66.83 33.07L66.82 33.06Q64.72 34.24 63.12 35.65L63.10 35.63L63.16 35.69Q63.26 36.55 63.34 37.58L63.33 37.57L63.31 37.55Q63.68 39.33 64.97 40.09L64.95 40.06L64.95 40.07Q66.16 41.89 68.71 42.27L68.66 42.22L68.57 42.12Q70.14 42.37 73.07 42.56Z
      
    • ignore all the whitespaces and numbers to create string containing series of commands

      • Corresponding series for "S" above would be:
      MLLQLLQLLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQZMLLQLLQLLQLLQLLQLLQLLLQLLLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLLLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQLLQZ
      
    • find the character corresponding to this string in the map we created

Codes

Script to find svg for all alphanumerics (node js)

This node script runs on modified version of library svg-captcha - I have modified the files in my fork - check default branch

The following script can be found at ./index.js in my fork

let captchacreator = require('./lib'); // import the svg-captcha library
const fs = require('fs');
let chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
let obj = {};
for(let i=0; i<chars.length; i++){
    obj[chars[i]] = captchacreator.create(chars[i]).data;
}
let ans = JSON.stringify(obj);
try {
  const data = fs.writeFileSync('texts.json', ans)  
} catch (err) {
  console.error(err)
}
Enter fullscreen mode Exit fullscreen mode

Script to filter out numbers in d attribute to create the model (reverse map)

import json
import base64
def isval(c):
    nms = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
    nms = nms + [" ", "."]
    return c not in nms

def filter(t):
    ans = ''
    for i in t:
        if isval(i):
            ans += i
    return ans

def readfileandgetjson():
    with open("texts.json") as f_:
        return json.load(f_)

def filter_data(data):
    retdata = {}
    for k, v in data.items():
        retdata[filter(v)] = k
    return json.dumps(retdata)

def write_file(jsonstr):
    with open("final.json", "w+") as f_:
        f_.write(jsonstr)

write_file(
    filter_data(
        readfileandgetjson()
    )
)


def create_base64(jsonstr):
    encoded = base64.b64encode(jsonstr)
    print(encoded)


create_base64(
    filter_data(
        readfileandgetjson()
    )
)
Enter fullscreen mode Exit fullscreen mode

Code which decodes captcha in the webpage

Note that the model is stored as a base64 encoded string in the following script. This base64 string is generated by using the function create_base64 defined in the python script above

let model = "BASE64 ENCODED STRING OF REVERSE MAP CREATED";
var parsed_model = JSON.parse(atob(model));
var parser = new DOMParser();

var svg = parser.parseFromString(atob($("img#captchaImage").attr("src").split("base64,")[1]), "image/svg+xml");
$(svg).find('path').each((_, p) => {
  if ($(p).attr('stroke') != undefined) $(p).remove();
})
vals = [];
$(svg).find('path').each(
  (_, p) => {
    idx = parseInt($(p).attr("d").split(".")[0].replace("M", ""));
    vals.push(idx);
  }
)
var sorted = [...vals].sort(function (a, b) {
  return a - b;
})

var solution = ['', '', '', '', ''];

$(svg).find('path').each(
  (idx, p) => {
    var pattern = $(p).attr('d').replace(/[\d\.\s]/g, "");
    solution[sorted.indexOf(vals[idx])] = parsed_model[pattern];
  }
);
console.log(solution); //solution array contains the decoded captcha
Enter fullscreen mode Exit fullscreen mode

Oldest comments (4)

Collapse
 
palashgdev profile image
Palash Gupta

reverse eng. is awesome

Collapse
 
fatardy profile image
Aradhya Alamuru

Nicely done!

Collapse
 
laurentiustroia profile image
laurentiustroia

Nice, very good structured, introduction help a lot to understanding the reverse engineering

Collapse
 
ha3an profile image
ha3an

Hello,

Thank you for sharing this tutorial.

I have an issue. When I convert and compare the SVG code according to the tutorial, I realize that the letters are different in the display compared to the code. What I mean is that the order of <path> in the code is different from the display order of <path>. Is there a way to solve this issue?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.