DEV Community

fabulous.yap
fabulous.yap

Posted on

Zero-Width Characters

(Note: all credit goes to original author(s) linked within the text. the converted C# code is public domain, feel free to have fun with it)

Read an interesting article by Tom from Medium on using Zero-Width Characters as a technique to fingerprinting text, although it does not always work (depending on which editors/word processor you are using), it is interesting none-the-less. Tom has even created a demo website to demonstrate the techniques.

You can access his JS source code at ==> https://github.com/umpox/zero-width-detection.

I found the JS code interesting and wanted to try it with C# and offline. As a user of LINQPad, I'm fond of using it to create tiny offline routines for quick (and sometimes dirty) utils for my occasional needs. Following is the C# source.

using System;
using System.Text;
using System.Linq;

namespace zerowidth
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");
            string userName = "名字.Yap";
            string binaryText = string.Join(" ", Encoding.Unicode.GetBytes(userName).Select(c => Convert.ToString(c, 2).PadLeft(8, '0')).ToArray());
            Console.WriteLine(binaryText);

            string zeroWidthText2 = string.Join("\ufeff", binaryText.ToCharArray().Select(c => {  // Zero Width No-Break Space &#65279
                if (c == '1') return '\u200B';          // Zero Width Space &#8203
                else if (c == '0') return '\u200C';     // Zero Width Non-Joiner &#8204
                return '\u200D';                        // Zero Width Joiner &#8205
            }).ToArray());
            Console.WriteLine("|{0}|{1}", zeroWidthText2, zeroWidthText2.Length);

            string binaryText3 = string.Join("", zeroWidthText2.ToCharArray().Select(c => {
                if (c == '\u200B') return '1';
                else if (c == '\u200C') return '0';
                else if (c == '\u200D') return ' ';
                else return (char?)null;
            }).ToArray());
            Console.WriteLine(binaryText3);
            string originalUserName2 = Encoding.Unicode.GetString(binaryText3.Split(' ').Select(s => Convert.ToByte(s, 2)).ToArray());
            Console.WriteLine(originalUserName2);
        }
    }
}

Do follow and read the linked article for further details, Tom (and Zach Aysan, linked in Tom's article) has done an excellent job of elaborating and explaining the techniques.

It's been a long time since my last steganography in image processing and document fingerprinting, this has re-ignite the interest somewhat... :)

Top comments (0)