(Note: all credit goes to original author(s) linked within the text. the converted C# code is public domain, feel free to have fun with it)
Read an interesting article by Tom from Medium on using Zero-Width Characters as a technique to fingerprinting text, although it does not always work (depending on which editors/word processor you are using), it is interesting none-the-less. Tom has even created a demo website to demonstrate the techniques.
You can access his JS source code at ==> https://github.com/umpox/zero-width-detection.
I found the JS code interesting and wanted to try it with C# and offline. As a user of LINQPad, I'm fond of using it to create tiny offline routines for quick (and sometimes dirty) utils for my occasional needs. Following is the C# source.
using System;
using System.Text;
using System.Linq;
namespace zerowidth
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello World!");
string userName = "名字.Yap";
string binaryText = string.Join(" ", Encoding.Unicode.GetBytes(userName).Select(c => Convert.ToString(c, 2).PadLeft(8, '0')).ToArray());
Console.WriteLine(binaryText);
string zeroWidthText2 = string.Join("\ufeff", binaryText.ToCharArray().Select(c => { // Zero Width No-Break Space 
if (c == '1') return '\u200B'; // Zero Width Space ​
else if (c == '0') return '\u200C'; // Zero Width Non-Joiner ‌
return '\u200D'; // Zero Width Joiner ‍
}).ToArray());
Console.WriteLine("|{0}|{1}", zeroWidthText2, zeroWidthText2.Length);
string binaryText3 = string.Join("", zeroWidthText2.ToCharArray().Select(c => {
if (c == '\u200B') return '1';
else if (c == '\u200C') return '0';
else if (c == '\u200D') return ' ';
else return (char?)null;
}).ToArray());
Console.WriteLine(binaryText3);
string originalUserName2 = Encoding.Unicode.GetString(binaryText3.Split(' ').Select(s => Convert.ToByte(s, 2)).ToArray());
Console.WriteLine(originalUserName2);
}
}
}
Do follow and read the linked article for further details, Tom (and Zach Aysan, linked in Tom's article) has done an excellent job of elaborating and explaining the techniques.
It's been a long time since my last steganography in image processing and document fingerprinting, this has re-ignite the interest somewhat... :)
Top comments (0)