DEV Community

Discussion on: Daily Challenge #42 - Caesar Cipher

Collapse
 
vince_tblt profile image
Vincent Thibault • Edited

Using English frequency analysis (but you actually need to know the language source), here is a javascript version:

// English letter frequency a-z
// source: https://en.wikipedia.org/wiki/Letter_frequency
const ENGLISH_FREQUENCIES = new Float32Array([
  0.0816, 0.0149, 0.0278, 0.0425, 0.1270, 0.0222, 0.0201, 0.0609, 0.0696, 0.0015, 0.0077, 0.0402, 0.0240,
  0.0674, 0.0750, 0.0192, 0.0009, 0.0598, 0.0632, 0.0905, 0.0275, 0.0097, 0.0236, 0.0015, 0.0197, 0.0007,
].map(v => Math.log(v) / Math.LN2));

// Here we are !
const decipher = str =>
  Array.from({ length: 26 }, (_, key) => {
    // Rotate text
    const text = str
      .toLowerCase()
      .replace(/[a-z]/g, $0 =>
        String.fromCharCode(97 + (($0.charCodeAt(0) - 97 + key) % 26))
      );

    // Get entropy information based on Shannon
    // https://en.wikipedia.org/wiki/Entropy_(information_theory)
    const entropy = text
      .match(/[a-z]/g)
      .map(c => ENGLISH_FREQUENCIES[c.charCodeAt(0) - 97])
      .reduce((sum, frequency, _, arr) => sum - frequency / arr.length, 0);

    return { key, text, entropy };
  }).sort((a, b) => a.entropy - b.entropy); // Sort results

You'll get results sort by entropy score:

console.log(
  decipher("dwwdfn iurp wkh zrrgv dw gdzq")
    .map(({ key, text, entropy }) => `${entropy.toFixed(3)} - ${text} (${key})`)
    .join("\n")
);

// 4.227 - attack from the woods at dawn (23)
// 4.936 - piiprz ugdb iwt lddsh pi splc (12)
// 4.974 - ohhoqy tfca hvs kccrg oh rokb (11)
// 5.098 - tmmtvd ykhf max phhwl tm wtpg (16)