DEV Community

Hargunbeer Singh
Hargunbeer Singh

Posted on

Cryptographic frequency Analysis explained

Frequency Analysis is the study of frequency of letters or group or letters in a ciphertext. This method is relatively older to the newer methods of cryptanalysis. This method is used to break classic ciphers, this method is really useful in breaking substitution ciphers. It is based on the fact that certain letters or group of letters in a specific language text occur with specific frequencies. There is a distribution of the frequencies of letters that occur in a particular language. For example: the letters E,A,T and O occur the most in a piece of text in the English language. Similarly, the letter pairs - TH, ER, ON and AN are the most common in a piece of text in the English language, these pairs are reffered to as bigrams.

In some ciphers, the properties and patters of the plaintext is preserved in the ciphertext, and these patterns can easily be studied and then the cipher can be exploited. The ciphertext usually retains the properties of the plaintext when a single letter always gets encrypted to the same ciphertext letter, this usually occurs in substitution ciphers. This type of attack is called ciphertext-only attack. Ciphertext-only attack is the attack in which the cryptanalyst just has the access to the ciphertext and nothing else. The cryptanalyst, in some cases might also know the language of the plaintext, where the cryptanalyst can use techniques like frequency analysis and index of coincidence.

When a plaintext is encrypted using a substition cipher, and the cryptanalyst know the actual language of the plaintext, the cryptanalys can easily find frequency distribution and sequences in the ciphertext in most cases. For example: if a ciphertext contains a high frequency of R in it and the plaintext language was English, the cryptanalyst would know that R would most probably be E, T or A in the plaintext as these letters occur the most in English, he would still need to try more combinations but frequency distributions make the combinations to try a lot lesser.

Discussion (0)