Caesar Cipher: Encryption and Decryption

Introduction

This exercise demonstrates the encryption and decryption of text using the Caesar cipher. The process involves:

Step 1: Compile a Large Text Sample

To perform frequency analysis, we need a sufficiently large piece of text. For this example, we use "Pride and Prejudice" by Jane Austen, a public domain work.

Step 2: Analyze Letter Frequency

The frequency of each letter in the text is computed as a proportion of the total number of letters. The following Python code performs this analysis:

from collections import Counter

# Load the text and filter for letters only
filtered_text = ''.join(filter(str.isalpha, text.lower()))
letter_counts = Counter(filtered_text)
total_letters = sum(letter_counts.values())
letter_frequency = {letter: count / total_letters for letter, count in letter_counts.items()}
    

Step 3: Encrypt the Text

The Caesar cipher shifts each letter by a fixed value. For example, with a shift of 5, 'a' becomes 'f', 'b' becomes 'g', and so on. Wrap-around is handled for 'z'.

def caesar_cipher_encrypt(text, shift):
    encrypted_text = []
    for char in text:
        if char.isalpha():
            shifted = ord(char) + shift
            if char.islower() and shifted > ord('z'):
                shifted -= 26
            elif char.isupper() and shifted > ord('Z'):
                shifted -= 26
            encrypted_text.append(chr(shifted))
        else:
            encrypted_text.append(char)
    return ''.join(encrypted_text)
    

Step 4: Decrypt the Ciphertext

Frequency analysis can determine the shift value. We compare the letter frequency of the ciphertext with known English letter frequencies using the chi-squared statistic.

def caesar_cipher_decrypt(text, shift):
    return caesar_cipher_encrypt(text, -shift)

# Find the shift with smallest chi-squared value
chi_squared_statistics = []
for shift in range(26):
    decrypted_text = caesar_cipher_decrypt(encrypted_text, shift)
    # Compute chi-squared
    chi_squared = sum(
        (observed - expected) ** 2 / expected
        for letter, observed in decrypted_letter_frequency.items()
        if (expected := english_letter_frequency.get(letter, 0))
    )
    chi_squared_statistics.append((shift, chi_squared))

best_shift = min(chi_squared_statistics, key=lambda x: x[1])[0]
decrypted_text = caesar_cipher_decrypt(encrypted_text, best_shift)
    

Results

After identifying the correct shift using frequency analysis, the original text is successfully decrypted.

Conclusion

The Caesar cipher provides a simple yet effective demonstration of encryption and decryption. While it is vulnerable to frequency analysis, it serves as an excellent introduction to cryptography.