code2

Crypto Analysis to Crack Vigenere Ciphers

(This post assumes some familiarity with both Vigenere and Ceasar Shift Ciphers.  You can do some background reading on them here first).

We can crack a Vigenere Cipher using mathematical analysis.  Vigenere Ciphers are more difficult to crack than Caesar Shifts, however they are still susceptible to mathematical techniques.  As an example, say we receive the code:

VVLWKGDRGLDQRZHSHVRAVVHZKUHRGFHGKDKITKRVMG

If we know it is a Vigenere Cipher encoded with the word CODE then we can create the following decoding table.

VIGENERE4

Here we have 4 alphabets, each starting with the letters of the code word.  To decode we cycle through the alphabets.  The first code letter is V so we find this in the C row and then look at the letter at the top of the column – this is T.  This is our first letter.  Next the second code letter is also V, but this time we find it the O row.  The column letter corresponding to this is H.  We continue this method which gives the decoded sentence:

THIS IS AN EXAMPLE OF HOW THE VIGENERE CIPHER WORKS

How do we know what cipher to use? 

In any kind of crypto-analyis we need to decide which technique has been used.  Say for example we receive the message:

GZEFWCEWTPGDRASPGNGSIAWDVFTUASZWSFSGRQOHEUFLAQVTUWFV
JSGHRVEEAMMOWRGGTUWSRUOAVSDMAEWNHEBRJTBURNUKGZIFOHR
FYBMHNNEQGNRLHNLCYACXTEYGWNFDRFTRJTUWNHEBRJ

In real code breaking we won’t have a message alongside it saying, “Use a Vigenere Cipher.”  A large part of the skill of code breaking is deciding which encoding technique has been used.  For our received message we have the frequency:

VINEGERE7

So, in this case is it best to do look for a Caesar Shift or a Vigenere Cipher?  To find this out, we could do with finding out how “smooth” the bar chart is and how it compares with the expected frequencies.  The expected values in English are:

vigenere3

A Caesar Shift simply shifts every letter in the message by a given number of letters in the alphabet, so we would expect a frequency barchart for a Caesar Shift to have the same peaks and troughs (just shifted along).  The Vigenere makes frequency analysis more difficult because it “smooths out” the frequencies – this means that the bar chart for the frequency will be less spiky and more uniform.

Incidence of Coincidence

A mathematical method to check how smooth the bar chart is, is to use the Incidence of Coincidence – this method is outlined in this post on Practical Cryptography, and uses this formula:

VIGENERE5

There is also a script on the site to work out the I.C for us.  If we enter our received code we get an I.C of 0.045.  We would expect an I.C of around 0.067 for a regular distribution of English letters (which we would find in a Caesar Shift for example).  Therefore this I.C value is a clue that we have a Vigenere Cipher rather than a Caesar shift.

Exploiting the cyclic nature of the Vigenere Cipher

So, we suspect it is a Vigenere Cipher, next we want to find out what the code word that was used to generate the code table is.  To do this we can look at the received code for repeating groups of letters.   There is a cyclic nature to the Vigenere Cipher, so there will also be a cyclic nature to the encoded message.

Using the site Crypto Corner we can analyse the text for repeating patterns of letters.  This gives us:

VINEGER8

This clearly indicates that there are a lot of letters repeating with period of 3.  Therefore it is a good guess that the keyword is also length 3.

So, next we can split the received message into 3 separate messages:

GFEPRPGAVUZFRHFQUVGVAOGURADEHRBNGFRBNQRNYXYNRRUHR
ZWWGAGSWFAWSQELVWJHEMWGWUVMWEJUUZOFMNGLLATGFFJWEJ
ECTDSNIDTSSGOUATFSREMRTSOSANBTRKIHYHENHCCEWDTTNB

Here we have simply generated the first line by taking the first, fourth, seventh, tenth etc. letters.

Cracking the code

Now we can do three separate Cesar Shift tests on these separate lines:

The first line has frequency:

vigg1

which strongly suggests that R in the cipher text is going to E.  This gives us the following Caesar Shift:

vigenere10

The second line has the following frequency:

VINEGER11

Which strongly suggests that W in the cipher text is going to E.  This gives us:

vigenere11

Lastly we notice that this will give us the codeword NS_.  Well NSA, (the American digital spy agency) would be a good guess so for the third Caesar Shift we try:

vigg2

Putting these together we have the Vigenere Cipher:

vigg3

and this decodes our received code as:

THE SECRET CODE IS CONTAINED IN THIS MESSAGE.  YOU MUST ADD THE FIRST PRIME NUMBER TO THE SECOND SQUARE NUMBER TO CRACK THIS. WHEN YOU HAVE DONE THAT CLICK BELOW AND ENTER THE NUMBER.

We have done it!  We have cracked the Vigenere Cipher using a mixture of statistics, logic and intuition.  The method may seem long, but this was a cipher that was thought to be unbreakable – and indeed took nearly 300 years to crack.  Today, using statistical algorithms it can be cracked in seconds.  Codes have moved on from the Vigenere Cipher – but maths remains at the heart of both making and breaking them.

If you enjoyed this post you might also like:

The Maths Code Challenge – three levels of codes to attempt, each one providing a password to access the next code in the series.  Can you make it onto the leaderboard?

RSA public key encryption – the code that secures the internet.

Essential resources for IB students:

1) Revision Village

Screen Shot 2021-05-19 at 9.55.51 AM

Revision Village has been put together to help IB students with topic revision both for during the course and for the end of Year 12 school exams and Year 13 final exams.  I would strongly recommend students use this as a resource during the course (not just for final revision in Y13!) There are specific resources for HL and SL students for both Analysis and Applications.

Screen Shot 2018-03-19 at 4.42.05 PM.png

There is a comprehensive Questionbank takes you to a breakdown of each main subject area (e.g. Algebra, Calculus etc) and then provides a large bank of graded questions.  What I like about this is that you are given a difficulty rating, as well as a mark scheme and also a worked video tutorial.  Really useful!

Screen Shot 2021-05-19 at 10.05.18 AM

The Practice Exams section takes you to a large number of ready made quizzes, exams and predicted papers.   These all have worked solutions and allow you to focus on specific topics or start general revision.  This also has some excellent challenging questions for those students aiming for 6s and 7s.

Each course also has a dedicated video tutorial section which provides 5-15 minute tutorial videos on every single syllabus part – handily sorted into topic categories.

2) Exploration Guides and Paper 3 Resources

Screen Shot 2021-05-19 at 6.32.13 PM

I’ve put together four comprehensive pdf guides to help students prepare for their exploration coursework and Paper 3 investigations. The exploration guides talk through the marking criteria, common student mistakes, excellent ideas for explorations, technology advice, modeling methods and a variety of statistical techniques with detailed explanations. I’ve also made 17 full investigation questions which are also excellent starting points for explorations.  The Exploration Guides can be downloaded here and the Paper 3 Questions can be downloaded here.