You are currently browsing the category archive for the ‘Code Challenge’ category.

**Crack the Beale Papers and find a $65 Million buried treasure?**

The story of a priceless buried treasure of gold, silver and jewels (worth around $65 million in today’s money) began in January 1822. A stranger by the name of Thomas Beale walked into the Washington Hotel Virginia with a locked iron box, which he gave to the hotel owner, Robert Morriss. Morriss was to look after the box for Beale as he went off on his travels.

In May 1822 Morriss received a letter from Beale which stated that the box contained papers of huge value – but that they were encoded for protection. Beale went on to ask that Morriss continue to look after the box until his return. He added that if he did not return in the next 10 years then he had instructed a close friend to send the cipher key on June 1832. After that time Morriss would be able to decipher the code and learn of the box’s secrets.

Well, Beale never returned, nor did Morriss receive the promised cipher key. Eventually he decided to open the box. Inside were three sheets of paper written in code, and an explanatory note. The note detailed that Beale had, with a group of friends discovered a seam of gold and other precious metals in Santa Fe. They had mined this over a number of years – burying the treasure in a secret location for safe keeping. The note then explained that the coded messages would give the precise location of the treasure as well as detailing which men were due a share.

Morriss devoted many years to trying to decipher the code in vain – before deciding at the age of 84 in 1862 that he should share his secret with a close friend. That friend would later publish the Beale Papers in 1885. The pamphlet that was published stirred huge interest in America – inspiring treasure hunters and amateur cryptographers to try and crack the code. The second of the 3 coded messages was cracked by the author of the pamphlet using what is known as a book code. The United States Declaration of Independence was used as the book to encode the message above.

The first number 115 refers to the 115th word in the Declaration of Independence, which is the word “instituted”. Therefore the first letter of the decoded message is “I”. The second number is 73, which refers to the 73rd word in the declaration – which is “hold”, so the second letter of the decoded message is “h”. Following this method, the following message was revealed:

*I have deposited in the county of Bedford, about four miles from Buford’s, in an excavation or vault, six feet below the surface of the ground, the following articles, belonging jointly to the parties whose names are given in number three, herewith:*

*The first deposit consisted of ten hundred and fourteen pounds of gold, and thirty-eight hundred and twelve pounds of silver, deposited Nov. eighteen nineteen. The second was made Dec. eighteen twenty-one, and consisted of nineteen hundred and seven pounds of gold, and twelve hundred and eighty-eight of silver; also jewels, obtained in St. Louis in exchange for silver to save transportation, and valued at thirteen thousand dollars.*

*The above is securely packed in iron pots, with iron covers. The vault is roughly lined with stone, and the vessels rest on solid stone, and are covered with others. Paper number one describes the exact locality of the vault, so that no difficulty will be had in finding it. Source*

After the pamphlet was published there was great interest in cracking the 2 remaining papers, an interest which has persisted into modern times. One of the uncracked papers is shown below:

In 1983 2 amateur treasure hunters were jailed for trying to dig up graves in Bedford, sure that they were about to find the missing gold. In 1989 a professional treasure hunter called Mel Fisher secretly bought a large plot of land after believing that the treasure was buried underneath. However nothing was found. Up until now all efforts to crack the code above have ended in failure. Perhaps the pamphlet was a giant hoax? Or perhaps the treasure is still waiting to be found.

The town of Bedford still receives visitors from around the world, keen to try and crack this centuries old puzzle. You can hire metal detectors and go looking for it yourself. The map above from 1891 shows the 4 mile radius from Buford’s tavern which is thought to contain the treasure. Maybe one day Beale’s papers will finally be cracked.

For more information on this topic read Simon Singh’s excellent The Code Book – which has more details on this case and many other code breaking puzzles throughout history.

If you want to try your own codebreaking skills, head over to our Schoolcodebreaking site – to test your wits against students from schools around the world!

**Murder in the Maths Department **

A murder has been committed in the maths department! A body has been discovered surrounded by mathematical objects and only the hardworking maths teachers were in school, doing long division sums for fun at the weekend. One of them must be the murderer. *(The wall of fame of successful detectives will be posted here)*

**Your task, should you choose to accept it, is to find:
**

**1) The murderer**

2) The room

3) The murder weapon

**The murder suspects are:**

1) Al Jabra – who was wearing a white, T-shirt with 2 stripes and ripped jeans on the day of the murder.

2) Polly Gon – who was wearing a knee-length green skirt, white blouse and gold watch.

3) Lisa Perbound- who was wearing a blue Adidas T-shirt with 3 stripes on the sleeves, Bermuda shorts and a baseball cap.

4) May Trix- who was wearing a black and white pin-stripe suit with shiny black shoes.

5) Ella Ment- who was wearing a blue knitted jumper with a picture of pi on the front, and brown cords.

The possible rooms are:

**The possible rooms are:
**

1) The Canteen

2) The Tuck-shop

3) Room 20

4) Room 18

5) Room 17

6) Room 7

**The possible murder weapons are:**

1) A wooden metre ruler

2) A large metal stapler

3) A dusty trundle wheel

4) A sharp compass

5) A large maths textbook

6) An oversized calculator

**Clue Number 1**

When you have solved this clue – click here, and enter the last word only of the decoded message (no capital letters).

This is inspired from the great site, Practical Cryptography which is a really good resource for code making and code breaking. One of their articles is about how we can use the Chi Squared test to crack a Caesar Shift code. Indeed, if you use an online program to crack a Caesar shift, they are probably using this technique.

This is the formula that you will be using for Chi Squared. It looks more complicated than it is. Say we have the following message (also from Practical Cryptography):

*AOLJHLZHYJPWOLYPZVULVMAOLLHYSPLZARUVDUHUKZPTWSLZAJPWOLY ZPAPZHAFWLVMZBIZAPABAPVUJPWOLYPUDOPJOLHJOSLAALYPUAOLWSH PUALEAPZZOPMALKHJLYAHPUUBTILYVMWSHJLZKVDUAOLHSWOHILA*

We first work out the frequency of each letter which we do using the Counton site.

We next need to work out the expected values for each letter. To do this we first need the expected percentages for the English language:

Then we can count the number of letters in the code we want to crack (162 – again we can use an online tool)

Now, to find the expected number of As in the code we simply do 162 x 0.082 = 13.284.

The actual number of As in the code is 18.

Therefore we can do (13.284-18)^{2}/18 following the formula at the top of the page.

We then do exactly the same for the Bs in the code. The expected number is 162 x 0.015 = 2.43. The actual number is 3.

Therefore we can do (3-2.43)^{2 }/2.43

We do this same method for all the letters A-Z and then add all those numbers together. This is our Chi Squared statistic. The lower the value, the closer the 2 distributions are. If the expected values and the observed values are the same then there will be a chi squared of zero.

If you add all the values together you get a Chi Squared value of ≈1634 – which is quite large! This is what we would expect – because we already know that the code we have received has letter frequencies quite different to normal English sentences. Now, what a Caesar Shift decoder can do is shift the received code through all the permutations and then for each one find out the Chi Squared value. The permutation with the lowest Chi Squared will be the solution.

For example, if we shift every letter in our received code back by one – using the Counton tool (so A goes to Z etc) we get:

*ZNKIGKYGXIOVNKXOYUTKULZNKKGXROKYZQTUCTGTJYOSVRKYZIOVNKX YOZOYGZEVKULYAHYZOZAZOUTIOVNKXOTCNOINKGINRKZZKXOTZNKVRG OTZKDZOYYNOLZKJGIKXZGOTTASHKXULVRGIKYJUCTZNKGRVNGHKZ*

We can then do the same Chi Squared calculations as before. This will give a Chi Squared of ≈3440 – which is an even worse fit than the last calculation. If we carried this on so that A goes to T we would get:

THECAESARCIPHERISONEOFTHEEARLIESTKNOWNANDSIMPLESTCIPHER SITISATYPEOFSUBSTITUTIONCIPHERINWHICHEACHLETTERINTHEPLA INTEXTISSHIFTEDACERTAINNUMBEROFPLACESDOWNTHEALPHABET

and a Chi Squared on this would show that this has a Chi Squared of ≈33 – ie it is a very good fit. (You will get closer to zero on very long code texts which follow standard English usage). Now, obviously we could see that this is the correct decryption without even working out the Chi Squared value – but this method allows a computer to do it, without needing the ability to understand English. Additionally a codebreaker who spoke no English would still be able to decipher this code, on mathematics alone.

The Practical Cryptography site have a tool for quickly working out Chi Squared values from texts – so you can experiment with your own codes. Note that this is a slightly different use of Chi-Squared as here we are not comparing with a critical value, but instead comparing all Chi Squared to find the lowest value.

If you liked this post you might also like:

Code Breakers Wanted by the NSA – A look at some other code breaking techniques.

RSA Public Key Encryption – The Code that Secures the internet – How understanding RSA code is essential for all people involved in internet security.

**IB Revision**

If you’re already thinking about your coursework then it’s probably also time to start planning some revision, either for the end of Year 12 school exams or Year 13 final exams. There’s a really great website that I would strongly recommend students use – you choose your subject (HL/SL/Studies if your exam is in 2020 or Applications/Analysis if your exam is in 2021), and then have the following resources:

The Questionbank takes you to a breakdown of each main subject area (e.g. Algebra, Calculus etc) and each area then has a number of graded questions. What I like about this is that you are given a difficulty rating, as well as a mark scheme and also a worked video tutorial. Really useful!

The Practice Exams section takes you to ready made exams on each topic – again with worked solutions. This also has some harder exams for those students aiming for 6s and 7s and the Past IB Exams section takes you to full video worked solutions to every question on every past paper – and you can also get a prediction exam for the upcoming year.

I would really recommend everyone making use of this – there is a mixture of a lot of free content as well as premium content so have a look and see what you think.

**Crypto Analysis to Crack Vigenere Ciphers**

*(This post assumes some familiarity with both Vigenere and Ceasar Shift Ciphers. You can do some background reading on them here first).*

We can crack a Vigenere Cipher using mathematical analysis. Vigenere Ciphers are more difficult to crack than Caesar Shifts, however they are still susceptible to mathematical techniques. As an example, say we receive the code:

*VVLWKGDRGLDQRZHSHVRAVVHZKUHRGFHGKDKITKRVMG*

If we know it is a Vigenere Cipher encoded with the word CODE then we can create the following decoding table.

Here we have 4 alphabets, each starting with the letters of the code word. To decode we cycle through the alphabets. The first code letter is V so we find this in the C row and then look at the letter at the top of the column – this is T. This is our first letter. Next the second code letter is also V, but this time we find it the O row. The column letter corresponding to this is H. We continue this method which gives the decoded sentence:

*THIS IS AN EXAMPLE OF HOW THE VIGENERE CIPHER WORKS*

**How do we know what cipher to use? **

In any kind of crypto-analyis we need to decide which technique has been used. Say for example we receive the message:

*GZEFWCEWTPGDRASPGNGSIAWDVFTUASZWSFSGRQOHEUFLAQVTUWFV*

* JSGHRVEEAMMOWRGGTUWSRUOAVSDMAEWNHEBRJTBURNUKGZIFOHR*

* FYBMHNNEQGNRLHNLCYACXTEYGWNFDRFTRJTUWNHEBRJ*

In real code breaking we won’t have a message alongside it saying, “Use a Vigenere Cipher.” A large part of the skill of code breaking is deciding which encoding technique has been used. For our received message we have the frequency:

So, in this case is it best to do look for a Caesar Shift or a Vigenere Cipher? To find this out, we could do with finding out how “smooth” the bar chart is and how it compares with the expected frequencies. The expected values in English are:

A Caesar Shift simply shifts every letter in the message by a given number of letters in the alphabet, so we would expect a frequency barchart for a Caesar Shift to have the same peaks and troughs (just shifted along). The Vigenere makes frequency analysis more difficult because it “smooths out” the frequencies – this means that the bar chart for the frequency will be less spiky and more uniform.

**Incidence of Coincidence**

A mathematical method to check how smooth the bar chart is, is to use the Incidence of Coincidence – this method is outlined in this post on Practical Cryptography, and uses this formula:

There is also a script on the site to work out the I.C for us. If we enter our received code we get an I.C of 0.045. We would expect an I.C of around 0.067 for a regular distribution of English letters (which we would find in a Caesar Shift for example). Therefore this I.C value is a clue that we have a Vigenere Cipher rather than a Caesar shift.

**Exploiting the cyclic nature of the Vigenere Cipher**

So, we suspect it is a Vigenere Cipher, next we want to find out what the code word that was used to generate the code table is. To do this we can look at the received code for repeating groups of letters. There is a cyclic nature to the Vigenere Cipher, so there will also be a cyclic nature to the encoded message.

Using the site Crypto Corner we can analyse the text for repeating patterns of letters. This gives us:

This clearly indicates that there are a lot of letters repeating with period of 3. Therefore it is a good guess that the keyword is also length 3.

So, next we can split the received message into 3 separate messages:

*GFEPRPGAVUZFRHFQUVGVAOGURADEHRBNGFRBNQRNYXYNRRUHR*

* ZWWGAGSWFAWSQELVWJHEMWGWUVMWEJUUZOFMNGLLATGFFJWEJ*

* ECTDSNIDTSSGOUATFSREMRTSOSANBTRKIHYHENHCCEWDTTNB*

Here we have simply generated the first line by taking the first, fourth, seventh, tenth etc. letters.

**Cracking the code**

Now we can do three separate Cesar Shift tests on these separate lines:

The first line has frequency:

which strongly suggests that R in the cipher text is going to E. This gives us the following Caesar Shift:

The second line has the following frequency:

Which strongly suggests that W in the cipher text is going to E. This gives us:

Lastly we notice that this will give us the codeword NS_. Well NSA, (the American digital spy agency) would be a good guess so for the third Caesar Shift we try:

Putting these together we have the Vigenere Cipher:

and this decodes our received code as:

*THE SECRET CODE IS CONTAINED IN THIS MESSAGE. YOU MUST ADD THE FIRST PRIME NUMBER TO THE SECOND SQUARE NUMBER TO CRACK THIS. WHEN YOU HAVE DONE THAT CLICK BELOW AND ENTER THE NUMBER.*

We have done it! We have cracked the Vigenere Cipher using a mixture of statistics, logic and intuition. The method may seem long, but this was a cipher that was thought to be unbreakable – and indeed took nearly 300 years to crack. Today, using statistical algorithms it can be cracked in seconds. Codes have moved on from the Vigenere Cipher – but maths remains at the heart of both making and breaking them.

If you enjoyed this post you might also like:

The Maths Code Challenge – three levels of codes to attempt, each one providing a password to access the next code in the series. Can you make it onto the leaderboard?

RSA public key encryption – the code that secures the internet.

**NSA Code Breaking Puzzle Number 2**

Here is the second of the NSA’s code breaking tweets – crack them all and you might have a chance of a job with the super-secret digital spying agency.

Rimfinnpeqcnvqauuagcrdokvdisn

drdcrpigaisacpsdffaicvhakcfdqfpq

detrkilfaecnpqacakqisacpfampoa

cfimannicfakdumfalddnraprf

After a few attempts it looks a bit harder than the first one – it’s not a substitution cipher like the first code, it’s not a Caesar shift, it’s not a transposition cipher, it’s not an atbash code. The number of letters is 117 – which has factors 1,3,9,13, 39,117 However the frequency distribution is:

This distribution does look similar to the expected frequencies for the alphabet:

so maybe nothing too fancy is happening. However, the fact that there are no letters at all for w,x,y,z is a little odd – you would expect the letters with no frequency to be spaced out if it was any kind of substitution.

Trying something like a Caesar shift and then a transposition cipher doesn’t seem to throw anything up. There’s also a repeated string of letters in the text –* isacp* which is repeated after 42 letters. This might suggest that this is a Vigenere cipher – with a keyword of length 2,3,6,7 or 21.

We can use the awesome Crypto Corner website to analyse this. However, I’ve not been able to get any further – I’ve tried guessing some obvious keywords (NSA, Careers, agency) etc – without any luck. I have also tried guessing possible first words (The, This, NSA) etc – to try and work out a likely keyword. Again, nothing! Maybe it’s not a Vigenere cipher after all.

So, time to throw in the towel and find the solution online. It turns out that actually it wasn’t as complicated as a Vignere cipher – but it did require using 2 separate actions. First, we need to reverse the text in the message:

frparnddlafmudkafcinnamifcaopmaf

pcasiqkacaqpnceaflikrtedqpfqdfckah

vciaffdspcasiagiprcdrdnsidvkod

rcgauuaqvncqepnnifmiR

and now we then use some software to analyse the string for a substitution cipher.

This gives the solution:

*nsa is looking for intelligent imaginative critical thinkers who can contribute innovative ideas to solve our most difficult challenges*

**Code Breakers Wanted by the NSA**

The American National Security Agency have just launched a new code breaking challenge. The tweet above is the first in their series for those interested in a career in code breaking. The NSA are possibly in search of some good publicity after the revelations of Edward Snowden with regards to just how much data they collect on citizens, and people who can crack all their codes may get a chance to work with the organisation.

The first code above looks like quite an easy one – based on a Caesar shift (where we shift letters along by a given amount). The only difficulty is deciding how many letters to shift the alphabet along by. To help we can use a frequency analysis table like that below:

This counts how often each letter occurs in the code. We can see that the most common letters are C I and P. It’s a good chance that one of these corresponds to the letter E (the most common letter in the English alphabet).

C to E corresponds to a shift of 2. But if we try this in the first word we get:

vrh as the first 3 letters – not very promising!

I to E corresponds to a shift of -4. Trying this with the first word we get:

plb as the first 3 letters – again not likely!

P to E corresponds to a shift of -11. Trying this with the first word we also get gibberish. So we could continue with this method, but if we do, we’ll not be able to break the code. It’s not a Caesar Shift – but a more complicated substitution cipher, where every code letter is randomly assigned an alphabet letter. If the message was long enough, the frequency analysis would still help us.

This shows that the most frequent letters in the English alphabet are e,t,a,0,i. We could then to match the letter frequencies with all the code letters by trial and error til we found a match. However, the message is quite short so instead there is an online tool that will do all this analysis for us and work out the most likely solutions:

From these suggestions we can see that the solution is actually:

*Want to know what it takes to work at NSA? Check back each Monday in May as we explore career essentials to protect our nation*

(notice that I does indeed go to E – as we thought it might do based on frequency analysis. C also goes to T and P also goes to A. Therefore using frequency analysis the 3 most frequent letters in the code word correspond to the 3 most frequent letters in the solution).

Let’s see what new messages are posted next Monday.

The School Code Challenge is based on a similar competition that GCHQ (The UK digital spy agency) are running. My clues will however be a little more accessible! I have created a number of codes that need to be broken. Each code will give a password. When you crack the code, follow the link and enter the password to access the next code. There are five codes in each level. After completing the final code there will be a submission box where you can enter you name. I will put up a leaderboard of all the successful entrants.

There are now 3 levels of codes – Level 1 is designed for KS3 and KS4 students (11-16). Level 2 is designed for KS4 (14-16). Level 3 is designed for post 16 students (A level and IB). Can you break all levels? Scroll down to see a leaderboard of students who have completed this challenge!

**Level 1 Code Number 1:**

**Level 2 Code Number 1:**

**Level 3 Code Number 1:**

**Level 1 Code Number 1 Information:**

1) A Roman emperor may help you discover the method.

2) Frequency analysis may also help you – what is the most common letter in the English alphabet?

When you have solved this puzzle click here to enter you password and gain access to the second code.

**Level 2 Code Number 1 Information:
**

Actually, this is quite a nice gentle start to Level 2 – but don’t worry, they do get harder! When you’ve cracked it, click here for the second code.

**Level 3 Code Number 1 Information:
**

Binary strings are used as a method of transmitting data across large distances (Voyager 1 used this method to transmit pictures from Jupiter in 1979 – a massive 600 million miles away). Make a rectangular grid by factorising the length of the binary string above into prime factors to reveal a secret message. When you’ve solved it write the password (in word(s)) here. Warning – this level is not for the faint hearted!

**Leaderboard: Click here (and scroll down) to see a list of students who have completed this challenge!
**

**Teachers**

If you are a teacher and would like to run this as an in-house school competition then if you contact me I will send out the full questions, answers and code breaking pack (with student worksheets and murder mysteries etc). It is also available on TES here.

For more code breaking content, check out:

Crack the Code to Become a Spy : Information about the current GCHQ competition – with prizes to be won for those who can solve the puzzle.

Cracking Codes Lesson. An example of 2 double period lessons on code breaking

NASA, Aliens and Binary Codes from the Stars. A post about how pictures can be transmitted across millions of miles using binary strings.