Friday, March 26, 2021

The Beale Papers, Part Four: What Can We Learn From The Cipher We Can Read?

 

The story behind the Beale “treasure” is in a sense meaningless because even if we could absolutely prove that the entire story was absolutely false (and given the problems in “proving” a negative), then the legend would still have believers, joining the group of people who propose that of course the story is a fake to cover up that the real treasure was Jean Lafitte’s, or the Confederate treasury, or some other secret source.

The Beale story is in that sense meaningless because no problem can ever finally break it. What we can reliably analyze are ciphers themselves. If anything here is going to be real, it has to be the numbers. Are they real? To talk about that, we first have to gather what we can from the B2: The Treasure. What have we learned about the encipherer by “breaking” B2?

The cipher in B2 was made using the Declaration of Independence as a book cipher, with every number representing a letter. Leaving aside for the moment the substantial numbers of mistakes that were made in the encoding, the message itself is written as if it was being sent to TJB’s tenth-grade English teacher.  Since the appearance of any words that appear to someone trying to break a cipher serves as a clue that they are on the right track, a truly cautious coder will toss in the occasional nonsense word, have numbers or symbols that don’t have any meaning in the cipher (nulls), spell words phonetically as opposed to being wedded to the Queen’s English (or King’s English, if one holds that this was written circa 1822) such as “siks” for “six” (which also would have helped with the VYX problem, further addressed below), along with other techniques. The encoder did none of these in B2. The only important thing to the PA was to make sure that the note was stylistically what they were aiming for. (Seriously speaking, if the cipher is real, the encoder could have written "six" as "five and one" in the same way they picked "ten hundred".

THE KEY

I propose that the encoder made a key while constructing B2. Starting with the alphabet listed on a paper, and when each letter is needed, the Declaration of Independence was scanned until a word starting with the correct letter was found, and then the number was written on the list. For example, the first letter is “I”, and when it is found that the 115th word starts with “I”, the number 115 is written on the cipher paper and on the key. After several different numbers had been found for a particular letter, the encoder began reusing numbers instead of looking them up individually, speeding up the ciphering process tremendously.   

THE VYX PROBLEM, REVISITED

In the Declaration of Independence, two words begin with the letter “V”, the first word is 819 words in. (Due to the series of errors made by the encoder, this appears as #807, which seems to indicate that the encoder spent the effort to count out that far, although not far enough to reach 1133 (which would be 1121, after mistakes), and they used 807 for every appearance of “V”, for which there were eighteen.  This is by far the sloppiest approach to any letter. If we compare the number of times that a letter is used in the code and compare that to the number of cipher numbers used to represent that the letter, on average each number is used 4.5 times. This is where the encoder balanced security (the more numbers the better!) with convenience (looking up a new letter takes time!). This value for “V” is therefore eighteen, with “Y” (and there are NO words that begin with “Y” in the Declaration, the encoder might have grabbed a word with a “Y” somewhere in it, or just took a big number) coming in second with nine (nine appearances and again, only one number used in enciphering), and “N” and “E” coming in with 8.6 and 7.2. (Doing all this, I learned that the letters “E” and “N” appear *in* words far more often than they *begin* words.  There, now there’s a possibility that you have learned something too; if you already knew this, please leave your age in the comments so I can work out how far behind I am.) We get into downright admirable security for “B” (appears 11 times, 7 substitution numbers are used), “M” (appears 7 times, 5 substitution numbers are used), and “J” (appears 3 times, using a different number each time). Only one number appears for the letter “K”, but it only gets used once. The only other letter ciphered with a single number was “X”.

There are no numbers (duh) beginning with the letter “X” in the Declaration of Independence. You could have easily guessed that. I easily guessed that. Somehow, the encoder missed this until they set down to translate their work, including the words “exchange”, “exact”, “excavation”, and “six”. Instead of using the word “trade” for “exchange”, “precise” for “exact”, “big freaking hole” for “excavation”, and “Rosencrantz! Guildenstern! Grab your shovels, we gotta add one (no, dammit, “V” is already a problem), TWO extra feet of dirt onto the treasure!”, they apparently grabbed a number way larger than any other number they had picked (possibly a number they thought was greater than the number of words in the Declaration of Independence) and trusted to context to let its audience (future TJB or MTF) work out that it was an “X”, use actual code words as well in addition to the ciphering, and more.  The encoder of B2 did none of these.

The key looked like this at the end of encoding B2.


As an argument that this is what the writer did, the following table shows how often each number is used.  Notice that in almost all cases, the first numbers found are the ones most used. (Perhaps the encoder finally reached the point of saying, "Screw it! I'm tired of looking for 'E"s!"?) Since there are numbers that are used quite often, it stands to reason that these numbers had been written down.

            So in enciphering the cipher we have as B2, the writer collected the numbers used to more easily use them, did not worry about reusing the same letter eighteen times for a hard-to-find letter, or just grabbing two numbers and depending on context to demonstrate their meaning. To be fair, for a book code, this still represented a difficult to overcome level of security, but that was all due to the choice of a book code, and none to clever efforts on the part of the encoder.  If have seen many theories about clever techniques that will explain why the other ciphers can't be made to make sense, but all of those depend on the encoder doing something with B1: The Location and 3: Names & Addresses that was drastically different to anything seen with regard to B2. 

            In the post that will go up on Monday morning, I will try and show how there *is* a way in which what we have as B1 and B3 can be explained by appealing to what we have, without an multiplication of outside forces.
 

No comments:

Post a Comment