This is
the third in a series of x posts (see the previous entry for my cunning plan on
this score)examining the Beale Papers, centered on a series of alleged ciphers
contained in a pamphlet printed in Virginia in 1885. Three series of numbers
were included, with the pamphlet
stating that one cipher listed the contents of
the treasure, one list the location of the treasure, and one listed the names
and addresses of the heirs that the treasure was supposed to be divided among.
If none
of the original ciphers had been broken, then I argue that far more people,
even those readers that had spent money on it, would have sooner or later
simply assumed that the numbers were random gibberish, and tossed it aside.
However, the higher-than-average price conveyed a sense that there must be
something real at the base to justify that price, and the claim that one
encryption had been broken implied that the other two were real ciphers that
could be broken as well.
Without at least one cipher being
broken, I very much doubt that the pamphlet would have so any public attention,
so it is fortunate for the story that one (if the message was really ‘hidden’
in the first place) of the ciphers was broken, and that it was the one listing
the treasure.
If the cipher listing the location
of the treasure had been translated, I can’t see why the pamphlet would ever
have been published. I, at least, would have been off like a shot and as long
as the treasure had been anything above mildewed counterfeit Pokémon cards, I
would have been happy. If the cipher containing names and addresses of heirs
had been decoded, at least the story could be checked, and the heirs would be
running around chasing the people wandering about with shovels.
So it was
really amazingly good luck for the staying power of the story and the sales
draw of the expensive pamphlet that the page that was deciphered just
listed the contents of the treasure.
It would
make some sense for a would-be codebreaker to attack what we call B2 first. It is the longest set of numbers, so if any successful
attack on the code is made, there are more places for clear meaning to appear
as words. As we shall also see, if B1: The Location and B3: Names &
Addresses are real ciphers, they appear (for reasons that I will go into in
Part Five) to be enciphered with far more care and discipline. Here are
the numbers, as given in the pamphlet. There are 762 (actually 763, at one
point two numbers appear to be combined into one) numbers listed.
115, 73, 24, 807, 37, 52, 49, 17, 31, 62, 647,
22, 7, 15, 140, 47, 29, 107, 79, 84, 56, 239, 10, 26, 811, 5, 196, 308, 85, 52,
160, 136, 59, 211, 36, 9, 46, 316, 554, 122, 106, 95, 53, 58, 2, 42, 7, 35,
122, 53, 31, 82, 77, 250, 196, 56, 96, 118, 71, 140, 287, 28, 353, 37, 1005,
65, 147, 807, 24, 3, 8, 12, 47, 43, 59, 807, 45, 316, 101, 41, 78, 154, 1005,
122, 138, 191, 16, 77, 49, 102, 57, 72, 34, 73, 85, 35, 371, 59, 196, 81, 92,
191, 106, 273, 60, 394, 620, 270, 220, 106, 388, 287, 63, 3, 6, 191, 122, 43,
234, 400, 106, 290, 314, 47, 48, 81, 96, 26, 115, 92, 158, 191, 110, 77, 85,
197, 46, 10, 113, 140, 353, 48, 120, 106, 2, 607, 61, 420, 811, 29, 125, 14,
20, 37, 105, 28, 248, 16, 159, 7, 35, 19, 301, 125, 110, 486, 287, 98, 117,
511, 62, 51, 220, 37, 113, 140, 807, 138, 540, 8, 44, 287, 388, 117, 18, 79,
344, 34, 20, 59, 511, 548, 107, 603, 220, 7, 66, 154, 41, 20, 50, 6, 575, 122,
154, 248, 110, 61, 52, 33, 30, 5, 38, 8, 14, 84, 57, 540, 217, 115, 71, 29, 84,
63, 43, 131, 29, 138, 47, 73, 239, 540, 52, 53, 79, 118, 51, 44, 63, 196, 12,
239, 112, 3, 49, 79, 353, 105, 56, 371, 557, 211, 505, 125, 360, 133, 143, 101,
15, 284, 540, 252, 14, 205, 140, 344, 26, 811, 138, 115, 48, 73, 34, 205, 316,
607, 63, 220, 7, 52, 150, 44, 52, 16, 40, 37, 158, 807, 37, 121, 12, 95, 10,
15, 35, 12, 131, 62, 115, 102, 807, 49, 53, 135, 138, 30, 31, 62, 67, 41, 85,
63, 10, 106, 807, 138, 8, 113, 20, 32, 33, 37, 353, 287, 140, 47, 85, 50, 37,
49, 47, 64, 6, 7, 71, 33, 4, 43, 47, 63, 1, 27, 600, 208, 230, 15, 191, 246,
85, 94, 511, 2, 270, 20, 39, 7, 33, 44, 22, 40, 7, 10, 3, 811, 106, 44, 486, 230,
353, 211, 200, 31, 10, 38, 140, 297, 61, 603, 320, 302, 666, 287, 2, 44, 33,
32, 511, 548, 10, 6, 250, 557, 246, 53, 37, 52, 83, 47, 320, 38, 33, 807, 7,
44, 30, 31, 250, 10, 15, 35, 106, 160, 113, 31, 102, 406, 230, 540, 320, 29,
66, 33, 101, 807, 138, 301, 316, 353, 320, 220, 37, 52, 28, 540, 320, 33, 8,
48, 107, 50, 811, 7, 2, 113, 73, 16, 125, 11, 110, 67, 102, 807, 33, 59, 81,
158, 38, 43, 581, 138, 19, 85, 400, 38, 43, 77, 14, 27, 8, 47, 138, 63, 140,
44, 35, 22, 177, 106, 250, 314, 217, 2, 10, 7, 1005, 4, 20, 25, 44, 48, 7, 26,
46, 110, 230, 807, 191, 34, 112, 147, 44, 110, 121, 125, 96, 41, 51, 50, 140,
56, 47, 152, 540, 63, 807, 28, 42, 250, 138, 582, 98, 643, 32, 107, 140, 112,
26, 85, 138, 540, 53, 20, 125, 371, 38, 36, 10, 52, 118, 136, 102, 420, 150,
112, 71, 14, 20, 7, 24, 18, 12, 807, 37, 67, 110, 62, 33, 21, 95, 220, 511,
102, 811, 30, 83, 84, 305, 620, 15, 2, 108, 220, 106, 353, 105, 106, 60, 275,
72, 8, 50, 205, 185, 112, 125, 540, 65, 106, 807, 138, 96, 110, 16, 73, 33,
807, 150, 409, 400, 50, 154, 285, 96, 106, 316, 270, 205, 101, 811, 400, 8, 44,
37, 52, 40, 241, 34, 205, 38, 16, 46, 47, 85, 24, 44, 15, 64, 73, 138, 807, 85,
78, 110, 33, 420, 505, 53, 37, 38, 22, 31, 10, 110, 106, 101, 140, 15, 38, 3,
5, 44, 7, 98, 287, 135, 150, 96, 33, 84, 125, 807, 191, 96, 511, 118, 40, 370,
643, 466, 106, 41, 107, 603, 220, 275, 30, 150, 105, 49, 53, 287, 250, 208,
134, 7, 53, 12, 47, 85, 63, 138, 110, 21, 112, 140, 485, 486, 505, 14, 73, 84,
575, 1005, 150, 200, 16, 42, 5, 4, 25, 42, 8, 16, 811, 125, 160, 32, 205, 603,
807, 81, 96, 405, 41, 600, 136, 14, 20, 28, 26, 353, 302, 246, 8, 131, 160,
140, 84, 440, 42, 16, 811, 40, 67, 101, 102, 194, 138, 205, 51, 63, 241, 540,
122, 8, 10, 63, 140, 47, 48, 140, 288
Wasn’t
that exciting? I left the numbers as text instead of using an image in case
anyone wanted to take them for themselves. (Although if anyone wants to do
this, drop me an email at Astronoharry at gmail dot net, and I’ll send you a
copy of an Excel file with all the numbers tied to a way to test any text you
want to see if that will work as a book code for the others.)
There are
many ways to encipher information. Most of the codes available at the time were
substitution ciphers, in which each letter was replaced by another letter or
number. This actually isn’t a good way
to encode a secret, especially a secret this long, because the letters aren’t
used evenly in the English language. The most commonly used letter is E,
followed by T, then A, then OINSHRDLCUMWFGYPBVKJXQZ. The two graphs shown below
demonstrate the letter distribution for a large number of English-language texts,
followed by the letter distribution for B2: the Treasure. I will come back to
these in my analysis of B1: The Location and B3: Names and Addresses.


The vertical axis represents the pecentage of letters made up of that particular letter, and the horizontal axis is the alphabet, where "A" is represented by 1, "E" is represented by 5, "T" by 20, and so on. It could be that the differences are due to the set being limited to 763 letters; maybe a longer message would have fit the first plot even better. In the case that the differences (though small - see the graph below) represents a difference in style between the author of B2 (TJB or Ward) and general English, I'll use both graphs when I try and play around with B1 and B3.

It can’t be the case that this is a
simple substitution cipher, as there are more than 26 different numbers, so
this would seem to be a homophonic substitution cipher, in which each letter is
replaced by several letters, so that a reader can’t break this code the way a
simple substitution cipher could be broken. If this cipher was constructed by
selecting numbers at random to replace letters, this would be an extremely
difficult code to break, made the more difficult by the more numbers assigned
to each letter. A code in which each number appeared only once would be
unbreakable without the key.
This,
then is where the key becomes a weakness, because one presumes that someone
encoding a message would like at least one other person (even if it’s
themselves at a later time) to be able to decode it. Therefore, the key must be
provided to the other person. If the key is intercepted, the enciphering is
worthless, and perhaps worse than worthless if the person who enciphered it
does not know that communication has been compromised. How to get the key to
the other person?
If it is a book code, one could
tell the other person what the book would be ahead of time, mention the book in
another communication, or just mail them a copy of the book. Or one person
might have the book, and another person the numbers, keeping security as long
as the two don’t know who each other are.
If we agree with MP that this is a
book code, and we take the logical guess that each number refers to the first
letter of a word given by its position in the document, then we are looking at
simply testing this with one document after another. This would take us basically infinite time,
even if we constrained ourselves to using books available in 1822. (And it
could always be that TJP wrote up a single essay that was then used as a basis,
so we would be doomed in any case.) Happily (choose your level of sarcasm), MP
discovered (adjust sarcasm level again) that the Declaration of Independence
served as the key text for this set of numbers.
It
surprised me to learn that there are a number of different versions of the
Declaration of Independence that exist. I grabbed the one that I’m going to
start with using through a quick Google search. Only after trying this with my
own hand using this will I compare it to the version included in the pamphlet.
When using this version, I produced the text reproduced below. Some words are
clearly there. I am going to cheat just a smidge and from the beginning and
include word spacing where it will eventually appear.
I hare depostted in the countf of gedford aboar four
miles from btfords ii aa ehcaration or raalt sih feet below the sirface ot tta nrohid
the foolonrng articles belongiag joiattf to the ptrties wtosa iamls are girep
in ihmbeh thrlt horewith tht fitst deposit copsistcd of ten hopdred and fourteea
poiodh oh gold cpa thihtf eight haadred and twelre pounds of silrer deposited
nor eighteea iineteen the second wat made aec linhteen twentf ona aad consiftoi
hp iinetltn htoared ani seren potnds of gold api twelre taaired api eightf eight
of silrer also heweos obtained in st lotrs in ehchange to sare transportation apd
raltea ap thirteep rhoisand doltars the abore is securllf pacpad itron pots
with irop cortrs tht rapot is roanhlf oined with stone amd the resseth rest on
solid stone aid are cororld uppr othors paper itmber one descrifah thc thact
localitf of tho rarot to that ah aifficultf will be had ip finding in
The
greatest weakness in using a book code, even if both parties have exactly the
same version of the book (and it turns out that there are a lot of versions of
the Declaration of Independence with minor differences), is that one error in
counting could doom the effort. Whether this was encoded by TJB or James Ward,
the encoder did not feel that it was worth the effort to double-check the work.
Still, some words can be seen. I have cheated by cutting the text to fit the
final words, but surely the gentle reader is intelligent enough to consider the
problems inherent in sorting the wheat from the chaff, and no one has any
desire to see this get longer through the use of a coupe of theoretical blind
alleys! Here, let me even go so far as
to print words in bold as they reach their final form.
I hare depostted in the countf of
gedford aboar four miles from btfords ii aa ehcaration or raalt
sih feet below the sirface ot tta nrohid the foolonrng articles
belongiag joiattf to the ptrties wtosa iamls are girep in
ihmbeh thrlt horewith tht fitst deposit copsistcd of ten
hopdred and fourteea poiodh oh gold cpa thihtf eight
haadred and twelre pounds of silrer deposited nor
eighteea iineteen the second wat made aec linhteen twentf
ona aad consiftoi hp iinetltn htoared ani seren potnds of gold api
twelre taaired api eightf eight of silrer also heweos obtained
in St. lotrs in ehchange to sare transportation
apd raltea ap thirteep rhoisand doltars the abore is securllf
pacpad itron pots with irop cortrs tht rapot is roanhlf oined
with stone amd the resseth rest on solid
stone aid are cororld uppr othors paper itmber one
descrifah thc thact localitf of tho rarot to that ah aifficultf will
be had ip finding in
THE
ENCIPHERER’S VYX PROBLEM
Just from this pass, we can see
enough clear words to indicate that this must be the key for the book code, and
also that there are problems. Before trying to address some of the letters that
seem at first to make no sense, there is one quick cut that I want to make. From
context, using an examination of what we can already work out, it seem likely
that 807 is the letter “v”, the number
811 represents the letter “y”, and 1005 represents “x”. Replacing these three
numbers with these three letters does add clarity to the message, and agrees
with the translation in the pamphlet (which carries extra weight if PA/MP is the
encoder – i.e., if it’s a fake). I have also begun adding in capitalization and
punctuation where it seems to make sense.
I have depostted in the county
of gedford aboar four miles from btfords ii aa excaration or vaalt
six feet below the sirface ot tta nrohid the foolonrng articles
belongiag joiatty to the ptrties wtosa iamls are givep in
ihmbeh thrlt herewith. Tht fitst deposit copsistcd of ten
hopdred and fourteea poiodh oh gold cpa thihty eight
haadred and twelve pounds of silver deposited
Nov. eighteea iineteen. The second wat made aec
linhteen twenty ona aad consiftoi hp iinetltn htoared ani seven
potnds of gold api twelve taaired api eighty eight of
silver, also heweos obtained in St. lotrs in
exchange to save transportation apd valtea ap
thirteep rhoisand doltars. The above is securlly pacpad
itron pots with irop covtrs. Tht vapot is roanhlf oined with
stone and the vesseth rest on solid stone
aid are covorld uppr othors. Paper itmber one descrifah
thc txact locality of tho varot to that ah aifficulty will
be had ip finding in.
There is still quite a lot of noise
in the above signal. If one has done
this using a spreadsheet, as I have (take THAT, nineteenth century treasure
hunters!), or if one marks with a dot every letter than doesn’t seem to fit,
one will see that along with a couple of scattered problems, every number above
240 is incongruous. If one were to look at some examples, thinking perhaps that
there is no Gedford County, but there is a “Bedford”, and the pamphlet was
published in Bedford County, and there is a “B” one number off from the “G”,
then if assume that somewhere around 240 the encoder miscounted by one ...)
I have deposcted in the county
of Bedford about four miles from Buford’s in an
excavation or vault six feet below the surface of
ths ground the following articles belonging
joiptly to the parties whosl namfs are giveo in number
thrfl harewith. Tha first deposit coosistcd of ten
huodred and fourteen poitds of gold aod thirty
eight hupdred and twelve pounds of silver
deposited Nov. eighteen nineteen. The second
wao made Dec. fighteen twenty onl and consistad oh ninetfln
hutdred and seven pounds of gold aod twelve hundred
aod eighty eight of silver, also aewels obtained in
St. Louis in exchange to save transportation
aod valuet as thirteeo rhosand dollars. The above is
securlly packsd itron pots with iroo covtrs. Tht vault is
roughly lined with stone and the vessetr
rest on solid stone and are coverfd uish
othors. Paper number one descrialr thc axact locality
of tho varlt oo that no difficulty will be had io finding
it.
This is
much better, but still not perfect. Let’s take another look through to see if
we can see another pattern. (After all, we’re not doing this to read B2; we
could pretty much do that already. We want the key that the encoder is building
as B2 is being enciphered.
Looking
back at the spreadsheet/dots/whatever you’re using, we now see that almost all of the
problems are at numbers of 485 and above. We could do as before, guess what
letter should be in a slot, and then look at what shifts are needed to fix the
problem. Number 486 should encipher as an “E”. Strangely, there is only one “E”
even relatively close by, and that requires a shift of ten. Number 511 should
also be an “E”, not an “F”, and a shift of adding ten brings that in accordance
as well. Checking a couple other numbers
in the low 500’s shows the same effect. (I’m trying to stay close to the
problem break in case the encoder makes another mistake later.) If we peek at
the version of the Declaration in pamphlet, we see that the inset numbers,
included after every ten words, reads “450 … 460 … 470 … 480 … 480 … 490 … 500)
and so on. Let us apply that ten shift to every number above 480:
I have deposmted in the county
of Bedford about four miles from Buford’s in an
excavation or vault six feet below the surface of
the ground the following articles belonging
jointly to the parties whose names are
given in number three herewith. The first
deposit consistcd of ten hundred and fourteen
pounds of gold and thirty eight hundred and twelve pounds of
silver deposited Nov. eighteen nineteen. The second was made Dec. eighteen
twenty one and consisted oj nineteen hundred and seven
pounds of gold and twelve hundred and
eighty eight of silver, also jewels obtained
in St. Louis in exchange to save
transportation and valued ao thirteen rhosand dollars.
The above is securely packed itron pots
with iron covtrs. Tht vault is roughly lined
with stone and the vessels rest on solid
stone and are covered uioh others. Paper
number one describes thc exact locality of
the varlt so that no difficulty will be had in
finding it.
In order
to get the final key, let’s take a look at the leftover misfits.
|
Number
|
Decodes as
|
But Should Be
|
The smallest
shift is
|
|
647
|
M
|
I
|
+1
|
|
84
|
C
|
E (twice)
|
|
|
666
|
J
|
F
|
+1
|
|
643
|
O
|
T (twice)
|
+1
|
|
53
|
R
|
T
|
|
|
188
|
T
|
E
|
|
|
32
|
T
|
E
|
|
|
440
|
U
|
W
|
|
|
96
|
R
|
U
|
|
Number 108 appears just once, and
decodes as “T”, but if there had been a decoding of typesetting error, and this
should read “10 8”, then it decodes as “N I” and “itron” becomes “in iron”. If
another word is missed between 620 and 643
If I now make a mark at every
letter that seems like it does not help create a proper word, I find that with
six exceptions (to be examined individually) we find that this is every number
from 466 to 666 (and it is possible that the original encipherer counted beyond
this point. The first appearance of the letter “v” beginning a word in the
Declaration is at word 819, and no word begins with “y” or “x”. There are two
instances of 440, which decodes as “u” according to the text of the Declaration
that I used. Somewhere between 421 and
439 a word was skipped, or a version was used with a word left out. (Though I
cannot see where a word could be dropped as a variant and leave the flow of the
text continuous in the area, “… right inestimable to them, and formidable to
tyrants only. He has called together legislative bodies at places unusual,
uncomfortable, …”
This leaves us with eight more
errors. Errors using “84” for “E”, “666” for “F”, “53” for “T”, “32” for “E”,
or “96” for “U”, can be explained by the encoder looking up these letters
individually and miscounting by one. This is not explainable if the encoder
made a key before starting, and it is not consistent if each word was numbered,
so I hold that the copy of the Declaration used had marks every ten words, as
in the version printed in the pamphlet. (I further hold that this is because this
was the exact copy used to encode B2: The Treasure, but I address that more
fully elsewhere.)
Encoding “188” for “E” and the time
“440” is encoded as “W” are more complete errors. The final reconstruction of
the original text follows.
I have deposited in the county of Bedford about four
miles from Buford’s in an excavation or vault six feet below the surface of the
ground the following articles belonging jointly to the parties whose names are
given in number three herewith. The first deposit consisted of ten hundred and
fourteen pounds of gold and thirty-eight hundred and twelve pounds of silver
deposited Nov. eighteen nineteen. The second was made Dec. eighteen twenty-one
and consisted of nineteen hundred and seven pounds of gold and twelve hundred
and eighty eight of silver, also jewels obtained in St. Louis in exchange to
save transportation and valued at thirteen thousand dollars. The above is
securely packed in iron pots with iron covers. The vault is roughly lined with
stone and the vessels rest on solid stone and are covered with others. Paper
number one describes the exact locality of the vault so that no difficulty will
be had in finding it.
This
text, again, could have been reasoned out much earlier in the process, but due
to the prevalence of several key errors, we needed to go through each of these
steps to recreate the key the encoder had created by the end of this process.
We will find (spoiler alert) that this key will not translate either of the two
other ciphers, but that this key was used in creating at least one of the other
two lists of numbers.
I am trying to complete the model of the encoder's Declaration of Independence because I want to argue that the encoder only constructed a key for the Declaration of Independence, and then did not use it to encode a real text in B1/B3, and this is going to lead me to make a bit of a leap. The location of the skips are between 630 and 640, 670 and 680, and xxx and xxx. In each of those intervals, there is a point at which the same letter appears twice in a row (for 630-640, the first letters are TOPSAWTCOOL - "...times of peace, standing armies without the consent of our legislatures."). I am going to make the assumption that the miscounting occurs at that doubled letter, so that if the encoder would have used one of these numbers, they would have used TOPSAWTCOL If I am wrong, then this only affect a couple of letters falling in these gaps.
The ninth word, "four", has a bit of a story behind it as well. Using the version of the Declaration that I happened to grab, this translates as "four"; using the Declaration of Independence as given in the pamphlet, this should translate as "foir", going back to a long-standing argument over whether the Declaration declares that there are "unalienable" rights or "inalienable" rights. I have seen arguments putting this error forward as positive evidence for the truth of the Beale cipher since in this one respect the pamphlet version differs from the version used to encipher B2. I think that it is just as likely, if not more so, that the typesetter was carrying words in their head from the text to the letter bins, and switched versions because they had grown up with "inalienable" themselves.
There is
yet one more “piece of luck” to be added to the mix. As James Gillogly noted,
the PA enciphers 87 characters at the end to include “Paper number one describes
the exact locality of the vault so that no difficulty will be had in finding it.”
Since MTF was to decipher all of the lists anyway, this is just meaningless
extra work, if the cipher is real, but if the cipher is fake then this line
enhances the claim that there is value in breaking the other two ciphers.