Friday, April 20, 2012

Hex puzzle [-1]

Here is a fourth (and final) post in the saga of the "Let us have hex" puzzle. (Actual puzzle, reddit comments, and previous posts here, here and here).

Although I know the redditeers have already solved the problem, I had hoped to work my way through it more systematically. But I ran into a constraint I can't get around very easily (or perhaps at all), which is that the available data is not complete. It's not enough to yield a systematic solution.

An important feature of the puzzle is that visitors to the puzzle page get served 100 digit hex strings that are refreshed to a new value slowly (maybe every half hour), and also might (at least theoretically) be restricted by ip address, so that a social venue like reddit is critical to gathering all the data. But the reddit faction quit too soon, after they'd guessed the answer.

The total number of strings I found in the reddit comments is 340. Let me show two demonstrations that the complete data is not there. First, recall that the base64 encodings have "codons" of length 4. So two adjacent 50-character base64 strings that are not correctly aligned (because they abut without overlap, and were decoded individually), might be missing as many as 4 base64 characters at the joint. This would affect in turn at most 2 adjacent characters in the C or PHP program, since the ratio is 4:3.

A simple decoding of string i = 6 gives:

< 73; i++) { pri
{ echo $a[$i];

The first line is C and the second PHP. The problem is that there's nothing in the known data that corresponds to what likely comes next---there's no "echo $b[$i];" in the data. It's just missing. I suppose there could be a series of very short overlaps that would build it, but it seems pretty unsporting. Similarly, the formatting of the C output is not there.

More conclusive evidence comes from an analysis of the solution. The first word is "Design" and it is derived from the C program, which has this fragment:

int a[] = {4, 4, 6, 5, 7, 3, 6, 9, 6, 7..

If we combine these as hex digits and then print the ASCII equivalent we get:

>>> L = ['44','65','73','69','67']
>>> L = ['0x' + s for s in L]
>>> L = [chr(int(s,base=16)) for s in L]
>>> L
['D', 'e', 's', 'i', 'g']

We need the "n" in design which is '0x6e' in hex:

>>> hex(ord('n'))
'0x6e'

But there is no 'e' in the data with a leading 6. The overlap is not there.

I think I'm done now. I had fun. I wrote a lot of code to try to deal with the challenge of how to assemble a bunch of strings where the candidate strings with a large overlap might be misleading. I was prepared to do the thing by hand, as shown in the screenshot. But there is not enough data, you have to guess. Here is what I was reduced to: