We received a PNG file that got somehow corrupted in transit. Reading the PNG
specification and
looking at the first couple of bytes in the header we saw that an 0x0d byte
was dropped. The file used 89 50 4e 47 0a 1a 0a as header instead of 89 50 4e
47 0d 0a 1a 0a. According to the PNG specification this is a feature to detect
if the file was converted from lf/cr to cr or cr to lf/cr. Now if that's not a
huge hint.
Ignoring the hint we badgered on and continued reading the PNG documentation and
playing around with the PNG after fixing the header. PNG files consist of
multiple chunks of different types. Each chunk consists of length (4b, big
endian), chunk type (4b), data (len), CRC-32 (4b). Writing a parser we saw that
the first couple of chunks were decoded just fine but then there was a bunch of
IDAT chunks (that contain a deflate-encoded stream of data) that did not match
up. The data was somewhat too short (shorter than the length description) and
the following IDATs did not line up. We tried both padding the data section with
additional 0x0 bytes and shortening the data sections but then the CRC-32 did no
longer match. After we fixed the CRC-32 we got a deflate decode error (in the
first IDAT chunk). Whenever decoding fails in a chunk, PNG readers give up as
they can no longer resynchronize.
Oh well, back to square one. Let's look again at that PNG linefeed conversion
thingy. After looking more closely we discovered that each chunk hat 0 to 3
bytes missing. Hm, this looks suspiciously like a Windows to Linux text format
conversion and might tell us that the file was converted in error. An earlier
internet search told us that it is impossible to undo this conversion (that's
why we tried the other options first) but if you try hard it might be possible
to recover.
So, firing up our Python skills we wrote a quick tool that parsers the PNG file,
fixes header and copies all correct chunks. For incorrect chunks we extract the
data and recursively try to find correct placements of 0x0d bytes. We walk
through the data section and for every 0x0a byte we find, we try to place an
0x0d byte before it until the length matches. We then test the CRC-32 and if
the CRC-32 now matches with the CRC-32 in the file we have successfully
recovered the data of this one chunk. This recovery option only works if there
are not too many 0x0d bytes that were removed from a specific chunk, otherwise
the search space would explode. Even with only a couple of chunks that had 3
bytes missing the search took roughly 15 minutes on a fast desktop (well, maybe
we should not have coded the search in Python).
Here's the recursive search function:
def find0a(buf, nra, crc, loc):
ptr = string.find(buf, "\x0a", loc)
while ptr != -1:
fbuf = buf[0:ptr]
fbuf = fbuf + "\x0d"
fbuf = fbuf + buf[ptr:]
if nra == 1:
tcrc = binascii.crc32(fbuf) & 0xffffffff
if tcrc == crc:
print "Found a match: "+str(hex(tcrc))
return fbuf
else:
tmp = find0a(fbuf, nra-1, crc, ptr+1)
if tmp != "":
return tmp
ptr = string.find(buf, "\x0a", ptr+1)
return ""
You can guess the remainder of the program (parsing the file format, then
parsing each chunk, looking at the length and searching for the next chunk by
guessing how many bytes are missing. In the end we received a nice Startcraft
screenshot that told us the flag is flag{have_a_wonderful_starcrafts}. Hooray,
150 points!