08-19-2018, 12:19 AM
First off sorry, I did look at these earlier but gave up pretty quick. Raccoon Sam is a bit more persistent than me and actually bothers to try find some patterns. With the information he posted I've managed to find out some more stuff though.
I decided to look into this at the bit level, and it seems that for there's some XORing going on. A quick primer on what the heck XORing is:
Bits are ones and zeros, right? For example, 5 in binary is 0101 (for clarity all binary numbers will be green and bold). Why the zero at the start? When representing binary we usually do it in nibbles (chunks of 4 bit) to make them easier to read. As a byte (8 bits i.e. 2 nibbles) 5 is 0000 0101. Pretty simple.
XORing is when you take two numbers, in binary, and do an operation. The operation takes each two corresponding bits, and compares them. If they're the same, the result is a 0. If they're different, the result is a 1.
As an example, let's say we have 1101 (13) and 0101 (5). If we XOR them, we get 1000 (8). Here's how it works:
15 = 1101
5 = 0101
15 XOR 5 = 1000
You can see that everywhere there were two different digits we put a 1, and if they were the same we put a 0. Everything matched except the first digit, so the result is 1000 Hopefully that made sense!
(A final note for those interested: XOR is reversible. If A XOR B = C, you can XOR C with A to get B, or B to get A. This means that if we XOR 1000 (8) with 0101 (5), we'll get back to 1101 (13). Very handy, and this is why XORing is a common technique for obfuscation.)
I checked it out, and for the first bytes you guys were onto it. Every first byte is XORed with $11 (17), which is 0001 0001. This means you either add or subtract 1 from each nibble, depending on what's already there.
The second bytes are a bit harder. It seems that every second byte is XORed with $E5 (229), which is 1110 0101, except the last one. However, Sam was looking at the first $14 bytes, not $12. Looking at some PNGs, bytes $12 and $13 seem to differ between them.
I also checked out some of the other fake PNGs. Some were XORed with different values, some in differently sized chunks. Wasn't sure how to find the proper numbers, but I tried some and then I decided to take a look at the file size.
BINGO.
Do you know what the size of box2-open.png is? $11 E5. (Putting spaces to make numbers easier to read.)
That's not even all. The file that had differently sized chunks was cg01_di.png. For each chunk of 4 bytes (long or double word) in this file, the first one was fine, then the next 3 obfuscated. I checked the size, it's $03 35 93. That's 3 bytes worth of data.
Basically, what it seems you do is you take the file's size, express it as 4 bytes, then XOR that with every 4 bytes of the file. $00 00 11 E5 XORed with $89 50 4E 47 is $89 50 5F A2. Perfect.
At this point, I decided to go to the end of the file to see if I could find anything, since that's usually where any info is if it's not at the start. Sure enough, there was the size, plus $800. Not sure why it's $800, but it does look like the last $0C or so bytes of the file should be shaved off.
Phew. So it seems like that's the most of it. I don't think it's as simple as XORing the whole file (I couldn't find any traces of an IEND chunk) but I'll see if I can make a small program to test this out and see what discrepancies pop out.
I decided to look into this at the bit level, and it seems that for there's some XORing going on. A quick primer on what the heck XORing is:
Bits are ones and zeros, right? For example, 5 in binary is 0101 (for clarity all binary numbers will be green and bold). Why the zero at the start? When representing binary we usually do it in nibbles (chunks of 4 bit) to make them easier to read. As a byte (8 bits i.e. 2 nibbles) 5 is 0000 0101. Pretty simple.
XORing is when you take two numbers, in binary, and do an operation. The operation takes each two corresponding bits, and compares them. If they're the same, the result is a 0. If they're different, the result is a 1.
As an example, let's say we have 1101 (13) and 0101 (5). If we XOR them, we get 1000 (8). Here's how it works:
15 = 1101
5 = 0101
15 XOR 5 = 1000
You can see that everywhere there were two different digits we put a 1, and if they were the same we put a 0. Everything matched except the first digit, so the result is 1000 Hopefully that made sense!
(A final note for those interested: XOR is reversible. If A XOR B = C, you can XOR C with A to get B, or B to get A. This means that if we XOR 1000 (8) with 0101 (5), we'll get back to 1101 (13). Very handy, and this is why XORing is a common technique for obfuscation.)
I checked it out, and for the first bytes you guys were onto it. Every first byte is XORed with $11 (17), which is 0001 0001. This means you either add or subtract 1 from each nibble, depending on what's already there.
The second bytes are a bit harder. It seems that every second byte is XORed with $E5 (229), which is 1110 0101, except the last one. However, Sam was looking at the first $14 bytes, not $12. Looking at some PNGs, bytes $12 and $13 seem to differ between them.
I also checked out some of the other fake PNGs. Some were XORed with different values, some in differently sized chunks. Wasn't sure how to find the proper numbers, but I tried some and then I decided to take a look at the file size.
BINGO.
Do you know what the size of box2-open.png is? $11 E5. (Putting spaces to make numbers easier to read.)
That's not even all. The file that had differently sized chunks was cg01_di.png. For each chunk of 4 bytes (long or double word) in this file, the first one was fine, then the next 3 obfuscated. I checked the size, it's $03 35 93. That's 3 bytes worth of data.
Basically, what it seems you do is you take the file's size, express it as 4 bytes, then XOR that with every 4 bytes of the file. $00 00 11 E5 XORed with $89 50 4E 47 is $89 50 5F A2. Perfect.
At this point, I decided to go to the end of the file to see if I could find anything, since that's usually where any info is if it's not at the start. Sure enough, there was the size, plus $800. Not sure why it's $800, but it does look like the last $0C or so bytes of the file should be shaved off.
Phew. So it seems like that's the most of it. I don't think it's as simple as XORing the whole file (I couldn't find any traces of an IEND chunk) but I'll see if I can make a small program to test this out and see what discrepancies pop out.