Dizzyspiral

Posted on Oct 26, 2022

"Beating" Checksums

#security #python #research #programming

Checksums. The often-ignored string of hex characters that accompany most software downloads. Checksums are meant to help you, the end user downloading and installing software from the wild wild internet, verify that what you got is actually what the developers released.

Being the very responsible and security-conscious person you are, I'm sure you verify the checksum on everything you download. Do you actually diff the checksums? Or do you, like me, just eyeball it sometimes?

I bet you eyeball it. It's faster. We're lazy. And checksums are designed to be wildly different with just small changes in the input. So if, say, the beginning and the end of the checksums match, chances are the whole thing matches. Right?

Right?

That's the subject of today's blog post! Beating checksums is a thing security researchers often have to, though typically they're not packaging their stuff to be a particular checksum if they can help it. That's hard to do, and usually it's easier to find a way to defeat the check - like stuffing your code/data/whatever into something that doesn't get checksum'd at all. But what if you have to beat the checksum? And what if the checksum you're beating is checked by a human?

I hypothesize (from my very rigorous sample size of me and my coworkers) that lots of people only check the beginning and end of a checksum, and maybe a few stand-out patterns in the middle if they're feeling particularly paranoid. Can we build something that'll pack arbitrary data to match the beginning and end of a checksum? Let's call the beginning/end about 5 characters each, in keeping with the researched limits of short-term memory.

I know nothing about checksums, and today isn't the day I'm going to get all hot and bothered for math, so I'm going to do a little light research on manipulating checksums and see what techniques are generally used.

Research

Okay so first off, I found this paper that explained something I knew to be true - CRC checksums aren't secure (and are largely not used anymore because of this). I mostly see sha256 these days, so that's what we'll try to "beat."

Second... it turned out there's no way to get a handle on this problem without learning some math (ugh) so here we go on SHA-256:

By kockmeyer - Own work, CC BY-SA 3.0

SHA-256 is a hashing algorithm that turns your input into 512-bit chunks and does some math on them, then uses that to produce a 256-bit value. This hash is computed 64 times, with new input data for each round based on the previous round. If you want a deep dive into the math on this, here's a video of doing a round out by hand. If you want something a bit more digestible, this video is an excellent visual breakdown.

Third, there's lots of bitcoin bullshit videos out there saying that hashes are always unique. This isn't true. 256 bits is a lot of playground to work with - 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,935 unique values, to be exact. And while conceptually this means you could have a unique hash for every bit of unique data that could ever be created, things are just not quite that simple. Hash collisions (where the same hash is generated for two completely different inputs) exist. They're just hard to find.

Credit

Fourth, hey look bitcoin miners are basically already doing this. I knew that, but I guess I forgot. Proof-of-work crypto schemes take a transaction and append a nonce to them in order to achieve a particular prefix on the resultant sha256 hash of transaction + nonce. This seems like a great place to start this research.

Implementation

Okay so what we want to do is manipulate arbitrary data so that the first and last five characters of its sha256-sum match an arbitrary sha256-sum. We think we can take the same thing crypto miners are doing - fuzzing an appended nonce - to find a hash that has a given prefix and suffix.

While ostensibly I could take some code already written for this by the crypto bros, as a practical matter that's all going to have coin-specific code and blockchain network blah blah in it, and anyway I didn't find anything suitable in my 10 seconds of looking.

Instead, I'm going to try rolling my own. I'm itching to code - let's go!

The code

This problem is actually easy enough that we could probably get a prototype going using bash. But since I'm confident from my research that the approach is going to work, and I'm really not a fan of debugging bash, I'm going to throw this together in python.

Quick aside, python is really not the right choice for a problem like this, since it's an obviously intense and parallelizable problem, and python is shit at parallel processing. However, in the long run I want to experiment with this by modifying known data types (e.g. ELF, PDF) in non-destructive ways other than just appending a value. Python has a lot of great tools to do that quickly, so that's why python.

Here's our initial implementation:

import argparse
import hashlib

NONCE_SIZE = 1  # 1 byte
MAX_NONCE = (2 ** (NONCE_SIZE * 8)) - 1

def parse_args():
    parser = argparse.ArgumentParser(description="SHA256 prefix and suffix mutator")
    parser.add_argument('filename', metavar='filename', type=str, help='file to hash')
    parser.add_argument('--prefix', dest='prefix', action='store', default="", help='prefix to match')
    parser.add_argument('--suffix', dest='suffix', action='store', default="", help='suffix to match')

    return parser.parse_args()

def mutate(data, prefix, suffix):
    message = hashlib.sha256()
    message.update(data)
    digest = message.hexdigest()

    print(f"Initial SHA256 digest: {digest}")
    print(f"MAX_NONCE = {MAX_NONCE}")

    nonce = -1
    # Precompute for speed
    prefix_len = len(prefix) if prefix else 0
    suffix_len = len(suffix) if suffix else 0

    while (digest[:prefix_len] != prefix or digest[-suffix_len:] != suffix) and nonce <= MAX_NONCE:
        nonce += 1
        message = hashlib.sha256()
        message.update(data)
        message.update(nonce.to_bytes(NONCE_SIZE, "big"))
        digest = message.hexdigest()

    print(f"Got hash {digest} using nonce {nonce}")

if __name__ == '__main__':
    args = parse_args()

    with open(args.filename, 'rb') as f:
        data = f.read()

    mutate(data, args.prefix, args.suffix)

Running that, we get:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py main.py --prefix a --suffix b
Initial SHA256 digest: 35064e6ca1abd7b5f51f353e5d558aa947e389480ca6bdadc5660a32e615f88d
MAX_NONCE = 255
Traceback (most recent call last):
  File "main.py", line 47, in <module>
    mutate(data, args.prefix, args.suffix)
  File "main.py", line 32, in mutate
    message.update(nonce.to_bytes(NONCE_SIZE, "big"))
OverflowError: int too big to convert

So we up our nonce size:

NONCE_SIZE = 4

And now:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py main.py --prefix a --suffix b
Initial SHA256 digest: 9482419d29e5a2579d7ce71373d3aa5fc4a12a45a2378629c8e34bdd95174015
MAX_NONCE = 4294967295
Got hash aca28c372935e0b10acacbb81bf3b427ca5d64ae91b80870be3d71a4b0a9023b using nonce 28

You'll note that this modified code only took 28 tries, which is less than 255 - because we're running this with our code as input (because lazy), this isn't very deterministic or repeatable as we iterate on the code.

Let's use some static sample data.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

And let's add some profiling

import cProfile
...
cProfile.run("mutate(data, args.prefix, args.suffix)")

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix a --suffix b
Initial SHA256 digest: 56293a80e0394d252e995f2debccea8223e4b5b2b150bee212729b3b39ac4d46
MAX_NONCE = 4294967295
Got hash ae7ee87838542a984c3b82c9884f75231c870776500d6dfed3777566c5cdf07b using nonce 269
         1362 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.001    0.001 main.py:16(mutate)
      271    0.000    0.000    0.000    0.000 {built-in method _hashlib.openssl_sha256}
        1    0.000    0.000    0.001    0.001 {built-in method builtins.exec}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        3    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      271    0.000    0.000    0.000    0.000 {method 'hexdigest' of '_hashlib.HASH' objects}
      270    0.000    0.000    0.000    0.000 {method 'to_bytes' of 'int' objects}
      541    0.000    0.000    0.000    0.000 {method 'update' of '_hashlib.HASH' objects}

The profiling will be nice when we start upping the prefix and suffix lengths... as we'll do now.

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix aa --suffix bb
Initial SHA256 digest: 56293a80e0394d252e995f2debccea8223e4b5b2b150bee212729b3b39ac4d46
MAX_NONCE = 4294967295
Got hash aa08146063c11f7b153146027543b0d0de02207048ce024d074b6010f86d6fbb using nonce 111950
         559767 function calls in 0.258 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.258    0.258 <string>:1(<module>)
        1    0.076    0.076    0.258    0.258 main.py:16(mutate)
   111952    0.013    0.000    0.013    0.000 {built-in method _hashlib.openssl_sha256}
        1    0.000    0.000    0.258    0.258 {built-in method builtins.exec}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        3    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   111952    0.039    0.000    0.039    0.000 {method 'hexdigest' of '_hashlib.HASH' objects}
   111951    0.013    0.000    0.013    0.000 {method 'to_bytes' of 'int' objects}
   223903    0.117    0.000    0.117    0.000 {method 'update' of '_hashlib.HASH' objects}

Adding two more bytes (one to the prefix, one to the suffix) has drastically increased our running time. Add two more, and:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix aaa --suffix bbb
Initial SHA256 digest: 56293a80e0394d252e995f2debccea8223e4b5b2b150bee212729b3b39ac4d46
MAX_NONCE = 4294967295
Got hash aaaace2fd053c08a86ed9aa2d6b9bb4cd2ee98935f197fc9f28c7020fe6debbb using nonce 5950739
         29753712 function calls in 13.329 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   13.329   13.329 <string>:1(<module>)
        1    3.800    3.800   13.329   13.329 main.py:16(mutate)
  5950741    0.648    0.000    0.648    0.000 {built-in method _hashlib.openssl_sha256}
        1    0.000    0.000   13.329   13.329 {built-in method builtins.exec}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        3    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  5950741    2.035    0.000    2.035    0.000 {method 'hexdigest' of '_hashlib.HASH' objects}
  5950740    0.734    0.000    0.734    0.000 {method 'to_bytes' of 'int' objects}
 11901481    6.112    0.000    6.112    0.000 {method 'update' of '_hashlib.HASH' objects}

It is very obvious where the time is spent. Let's graph those data points, for science.

We want to get it to match 5 characters (so 5 bytes) from the prefix and suffix each, so we're really aiming to mutate 10 bytes. Initial findings are... not good, on the runtime scale. Which is not a surprise.

We can get some extra performance by multi-threading, but as mentioned earlier, python is not the best language for this. Still, let's give it a go and see how we do.

import argparse
import hashlib
import threading
import math
import cProfile

NONCE_SIZE = 4  # 4 bytes
MAX_NONCE = (2 ** (NONCE_SIZE * 8)) - 1  # Largest int you can make with the number of nonce bytes
NUM_THREADS = 8

done = False

def parse_args():
    parser = argparse.ArgumentParser(description="SHA256 prefix and suffix mutator")
    parser.add_argument('filename', metavar='filename', type=str, help='file to hash')
    parser.add_argument('--prefix', dest='prefix', action='store', default="", help='prefix to match')
    parser.add_argument('--suffix', dest='suffix', action='store', default="", help='suffix to match')
    parser.add_argument('--multithread', dest='multithread', action='store_const', default=False, const=True, help='enable multithreading')

    return parser.parse_args()

def mutate(data, prefix, suffix, nonces):
    global done
    message = hashlib.sha256()
    message.update(data)

    # Precompute for speed
    prefix_len = len(prefix) if prefix else 0
    suffix_len = len(suffix) if suffix else 0

    for nonce in nonces:
        # If another thread got the answer, quit
        if done:
            break

        message = hashlib.sha256()
        message.update(data)
        message.update(nonce.to_bytes(NONCE_SIZE, "little"))  # Change to little, because intel
        digest = message.hexdigest()

        # If we got the anwswer, set the done flag and quit
        if digest[:prefix_len] == prefix and digest[-suffix_len:] == suffix:
            done = True
            print(f"Got hash {digest} using nonce {nonce}")
            break

def start_threads(data, prefix, suffix):
    threads = []

    for i in range(1, NUM_THREADS + 1):
        start_nonce = math.floor(MAX_NONCE / NUM_THREADS) * (i - 1)
        end_nonce = math.ceil((MAX_NONCE / NUM_THREADS) * i)
        print(f"Starting thread with nonce range {start_nonce} to {end_nonce}")
        threads.append(threading.Thread(target=mutate, args=(data, args.prefix, args.suffix, range(start_nonce, end_nonce))))
        threads[-1].start()

    for thread in threads:
        thread.join()

if __name__ == '__main__':
    args = parse_args()

    with open(args.filename, 'rb') as f:
        data = f.read()

    if args.multithread:
        cProfile.run("start_threads(data, args.prefix, args.suffix)")
    else:
        cProfile.run("mutate(data, args.prefix, args.suffix, range(MAX_NONCE))")

Since I changed the byte ordering to little endian, and we changed some other stuff about how the algorithm works, let's re-benchmark our single-threaded execution:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix aaa --suffix bbb
Got hash aaa07ea3b6faf37e8ad580ad7b60bf5b58fa1d30cf898c22819fff562261abbb using nonce 35489965
         177449839 function calls in 77.409 seconds

And now, multithreaded:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix aaa --suffix bbb --multithread
Starting thread with nonce range 0 to 536870912
Starting thread with nonce range 536870911 to 1073741824
Starting thread with nonce range 1073741822 to 1610612736
Starting thread with nonce range 1610612733 to 2147483648
Starting thread with nonce range 2147483644 to 2684354560
Starting thread with nonce range 2684354555 to 3221225472
Starting thread with nonce range 3221225466 to 3758096384
Starting thread with nonce range 3758096377 to 4294967295
Got hash aaae5c17d3d40008c490879d1d227c63dda58d9fc4e9244025965f191df5fbbb using nonce 1076940590
         356 function calls in 36.090 seconds

So we've cut execution time in about half, based on this one datapoint. I guess let's let it rip on just one more byte, in the prefix:

┌─[dizzyspiral@akatsuki:~/repos/github/sha-bam]
└─╼ python3 main.py lorem.txt --prefix aaaa --suffix bbb --multithread
Starting thread with nonce range 0 to 536870912
Starting thread with nonce range 536870911 to 1073741824
Starting thread with nonce range 1073741822 to 1610612736
Starting thread with nonce range 1610612733 to 2147483648
Starting thread with nonce range 2147483644 to 2684354560
Starting thread with nonce range 2684354555 to 3221225472
Starting thread with nonce range 3221225466 to 3758096384
Starting thread with nonce range 3758096377 to 4294967295
^C         274 function calls in 534.320 seconds

It ran for a while, but I got impatient and killed it.

So now we're faced with a fundamental question. Do we throw more compute at this, or do we try to do something clever to reduce the compute required? The former option isn't very interesting; crypto-mining has shown us that we can solve this problem with beefy GPUs. But that's not super attainable for the average dudette. Is there a way we can give ourselves a better chance of affecting the prefix and suffix of the hash?

First thing that comes to mind is changing up where we're inserting our nonce. If we put it at the beginning of the data, is it more likely to affect the prefix? We can do some benchmarking on randomly generated data to find out.

For what it's worth, I do doubt that will make a difference, because it would indicate a flaw in the checksum that could allow you to learn something about the original data. And sha256 has been in use for so long that such a trivial attack would be well known by now if it existed. But I take nothing for granted, so I'll definitely try it. In the next blog post ;)

DEV Community

"Beating" Checksums

Research

Implementation

The code

Oldest comments (0)