Thomas Pegler

Posted on Aug 6, 2023

Steganography: Part 2 - Advanced LSB

#python #steganography #tutorial #security

In Part 1 I had a simple example of LSB steganography. Today I'll show how another simple step can improve resiliency and make it harder for classic steganalysis tools to detect.

Note: One thing I didn't mention in part 1 is that the code in these will always attempt to preserve the integrity of the images over the integrity, and amount, of data that can be embedded. I operate under the assumption that an adversary has access to the original copies of the images. This means that the amount of data that can be stored will always be lower than some other algorithms that more heavily alter the images.

The easiest way to improve LSB steganography is to change how the data is embedded. There are a few proposed methods but for now, let's use simple for-loops to create blocks of pixels, like the process used in JPEG compression.

from PIL import Image


def encode(filepath):
    start = '#####'
    stop = '*****'
    full = start + 'Some string that you want to encode into an image' + stop

    binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
    print(binary_text, len(binary_text))

    with Image.open(filepath) as im:
        i = 0
        w, h = im.size

        # A good block size is 8x8 or a multiple of 8
        min_block_size = 24

        print(f'Minimum block size: {min_block_size}')

        if min_block_size > w or min_block_size > h:
            print('Data too large to store in image')
            return

        for x in range(0, w - min_block_size, min_block_size):
            for y in range(0, h - min_block_size, min_block_size):

                for j in range(x, x + min_block_size):
                    for k in range(y, y + min_block_size):
                        if i >= len(binary_text):
                            i = 0

                        bit = binary_text[i]
                        pixel = im.getpixel((j, k))

                        if bit == "0":
                            # Is odd, should be even.
                            if pixel[0] % 2 != 0:
                                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        else:
                            # Is even, should be odd.
                            if pixel[0] % 2 == 0:
                                new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        i += 1

        im.save(f'encoded_{filepath}')

As you can see the above is almost identical to the previous example, the only difference is the pair of inner for loops:

for j in range(x, x + min_block_size):
    for k in range(y, y + min_block_size):
        if i >= len(binary_text):
            i = 0

        bit = binary_text[i]
        pixel = im.getpixel((j, k))

        if bit == "0":
            # Is odd, should be even.
            if pixel[0] % 2 != 0:
                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                im.putpixel((j, k), new_pix)

            else:
                # Is even, should be odd.
                if pixel[0] % 2 == 0:
                    new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                    im.putpixel((j, k), new_pix)

        i += 1

This iterates over a square of min_block_size X min_block_size and encodes the data sequentially there. In theory, this makes the encoding more robust and harder for standard steganalysis tools to extract since you have to know the block size to retrieve it. This is the strength and weakness of this approach. You have to either define a block size, the length of the input text or send the block size some other way so that whoever is decoding it can know what block size to use.

Speaking of decoding, this method is essentially the same, just with the inner double for loop again.

def decode(filepath, block_size=None):
    start = '#####'
    stop = '*****'
    found = False
    binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
    bit_count = 0
    message = ''

    with Image.open(filepath) as im:
        w, h = im.size
        binary_text = ''
        # A good block size is 8x8 or a multiple of 8
        min_block_size = block_size or 24

        while not found:
            for x in range(0, w - min_block_size, min_block_size):
                for y in range(0, h - min_block_size, min_block_size):

                    if message.endswith(stop):
                        found = True
                        break

                    for j in range(x, x + min_block_size):
                        for k in range(y, y + min_block_size):

                            if bit_count == 8:
                                char = chr(int(binary_text, 2))

                                if char in string.printable:
                                    message += char
                                    bit_count = 0
                                    binary_text = ''

                            pixel = im.getpixel((j, k))

                            # Since we always want to get the LSB, we 
                            # can just use the result of the modulo as 
                            # our value
                            binary_text += f'{pixel[0] % 2}'

                            bit_count += 1

    if found:
        start_point = message.find(start) + len(start)
        end = message.find(stop)
        message = message[start_point:end]
        return message

As you can see, this is essentially the same loop as the encode, the block size is passed as an argument in this example with the known 24 as a backup. I've also added a check for the found char, to ensure it is printable (less helpful for this example but much more so later when we attempt to process cropped images).

Conclusion

Putting both parts together with a little argparse for ease of command line use, we get:

import argparse
import string

from PIL import Image


def encode(filepath):
    start = '#####'
    stop = '*****'
    full = start + 'Some string that you want to encode into an image' + stop

    binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
    print(binary_text, len(binary_text))

    with Image.open(filepath) as im:
        i = 0
        w, h = im.size

        # A good block size is 8x8 or a multiple of 8
        min_block_size = 24

        print(f'Minimum block size: {min_block_size}')

        if min_block_size > w or min_block_size > h:
            print('Data too large to store in image')
            return

        for x in range(0, w - min_block_size, min_block_size):
            for y in range(0, h - min_block_size, min_block_size):

                for j in range(x, x + min_block_size):
                    for k in range(y, y + min_block_size):
                        if i >= len(binary_text):
                            i = 0

                        bit = binary_text[i]
                        pixel = im.getpixel((j, k))

                        if bit == "0":
                            # Is odd, should be even.
                            if pixel[0] % 2 != 0:
                                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        else:
                            # Is even, should be odd.
                            if pixel[0] % 2 == 0:
                                new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        i += 1

        im.save(f'encoded_{filepath}')


def decode(filepath, block_size=None):
    start = '#####'
    stop = '*****'
    found = False
    binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
    bit_count = 0
    message = ''

    with Image.open(filepath) as im:
        w, h = im.size
        binary_text = ''
        # A good block size is 8x8 or a multiple of 8
        min_block_size = block_size or 24

        while not found:
            for x in range(0, w - min_block_size, min_block_size):
                for y in range(0, h - min_block_size, min_block_size):

                    if message.endswith(stop):
                        found = True
                        break

                    for j in range(x, x + min_block_size):
                        for k in range(y, y + min_block_size):

                            if bit_count == 8:
                                char = chr(int(binary_text, 2))

                                if char in string.printable:
                                    message += char
                                    bit_count = 0
                                    binary_text = ''

                            pixel = im.getpixel((j, k))

                            # Since we always want to get the LSB, we 
                            # can just use the result of the modulo as 
                            # our value
                            binary_text += f'{pixel[0] % 2}'

                            bit_count += 1

    if found:
        start_point = message.find(start) + len(start)
        end = message.find(stop)
        message = message[start_point:end]
        return message


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-a", "--action", help="encode or decode")
    parser.add_argument("-f", "--filepath", help="path to image")
    parser.add_argument("-b", "--block_size", required=False, type=int, help="block size")
    args = parser.parse_args()

    if args.filepath:
        if args.action == "encode":
            encode(args.filepath)
        elif args.action == "decode":
            print(decode(args.filepath))
        else:
            print("Invalid action")
    else:
        print("No filepath provided")

With that very simple script, you have your very own PNG steganographic tool. Simply ensure you have Pillow installed and from the terminal run something like:

python ./advanced.py -a encode -f file.png

Then:

python ./advanced.py -a decode -f encoded_file.png -b 24

And you'll have your very own, secretly encoded messaging system. The changes are undetectable to the human eye, even with the original.

Header by Isis França on Unsplash

Top comments (3)

Retiago Drago • Sep 12 '23 • Edited

The first thing I notice about your approach is the depth of your loop. I believe this increases the time complexity since you go as deep as 4-5 levels. Do you think there's a faster way to do this? I'm considering flattening the image and using a specific mathematical formula to reduce the level of the loop. What are your thoughts? 😁

Thomas Pegler • Sep 12 '23

You're absolutely right, I really didn't optimise this at all. I mostly wanted to just get something simple, with only 1 or 2 libraries to try and make the topic a little easier to start with. I ended up doing my own implementation in C++ to improve the speed (because this Python one was far too slow).

I think a good approach might be to flatten the image into a single array and then use another algorithm to figure out the correct pixels to alter or just use Pandas. Do you know any better ways of handling it?

Retiago Drago • Sep 12 '23

In my post, I just utilized NumPy index operation and lookup table aka memoization for better and faster performance. Let me know what you think about my approach there. I'm still new and always exploring new stuff as time goes on.

Exploring Steganography in the Wild - Part 1

Retiago Drago ・ Sep 12

#python #security #cybersecurity #steganography

DEV Community

Steganography: Part 2 - Advanced LSB

Conclusion

Top comments (3)

Exploring Steganography in the Wild - Part 1

Retiago Drago ・ Sep 12

Read next

Fine-tune your first large language model (LLM) with LoRA, llama.cpp, and KitOps in 5 easy steps

How to Implement pagination on API Endpoint Using Nodejs and TypeScript

Locally test and validate your Renovate configuration files

Understanding SSH Tunneling