In Part 1 I had a simple example of LSB steganography. Today I'll show how another simple step can improve resiliency and make it harder for classic steganalysis tools to detect.
Note: One thing I didn't mention in part 1 is that the code in these will always attempt to preserve the integrity of the images over the integrity, and amount, of data that can be embedded. I operate under the assumption that an adversary has access to the original copies of the images. This means that the amount of data that can be stored will always be lower than some other algorithms that more heavily alter the images.
The easiest way to improve LSB steganography is to change how the data is embedded. There are a few proposed methods but for now, let's use simple for-loops to create blocks of pixels, like the process used in JPEG compression.
from PIL import Image
def encode(filepath):
start = '#####'
stop = '*****'
full = start + 'Some string that you want to encode into an image' + stop
binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
print(binary_text, len(binary_text))
with Image.open(filepath) as im:
i = 0
w, h = im.size
# A good block size is 8x8 or a multiple of 8
min_block_size = 24
print(f'Minimum block size: {min_block_size}')
if min_block_size > w or min_block_size > h:
print('Data too large to store in image')
return
for x in range(0, w - min_block_size, min_block_size):
for y in range(0, h - min_block_size, min_block_size):
for j in range(x, x + min_block_size):
for k in range(y, y + min_block_size):
if i >= len(binary_text):
i = 0
bit = binary_text[i]
pixel = im.getpixel((j, k))
if bit == "0":
# Is odd, should be even.
if pixel[0] % 2 != 0:
new_pix = (pixel[0] - 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
else:
# Is even, should be odd.
if pixel[0] % 2 == 0:
new_pix = (pixel[0] + 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
i += 1
im.save(f'encoded_{filepath}')
As you can see the above is almost identical to the previous example, the only difference is the pair of inner for loops:
for j in range(x, x + min_block_size):
for k in range(y, y + min_block_size):
if i >= len(binary_text):
i = 0
bit = binary_text[i]
pixel = im.getpixel((j, k))
if bit == "0":
# Is odd, should be even.
if pixel[0] % 2 != 0:
new_pix = (pixel[0] - 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
else:
# Is even, should be odd.
if pixel[0] % 2 == 0:
new_pix = (pixel[0] + 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
i += 1
This iterates over a square of min_block_size X min_block_size
and encodes the data sequentially there. In theory, this makes the encoding more robust and harder for standard steganalysis tools to extract since you have to know the block size to retrieve it. This is the strength and weakness of this approach. You have to either define a block size, the length of the input text or send the block size some other way so that whoever is decoding it can know what block size to use.
Speaking of decoding, this method is essentially the same, just with the inner double for loop again.
def decode(filepath, block_size=None):
start = '#####'
stop = '*****'
found = False
binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
bit_count = 0
message = ''
with Image.open(filepath) as im:
w, h = im.size
binary_text = ''
# A good block size is 8x8 or a multiple of 8
min_block_size = block_size or 24
while not found:
for x in range(0, w - min_block_size, min_block_size):
for y in range(0, h - min_block_size, min_block_size):
if message.endswith(stop):
found = True
break
for j in range(x, x + min_block_size):
for k in range(y, y + min_block_size):
if bit_count == 8:
char = chr(int(binary_text, 2))
if char in string.printable:
message += char
bit_count = 0
binary_text = ''
pixel = im.getpixel((j, k))
# Since we always want to get the LSB, we
# can just use the result of the modulo as
# our value
binary_text += f'{pixel[0] % 2}'
bit_count += 1
if found:
start_point = message.find(start) + len(start)
end = message.find(stop)
message = message[start_point:end]
return message
As you can see, this is essentially the same loop as the encode, the block size is passed as an argument in this example with the known 24 as a backup. I've also added a check for the found char, to ensure it is printable (less helpful for this example but much more so later when we attempt to process cropped images).
Conclusion
Putting both parts together with a little argparse
for ease of command line use, we get:
import argparse
import string
from PIL import Image
def encode(filepath):
start = '#####'
stop = '*****'
full = start + 'Some string that you want to encode into an image' + stop
binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
print(binary_text, len(binary_text))
with Image.open(filepath) as im:
i = 0
w, h = im.size
# A good block size is 8x8 or a multiple of 8
min_block_size = 24
print(f'Minimum block size: {min_block_size}')
if min_block_size > w or min_block_size > h:
print('Data too large to store in image')
return
for x in range(0, w - min_block_size, min_block_size):
for y in range(0, h - min_block_size, min_block_size):
for j in range(x, x + min_block_size):
for k in range(y, y + min_block_size):
if i >= len(binary_text):
i = 0
bit = binary_text[i]
pixel = im.getpixel((j, k))
if bit == "0":
# Is odd, should be even.
if pixel[0] % 2 != 0:
new_pix = (pixel[0] - 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
else:
# Is even, should be odd.
if pixel[0] % 2 == 0:
new_pix = (pixel[0] + 1, pixel[1], pixel[2])
im.putpixel((j, k), new_pix)
i += 1
im.save(f'encoded_{filepath}')
def decode(filepath, block_size=None):
start = '#####'
stop = '*****'
found = False
binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
bit_count = 0
message = ''
with Image.open(filepath) as im:
w, h = im.size
binary_text = ''
# A good block size is 8x8 or a multiple of 8
min_block_size = block_size or 24
while not found:
for x in range(0, w - min_block_size, min_block_size):
for y in range(0, h - min_block_size, min_block_size):
if message.endswith(stop):
found = True
break
for j in range(x, x + min_block_size):
for k in range(y, y + min_block_size):
if bit_count == 8:
char = chr(int(binary_text, 2))
if char in string.printable:
message += char
bit_count = 0
binary_text = ''
pixel = im.getpixel((j, k))
# Since we always want to get the LSB, we
# can just use the result of the modulo as
# our value
binary_text += f'{pixel[0] % 2}'
bit_count += 1
if found:
start_point = message.find(start) + len(start)
end = message.find(stop)
message = message[start_point:end]
return message
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-a", "--action", help="encode or decode")
parser.add_argument("-f", "--filepath", help="path to image")
parser.add_argument("-b", "--block_size", required=False, type=int, help="block size")
args = parser.parse_args()
if args.filepath:
if args.action == "encode":
encode(args.filepath)
elif args.action == "decode":
print(decode(args.filepath))
else:
print("Invalid action")
else:
print("No filepath provided")
With that very simple script, you have your very own PNG steganographic tool. Simply ensure you have Pillow installed and from the terminal run something like:
python ./advanced.py -a encode -f file.png
Then:
python ./advanced.py -a decode -f encoded_file.png -b 24
And you'll have your very own, secretly encoded messaging system. The changes are undetectable to the human eye, even with the original.
Header by Isis França on Unsplash
Top comments (3)
The first thing I notice about your approach is the depth of your loop. I believe this increases the time complexity since you go as deep as 4-5 levels. Do you think there's a faster way to do this? I'm considering flattening the image and using a specific mathematical formula to reduce the level of the loop. What are your thoughts? 😁
You're absolutely right, I really didn't optimise this at all. I mostly wanted to just get something simple, with only 1 or 2 libraries to try and make the topic a little easier to start with. I ended up doing my own implementation in C++ to improve the speed (because this Python one was far too slow).
I think a good approach might be to flatten the image into a single array and then use another algorithm to figure out the correct pixels to alter or just use Pandas. Do you know any better ways of handling it?
In my post, I just utilized NumPy index operation and lookup table aka memoization for better and faster performance. Let me know what you think about my approach there. I'm still new and always exploring new stuff as time goes on.
Exploring Steganography in the Wild - Part 1
Retiago Drago ・ Sep 12