Disclaimer! I am not a C++ developer, I just write a little bit of likely horrible C++ when I need to. Any tips on improving the below code is always welcome, it's quick and dirty and since I'm still learning, works well enough for my purposes. If anyone reads this and actually wants a Python version (even though it'll be much slower) drop a comment and I'll do a write up of this in Python.
For part 3 of this little series, we'll look at one implementation of JPEG steganography. This is still an evolving field, there is so much to look at and many ways to do it but in-keeping with the rest of the series, we'll mostly be focusing on the visual integrity of the image itself.
Sadly no TL;DR this time. The code is a little long and involved and it really helps to understand why it's done the way that it is. Lossy format steganography is weird and more error prone than lossless due to how the lossy-ness is achieved.
Before diving into the actual code it might be worth talking about why it needs to be done the way we'll be doing it. So far, we've been using standard LSB steganography on lossless images, on the actual pixels themselves, but for JPEGs, that won't cut it. That's because the pixel values, whether extracted manually or using a library, will be altered when the JPEG image is saved. Due to how JPEGs work, the raw pixel data is technically not consistent, at least not consistent enough for our purposes.
So for steganography, we'll focus on the parts of the file that don't change, the Quantization tables themselves! That's right, we're using the JPEG algorithm against itself. The tables used to compress and decompress the pixel values are mostly static, especially if the compression value of the file remains the same and that is the crux of the issue. Since we're dealing with compressed data, the values can change, dependent on the method of extracting the values (something I came up against when trying to figure this out myself).
You could also use the Huffman tables, perhaps both together, but you run the risk of more visual imperfections but increasing capacity.
There are some pretty big downsides to this approach. The main one being the size of the data that we can store. The Quantization tables themselves are small, they're meant to be small. They also generally don't scale with the size of the image so you can't just pick a big image and hope for the best. That being said, hopefully this will help you get started.
First, let's add some helpers:
// constants.hpp
namespace constants {
typedef unsigned char BYTE;
inline std::vector<BYTE> QUANT = {0xFF, 0xDB};
inline std::string START_SEQUENCE = "###";
inline std::string END_SEQUENCE = "***";
}
// helpers.hpp
#include "constants.hpp"
static bool endsWith(std::string_view str, std::string_view suffix) {
return str.size() >= suffix.size() && 0 == str.compare(str.size() - suffix.size(), suffix.size(), suffix);
}
static std::vector<constants::BYTE> binaryFileToVector(
const std::basic_string<char, std::char_traits<char>,
std::allocator<char>>& filename) {
// open the file:
std::streampos size = 0;
std::ifstream file(filename, std::ios::binary);
// Stop eating new lines in binary mode!!!
file.unsetf(std::ios::skipws);
file.seekg(0, std::ios::end);
size = file.tellg();
file.seekg(0, std::ios::beg);
if (!file.is_open()) {
throw std::invalid_argument("Could not process. File is
empty.");
}
std::vector<constants::BYTE> inputBuffer;
inputBuffer.reserve(size);
inputBuffer.insert(inputBuffer.begin(),
std::istream_iterator<constants::BYTE>(file),
std::istream_iterator<constants::BYTE>());
return inputBuffer;
}
}
This is a pretty basic function to take a file, get it's size, create a vector of that size and stream the contents to it. Having the file as a vector makes processing it in the next step a little easier as we can take advantage of the vectors search methods.
Now for the encode function:
std::string jpeg_encode(const std::string& image, const std::string& outputFile, const std::string &text) {
std::vector<constants::BYTE> buffer { binaryFileToVector(image) };
uint32_t i { 0 }, j { 0 }, k { 0 };
uint32_t dataSize { 0 };
auto size { buffer.size() };
auto textSize { text.size() };
std::vector<constants::BYTE>::iterator quantIter, exifItr;
bool exifFound {false}, initialQuantFound {false}, quantFound {false}, isExif {false};
// Let's convert our text to a binary string representation
// This could just be kept as bitsets if you wanted.
std::string binaryText;
for ( k = 0; k < textSize; ++k ) {
binaryText += std::bitset<8>(text.c_str()[k]).to_string();
}
// Now we're checking that the file is actually a valid JFIF file.
if( buffer[i] == 0xFF && buffer[i + 1] == 0xD8 ) {
i += 2; // File start is correct, begin looking for valid JFIF or Exif marker
/* Check for null terminated EXIF */
if ( buffer[i] == 0xFF && buffer[i+1] == 0xE1 || buffer[i] == 0xFF && buffer[i+1] == 0xE0 ) {
exifItr = std::search(buffer.begin(), buffer.end(),
constants::EXIF.begin(), constants::EXIF.end());
isExif = true;
}
if ( exifItr != buffer.end() and isExif ) {
exifFound = true;
i = int(exifItr - buffer.begin() + constants::EXIF.size());
}
// JPEG files can be processed.
if ( exifFound ) {
quantIter = std::search((buffer.begin() + i), buffer.end(),
constants::QUANT.begin(), constants::QUANT.end());
if ( quantIter != buffer.end() ) {
initialQuantFound = true;
quantFound = true;
i = int(quantIter - buffer.begin() - 1);
}
if ( initialQuantFound ) {
while ( i < size )
{
if ( buffer[i + 1] == 0xFF ) {
if ( buffer[i + 2] == 0xDB ) {
quantFound = true;
i += 3;
} else {
quantFound = false;
}
} else {
if ( quantFound ) {
if ( j < binaryText.size() ) {
// We only want to jpeg_encode the data once and only if the result wouldn't make the number 1 or 0 since that
// would alter the resulting image far too much. We only want to change the larger values
std::string s = std::bitset<8>((int) buffer[i]).to_string();
s[7] = binaryText[j];
std::bitset<8> b3(s);
buffer[i] = static_cast<constants::BYTE>(b3.to_ulong());
j++;
dataSize++;
}
}
}
i++;
}
}
if ( dataSize < binaryText.size() ) {
throw std::invalid_argument("Could not store encoded string in this file as the string is too large.");
} else {
std::ofstream outputImageData(outputFile, std::ios::binary);
outputImageData.write(reinterpret_cast<const char *>(buffer.data()), buffer.size());
outputImageData.close();
}
return outputFile;
} else {
throw std::invalid_argument("Did not find a valid JFIF or Exif header, check the file is correct and not corrupt");
};
} else {
throw std::invalid_argument("Did not find a valid start tag, check the file is correct and not corrupt");
};
}
So, what are we doing here? First, we're checking that the file is valid. Always a good call, especially with JFIF files since there are multiple possible valid headers. If it's correct, we find the first Quantization table (using FFDB
as the beginning marker). While we're still in this block, encode the information. If the block ends, stop encoding and search for the next block. Repeat until the data is empty or we run out of Quantization tables.
And now for the decode function!
std::string jpeg_decode(const std::string &image) {
std::vector<constants::BYTE> buffer = binaryFileToVector(image);
std::vector<int> binaryTextBuffer;
std::string decodedString;
std::string binaryText;
bool decoded { false };
uint32_t i { 0 };
uint32_t bitCount { 0 };
auto size { buffer.size() };
std::vector<constants::BYTE>::iterator quantIter, jfifItr, exifItr;
bool jfifFound {false}, exifFound = {false}, initialQuantFound = {false}, quantFound = {false};
if ( buffer[i] == 0xFF && buffer[i + 1] == 0xD8 ) {
i += 2; // File start is correct, begin looking for valid JFIF or Exif marker
if ( buffer[i] == 0xFF && buffer[i+1] == 0xE1 ) {
exifItr = std::search(buffer.begin(), buffer.end(),
constants::EXIF.begin(), constants::EXIF.end());
} else if ( buffer[i] == 0xFF && buffer[i + 1] == 0xE0 ) {
jfifItr = std::search(buffer.begin(), buffer.end(),
constants::JFIF.begin(), constants::JFIF.end());
}
if ( exifItr != buffer.end() ) {
exifFound = true;
i = int(exifItr - buffer.begin() + constants::EXIF.size());
}
if ( !exifFound ) {
if ( jfifItr != buffer.end() ) {
jfifFound = true;
i = int(jfifItr - buffer.begin() + constants::JFIF.size());
}
}
if ( jfifFound || exifFound ) {
quantIter = std::search((buffer.begin() + i), buffer.end(),
constants::QUANT.begin(), constants::QUANT.end());
if ( quantIter != buffer.end() ) {
initialQuantFound = true;
quantFound = true;
i = int(quantIter - buffer.begin() + constants::QUANT.size());
}
if ( initialQuantFound ) {
while ( i < size )
{
if ( buffer[i + 1] == 0xFF ) {
if ( buffer[i + 2] == 0xDB ) {
quantFound = true;
i += 3;
} else {
quantFound = false;
}
} else {
if ( quantFound ) {
if (bitCount != 0 and bitCount % 8 == 0) {
char letter = char(std::bitset<8>(binaryText).to_ulong());
if (isprint(letter)) {
decodedString.push_back(letter);
binaryText.clear();
bitCount = 0;
} else {
binaryText.erase(binaryText.begin());
bitCount = 7;
}
}
std::string s = std::bitset<8>((int) buffer[i]).to_string();
binaryText.push_back(s[7]);
if ( endsWith( decodedString, constants::END_SEQUENCE ) == 1 ) {
decoded = true;
goto OUT;
}
bitCount++;
}
}
i++;
}
}
OUT:;
if ( decodedString.length() > 0 ) {
if ( endsWith(decodedString, constants::END_SEQUENCE ) == 1 ) {
decoded = true;
decodedString.erase((decodedString.length() - constants::END_SEQUENCE.length()), constants::END_SEQUENCE.length());
if ( decodedString.starts_with(constants::START_SEQUENCE) ) {
decodedString.erase(0, constants::START_SEQUENCE.length());
}
} else {
decodedString = "FAILED";
}
return decodedString;
} else {
throw std::invalid_argument("Could not find an encoded string in this file.");
}
} else {
throw std::invalid_argument("Did not found a valid JFIF or Exif header, check the file is correct and not corrupt");
};
} else {
throw std::invalid_argument("Did not found a valid start tag, check the file is correct and not corrupt");
};
}
Here we're functionally doing the same as with the encoding step, just extracting the information from the Quantization blocks. After we run out of blocks to check, we check the found string, see if it contains our START_SEQUENCE
and END_SEQUENCE
. If so, trim those sequences and return the string. If not, well, we've failed somewhere or the image was never encoded to begin with.
Final disclaimer, I promise. This process is incredibly fragile. This is not the best method for steganography within JPEGs, because it is so fragile. Utilising the parts of the image file that don't tend to change makes it slightly more robust, but it is still not perfect.
If you're really looking into using steganography for images, I highly recommend looking up the file structure and encode/decode process for the image files so you can find areas of the process that you can take advantage of to make this better. Another step would be to check out some of the amazing ML tools for this. SteganoGAN is probably the most popular and robust so far.
That's everything. As ever, if you have any questions, comments or just want to let me know I write terrible code, leave a comment. I always want to learn more and I hope I will always be the dumbest person in the room so I can get the chance to learn and grow.
Header by Drew Dizzy Graham on Unsplash
Top comments (4)
Very interesting series, do you have any recommendation on resources to learn more about steganography?
Unfortunately nothing easily accessible beyond the types of steganography that I've already discussed. In trying to learn it myself I found a few articles that discussed the basics, some that had concrete examples and others that just point to popular libraries such as JSteg. The vast majority of the available resources are academic papers, which are not fun to read.
However, if you want to learn more about it and know of any means to find the academic papers then here are a few I read in my attempts to learn:
sciencedirect.com/science/article/...
arxiv.org/abs/2107.13151
arxiv.org/ftp/arxiv/papers/1909/19...
A lot of what I've talked about I figured out from the wikipedia page and a lot of trial and error. The content of this article in particular was just understanding how JPEG compression works, the file structure and a lot of testing (several weeks worth). I'm sorry that's likely not the answer you were looking for.
The best means I have for learning more about it is to learn the basics of the file format you want to attempt it on, look for any existing library that supports it and reverse engineer it. I really love steganography as a means of concealment so I'm always willing to discuss it.
Thank you very much for the informative reply!
Not a problem, always happy to help! The best way to get into it is really to try it out. It's a rapidly evolving technology and is likely to just get more important. Google recently announced that they're going to be "invisibly tagging" AI generated content which is likely a form of steganography so I have a feeling it's going to explode fairly soon.
Best of luck in your learning journey and if you have any questions I'm always around to help whenever I can.
Looking forward to see what you create!