I came up with an algorithm to compress numbers into Base64, so I'd like to introduce it.
The compression rate is about 55%.
Because the leading zeros disappear, I hope you will use it for numbers that do not start with a zero, such as when you want to save long-type numbers.
one byte character There should be 95 , so you can also use Base95.
This will allow for more compression, but the compression rate should be just under 50%, so there shouldn't be much difference between the two.
If you don't want to use commas to make management easier, you'll have to compromise and use Base94, but making such modifications later is a hassle, so it's probably easier to stick to Base64 in the end.
Code to compress/decompress
public static string CompressNumberToBase64(string targetNumber)
{
string thinking = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
System.Text.StringBuilder create = new();
System.Numerics.BigInteger nowBigInteger = System.Numerics.BigInteger.Parse(targetNumber);
System.Numerics.BigInteger bigInteger64 = new(64);
System.Numerics.BigInteger outBigInteger;
while (0 <= nowBigInteger.CompareTo(bigInteger64))
{
System.Numerics.BigInteger.DivRem(nowBigInteger, bigInteger64, out outBigInteger);
create.Append(thinking[(int)outBigInteger]);
nowBigInteger = System.Numerics.BigInteger.Divide(nowBigInteger, bigInteger64);
}
System.Numerics.BigInteger.DivRem(nowBigInteger, bigInteger64, out outBigInteger);
create.Append(thinking[(int)outBigInteger]);
return create.ToString();
}
public static string DecompressBase64ToNumber(string targetComperssNumber)
{
string thinking = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
System.Numerics.BigInteger nowBigInteger = new(0);
System.Numerics.BigInteger bigInteger64 = new(64);
for (int i = targetComperssNumber.Length - 1; 0 <= i; i--)
{
nowBigInteger = BigInteger.Multiply(nowBigInteger, bigInteger64);
nowBigInteger = BigInteger.Add(nowBigInteger, thinking.IndexOf("" + targetComperssNumber[i]));
}
return nowBigInteger.ToString();
}
When compressing a 10,000-digit number with this code,
it is compressed to 5,537 characters.
A 10-digit number can also be compressed to 6 characters.
This means that the compression rate is about 55%.
If you know that only long type numbers will be converted,
it would definitely be faster to rewrite BigInteger to long.
From the perspective of cheat prevention,
it is assumed that it will be implemented in combination with encryption,
but it may actually work well with checksums.
Above, I introduced an algorithm that compresses numbers to about 55% of the number of characters.
A compression rate of 55% is about right when you consider that 10 to the power of 10 is 1 billion and 64 to the power of 5 is about 1.07 billion.
If anyone knows of a better algorithm that "runs faster" or has a higher compression rate, I would be grateful if you could let me know in the comments.
That's all for now.
Thank you for reading.
Top comments (0)