Introduction
Hash functions are commonly used for verifying data integrity and cryptographic purposes. In this article, I will implement the widely-used SHA-256 hash function and examine why it is irreversible (i.e., why the original value cannot be restored).
The conclusion is that the irreversibility of hash functions is achieved through information loss.
The Rust implementation of SHA-256 for this article is available here:
https://github.com/akira-19/algorithms_rust/tree/main/sha-256
The Flow of SHA-256
The irreversibility of SHA-256 can be understood by analyzing its flow. Let’s hash the string "msg"
as an example.
- First, convert the string
msg
into its character codes (hexadecimal representation). -
Next, format the message into a 64-byte block. Add
0x80
just after the original message, and append the binary length of the original message to the last 8 bytes of the block. Sincemsg
is 3 bytes, its binary length is 24 bits, represented as0x18
in hexadecimal. Pad the gap between0x80
and the message length with zeros. Split the 64-byte block into 4-byte segments. For example, the first segment
6d 73 67 80
becomes6d736780
(in decimal: 1836279680).-
Calculate the value
W
, which includes irreversible processing. A part of sample Rust code snippet is as follows.
fn calc_w(msg: [u32; 16]) -> [u32; 64] { let mut w = vec![0; 64]; for i in 0..16 { w[i] = msg[i]; } for i in 16..64 { w[i] = lower_sigma_1(w[i - 2]) .wrapping_add(w[i - 7]) .wrapping_add(lower_sigma_0(w[i - 15])) .wrapping_add(w[i - 16]); } w } fn lower_sigma_1(x: u32) -> u32 { rotr(x, 17) ^ rotr(x, 19) ^ shr(x, 10) } fn rotr(x: u32, n: u32) -> u32 { (x >> n) | (x << (32 - n)) } fn shr(x: u32, n: u32) -> u32 { x >> n }
The
calc_w
function calls thelower_sigma_1
function, which in turn callsrotr
. Let’s visualize how this works. -
As an example, let
x = 31
andn = 3
. In binary, 31 is11111
. Sincex
andn
are unsigned 32-bit integers,x
is represented as000...000011111
(27 leading zeros). The operation
x >> n
shiftsx
to the right byn
bits, adding000
to the left and removing the last111
.Similarly,
x << (32 - n)
shiftsx
to the left by32 - 3 = 29
bits. The first three bits become111
, followed by zeros.Finally, the bitwise OR of steps 6 and 7 results in a value like
111000...0011
.
The information loss occurs in the shr
operation. In step 6, the bits shifted out to the right cannot be recovered.
Additionally, in lower_sigma_1
, the operation rotr(x, 17) ^ rotr(x, 19)
results in information loss. To simplify, this process involves XORing 10000
shifted right by 1 bit with 10000
shifted right by 2 bits, yielding 01000 ^ 00100 = 00000
. This result could also originate from an original value of 00000
, making the original value unrecoverable.
By repeating such operations, it becomes impossible to restore the original value.
Top comments (0)