Given a bitmask and an input, pext will select the bits where-ever there is a set bit in the mask, and compress them together to produce a new result. |0000100000001111100000100001111100010000000010000001001000010000| < Operand A ^ ^ ^ ^ ^ ^ ^ ^ |0100000010000000000000100000001000000000000000010000001000010001| < Mask | Extract bits at mask V |.1......0.............1.......1................0......1....1...0| | Compress into new 64-bit integer V |0000000000000000000000000000000000000000000000000000000000110010| < Result

// Goes from ascii "00100101" to binary byte 0b00100101('a') // the ascii string "00100101" is 8 bytes, so it fits perfectly // with a 64-bit integer inline std::uint8_t DecodeBase2Word( std::uint64_t BinAscii ) { const std::uint64_t CurInput = __builtin_bswap64(BinAscii); std::uint8_t Binary = 0; #if defined(__BMI2__) // Much faster, or is it? Binary = _pext_u64(CurInput, 0x0101010101010101UL); #else // Serial bit extraction std::uint64_t Mask = 0x0101010101010101UL; for( std::uint64_t CurBit = 1UL; Mask != 0; CurBit <<= 1 ) { if( CurInput & Mask & -Mask ) { Binary |= CurBit; } Mask &= (Mask - 1UL); } #endif return Binary; }

3210987654321098765432109876543210987654321098765432109876543210 666655555555554444444444333333333322222222221111111111 ----------------------------------------------------------------- 0000100000001111100000100001111100010000000010000001001000010000| < Operand A |^ ^ ^ ^ +----------^ ^ ^ ^| || | | | | +---------+ |+--+| |+--+ +---+ +-+ +-+ | | +------+| | | | | | | | | | | | |---+-------+-------+-------+-------+-------+-------+-------+---| | 62 | 55 | 41 | 33 | 16 | 9 | 4 | 0 | < Operand B +---------------------------------------------------------------+ | Get bits at index V +---------------------------------------------------------------+ | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | +---------------------------------------------------------------+ | Compress into new 8-bit integer V +----------------+ | 0b00110010 | +----------------+

// '0' : 0b00110000 // '1' : 0b00110001 // ^ Extract and compress these bits // the rest of he bits stay the same! (0x30) // (assuming you've validated your input) void Base2Decode( const std::uint64_t Input[], std::uint8_t Output[], std::size_t Length ) { std::size_t i = 0; // 8 at a time for( std::size_t j = i/8 ; i < Length/8; ++j, i += 8 ) { const __mmask64 Compressed = _mm512_bitshuffle_epi64_mask( _mm512_loadu_si512(reinterpret_cast<const __m512i*>(Input + i)), _mm512_set1_epi64(0x00'08'10'18'20'28'30'38) ); _store_mask64(reinterpret_cast<__mmask64*>(Output + i), Compressed); } // 4 at a time for( std::size_t j = i/4 ; i < Length/4; ++j, i += 4 ) { const __mmask32 Compressed = _mm256_bitshuffle_epi64_mask( _mm256_loadu_si256(reinterpret_cast<const __m256i*>(Input + i)), _mm256_set1_epi64x(0x00'08'10'18'20'28'30'38) ); _store_mask32(reinterpret_cast<__mmask32*>(Output + i), Compressed); } // 2 at a time for( std::size_t j = i/2 ; i < Length/2; ++j, i += 2 ) { const __mmask16 Compressed = _mm_bitshuffle_epi64_mask( _mm_loadu_si128(reinterpret_cast<const __m128i*>(Input + i)), _mm_set1_epi64x(0x00'08'10'18'20'28'30'38) ); _store_mask16(reinterpret_cast<__mmask16*>(Output + i), Compressed); } // Serial(could probably just use the pext implementation here but I'm demonstrating bitshuffle_epi64 here) for( ; i < Length; ++i ) { const __mmask16 Compressed = _mm_bitshuffle_epi64_mask( _mm_loadl_epi64(reinterpret_cast<const __m128i*>(Input + i)), _mm_set1_epi64x(0x00'08'10'18'20'28'30'38) ); Output[i] = static_cast<std::uint8_t>(_cvtmask16_u32(Compressed)); } } int main() { // "Hello World!" const std::uint64_t* Input = (const std::uint64_t*)"010010000110010101101100011011000110111100100000010101110110111101110010011011000110010000100001"; std::uint8_t Output[12] = {0}; Base2Decode(Input, Output, 12); std::printf("Output: '%.12s'\n", Output); }

base2

In the same spirit as the gnu coreutils software base64, base2 transforms data read from a file, or standard input, into (or from) base2(binary text) encoded form.

Because I was bored.

base2 - Wunkolo <wunkolo@gmail.com&gt
Usage: base2 [Options]... [File]
       base2 --decode [Options]... [File]
Options
  -h, --help            Display this help/usage information
  -d, --decode          Decode's incoming binary ascii into bytes
  -i, --ignore-garbage  When decoding, ignores non-ascii-binary `0`, `1` bytes
  -w, --wrap=Columns    Wrap encoded binary output within columns
                        Default is `76`. `0` Disables linewrapping

Encoding:

% base2 <<< 'QWERTY'
01010001010101110100010101010010010101000101100100001010
% base2 --wrap=8 <<< 'QWERTY'
01010001                          # 'Q'
01010111                          # 'W'
01000101                          # 'E'
01010010                          # 'R'
01010100                          # 'T'
01011001                          # 'Y'
00001010                          # '\n'

Decoding:

% base2 -d <<< '01010001010101110100010101010010010101000101100100001010'
QWERTY
% base2 -d <<< '010100010101
011101000garbage1010blah101001001010garbage1000101100100001010'
QWFz*J�B
% base2 -d -i <<< '010100010
101011101000garbage1010blah101001001010garbage1000101100100001010'
QWERTY

Did I mention its fast:

i3-6100

inxi -C
CPU:       Topology: Dual Core model:

…

🔍 Say Goodbye to Slow Renders: Optimizing React Apps with `react-scan`

Victor J. Rosario V. - Dec 2

How does Cognitive Science influence knowledge Representation in AI?

Vikas76 - Oct 29

Understanding and Solving False Sharing in Multi-threaded Applications with an actual issue I had

Aria Diniz - Dec 1

The Saga Pattern in Microservices

Daniel Azevedo - Nov 20

DEV Community

Fast base2 decoding on the upcoming Intel Icelake

Wunkolo / base2

A base2 implementation similar to gnu coreutil's base64

base2

Top comments (0)

Read next

🔍 Say Goodbye to Slow Renders: Optimizing React Apps with `react-scan`

How does Cognitive Science influence knowledge Representation in AI?

Understanding and Solving False Sharing in Multi-threaded Applications with an actual issue I had

The Saga Pattern in Microservices