I have the following problem: decode a base64 string and decode some bits as integers, such as
- first 6 bits (0 to 5) are the "version"
- next 36 bits (6 to 41) are the "created epoch time"
- etc
fields are stored in big-endian format. Bit numberings are left-to-right.
after trying several combinations, I manage to create a sequence of octets and use bitwise operations to find what I need, like in the example below.
for instance, for the version part, it is easy: since I am looking for the 6 first bits, I can perform a >> 2
, however for the next 36 bits which is [000000xx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xx000000]
starting at the 1st byte it is much more difficult
I'd like to know if there is a better way to do it. I manage to create an array of bits using unpack and manage to do a @bits[start..end]
, however I don't know how to continue.
My problem in specific is: I can port the bitwise operations done in another implementation handling an array of octets/bytes but I need add EXTRA operations to ensure the correct answer and it is a LOT of fields.
I never had to work with pack/unpack before and I want to avoid XS, doing everyting in pure perl (today we have a version that binds the golang library and I need to use CGO and the build process is really complex).
For instance, I find a python version that uses a module called bitarray, that simplifies a lot their work, however I did not find an equivalent in Perl.
my poc
use strict;
use warnings;
use feature 'say';
use MIME::Base64;
my $data = "COyiILmOyiILmADACHENAPCAAAAAAAAAAAAAE5QBgALgAqgD8AQACSwEygJyAAAAAA";
my @octets = unpack "C*", decode_base64($data);
my $version = $octets[0] >> 2;
say "version: $version";
my $deciseconds = unpack "Q>", pack "C*", (
0x0,
0x0,
0x0,
(($octets[0] & 0x3) << 2 | $octets[1] >> 6) & 0xFF,
($octets[1]<<2 | $octets[2]>>6) & 0xFF,
($octets[2]<<2 | $octets[3]>>6) & 0xFF,
($octets[3]<<2 | $octets[4]>>6) & 0xFF,
($octets[4]<<2 | $octets[5]>>6) & 0xFF,
);
say "deciseconds: $deciseconds";
it prints, as expected
version: 2
deciseconds: 15880192742
For instance, this is the go equivalent for the deciseconds decoding
var data []byte
// decode base64 ...
deciseconds := int64(binary.BigEndian.Uint64([]byte{
0x0,
0x0,
0x0,
(data[0]&0x3)<<2 | data[1]>>6,
data[1]<<2 | data[2]>>6,
data[2]<<2 | data[3]>>6,
data[3]<<2 | data[4]>>6,
data[4]<<2 | data[5]>>6,
}))
Thanks
vec
and bitwise string operators, fwiw.vec
function provides similar functionality as a Python bitarray