Integer + Floating Point Compression Filter
- Fastest transpose/shuffle
- Byte/Nibble transpose/shuffle for improving compression of binary data (ex. floating point data)
β¨ Scalar/SIMD Transpose/Shuffle 8,16,32,64,... bitsπ Dynamic CPU detection and JIT scalar/sse/avx2 switching- 100% C (C++ headers), usage as simple as memcpy
- Byte Transpose
- Fastest byte transpose
- Nibble Transpose
- nearly as fast as byte transpose
- more efficient, up to 6 times! faster than Bitshuffle
π better compression (w/ lz77) and
10 times! faster than one of the best floating-point compressors SPDP- can compress/decompress (w/ lz77) better and faster than other domain specific floating point compressors
- Scalar and SIMD Transform
- Delta encoding for sorted lists
- Zigzag encoding for unsorted lists
- Xor encoding
π lossy floating point compression with user-defined error
Transpose Benchmark:
- CPU: Skylake i7-6700 3.4GHz gcc 7.2 single thread
- Speed test
Benchmark w/ 16k buffer
BOLD = pareto frontier.
E:Encode, D:Decode
./tpbench -s# file -B16K (# = 8,4,2)
| Size | E Time cycles/byte | D Time cycles/byte | Transpose 64 bits AVX2 |
|---|---|---|---|
| 16,000 | .199 | .134 | tpbyte 8 |
| 16,000 | .326 | .201 | Blosc_shuffle 8 |
| 16,000 | .394 | .260 | tpnibble 8 |
| 16,000 | .848 | .478 | Bitshuffle 8 |
| Size | E Time cycles/byte | D Time cycles/byte | Transpose 32 bits AVX2 |
|---|---|---|---|
| 16,000 | .121 | .102 | tpbyte 4 |
| 16,000 | .451 | .139 | Blosc_shuffle 4 |
| 16,000 | .345 | .229 | tpnibble 4 |
| 16,000 | .773 | .476 | Bitshuffle 4 |
| Size | E Time cycles/byte | D Time cycles/byte | Transpose 16 bits AVX2 |
|---|---|---|---|
| 16,000 | .095 | .071 | tpbyte 2 |
| 16,000 | .640 | .108 | Blosc_shuffle 2 |
| 16,000 | .329 | .198 | tpnibble 2 |
| 16,000 | .758 | 1.177 | Bitshuffle 2 |
| 16,000 | .067 | .067 | memcpy |
Transpose/Shuffle benchmark w/ large files.
MB/s: 1,000,000 bytes/second
./tpbench -s# file (# = 8,4,2)
| Size | E Time MB/s | D Time MB/s | Transpose 64 bits AVX2 |
|---|---|---|---|
| 100,000,000 | 8387 | 9408 | tpbyte 8 |
| 100,000,000 | 8134 | 8598 | Blosc_shuffle 8 |
| 100,000,000 | 7797 | 9145 | tpnibble 8 |
| 100,000,000 | 3548 | 3459 | Bitshuffle 8 |
| 100,000,000 | 13366 | 13366 | memcpy |
| Size | E Time MB/s | D Time MB/s | Transpose 32 bits AVX2 |
|---|---|---|---|
| 100,000,000 | 8398 | 9533 | tpbyte 4 |
| 100,000,000 | 8198 | 9307 | tpnibble 4 |
| 100,000,000 | 8193 | 8796 | Blosc_shuffle 4 |
| 100,000,000 | 3679 | 3666 | Bitshuffle 4 |
| Size | E Time MB/s | D Time MB/s | Transpose 16 bits AVX2 |
|---|---|---|---|
| 100,000,000 | 7878 | 9542 | tpbyte 2 |
| 100,000,000 | 8987 | 9412 | tpnibble 2 |
| 100,000,000 | 7739 | 9404 | Blosc_shuffle 2 |
| 100,000,000 | 3879 | 2547 | Bitshuffle 2 |
- Compression test (transpose/shuffle+lz4)
for testing allmost all integer and floating point file types.
Note: Lossy compression benchmark with icapp only.
- Speed test (file msg_sweep3d)
| C size | ratio % | C MB/s | D MB/s | Name |
|---|---|---|---|---|
| 11,348,554 | 18.1 | 2276 | 4425 | tpnibble+lz |
| 22,489,691 | 35.8 | 1670 | 3881 | tpbyte+lz |
| 43,471,376 | 69.2 | 348 | 402 | SPDP |
| 44,626,407 | 71.0 | 1065 | 2101 | bitshuffle+lz |
| 62,865,612 | 100.0 | 13300 | 13300 | memcpy |
./tpbench -s4 -z *.sp
| File | File size | lz % | Tp8lz | Tp4lz | BSlz | spdp1 | spdp9 | Tp4lzt | eTp4lzt | |
|---|---|---|---|---|---|---|---|---|---|---|
| msg_bt | 133194716 | 94.3 | 70.4 | 66.4 | 73.9 | 70.0 | |
67.4 | 54.7 | 32.4 |
| msg_lu | 97059484 | 100.4 | 77.1 | 70.4 | 75.4 | 76.8 | |
74.0 | 61.0 | 42.2 |
| msg_sppm | 139497932 | 11.7 | 11.6 | 12.6 | 15.4 | 14.4 | |
13.7 | 9.0 | 5.6 |
| msg_sp | 145052928 | 100.3 | 68.8 | 63.7 | 68.1 | 67.9 | |
65.3 | 52.6 | 24.9 |
| msg_sweep3d | 62865612 | 98.7 | 35.8 | 18.1 | 71.0 | 69.6 | |
13.7 | 9.8 | 3.8 |
| num_brain | 70920000 | 100.4 | 76.5 | 71.1 | 77.4 | 79.1 | |
73.9 | 63.4 | 32.6 |
| num_comet | 53673984 | 92.4 | 79.0 | 77.6 | 82.1 | 84.5 | |
84.6 | 70.1 | 41.7 |
| num_control | 79752372 | 99.4 | 89.5 | 90.7 | 88.1 | 98.3 | |
98.5 | 81.4 | 51.2 |
| num_plasma | 17544800 | 100.4 | 0.7 | 0.7 | 75.5 | 30.7 | |
2.9 | 0.3 | 0.2 |
| obs_error | 31080408 | 89.2 | 73.1 | 70.0 | 76.9 | 78.3 | |
49.4 | 20.5 | 12.2 |
| obs_info | 9465264 | 93.6 | 70.2 | 61.9 | 72.9 | 62.4 | |
43.8 | 27.3 | 15.1 |
| obs_spitzer | 99090432 | 98.3 | 90.4 | 95.6 | 93.6 | 100.1 | |
100.7 | 80.2 | 52.3 |
| obs_temp | 19967136 | 100.4 | 89.5 | 92.4 | 91.0 | 99.4 | |
100.1 | 84.0 | 55.8 |
Tp8=Byte transpose, Tp4=Nibble transpose, lz = lz4
eTp4Lzt = lossy compression with lzturbo and allowed error = 0.0001 (1e-4)
Slow but best compression: SPDP9 and lzt = lzturbo,39
-
Scientific IEEE 754 64-Bit Double-Precision Floating-Point Datasets
./tpbench -s8 -z *.trace
| File | File size | lz % | Tp8lz | Tp4lz | BSlz | spdp1 | spdp9 | Tp4lzt | eTp4lzt | |
|---|---|---|---|---|---|---|---|---|---|---|
| msg_bt | 266389432 | 94.5 | 77.2 | 76.5 | 81.6 | 77.9 | |
75.4 | 69.9 | 16.0 |
| msg_lu | 194118968 | 100.4 | 82.7 | 81.0 | 83.7 | 83.3 | |
79.6 | 75.5 | 21.0 |
| msg_sppm | 278995864 | 18.9 | 14.5 | 14.9 | 19.5 | 21.5 | |
19.8 | 11.2 | 2.8 |
| msg_sp | 290105856 | 100.4 | 79.2 | 77.5 | 80.2 | 78.8 | |
77.1 | 71.3 | 12.4 |
| msg_sweep3d | 125731224 | 98.7 | 50.7 | 36.7 | 80.4 | 76.2 | |
33.2 | 27.3 | 1.9 |
| num_brain | 141840000 | 100.4 | 82.6 | 81.1 | 84.5 | 87.8 | |
83.3 | 77.0 | 16.3 |
| num_comet | 107347968 | 92.8 | 83.3 | 78.8 | 76.3 | 86.5 | |
86.0 | 69.8 | 21.2 |
| num_control | 159504744 | 99.6 | 92.2 | 90.9 | 89.4 | 97.6 | |
98.9 | 85.5 | 25.8 |
| num_plasma | 35089600 | 75.2 | 0.7 | 0.7 | 84.5 | 77.3 | |
3.0 | 0.3 | 0.1 |
| obs_error | 62160816 | 78.7 | 81.0 | 77.5 | 84.4 | 87.9 | |
62.3 | 23.4 | 6.3 |
| obs_info | 18930528 | 92.3 | 75.4 | 70.6 | 82.4 | 81.7 | |
51.2 | 33.1 | 7.7 |
| obs_spitzer | 198180864 | 95.4 | 93.2 | 93.7 | 86.4 | 100.1 | |
102.4 | 78.0 | 26.9 |
| obs_temp | 39934272 | 100.4 | 93.1 | 93.8 | 91.7 | 98.0 | |
97.4 | 88.2 | 28.8 |
eTp4Lzt = lossy compression with allowed error = 0.0001
Compile:
git clone git://github.com/powturbo/TurboTranspose.git
cd TurboTranspose
Linux + Windows MingW
make
or
make AVX2=1
Windows Visual C++
nmake /f makefile.vs
or
nmake AVX2=1 /f makefile.vs
-
benchmark with other libraries
download or clone bitshuffle or blosc and typemake AVX2=1 BLOSC=1 or make AVX2=1 BITSHUFFLE=1
Testing:
-
benchmark "transpose" functions
./tpbench [-s#] [-z] file s# = element size #=2,4,8,16,... (default 4) -z = only lz77 compression benchmark (bitshuffle package mandatory)
Function usage:
Byte transpose:
void tpenc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);
void tpdec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)
in : input buffer
n : number of bytes
out : output buffer
esize : element size in bytes (2,4,8,...)
Nibble transpose:
void tp4enc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);
void tp4dec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)
in : input buffer
n : number of bytes
out : output buffer
esize : element size in bytes (2,4,8,...)
Environment:
OS/Compiler (64 bits):
- Linux: GNU GCC (>=4.6)
- clang (>=3.2)
- Windows: MinGW-w64
- Windows: Visual C++ (>=VS2008)
Multithreading:
- All TurboTranspose functions are thread safe
References:
- BS - Bitshuffle: Filter for improving compression of typed binary data.
π A compression scheme for radio data in high performance computing - Blosc: A blocking, shuffling and loss-less compression
- SPDP is a compression/decompression algorithm for binary IEEE 754 32/64 bits floating-point data
π SPDP - An Automatically Synthesized Lossless Compression Algorithm for Floating-Point Data + DCC 2018 π FPC: A High-Speed Compressor for Double-Precision Floating-Point Data
Last update: 11 Jun 2018

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
