boost::dynamic_bitset 的倒序
Reverse order of boost::dynamic_bitset
是否有一种干净的方法来 return 反向排序 boost::dynamic_bitset
对象?
例如:
01001100
成为
00110010
。
我能想到的最简单的解决方案是将bitset转换为字符串,反转字符串并将其转换回bitset,但这似乎是一种相当慢的方法,使位串操作的速度无效。
提前致谢!
boost::dynamic_bitset
没有迭代器,所以有很多舒适的 STL 解决方案,例如,std::reverse
或 std::swap
或它们的 boost
没有对应的方法,我认为一个好方法是自己制作简单的逆向方法:
#include <iostream>
#include <boost/dynamic_bitset.hpp>
void reverse(boost::dynamic_bitset<> &bs)
{
for (size_t begin = 0, end = bs.size() - 1; begin < end; begin++, end--)
{
bool b = bs[end];
bs[end] = bs[begin];
bs[begin] = b;
}
}
int main()
{
size_t size = 8;
boost::dynamic_bitset<> bs(size, 50);
std::cout << "Normal: " << bs << std::endl;
reverse(bs);
std::cout << "Reverse: " << bs << std::endl;
}
输出:
Normal: 00110010
Reverse: 01001100
你可以和非常幸运的人一起做得更好
#define BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS
我最先注意到它是因为它在其他第三方库中使用(我忘了名字,但它是 AI/ML 相关的)。
我这里有一个不是很通用的版本,因为它使用了特定于大小的位旋转技巧(例如反转字节或 uint32)。您可能对这些感兴趣:
- 见http://aggregate.org/MAGIC/#Bit%20Reversal
- 或https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
您仍然可以看到 uint32 专用版本 Live On Compiler Explorer.
普通版
从那时起,我发现了这个不错的答案:In C/C++ what's the simplest way to reverse the order of bits in a byte?,它为 2 的幂宽度整数类型提供了一个相当有效的就地反向算法。所以,现在我们有了完全通用的:
// make sure it's globally defined
#define BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS
#include <boost/dynamic_bitset.hpp>
#include <iostream>
template <typename Block, typename Allocator>
void reverse(boost::dynamic_bitset<Block, Allocator>& bs) {
auto constexpr BLOCK_BIT = sizeof(Block) * CHAR_BIT;
auto original_size = bs.size();
if (auto partial_block = bs.size() % BLOCK_BIT) {
auto pad = (BLOCK_BIT - partial_block);
bs.resize(bs.size() + pad);
bs <<= pad;
}
// see
auto inplace = [](Block& n) {
static_assert(std::is_unsigned_v<Block>);
short bits = sizeof(n) * 8;
Block mask = ~Block(0); // equivalent to uint32_t mask =
// 0b11111111111111111111111111111111;
while (bits >>= 1) {
mask ^= mask << (bits); // will convert mask to
// 0b00000000000000001111111111111111;
n = (n & ~mask) >> bits | (n & mask) << bits; // divide and conquer
}
};
for (auto& b : bs.m_bits) {
inplace(b);
}
std::reverse(begin(bs.m_bits), end(bs.m_bits));
bs.resize(original_size);
}
NOTE the reversal will MUCH more efficient for size()
multiples of BLOCK_BIT
. This might in some scenario even lead one to prefer Block = std::uint8_t
.
通用测试仪/基准测试
让我们写一些随机测试。为了便于实现,我们将反转的字符串表示与反转的字符串表示进行比较。
我添加了针对不同块大小的测试以及有关大小和时间的统计信息:
// For quick testing
#include <random>
#include <chrono>
#include <boost/range/algorithm/reverse.hpp>
#include <boost/lexical_cast.hpp>
static auto now = std::chrono::high_resolution_clock::now;
using namespace std::chrono_literals;
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics.hpp>
namespace ba = boost::accumulators;
namespace bat = ba::tag;
template <typename Block> bool run_test(unsigned n, auto& stats) {
using BitSet = boost::dynamic_bitset<Block>;
auto gen = std::bind(std::uniform_int_distribution<Block>{},
std::mt19937{std::random_device{}()});
while (n--) {
Block init[]{gen(), gen(), gen()};
auto sz = std::bind(
std::uniform_int_distribution(0ul, sizeof(init) * CHAR_BIT),
std::mt19937{std::random_device{}()});
BitSet bs(std::begin(init), std::end(init));
bs.resize(sz());
stats(bs.size());
std::string expected = boost::lexical_cast<std::string>(bs);
boost::reverse(expected);
BitSet revclone = bs;
reverse(revclone);
auto actual = boost::lexical_cast<std::string>(revclone);
if (expected != actual) {
std::cout << __PRETTY_FUNCTION__ << " '" << bs << "': \n"
<< " expected: " << expected << "\n"
<< " actual: " << actual << "\n";
return false;
}
}
return true;
}
int main() {
auto start = now();
ba::accumulator_set<double,
ba::stats<bat::mean, bat::variance, bat::min, bat::max>>
stats;
if (run_test<unsigned char>(10'000, stats))
std::cout << "Completed 10'000 tests with char blocks\n";
if (run_test<uint16_t>(10'000, stats))
std::cout << "Completed 10'000 tests with uint16_t blocks\n";
if (run_test<uint32_t>(100'000, stats))
std::cout << "Completed 100'000 tests with uint32_t blocks\n";
if (run_test<uintmax_t>(1'000'000, stats))
std::cout << "Completed 1'000'000 tests with uintmax_t blocks\n";
auto cost = ((now() - start)/1.us) / ba::count(stats);
std::cout
<< "Samples " << ba::count(stats)
<< " mean: " << ba::mean(stats)
<< " min: " << ba::min(stats)
<< " max: " << ba::max(stats)
<< " stddev: " << std::sqrt(ba::variance(stats))
<< "\n";
std::cout << "Average cost " << cost << "μs\n";
}
在我的机器上打印:
Completed 10'000 tests with char blocks
Completed 10'000 tests with uint16_t blocks
Completed 100'000 tests with uint32_t blocks
Completed 1'000'000 tests with uintmax_t blocks
Samples 1120000 mean: 90.3283 min: 0 max: 192 stddev: 55.9233
Average cost 3.69335μs
real 0m4,141s
user 0m4,061s
sys 0m0,003s
因此,平均大小为 90 位,最多 192 位的位集可以在不到 4μs 的时间内反转。还不错。
适当的微基准测试
使用Nonius,我们可以从可预测的测试中获得可靠的数据。对于尺寸 31、32、37 位,净时序在 10-30ns 范围内。
使用的代码:
#define NONIUS_RUNNER
#include <nonius/nonius.h++>
#include <nonius/main.h++>
template <typename Block> void run_test(nonius::chronometer& cm, size_t target_size) {
using BitSet = boost::dynamic_bitset<Block>;
static const std::string data{
"0100110111010010010001100111010010010001011100100100111010100010011010"
"01100000011000010001110111"};
BitSet bs(data, 0, target_size);
assert(bs.size() == target_size);
cm.measure([&] { reverse(bs); });
}
NONIUS_BENCHMARK("Block=uchar, sz=32", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uint16_t, sz=32", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uint32_t, sz=32", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=32", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uchar, sz=31", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uint16_t, sz=31", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uint32_t, sz=31", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=31", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uchar, sz=37", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uint16_t, sz=37", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uint32_t, sz=37", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=37", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 37); })
Full interactive chart: http://Whosebug-sehe.s3.amazonaws.com/974d10e8-74ae-4fcf-be03-6a0d0e01b5ad/stats.html
是否有一种干净的方法来 return 反向排序 boost::dynamic_bitset
对象?
例如:
01001100
成为
00110010
。
我能想到的最简单的解决方案是将bitset转换为字符串,反转字符串并将其转换回bitset,但这似乎是一种相当慢的方法,使位串操作的速度无效。
提前致谢!
boost::dynamic_bitset
没有迭代器,所以有很多舒适的 STL 解决方案,例如,std::reverse
或 std::swap
或它们的 boost
没有对应的方法,我认为一个好方法是自己制作简单的逆向方法:
#include <iostream>
#include <boost/dynamic_bitset.hpp>
void reverse(boost::dynamic_bitset<> &bs)
{
for (size_t begin = 0, end = bs.size() - 1; begin < end; begin++, end--)
{
bool b = bs[end];
bs[end] = bs[begin];
bs[begin] = b;
}
}
int main()
{
size_t size = 8;
boost::dynamic_bitset<> bs(size, 50);
std::cout << "Normal: " << bs << std::endl;
reverse(bs);
std::cout << "Reverse: " << bs << std::endl;
}
输出:
Normal: 00110010
Reverse: 01001100
你可以和非常幸运的人一起做得更好
#define BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS
我最先注意到它是因为它在其他第三方库中使用(我忘了名字,但它是 AI/ML 相关的)。
我这里有一个不是很通用的版本,因为它使用了特定于大小的位旋转技巧(例如反转字节或 uint32)。您可能对这些感兴趣:
- 见http://aggregate.org/MAGIC/#Bit%20Reversal
- 或https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
您仍然可以看到 uint32 专用版本 Live On Compiler Explorer.
普通版
从那时起,我发现了这个不错的答案:In C/C++ what's the simplest way to reverse the order of bits in a byte?,它为 2 的幂宽度整数类型提供了一个相当有效的就地反向算法。所以,现在我们有了完全通用的:
// make sure it's globally defined
#define BOOST_DYNAMIC_BITSET_DONT_USE_FRIENDS
#include <boost/dynamic_bitset.hpp>
#include <iostream>
template <typename Block, typename Allocator>
void reverse(boost::dynamic_bitset<Block, Allocator>& bs) {
auto constexpr BLOCK_BIT = sizeof(Block) * CHAR_BIT;
auto original_size = bs.size();
if (auto partial_block = bs.size() % BLOCK_BIT) {
auto pad = (BLOCK_BIT - partial_block);
bs.resize(bs.size() + pad);
bs <<= pad;
}
// see
auto inplace = [](Block& n) {
static_assert(std::is_unsigned_v<Block>);
short bits = sizeof(n) * 8;
Block mask = ~Block(0); // equivalent to uint32_t mask =
// 0b11111111111111111111111111111111;
while (bits >>= 1) {
mask ^= mask << (bits); // will convert mask to
// 0b00000000000000001111111111111111;
n = (n & ~mask) >> bits | (n & mask) << bits; // divide and conquer
}
};
for (auto& b : bs.m_bits) {
inplace(b);
}
std::reverse(begin(bs.m_bits), end(bs.m_bits));
bs.resize(original_size);
}
NOTE the reversal will MUCH more efficient for
size()
multiples ofBLOCK_BIT
. This might in some scenario even lead one to preferBlock = std::uint8_t
.
通用测试仪/基准测试
让我们写一些随机测试。为了便于实现,我们将反转的字符串表示与反转的字符串表示进行比较。
我添加了针对不同块大小的测试以及有关大小和时间的统计信息:
// For quick testing
#include <random>
#include <chrono>
#include <boost/range/algorithm/reverse.hpp>
#include <boost/lexical_cast.hpp>
static auto now = std::chrono::high_resolution_clock::now;
using namespace std::chrono_literals;
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics.hpp>
namespace ba = boost::accumulators;
namespace bat = ba::tag;
template <typename Block> bool run_test(unsigned n, auto& stats) {
using BitSet = boost::dynamic_bitset<Block>;
auto gen = std::bind(std::uniform_int_distribution<Block>{},
std::mt19937{std::random_device{}()});
while (n--) {
Block init[]{gen(), gen(), gen()};
auto sz = std::bind(
std::uniform_int_distribution(0ul, sizeof(init) * CHAR_BIT),
std::mt19937{std::random_device{}()});
BitSet bs(std::begin(init), std::end(init));
bs.resize(sz());
stats(bs.size());
std::string expected = boost::lexical_cast<std::string>(bs);
boost::reverse(expected);
BitSet revclone = bs;
reverse(revclone);
auto actual = boost::lexical_cast<std::string>(revclone);
if (expected != actual) {
std::cout << __PRETTY_FUNCTION__ << " '" << bs << "': \n"
<< " expected: " << expected << "\n"
<< " actual: " << actual << "\n";
return false;
}
}
return true;
}
int main() {
auto start = now();
ba::accumulator_set<double,
ba::stats<bat::mean, bat::variance, bat::min, bat::max>>
stats;
if (run_test<unsigned char>(10'000, stats))
std::cout << "Completed 10'000 tests with char blocks\n";
if (run_test<uint16_t>(10'000, stats))
std::cout << "Completed 10'000 tests with uint16_t blocks\n";
if (run_test<uint32_t>(100'000, stats))
std::cout << "Completed 100'000 tests with uint32_t blocks\n";
if (run_test<uintmax_t>(1'000'000, stats))
std::cout << "Completed 1'000'000 tests with uintmax_t blocks\n";
auto cost = ((now() - start)/1.us) / ba::count(stats);
std::cout
<< "Samples " << ba::count(stats)
<< " mean: " << ba::mean(stats)
<< " min: " << ba::min(stats)
<< " max: " << ba::max(stats)
<< " stddev: " << std::sqrt(ba::variance(stats))
<< "\n";
std::cout << "Average cost " << cost << "μs\n";
}
在我的机器上打印:
Completed 10'000 tests with char blocks
Completed 10'000 tests with uint16_t blocks
Completed 100'000 tests with uint32_t blocks
Completed 1'000'000 tests with uintmax_t blocks
Samples 1120000 mean: 90.3283 min: 0 max: 192 stddev: 55.9233
Average cost 3.69335μs
real 0m4,141s
user 0m4,061s
sys 0m0,003s
因此,平均大小为 90 位,最多 192 位的位集可以在不到 4μs 的时间内反转。还不错。
适当的微基准测试
使用Nonius,我们可以从可预测的测试中获得可靠的数据。对于尺寸 31、32、37 位,净时序在 10-30ns 范围内。
使用的代码:
#define NONIUS_RUNNER
#include <nonius/nonius.h++>
#include <nonius/main.h++>
template <typename Block> void run_test(nonius::chronometer& cm, size_t target_size) {
using BitSet = boost::dynamic_bitset<Block>;
static const std::string data{
"0100110111010010010001100111010010010001011100100100111010100010011010"
"01100000011000010001110111"};
BitSet bs(data, 0, target_size);
assert(bs.size() == target_size);
cm.measure([&] { reverse(bs); });
}
NONIUS_BENCHMARK("Block=uchar, sz=32", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uint16_t, sz=32", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uint32_t, sz=32", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=32", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 32); })
NONIUS_BENCHMARK("Block=uchar, sz=31", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uint16_t, sz=31", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uint32_t, sz=31", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=31", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 31); })
NONIUS_BENCHMARK("Block=uchar, sz=37", [](nonius::chronometer cm) { run_test<uint8_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uint16_t, sz=37", [](nonius::chronometer cm) { run_test<uint16_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uint32_t, sz=37", [](nonius::chronometer cm) { run_test<uint32_t>(cm, 37); })
NONIUS_BENCHMARK("Block=uintmax_t, sz=37", [](nonius::chronometer cm) { run_test<uintmax_t>(cm, 37); })
Full interactive chart: http://Whosebug-sehe.s3.amazonaws.com/974d10e8-74ae-4fcf-be03-6a0d0e01b5ad/stats.html