将位掩码集映射到出现次数

Map set of bitmasks to number of occurrences

假设我有 n 个位掩码(n 是任意的,可能是 9-25 之间的数字)。这些位掩码的长度(对所有位掩码都相等)无关紧要,但举个例子,假设我们有 3 个长度为 3 的位掩码:

b1: 1 0 0
b2: 1 1 1
b3: 1 0 1

我现在想将这 3 个位掩码(或一般情况下的 n 个)映射到 4 (n + 1) 个其他位掩码,每个可能的出现次数对应一个。对于此示例,生成的位掩码应为:

0: 0 0 0
1: 0 1 0
2: 0 0 1
3: 1 0 0

现在的问题是:有人知道实现这一目标的巧妙策略吗?最简单的尝试显然是遍历每个位并计算原始位掩码集中出现的次数。但是,我想知道是否有一些巧妙的位操作方法可以同时跟踪所有位。创建中间位掩码很好。操作次数当然应该低于上面概述的“天真”方式。有什么想法吗?

根据您有多少个掩码以及每个掩码中有多少位,可能有一种方法可以通过位旋转操作来实现。对于最多三个 6 位掩码,应该这样做:

let mut counts = 0;
for mask in &masks {
    counts += (mask * 0x0101) & 0xAA55;
}

let mut result = [0; 4];
for i in 0..3 {
    result[(counts >> (2 * i)) & 0x3] |= 1 << (2 * i);
    result[(counts >> (2 * i + 9)) & 0x3] |= 1 << (2 * i + 1);
}

Playground

编辑:

这是一个更好的解决方案,适用于多达 255 个八位掩码:

let mut counts = 0;
for &mask in &masks {
    counts += ((mask as u64).wrapping_mul (0x8040201008040201) >> 7)
            & 0x0101010101010101;
}
    
let mut result = vec![0; masks.len() + 1];
for i in 0..8 {
    result[(counts >> (8*i)) as usize & 0xFF] |= 1 << (7-i);
}

Playground

Criterion 基准测试表明,对于单个输入掩码,此解决方案从比简单实现稍快到 256 个输入的三倍:

Nb inputs naive simd ratio
1 41 ns 39 ns 1.1
16 86 ns 43 ns 2.0
256 737 ns 232 ns 3.2

参考基准代码如下:

use criterion::{ black_box, criterion_group, criterion_main, Criterion };

fn naive (masks: impl Iterator<Item=u8>) -> Vec<u8> {
   let mut n = 1;
   let mut counts = vec![0; 8];
   for mask in masks {
      for i in 0..8 {
         if (mask >> i) & 0x1 != 0 {
            counts[i] += 1;
         }
      }
      n +=1;
   }

   let mut result = vec![0; n];
   for i in 0..8 {
      result[counts[i]] |= 1 << i;
   }

   result
}

fn simd (masks: impl Iterator<Item=u8>) -> Vec<u8> {
   let mut n = 1;
   let mut counts = 0;
   for mask in masks {
      counts += ((mask as u64).wrapping_mul (0x8040201008040201) >> 7)
         & 0x0101010101010101;
      n += 1;
   }

   let mut result = vec![0; n];
   for i in 0..8 {
      result[(counts >> (8*i)) as usize & 0xFF] |= 1 << (7-i);
   }

   result
}

pub fn bench_naive (c: &mut Criterion) {
   c.bench_function ("naive - 1", |b| b.iter (|| {
      black_box (naive (black_box (0..1))); }));
   c.bench_function ("naive - 16", |b| b.iter (|| {
      black_box (naive (black_box (0..16))); }));
   c.bench_function ("naive - 256", |b| b.iter (|| {
      black_box (naive (black_box (0..=255))); }));
}

pub fn bench_simd (c: &mut Criterion) {
   c.bench_function ("simd - 1", |b| b.iter (|| {
      black_box (simd (black_box (0..1))); }));
   c.bench_function ("simd - 16", |b| b.iter (|| {
      black_box (simd (black_box (0..16))); }));
   c.bench_function ("simd - 256", |b| b.iter (|| {
      black_box (simd (black_box (0..=255))); }));
}

criterion_group!(benches, bench_naive, bench_simd);
criterion_main!(benches);