在 perl 中,如何计算位设置高于 2_147_483_639 的位向量中的位?
In perl, how do I count bits in a bit vector which has bits set higher than 2_147_483_639?
Perl 非常擅长做位strings/vectors。设置位就像
一样简单
vec($bit_string, 123, 1) = 1;
获取设置位的计数非常快
$count = unpack("%32b*", $bit_string);
但如果您设置的比 2_147_483_639 高一点,您的计数将自动归零,而不会出现任何明显的警告或错误。
有什么解决办法吗?
下面的代码演示了问题
#!/usr/bin/env perl
# create a string to use as our bit vector
my $bit_string = undef;
# set bits a position 10 and 2_000_000_000
# and the apparently last valid integer position 2_147_483_639
vec($bit_string, 10, 1) = 1;
vec($bit_string, 2_000_000_000, 1) = 1;
vec($bit_string, 2_147_483_639, 1) = 1;
# get a count of the bits which are set
my $bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string: $bit_count\n");
## Bits set in bit string: 3
# check the bits at positions 10, 11, 2_000_000_000, 2_147_483_639
for my $position (10,11,2_000_000_000, 2_147_483_639) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 11 is 0
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
# Adding the next highest bit, 2_147_483_640, causes the count to become 0
# with no complaint, error or warning
vec($bit_string, 2_147_483_640, 1) = 1;
$bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string after setting bit 2_147_483_640: $bit_count\n");
## Bits set in bit string after setting bit 2_147_483_640: 0
# But the bits are still actually set
for my $position (10, 2_000_000_000, 2_147_483_639, 2_147_483_640) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
## Bit at 2147483640 is 1
# Set even higher bits
vec($bit_string, 3_000_000_000, 1) = 1;
vec($bit_string, 4_000_000_000, 1) = 1;
# verify these are also set
for my $position (3_000_000_000, 4_000_000_000) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 3000000000 is 1
## Bit at 4000000000 is 1
您可以尝试按小块数。它速度较慢,但似乎有效:
$bit_count = 0;
$bit_count += unpack '%32b*',
while $bit_string =~ /(.{1,32766})/g;
或者使用 substr 而不是 m//
稍微快一些:
$bit_count = 0;
my ($pos, $step) = (0, 2 ** 17);
$bit_count += unpack '%32b*', substr $bit_string, $step * $pos++, $step
while $pos * $step <= length $bit_string;
2 ** 17 似乎在我的机器上提供了最好的性能,但是 YMMV。
另一种可能性(较慢,顺便说一句)是为任何可能的字节做一个table位数并使用它:
my %by_bits;
for my $byte (1 ..255) {
my $bits_in_byte = sprintf('%b', $byte) =~ tr/1//; # Fix SO hiliting bug: /
$by_bits{$bits_in_byte} .= sprintf '\x%02x', $byte;
}
$bit_count = 0;
for my $count (keys %by_bits) {
$bit_count += $count * eval('$bit_string =~ tr/' . $by_bits{$count}. '//');
}
更新:
它在最近的 Perl 中工作正常。参见 Another 32-bit residual in 64-bit perl 5.18。
Perl 非常擅长做位strings/vectors。设置位就像
一样简单vec($bit_string, 123, 1) = 1;
获取设置位的计数非常快
$count = unpack("%32b*", $bit_string);
但如果您设置的比 2_147_483_639 高一点,您的计数将自动归零,而不会出现任何明显的警告或错误。
有什么解决办法吗?
下面的代码演示了问题
#!/usr/bin/env perl
# create a string to use as our bit vector
my $bit_string = undef;
# set bits a position 10 and 2_000_000_000
# and the apparently last valid integer position 2_147_483_639
vec($bit_string, 10, 1) = 1;
vec($bit_string, 2_000_000_000, 1) = 1;
vec($bit_string, 2_147_483_639, 1) = 1;
# get a count of the bits which are set
my $bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string: $bit_count\n");
## Bits set in bit string: 3
# check the bits at positions 10, 11, 2_000_000_000, 2_147_483_639
for my $position (10,11,2_000_000_000, 2_147_483_639) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 11 is 0
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
# Adding the next highest bit, 2_147_483_640, causes the count to become 0
# with no complaint, error or warning
vec($bit_string, 2_147_483_640, 1) = 1;
$bit_count = unpack("%32b*", $bit_string);
print("Bits set in bit string after setting bit 2_147_483_640: $bit_count\n");
## Bits set in bit string after setting bit 2_147_483_640: 0
# But the bits are still actually set
for my $position (10, 2_000_000_000, 2_147_483_639, 2_147_483_640) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 10 is 1
## Bit at 2000000000 is 1
## Bit at 2147483639 is 1
## Bit at 2147483640 is 1
# Set even higher bits
vec($bit_string, 3_000_000_000, 1) = 1;
vec($bit_string, 4_000_000_000, 1) = 1;
# verify these are also set
for my $position (3_000_000_000, 4_000_000_000) {
my $bit_value = vec($bit_string, $position, 1);
print("Bit at $position is $bit_value\n");
}
## Bit at 3000000000 is 1
## Bit at 4000000000 is 1
您可以尝试按小块数。它速度较慢,但似乎有效:
$bit_count = 0;
$bit_count += unpack '%32b*',
while $bit_string =~ /(.{1,32766})/g;
或者使用 substr 而不是 m//
稍微快一些:
$bit_count = 0;
my ($pos, $step) = (0, 2 ** 17);
$bit_count += unpack '%32b*', substr $bit_string, $step * $pos++, $step
while $pos * $step <= length $bit_string;
2 ** 17 似乎在我的机器上提供了最好的性能,但是 YMMV。
另一种可能性(较慢,顺便说一句)是为任何可能的字节做一个table位数并使用它:
my %by_bits;
for my $byte (1 ..255) {
my $bits_in_byte = sprintf('%b', $byte) =~ tr/1//; # Fix SO hiliting bug: /
$by_bits{$bits_in_byte} .= sprintf '\x%02x', $byte;
}
$bit_count = 0;
for my $count (keys %by_bits) {
$bit_count += $count * eval('$bit_string =~ tr/' . $by_bits{$count}. '//');
}
更新:
它在最近的 Perl 中工作正常。参见 Another 32-bit residual in 64-bit perl 5.18。