如何判断一个32-bit/64-bit的值是否有某个16位的值？

Question

问题

考虑以下 32 位和 64 位值：

  uint32_t iVal32 = 0x AB CD 12 34;
  uint64_t iVal64 = 0x AB CD 12 34 56 78 CA BE;

如何判断iVal32是否包含某个16位（字）值？ ~~假设 0xCD12.~~
如何判断iVal64是否包含某个16位（字）值？ ~~假设 0x3456.~~

更新 1（稍后添加）

~~0xCD12~~ 要检查的单词值可能在每个单词边界的 iVal32 中的任何位置。
~~0x3456~~ 要检查的单词值可能在每个单词边界的 iVal64 中的任何位置。

更新 2（稍后添加）

我承认问题中有一个荒谬的错误。在我之前的示例中，要检查的词值不在 iVal32 和 iVal64 中的词边界内。因此，我的修正是：

对于iVal32，要检查的字值可以是0xABCD或0x1234。因此，例如，不应在 iVal32.

0xCD12

对于iVal64，要检查的字值可以是以下之一：0xABCD或0x1234或0x5678或0xCABE。因此，例如，0xCD12 或 0x3456 或 0x78CA 不应在 iVal64.

备注

解决方案旨在用于在 Unicode 字符串中搜索 16 位字符的函数。在 x86 中，该函数一次读取两个字符；在 x64 中，该函数一次读取四个字符。
我问这个是因为我注意到 glibc strchr()（适用于 8 位字符）的实现试图一次测试一个长字，但我没有很好地理解代码。

Answer 1

bool contains (uint32_t haystack, uint16_t needle)
{
    return ((haystack & 0xffff) == needle) ||
           (((haystack >> 16) & 0xffff) == needle);
}

bool contains (uint64_t haystack, uint16_t needle)
{
    return ((haystack & 0xffff) == needle) ||
           (((haystack >> 16) & 0xffff) == needle) ||
           (((haystack >> 32) & 0xffff) == needle) ||
           (((haystack >> 48) & 0xffff) == needle);
}

Answer 2

不完全清楚 OP 想要什么，但根据描述和后续评论，大致如下：

#include <iostream>
#include <stdint.h>
using namespace std;

bool contains(uint16_t item, uint32_t source);
bool contains(uint16_t item, uint64_t source);

int main()
{
    uint32_t iVal32 = 0xABCD1234;
    uint64_t iVal64 = 0xABCD12345678CABELL;

    cout << contains(0x1234, iVal32) << endl;
    cout << contains(0xABCD, iVal32) << endl;

    cout << contains(0xABCD, iVal64) << endl;
    cout << contains(0x1234, iVal64) << endl;
    cout << contains(0x5678, iVal64) << endl;
    cout << contains(0xCABE, iVal64) << endl;   
    return 0;
}

bool contains(uint16_t item, uint32_t source)
{
    for (int i = 16 ; i >= 0 ; i -= 16)
    {
        if (((source << i) >> 16) == item)
        {
           return true;
        }    
    }
    return false;
}

bool contains(uint16_t item, uint64_t source)
{
    for(int i = 48 ; i >= 0 ; i -= 16)
    {
        if (((source << i) >> 48) == item)
        {
            return true;
        }
    }
    return false;
}

Answer 3

这是 32 位的可能解决方案。 find_pos returns 16bit值在32bit值中的位位置，或者 -1 如果 32 位值不包含 16 位值。

#include <stdio.h>
#include <stdint.h>

uint32_t iVal32 = 0xABCD1234;
uint64_t iVal64 = 0xABCD12345678CABE;

int find_pos(uint32_t value, uint16_t pattern)
{
    int i;

    for (i = 0; i < 32; i+=4) {
        uint32_t v = (value & (0xFFFF << i));
        uint32_t p = (pattern << i);
        if (v == p)
            return i;
    }

    return -1;
}

int main ()
{
    printf("i = %i\n", find_pos(iVal32, 0xCD12));

    return 0;
}

更新：

这是 32 位和 64 位的可能解决方案，使用宏。

#include <stdio.h>
#include <stdint.h>

uint32_t iVal32 = 0xABCD1234;
uint64_t iVal64 = 0xABCD12345678CABE;

#define find_pos(value, pattern, width)             \
({                                                  \
    int i;                                          \
    int found = 0;                                  \
                                                    \
    for (i = 0; i < width; i+=4) {                  \
        typeof(value) v = (value & (0xFFFF << i));  \
        typeof(value) p = (pattern << i);           \
        if (v == p) {                               \
            found = 1;                              \
            break;                                  \
        }                                           \
    }                                               \
                                                    \
    if (!found)                                     \
        i = -1;                                     \
    i;                                              \
})

#define find_pos32(value, pattern) \
    find_pos(value, pattern, 32)

#define find_pos64(value, pattern) \
    find_pos(value, pattern, 64)

int main ()
{
    printf("i = %i\n", find_pos32(iVal32, 0xCD12));
    printf("i = %i\n", find_pos64(iVal64, 0x678C));

    return 0;
}

Answer 4

David 的答案略有不同，非分支。

bool contains (uint32_t haystack, uint16_t needle)
{
    uint32_t h1 = (haystack ^ needle) & 0xFFFF;
    haystack >>= 16;
    uint32_t h2 = haystack ^ needle; // No need for mask.
    // If and only if needle was in haystack, h1 or h2 will now be 0
    return (h1*h2) == 0;
}

两个 XOR、一个移位、一个掩码、一个乘法和一个可能免费的比较。只需要 3 个寄存器。对 x64 的明显扩展计算 h1*h2*h3*h4 但计算运行乘积可能更有效，因此我们不需要寄存器来获取 4 个中间结果。（一个体面的优化器也会这样做）。

由于没有快捷方式求值，这有时可能需要更多指令，但缺少快捷方式求值也意味着没有分支。我敢打赌，对于字符串搜索来说，这是一个净赢。

有点跑题了，这个问题提到了x86。 SSE 是那里的逻辑选项，尤其是对于 64 位。

如何判断一个32-bit/64-bit的值是否有某个16位的值？

How to determine whether a 32-bit/64-bit value has a certain 16-bit value?

c

c++

bit-manipulation

问题

更新 1（稍后添加）

更新 2（稍后添加）

备注