为什么位掩码 (0x1F) 通常与 NFC 标签 NDEF 有效负载的字符编码字节进行 AND 运算？

Question

我正在编写一个 android 应用程序来编写 NFC 标签，我经常看到这样的例子：

private NdefRecord createTextRecord(String content){
    try {
        byte[] language;
        language = Locale.getDefault().getLanguage().getBytes("UTF-8");

        final byte[] text = content.getBytes("UTF-8");
        final int languageSize = language.length;
        final int textLength = text.length;
        final ByteArrayOutputStream payload = new ByteArrayOutputStream(1 + languageSize + textLength);

        payload.write((byte) (languageSize & 0x1F)); // <----- LOOK HERE
        payload.write(language, 0, languageSize);
        payload.write(text, 0, textLength);

        return new NdefRecord(NdefRecord.TNF_WELL_KNOWN, NdefRecord.RTD_TEXT, new byte[0], payload.toByteArray());
    }
    catch (UnsupportedEncodingException e){
        Log.e("createNdefMessage",e.getMessage());
    }
    return null;
}

注意 payload.write((byte) (languageSize & 0x1F)); 部分。 0x1F 位掩码是怎么回事？起初我以为规范只允许 5 位来描述编码的长度，但这没有意义，因为我们正在写一个完整的字节。

有关使用这种神秘 0x1F 面具的更多示例，请参阅 here and here for examples of the NDEF spec. And see here, and here。

我是不是漏掉了什么？

编辑：因为我已经回答了我自己的问题，我不完全确定我是否正确，如果其他人可以提供更好的解释或更多见解，我会 select 你的答案.

Answer 1

感谢代码中的注释 here ...

byte MASK = (byte) 0x1F;
if ((tagFirstOctet & MASK) == MASK) { // EMV book 3, Page 178 or Annex B1 (EMV4.3)

...我在 page 156 of EMV 4.3 Book 3 上找到了问题的部分答案。

好像低5位描述编码，是针对tag number，前3位描述class和object，因此：

b8 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | Meaning
---------------------------------------------------------------
 0 |  0 |    |    |    |    |    |    | Universal class
 0 |  1 |    |    |    |    |    |    | Application class
 1 |  0 |    |    |    |    |    |    | Context-specific class
 1 |  1 |    |    |    |    |    |    | Private class
   |    |  0 |    |    |    |    |    | Primitive data object
   |    |  1 |    |    |    |    |    | Constructed data object
   |    |    |  1 |  1 |  1 |  1 |  1 | See subsequent bytes
   |    |    |   Any other value <31  | Tag number

According to ISO/IEC 8825, Table 36 defines the coding rules of the 
subsequent bytes of a BER-TLV tag when tag numbers ≥ 31 are used
(that is, bits b5 - b1 of the first byte equal '11111').

b8 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | Meaning
---------------------------------------------------------------
 1 |    |    |    |    |    |    |    | Another byte follows
 0 |    |    |    |    |    |    |    | Last tag byte
   |           Any value > 0          | (Part of) tag number

因此，使用 (languageSize & 0x1F) 的建议似乎是不正确的，至少出于以下原因：

此值应表示标签号，而不是字符编码。
它假设每个标签都是 universal class 和 primitive data
如果低5位全为1（即：值为31），则格式不正确，因为下一个字节应该描述数字。

由于我已经回答了我自己的问题，而且我不完全确定我是否正确，如果其他人可以提供更好的解释或更多见解，我将 select 代替您的回答。

Answer 2

NDEF 文本记录是通用 NDEF 记录结构的一个版本，其特征在于类型名称格式（TNF 字段）代码 1（NFC 论坛分配的众所周知的记录类型名称）和类型名称（类型字段）"T" (0x54)。

对于 NFC 论坛众所周知的类型名称 "T"，NDEF 记录 PAYLOAD 的结构由 "NFC Forum Text Record Type Definition" 规范给出。

文本记录负载由一个状态字节组成，后跟一个可变长度的语言代码和实际的 UTF-8 或 UTF-16 编码文本内容。状态字节的最高有效位对于 UTF-8 编码为 0，对于 UTF-16 编码为 1。下一位保留。最低6位表示语言代码占用的字节数。位掩码 0x1F 对应于一个字节的 5 个最低有效位，与规范文本不匹配。此外，后续行写入 languageSize 字节而不应用相同的掩码，因此可能会创建不正确的 NDEF 文本记录，其中语言代码的尾部成为文本内容的一部分。

作为示例有效负载，字节序列 02656e48656c6c6f20576f726c64 以 2 字节语言代码 "en" (0x65, 0x6e) 的状态字节 0x02 开始，后跟 UTF-8 编码文本 "Hello World".

为什么位掩码 (0x1F) 通常与 NFC 标签 NDEF 有效负载的字符编码字节进行 AND 运算？

Why is a bitmask (0x1F) commonly ANDed to the character encoding bytes of NFC tag NDEF payloads?

format

android

specifications

nfc

ndef