如何将 utf-16 缓冲区与字符串进行比较？

Question

我有一个看起来像这样的缓冲区：

<Buffer 50 00 6f 00 77 00 65 00 72 00 50 00 6f 00 69 00 6e 00 74 00 20 00 44 00 6f 00 63 00 75 00 6d 00 65 00 6e 00 74 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...>

当使用 utf-16le 读取此缓冲区或使用 .toString() 打印此缓冲区时，这是 "PowerPoint Document"

但是，如果我这样做：

var stream = fs.createReadStream('test.ppt',{start:1152,end:1215,encoding:'utf16le'})
    stream
    .on('data',function(chunk){
        console.log(chunk.toString().trim());
        console.log(chunk.toString().trim().length);
        if(chunk.toString().trim() === "PowerPoint Document"){
            console.log('yay');
        }else{
            console.log('boo');
        }

这会打印：

PowerPoint Document  
32
boo

我如何比较这些？

Answer 1

您的字符串以空值结尾。由于字节 1152-1215 看起来像

0480h: 50 00 6F 00 77 00 65 00 72 00 50 00 6F 00 69 00  P.o.w.e.r.P.o.i. 
0490h: 6E 00 74 00 20 00 44 00 6F 00 63 00 75 00 6D 00  n.t. .D.o.c.u.m. 
04A0h: 65 00 6E 00 74 00 00 00 00 00 00 00 00 00 00 00  e.n.t........... 
04B0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

末尾的所有空字节都将转换为 \u0000，因此您实际上是在比较：

'PowerPoint Document\u0000\u0000...' === 'PowerPoint Document'

...这显然是错误的。

在字节 1189 处结束。

顺便说一句：流 data 事件不能保证触发您请求的所有数据。它可能只用部分数据触发多次（这就是它被称为 chunk 的原因）。您必须缓冲所有 data 事件，直到获得 end 事件，然后进行比较。

如何将 utf-16 缓冲区与字符串进行比较？

How do I compare a utf-16 buffer to a string?

javascript

utf-16

node.js