为什么 Node.js 允许这个看似无效的字符序列？

Question

我想看看是否有办法区分文件中的 return（下一行）和键入的换行符（文件中的\n）。当我在 REPL 中玩耍时，我在比较中打错了字，令我惊讶的是 Node.js 并没有在意。它甚至给出了我认为未定义的行为，除非我在 Node.js 多年的亲密关系中完全错过了一些东西。而且我在玩的过程中还发现了一些其他的东西，我会在下面问这些。

代码在post底部。

主要问题是：

为什么 Node.js 没有抱怨最后两次比较（==+ 和 ==-）的语法？那是某种有效的语法吗？为什么在没有尾随 +/- 时它使比较为真，它是假的？（在 post 条评论中更新）

主要的边题是：

为什么 'Buffer separate self comparison' 和 'Buffer comparison' 结果是假的，而其他所有测试都是真的？为什么缓冲区不与相同数据的缓冲区进行比较？

还有：

如何可靠地区分文件中的 return 和如上所述的键入的换行符？

代码如下：


const nl = '\n'
const newline = `
`

const NL = Buffer.from('\n')
const NEWLINE = Buffer.from(`
`)
const NEWLINE2 = Buffer.from(`
`)
console.log("Buffer separate self comparison: "+(NEWLINE2 == NEWLINE))
console.log("Buffer comparison: "+(NL == NEWLINE))
console.log("Non buffer comparison: "+(nl == newline))
console.log("Buffer self comparison 1: "+(NL == NL))
console.log("Buffer self comparison 2: "+(NEWLINE == NEWLINE))
console.log("Buffer/String comparison 1: "+(nl == NL))
console.log("Buffer/String comparison 2: "+(newline == NEWLINE))
console.log("Buffer/String cross comparison 1: "+(nl == NEWLINE))
console.log("Buffer/String cross comparison 2: "+(newline == NL))
console.log("Buffer toString comparison: "+(NL.toString() == NEWLINE.toString()))
console.log("Strange operator comparison 1: "+(NL ==+ NEWLINE))
console.log("Strange operator comparison 2: "+(NL ==- NEWLINE))

Answer 1

NEWLINE2 == NEWLINE (false)
NL == NEWLINE (false)

An expression comparing Objects is only true if the operands reference the same Object. src

事实并非如此：它们是两个独立的对象，即使它们的初始值相似，所以结果是 false。

编辑： 如果你想比较两个的 values 而不是 identity缓冲区，可以使用Buffer.compare。 Buffer.compare(NEWLINE2, NEWLINE) === 0 表示两者相等。

nl == newline (true)

Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions. src

字符串相等，所以true。

NL == NL (true)
NEWLINE == NEWLINE (true)

An expression comparing Objects is only true if the operands reference the same Object. src

nl == NL (true)
newline == NEWLINE (true)
nl == NEWLINE (true)
newline == NL (true)

这里发生的是你在比较两种不同的类型。一个是字符串，另一个是对象。

Each of these operators will coerce its operands to primitives before a comparison is made. If both end up as strings, they are compared using lexicographic order, otherwise they are cast to numbers to be compared. A comparison against NaN will always yield false. src

Buffer 有一个 toString 方法，所以调用它是为了在 == 的两边有相同的基本类型。此方法的结果是包含 \n 的字符串。 '\n' == '\n' 是 true.

顺便说一句，如果您的比较是 NEWLINE == 0，那么会发生这种情况：

' 1 ' == 1 等于真。转换时，空格会被丢弃，因此 ' 1 ' 将被转换为值为 1 的数字。结果比较结果为 1 == 1。

只有空白字符的字符串将被强制转换为 0。 Buffer 首先转换为字符串，然后转换为整数，因此会发生这种情况：0 == 0，所以结果会是 true。

NL.toString() == NEWLINE.toString() (true)

Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions. src

字符串相等，所以true。

NL ==+ NEWLINE (true)
NL ==- NEWLINE (true)

这与 == +NEWLINE 相同。您正在使用一元 + 或 - 显式转换为数字。这里有趣的是，您在转换后进行这些比较：0 == +0 和 0 == -0。正负零are considered equal.

None 这里的行为是 'undefined'.

除了 "huh, that's neat" 之外，确实 很少有理由不使用严格的相等运算符 (===)，它不会将事物转换为相同的原语.

关于你的问题：

文件中的换行符 (\n) 与自键入字符串中的换行符 ('\n') 相同。它们都是 ASCII 或 Unicode 字符 0x0A，按字节计算。

有些文档同时包含换行符和回车符 return。换行符则由两个字符组成：0x0D 0x0A（或\r\n）。

为什么 Node.js 允许这个看似无效的字符序列？

Why does Node.js allow this seemingly invalid character sequence?

undefined-behavior

comparison-operators

node.js