TextEncoder 生成 UTF-8 而不是请求字符集编码

Question

作为将我的 Thunderbird 扩展过渡到 Thunderbird 60 的一部分，我需要从使用 nsIScriptableUnicodeConverter（如果你不了解 Mozilla，请不要介意它是什么）切换到更流行的、支持多浏览器的，文本解码器和文本编码器。问题是，他们的行为不是我所期望的。

具体来说，假设我的字符串 str 包含“ùìåí”（当然没有引号）。现在，当我运行:

undecoded_str = new TextEncoder("windows-1252").encode(str);

我希望得到序列

F9, EC, E5, ED, 2C

（5 个字符中每个字符的 1 个八位字节 windows-1252 值）。但我实际得到的是：

C3, B9, C3, AC, C3, A5, C3, AD, 2C

这似乎是字符串的UTF-8编码。为什么会这样？

Answer 1

令人恼火的是，许多浏览器在 TextEncoder（和 TextDecoder）中有多个字符集编码 simply dropped support：

Note: Firefox, Chrome and Opera used to have support for encoding types other than utf-8 (such as utf-16, iso-8859-2, koi8, cp1261, and gbk). As of Firefox 48 (ticket), Chrome 54 (ticket) and Opera 41, no other encoding types are available other than utf-8, in order to match the spec. In all cases, passing in an encoding type to the constructor will be ignored and a utf-8 TextEncoder will be created (the TextDecoder still allows for other decoding types).

妈的！

TextEncoder 生成 UTF-8 而不是请求字符集编码

TextEncoder produces UTF-8 instead of request charset encoding

thunderbird

character-encoding