location.href 在带有变音符号 (ü) 的域上报告不同的域

location.href on a domain with umlaut (ü) reports different domain

访问以下域: https://obs.bürgerhaus.de

在浏览器控制台中,如果我选中 document.location.href,我将返回以下内容:

> document.location.href
"https://obs.xn--brgerhaus-q9a.de/"

为什么这个值与实际域不同?这是某种类型的 url 编码还是什么?如何获取其中包含变音符号的原始域?

The Domain Name System, which performs a lookup service to translate user-friendly names into network addresses for locating Internet resources, is restricted in practice1 to the use of ASCII characters, a practical limitation that initially set the standard for acceptable domain names.

(参见:https://en.wikipedia.org/wiki/Internationalized_domain_name

正如文章所述,我们日常使用的域在技术上仅限于 ASCII 字符,为了支持更多字符,unicode 域被编码为所谓的 Punycode(参见 RFC:https://www.ietf.org/rfc/rfc3492.txt

访问带有变音符号(或类似符号)的网站将强制浏览器对此进行编码。例如,http://öbb.at is transformed to http://xn--bb-eka.at. The transformed form is called ASCII Compatible Encoding (ACE) made up of the four character prefix ( xn-- ) and the punycode representation of Unicode characters. See more details here ...

要解析它,您可以查看:

Punycode JS on GitHub

Solution from some - Whosebug