提取 "asterisk-prefixed" 后缀的域

Extracting domain for "asterisk-prefixed" suffixes

我使用 tldextract(版本 2.2.2)从 URL 中提取 subdomain/domain/suffix。

我最近注意到一个令我惊讶的结果:

>>> from tldextract import extract
>>> extract('http://althawrah.ye/archives/597366')
ExtractResult(subdomain='', domain='', suffix='althawrah.ye')

althawrah 没有被选为域,而是被选为后缀的一部分。 这是为什么?

仔细观察了一下,我注意到 Public Suffice List 本身 .ye 是少数使用前导星号的后缀之一,例如

// fj : https://en.wikipedia.org/wiki/.fj
*.fj
// ye : http://www.y.net.ye/services/domain_name.htm
*.ye

这里的言外之意就是这些后缀不允许在后缀下直接注册域名,而是必须注册为三级域名。但是,不是 http://althawrah.ye/; that is, althawrah is not listed as a second-level domain of .ye 的情况。那么,这是怎么回事?

根据列表的历史记录和更新过程的描述,似乎也门条目完全错误或已过时。条目 was added before 2007 (when the list was migrated from CVS to git), while the list guidelines 指出:

Changes [for ICANN Domains] need to either come from a representative of the registry (authenticated in a similar manner to below) or be from public sources such as a registry website.

website linked in the list (which hasn't changed since 2002) gives little detail but does mention URLs of the format www.yourcompany.com.ye, which is where the *.ye rule presumably came from. IANA's root zone database specifies TeleYemen as the current TLD manager, but there is no mention of domain registration on their site. The Wikipedia list of supposed "second level domains" was added in 2008 by a Canadian user linking to a since-deleted website of a company called phpcomet (archived here) 声称在列出的二级域中出售域。但是,google 搜索 "site:ye" 会显示这些域之外的大量站点(例如 press24.ye、ndc.ye)并且无法给出其中许多站点的任何结果(me.ye, co.ye, ltd.ye, plc.ye).

我不确定如何更新官方列表,但如果正确的条目是这样的,我不会感到惊讶:

ye
com.ye
edu.ye
gov.ye
org.ye

感谢 TeleYemen 和项目维护者,这些更改已在 pull request 1189 中合并到 publicsuffix/list。

该列表现在明确指定子域并删除 * 星号。