为什么双引号 urlencoded 为 %22?

Why are double-quotes urlencoded as %22?

据我所知,存在URL编码是因为URLs只支持ASCII编码。但是既然"已经在ASCIItable中了,为什么要在URL编码中编码为%22呢?

" 字符属于 RFC 1738(统一资源定位器)第 2.2 节(URL 字符编码问题)的“不安全”部分。收录原因是:

The quote mark (""") is used to delimit URLs in some systems.

我能想到的一个例子是 HTML 属性。例如,如果您有一个带有 href 属性的 <a> 标记,您可能会将 URL 括在双引号之间。如果 " 字符未被引用,则标记无效:

<a href="https://example.com/this"should-be-quoted">...</a>

RFC 还接着说:

All unsafe characters must always be encoded within a URL.


其他不安全字符的一些示例:

The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text.

The character "%" is unsafe because it is used for encodings of other characters.

The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it.

URLs only support ASCII encoding

事实并非如此。例如,URL 不支持空格或 /&?,即使它们是有效的 ASCII 字符,因为它们在 URL 中具有特殊含义s.

URL 中的有效字符是:

  • A-Z
  • a-z
  • 0-9
  • -
  • _
  • .
  • ~

不支持其他字符。不支持某些,例如空格和制表符,因为它们在通常使用 URL 的协议(例如 HTTP)中具有特殊含义。其他如 ?& 不受支持,因为它们在 URL 语法中具有特殊含义。