防止用户提供的链接中包含 XSS 攻击的正确方法是什么?

What is the correct way to prevent XSS attacks being included in user provided links?

我正在尝试修复网站上的 XSS 问题,其中用户提供的 link 被发送到服务器端,然后呈现回网页。可以在攻击者的 link 关闭 HTML 标签的地方执行 XSS 攻击,方法是在标签的末尾附加如下内容: "/><img+src/onerror%3d'alert(document.domain)'><"

我正在试验 OWASP Java HTML Sanitizer Library 但无法正常工作。

好像要破解了link。例如,如果我将此 link 输入到 LINKS 默认策略,它会破坏它:

之前:https://www.google.com/search?client=firefox-b-d&q=xss+encoding+url

之后:https://www.google.com/search?client&#61;firefox-b-d&amp;q&#61;xss&#43;encoding&#43;url

如果我将编码后的 link 粘贴到浏览器中,它不会直接将我引导至 google 搜索。

我觉得我误解了 XSS 攻击是如何对 URL 起作用的,希望能帮助理解为什么消毒剂不能像我预期的那样起作用。我希望消毒剂对“<”和“””之类的字符进行编码,但不会对“=”之类的字符进行编码。

顾名思义,HTML Sanitizer 用于清理 html 内容(尤其是生成的正文内容,javascript 等)。也就是说,如果您将经过清理的字符串放入 html 页面,它将完美运行。

只需尝试以下操作:

<html>
<body>
<a href="https://www.google.com/search?client&#61;firefox-b-d&amp;q&#61;xss&#43;encoding&#43;url">
   Click here.
<a/>
</body>
</html>

点击经过清理的 link 确实会引导您进行所需的 Google 搜索。

如 OWASP 所述

A Positive XSS Prevention Model (...) treats an HTML page like a template, with slots where a developer is allowed to put untrusted data. These slots cover the vast majority of the common places where a developer might want to put untrusted data. Putting untrusted data in other places in the HTML is not allowed. This is a "whitelist" model, that denies everything that is not specifically allowed.

Given the way browsers parse HTML, each of the different types of slots has slightly different security rules. When you put untrusted data into these slots, you need to take certain steps to make sure that the data does not break out of that slot into a context that allows code execution. In a way, this approach treats an HTML document like a parameterized database query - the data is kept in specific places and is isolated from code contexts with escaping.

您的消毒剂旨在使这些插槽成为“更安全”的地方。