如何清理输入以避免 django 中的恶意属性?
How to sanitize input to avoid malicious attributes in django?
我想允许用户使用 post 图片,因此需要将 |safe
添加到模板标签,并使用 beautifulsoap 将一些使用 this 代码段的标签列入白名单。
但是我想知道如何避免像下面这样的潜在恶意属性?
<img src="puppy.png" onload="(function(){/* do bad stuff */}());" />
更新:
请注意,上面链接的片段有一些 XSS 缺陷,提到 here
您还需要检查属性白名单。
使用Beautiful Soup 3:
def safe_html(html):
tag_whitelist = ['img']
attr_whitelist = ['src', 'alt', 'width', 'height']
soup = BeautifulSoup(html)
for tag in soup.findAll():
if tag.name.lower() in tag_whitelist:
tag.attrs = [a for a in tag.attrs if a[0].lower() in attr_whitelist]
else:
tag.unwrap()
# scripts can be executed from comments in some cases (citation needed)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
for comment in comments:
comment.extract()
return unicode(soup)
使用Beautiful Soup 4:
def safe_html(html):
tag_whitelist = ['img']
attr_whitelist = ['src', 'alt', 'width', 'height']
soup = BeautifulSoup(html)
for tag in soup.find_all():
if tag.name.lower() in tag_whitelist:
tag.attrs = { name: value for name, value in tag.attrs.items()
if name.lower() in attr_whitelist }
else:
tag.unwrap()
# scripts can be executed from comments in some cases (citation needed)
comments = soup.find_all(text=lambda text:isinstance(text, Comment))
for comment in comments:
comment.extract()
return unicode(soup)
我想允许用户使用 post 图片,因此需要将 |safe
添加到模板标签,并使用 beautifulsoap 将一些使用 this 代码段的标签列入白名单。
但是我想知道如何避免像下面这样的潜在恶意属性?
<img src="puppy.png" onload="(function(){/* do bad stuff */}());" />
更新: 请注意,上面链接的片段有一些 XSS 缺陷,提到 here
您还需要检查属性白名单。
使用Beautiful Soup 3:
def safe_html(html):
tag_whitelist = ['img']
attr_whitelist = ['src', 'alt', 'width', 'height']
soup = BeautifulSoup(html)
for tag in soup.findAll():
if tag.name.lower() in tag_whitelist:
tag.attrs = [a for a in tag.attrs if a[0].lower() in attr_whitelist]
else:
tag.unwrap()
# scripts can be executed from comments in some cases (citation needed)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
for comment in comments:
comment.extract()
return unicode(soup)
使用Beautiful Soup 4:
def safe_html(html):
tag_whitelist = ['img']
attr_whitelist = ['src', 'alt', 'width', 'height']
soup = BeautifulSoup(html)
for tag in soup.find_all():
if tag.name.lower() in tag_whitelist:
tag.attrs = { name: value for name, value in tag.attrs.items()
if name.lower() in attr_whitelist }
else:
tag.unwrap()
# scripts can be executed from comments in some cases (citation needed)
comments = soup.find_all(text=lambda text:isinstance(text, Comment))
for comment in comments:
comment.extract()
return unicode(soup)