Python 输入清理
Python Input Sanitization
我需要做一些非常快速的输入清理,我想基本上将所有 <, >
转换为 <, >
。
我想获得与 '<script></script>'.replace('<', '<').replace('>', '>')
相同的结果,而不必多次迭代字符串。我知道 maketrans
与 str.translate
结合使用(即 http://www.tutorialspoint.com/python/string_translate.htm),但这只能将 1 个字符转换为另一个字符。换句话说,一个人不能做这样的事情:
inList = '<>'
outList = ['<', '>']
transform = maketrans(inList, outList)
是否有 builtin
函数可以在单次迭代中完成此转换?
我想使用 builtin
功能而不是外部模块。我已经知道 Bleach
。
您可以使用cgi.escape()
import cgi
inlist = '<>'
transform = cgi.escape(inlist)
print transform
输出:
<>
https://docs.python.org/2/library/cgi.html#cgi.escape
cgi.escape(s[, quote]) Convert the characters '&', '<' and '>' in
string s to HTML-safe sequences. Use this if you need to display text
that might contain such characters in HTML. If the optional flag quote
is true, the quotation mark character (") is also translated; this
helps for inclusion in an HTML attribute value delimited by double
quotes, as in . Note that single quotes are never
translated.
您可以定义自己的函数,循环一次字符串并替换您定义的任何字符。
def sanitize(input_string):
output_string = ''
for i in input_string:
if i == '>':
outchar = '>'
elif i == '<':
outchar = '<'
else:
outchar = i
output_string += outchar
return output_string
然后调用
sanitize('<3 because I am > all of you')
产量
'<3 because I am > all of you'
使用 html.escape() - cgi.escape() 在 Python 3
中已弃用
import html
input = '<>&'
output = html.escape(input)
print(output)
<>&
我需要做一些非常快速的输入清理,我想基本上将所有 <, >
转换为 <, >
。
我想获得与 '<script></script>'.replace('<', '<').replace('>', '>')
相同的结果,而不必多次迭代字符串。我知道 maketrans
与 str.translate
结合使用(即 http://www.tutorialspoint.com/python/string_translate.htm),但这只能将 1 个字符转换为另一个字符。换句话说,一个人不能做这样的事情:
inList = '<>'
outList = ['<', '>']
transform = maketrans(inList, outList)
是否有 builtin
函数可以在单次迭代中完成此转换?
我想使用 builtin
功能而不是外部模块。我已经知道 Bleach
。
您可以使用cgi.escape()
import cgi
inlist = '<>'
transform = cgi.escape(inlist)
print transform
输出:
<>
https://docs.python.org/2/library/cgi.html#cgi.escape
cgi.escape(s[, quote]) Convert the characters '&', '<' and '>' in string s to HTML-safe sequences. Use this if you need to display text that might contain such characters in HTML. If the optional flag quote is true, the quotation mark character (") is also translated; this helps for inclusion in an HTML attribute value delimited by double quotes, as in . Note that single quotes are never translated.
您可以定义自己的函数,循环一次字符串并替换您定义的任何字符。
def sanitize(input_string):
output_string = ''
for i in input_string:
if i == '>':
outchar = '>'
elif i == '<':
outchar = '<'
else:
outchar = i
output_string += outchar
return output_string
然后调用
sanitize('<3 because I am > all of you')
产量
'<3 because I am > all of you'
使用 html.escape() - cgi.escape() 在 Python 3
中已弃用import html
input = '<>&'
output = html.escape(input)
print(output)
<>&