我如何处理 Django 的 csrf 中间件中的 utf-8 与 punycode 问题?
How do I handle utf-8 vs. punycode issues in Django's csrf middleware?
我有一个包含 non-ascii 个字符的域,类似于 http://blå.no 该域是使用其等效的 punycode 注册的:
xn--bl-zia.no
这也是在 Apache 虚拟主机中设置的:
<VirtualHost *:443>
ServerName xn--bl-zia.no
...
我看到的问题来自包含以下内容的请求:
'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko',
'HTTP_HOST': 'xn--bl-zia.no',
'SERVER_NAME': 'xn--bl-zia.no',
'HTTP_REFERER': 'https://bl\xc3\xa5.no/login/ka/?next=/start-exam/participant-login/',
'HTTP_X_REQUESTED_WITH': 'XMLHttpRequest',
即。 referer 作为 utf-8 而不是 punycode 发送。我得到的例外是:
Traceback (most recent call last):
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/core/handlers/base.py", line 153, in get_response
response = callback(request, **param_dict)
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/utils/decorators.py", line 87, in _wrapped_view
result = middleware.process_view(request, view_func, args, kwargs)
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/middleware/csrf.py", line 157, in process_view
reason = REASON_BAD_REFERER % (referer, good_referer)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128)
csrf.py
中的相关代码为:
good_referer = 'https://%s/' % request.get_host()
if not same_origin(referer, good_referer):
reason = REASON_BAD_REFERER % (referer, good_referer)
(get_host()
使用请求中的 SERVER_NAME
)
是否有原生的 Django 方法来处理这个问题,或者我是否需要编写一个中间件来将 utf-8 转换为引荐来源网址的域部分中的 punycode header?
这是一个中间件解决方案..
import urlparse
class PunyCodeU8RefererFixerMiddleware(object):
def process_request(self, request):
servername = request.META['SERVER_NAME']
if 'xn--' not in servername:
return None
referer = request.META.get("HTTP_REFERER")
if not referer:
return None
url = urlparse.urlparse(referer)
try:
netloc = url.netloc.decode('u8')
except UnicodeDecodeError:
return None
def isascii(txt):
return all(ord(ch) < 128 for ch in txt)
netloc = '.'.join([
str(p) if isascii(p) else 'xn--' + p.encode('punycode')
for p in netloc.split('.')
])
url = url._replace(netloc=netloc)
request.META['HTTP_REFERER'] = urlparse.urlunparse(url)
return None
当它检测到它不能做任何有用的事情时,它会尝试尽早退出。当然必须在csrf中间件之前安装。
我有一个包含 non-ascii 个字符的域,类似于 http://blå.no 该域是使用其等效的 punycode 注册的:
xn--bl-zia.no
这也是在 Apache 虚拟主机中设置的:
<VirtualHost *:443>
ServerName xn--bl-zia.no
...
我看到的问题来自包含以下内容的请求:
'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko',
'HTTP_HOST': 'xn--bl-zia.no',
'SERVER_NAME': 'xn--bl-zia.no',
'HTTP_REFERER': 'https://bl\xc3\xa5.no/login/ka/?next=/start-exam/participant-login/',
'HTTP_X_REQUESTED_WITH': 'XMLHttpRequest',
即。 referer 作为 utf-8 而不是 punycode 发送。我得到的例外是:
Traceback (most recent call last):
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/core/handlers/base.py", line 153, in get_response
response = callback(request, **param_dict)
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/utils/decorators.py", line 87, in _wrapped_view
result = middleware.process_view(request, view_func, args, kwargs)
File "/srv/cleanup-project/venv/dev/lib/python2.7/site-packages/django/middleware/csrf.py", line 157, in process_view
reason = REASON_BAD_REFERER % (referer, good_referer)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128)
csrf.py
中的相关代码为:
good_referer = 'https://%s/' % request.get_host()
if not same_origin(referer, good_referer):
reason = REASON_BAD_REFERER % (referer, good_referer)
(get_host()
使用请求中的 SERVER_NAME
)
是否有原生的 Django 方法来处理这个问题,或者我是否需要编写一个中间件来将 utf-8 转换为引荐来源网址的域部分中的 punycode header?
这是一个中间件解决方案..
import urlparse
class PunyCodeU8RefererFixerMiddleware(object):
def process_request(self, request):
servername = request.META['SERVER_NAME']
if 'xn--' not in servername:
return None
referer = request.META.get("HTTP_REFERER")
if not referer:
return None
url = urlparse.urlparse(referer)
try:
netloc = url.netloc.decode('u8')
except UnicodeDecodeError:
return None
def isascii(txt):
return all(ord(ch) < 128 for ch in txt)
netloc = '.'.join([
str(p) if isascii(p) else 'xn--' + p.encode('punycode')
for p in netloc.split('.')
])
url = url._replace(netloc=netloc)
request.META['HTTP_REFERER'] = urlparse.urlunparse(url)
return None
当它检测到它不能做任何有用的事情时,它会尝试尽早退出。当然必须在csrf中间件之前安装。