配置 uWSGI 不解释 PATH_INFO

Configuring uWSGI to not interpret PATH_INFO

如何配置 uwsgi 以 PATH_INFO 形式不加修改地传入请求路径? IE。如果有请求 https://example.com/foo%5F/../bar?x=y,我希望 PATH_INFO 字面意思是 /foo/../%5Fbar,而不是 /_bar

uWSGI documentation saysuWSGI 能够以很多高级方式重写请求变量,但我找不到任何方法来设置单个请求变量,至少不是不修改uwsgi的源码。

我想这样做的原因是我有一个前端应用程序,它接受用户输入,然后向 http://backend.app/get/USER_INPUT 发送请求。麻烦的是,中间有一个uwsgi,当用户输入../admin/delete-everything时,请求转到http://backend.app/admin/delete-everything!
(我希望的这个 uwsgi 更改不会是唯一的修复方法;前端应用程序当然应该验证用户输入,并且后端应用程序首先不应向前端应用程序提供 /admin。但作为防御措施-深入,我希望我的请求不加修改地通过 uwsgi。)

我是 运行 没有 nginx 的裸 uWSGI,即 uwsgi --http 0.0.0.0:8000 --wsgi-file myapp/wsgi.py --master --processes 8 --threads 2

就其价值而言,调查 PATH_INFO 的后端应用程序是 Django。

所以您的问题与 uwsgiDjango 本身几乎无关。为了演示这个问题,我创建了一个简单的 flask 应用程序,其中包含一个 catch all handler

from flask import Flask
app = Flask(__name__)

@app.route('/', defaults={'path': ''})
@app.route('/<path:path>')
def catch_all(path):
    return 'You want path: %s' % path

if __name__ == '__main__':
    app.run()

现在当你运行这个并发出一个curl请求时

$ curl -v http://127.0.0.1:5000/tarun/../lalwani
* Rebuilt URL to: http://127.0.0.1:5000/lalwani
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 5000 (#0)
> GET /lalwani HTTP/1.1
> Host: 127.0.0.1:5000
> User-Agent: curl/7.54.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/html; charset=utf-8
< Content-Length: 22
< Server: Werkzeug/0.15.2 Python/3.7.3
< Date: Fri, 26 Jul 2019 07:45:16 GMT
<
* Closing connection 0
You want path: lalwani%

如您所见,服务器从来没有机会知道我们请求了这个。现在让我们再做一次并要求 curl 不要篡改 url

$ curl -v --path-as-is http://127.0.0.1:5000/tarun/../lalwani
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 5000 (#0)
> GET /tarun/../lalwani HTTP/1.1
> Host: 127.0.0.1:5000
> User-Agent: curl/7.54.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/html; charset=utf-8
< Content-Length: 31
< Server: Werkzeug/0.15.2 Python/3.7.3
< Date: Fri, 26 Jul 2019 07:48:17 GMT
<
* Closing connection 0
You want path: tarun/../lalwani%

现在您可以看到我的应用确实收到了实际路径。现在让我们在浏览器中查看相同的情况,应用程序甚至 运行ning

尽管我的服务甚至不是 运行ning,但浏览器本身将调用重构为 /lalwani 而不是 /tarun/../lalwani。因此,除非您使用的客户端支持禁用 url 源解析

,否则您最终无法解决问题,甚至无法解决问题

我之前的回答适用于在源头进行 url 解析的客户端。这个答案是适用的,当你真的能得到正确的请求时。

wsgi.pyuwsgi 运行 和 application 对象称为可调用对象。这在 Django 的情况下是 WSGIHanlder,它具有以下代码

class WSGIHandler(base.BaseHandler):
    request_class = WSGIRequest

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.load_middleware()

    def __call__(self, environ, start_response):
        set_script_prefix(get_script_name(environ))
        signals.request_started.send(sender=self.__class__, environ=environ)
        print(environ)
        request = self.request_class(environ)
        response = self.get_response(request)

        response._handler_class = self.__class__

        status = '%d %s' % (response.status_code, response.reason_phrase)
        response_headers = [
            *response.items(),
            *(('Set-Cookie', c.output(header='')) for c in response.cookies.values()),
        ]
        start_response(status, response_headers)
        if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
            response = environ['wsgi.file_wrapper'](response.file_to_stream)
        return response

我创建了一个示例视图来测试它

from django.http import HttpResponse


def index(request, **kwargs):
    return HttpResponse("Hello, world. You're at the polls index. " + request.environ['PATH_INFO'])

def index2(request, **kwargs):
    return HttpResponse("Hello, world. You're at the polls index2. " + request.environ['PATH_INFO'])

并使用以下代码注册它们

from django.urls import include, path
from polls.views import index2, index

urlpatterns = [
    path('polls2/', index2, name='index2'),
    path('polls2/<path:resource>', index2, name='index2'),
    path('polls/', index, name='index'),
    path('polls/<path:resource>', index, name='index'),
]

所以你需要的是覆盖这个class。下面是一个例子

import django

from django.core.handlers.wsgi import WSGIHandler


class MyWSGIHandler(WSGIHandler):
    def get_response(self, request):
        request.environ['ORIGINAL_PATH_INFO'] = request.environ['PATH_INFO']
        request.environ['PATH_INFO'] = request.environ['REQUEST_URI']

        return super(MyWSGIHandler, self).get_response(request)


def get_wsgi_application():
    """
    The public interface to Django's WSGI support. Should return a WSGI
    callable.

    Allows us to avoid making django.core.handlers.WSGIHandler public API, in
    case the internal WSGI implementation changes or moves in the future.
    """
    django.setup()
    return MyWSGIHandler()

application = get_wsgi_application()

完成后可以看到如下结果

$ curl --path-as-is "http://127.0.0.1:8000/polls/"
Hello, world. You're at the polls index. /polls/

$ curl --path-as-is "http://127.0.0.1:8000/polls2/"
Hello, world. You're at the polls index2. /polls2/

$ curl "http://127.0.0.1:8000/polls2/../polls/"
Hello, world. You're at the polls index. /polls/

$ curl --path-as-is "http://127.0.0.1:8000/polls2/../polls/"
Hello, world. You're at the polls index. /polls2/../polls/%

如您所见,对 PATH_INFO 的更改不会改变选择的视图。因为 polls2 仍然选择 index 函数

进一步挖掘后,我意识到还有另一个 pathpath_info 变量。使用 path_info

选择相同的 class

所以我们像下面这样更新我们的函数

class MyWSGIHandler(WSGIHandler):
    def get_response(self, request):
        request.environ['ORIGINAL_PATH_INFO'] = request.environ['PATH_INFO']
        request.environ['PATH_INFO'] = request.environ.get('REQUEST_URI', request.environ['ORIGINAL_PATH_INFO'])
        request.path = request.environ['PATH_INFO']
        request.path_info = request.environ.get('REQUEST_URI', request.environ['PATH_INFO'])
        return super(MyWSGIHandler, self).get_response(request)

这样修改后,我们得到了想要的结果

$ curl --path-as-is "http://127.0.0.1:8000/polls2/../polls/"
Hello, world. You're at the polls index2. /polls2/../polls/