Nginx 重写规则在编码 url 中的第一个时省略正斜杠 (/)

Nginx rewrite rule omit forward slash (/) when its first in encoded url

我正在尝试使用 Nginx (1.21.3) 重写,但它以某种方式删除了字符串中第一个 / 键。

重写规则:

#nginx not relevant conf here
location / {
    rewrite ^(.*)data/([0-9]+)/(.+)?$ processor.php?key=&data= last;
}
#nginx not relevant conf here

当我对我测试的任何 url 使用此重写规则时,没问题。当我像下面的例子那样尝试 url 时,它以某种方式省略了开头的 /

https://example.com/data/9/%2F*-%2B.%60!%40%23%24%25%5E%26*()_%2B%60-%3D%5B%5D%3B%27%5C%2C.%2F%7B%7D%3A%22%7C%3C%3E%3F

当我用通知重新加载 nginx 并且 rewrite_log=on; 我得到了输出:

2021/09/25 13:08:29 [notice] 528#528: *11710 "^(.*)data/([0-9]+)/(.+)?$" matches "/data/199/*-+.`!@#$%^&*()_+`-=[];'\,./{}:"|<>?", client: 192.168.255.107, server: localhost, request: "GET /data/199/%2F%2A-%2B.%60%21%40%23%24%25%5E%26%2A%28%29_%2B%60-%3D%5B%5D%3B%27%5C%2C.%2F%7B%7D%3A%22%7C%3C%3E%3F HTTP/2.0", host: "example.com", referrer: "https://example.com/"

PHP (8.0.10) $_GET["data"] 输出是(如你所见,没有 / 因此不是精确的数学):

*-+.`!@#$%^&*()_+`-=[];'\,./{}:"|<>?

我该如何解决?

rewritelocation 指令都适用于所谓的 normalized URI:

The matching is performed against a normalized URI, after decoding the text encoded in the “%XX” form, resolving references to relative path components “.” and “..”, and possible compression of two or more adjacent slashes into a single slash.

这意味着在第一阶段你的 URL /data/9/%2F*-%2B.%60!%40%23%24%25%5E%26*()_%2B%60-%3D%5B%5D%3B%27%5C%2C.%2F%7B%7D%3A%22%7C%3C%3E%3F 得到 URL-解码:

/data/9//*-+.`!@#$%^&*()_+`-=[];'\,./{}:"|<>?

并且在第二阶段两个相邻的斜线被压缩成一个:

/data/9/*-+.`!@#$%^&*()_+`-=[];'\,./{}:"|<>?

上面的字符串确实要接受 rewrite 指令的测试,因此导致缺少第一个 URL 编码的斜杠。但是,您可以使用包含未修改形式的请求 URI 的 $request_uri 变量。您可以使用

if ($request_uri ~ ^(?<prefix>.*/)data/(?<key>\d+)/(?<data>[^?]+)) {
    rewrite ^ ${prefix}processor.php?key=$key&data=$data;
}

要放置在服务器上下文或

下的块
location /
    if ($request_uri ~ ^(?<prefix>.*/)data/(?<key>\d+)/(?<data>[^?]+)) {
        rewrite ^ ${prefix}processor.php?key=$key&data=$data last;
    }
    ...
}

块放置在位置上下文下。