NGINX 位置块正则表达式和代理传递

Question

希望你们一切都好

我是 NGINX 的初学者，我想了解以下 NGINX 配置文件块。如果有人能帮助我理解这个块，我将不胜感激。

location ~ ^/search/google(/.*)?$ {
  set $proxy_uri $is_args$args;
  proxy_pass http://google.com$proxy_uri;
}

从下面的 SO 文章 () 中，我了解到：

对于location ~ ^/search/google(/.*)?$
- ~ 表示将执行正则表达式搜索（区分大小写）
- ^/search/google 表示路由应以 /search/google 开头（例如 http://<ip or domain>/search/google。如果我们在末尾尾随 /（例如 http://<ip or domain>/search/google/ 而不是 http://<ip or domain>/search/google
- (/.*)?$ 这是我有点困惑的部分。
  - 为什么在这种情况下使用 () 组？使用组的常见用例是什么？
  - 为什么在这种情况下使用 ?？ .* 不是已经包含零个或多个字符吗，为什么我们还需要 ?
  - 我们是否可以简单地删除 () 和 ?（例如 /search/google/.*$）以获得与原始行为相同的行为？
set $proxy_uri $is_args$args;
- 我知道我们正在设置一个名为 proxy_uri
- </code>会被替换成什么，有时候有人也包括<code>等等？
- 我认为$is_args$args意味着如果有一个查询字符串（即http://<ip or domain>/search/google?fruit=apple，$is_args$args将被替换为?fruit=apple
proxy_pass http://google.com$proxy_uri
- 我假设它只是将用户重定向到 http://google.com$proxy_uri???与 http 重定向 301 相同？？？

非常感谢您！

Answer 1

作为一个非英语母语的人，我以为有人会用比我更完美的英语回答你的问题，但由于过去五天没有人回答，我会尝试自己回答。

~ means that it will perform regex search (case sensitive)

我认为更正确的说法是“根据正则表达式模式执行匹配”。

^/search/google means that the route should start with /search/google (e.g. http://<ip or domain>/search/google. Is there any difference if we have trailing / at the end (e.g. http://<ip or domain>/search/google/ instead of http://<ip or domain>/search/google

下面会一一解答

why use () group in this case? What's the common use case of using group?

这是一个 numbered capturing group. Content of the string matched this group can be referenced later as </code>. Second numbered capture group, being present in the regex pattern, can be referenced as <code> and so on. There is also the named capture groups exists, when you can use your own variable name instead of </code>, <code>, etc. A good example of using named capture groups is given at this ServerFault 线程。

顺便说一句，您引用的 answer 提到了编号的捕获组（但不是命名的捕获组）。

why use ? in this case? Isn't .* already includes any char zero or more, why do we still need ?

您是否注意到我们的捕获组是 (/.*)，而不是 (.*)？这样它将匹配 /search/google/<any suffix> 但不匹配 /search/googles 等。问号使此捕获组可选（/search/google 也将匹配我们的正则表达式模式）。

Can we simply remove () and ? such as /search/google/.*$ to get the same behavior as the original one?

不，因为我们稍后需要 </code> 值。如果您正确理解以上所有信息，您应该看到它可以是 <code>/<any suffix> 或空字符串。

what will </code> be replaced with, sometimes someone also include <code> and so on?

已回答。

I think $is_args$args means that if there's a query string (i.e. http://<ip or domain>/search/google?fruit=apple, $is_args$args will be replaced with ?fruit=apple

是的，完全正确。

I would assume it just redirects the user to http://google.com$proxy_uri??? same as http redirect 301???

完全错误。区别简述here although that answer doesn't mention you can additionally modify the response before sending it to the client (for example, using the sub_filter模块）。

NGINX 位置块正则表达式和代理传递

NGINX location block regex and proxy pass

regex

nginx

proxypass