在 HTTPServer 或 BaseHTTPRequestHandler 中是否发生了一些缓存或分叉?

Is there some caching or forking happening in `HTTPServer` or` BaseHTTPRequestHandler`?

可能是我的代码实施错误,但我发现虽然我可以从文字数据提供 GET 请求,但我无法更新该数据并在后续 GET 请求中将其显示为已更新。我也不能让 POST 请求更新数据。

所以它的行为就好像在 Python 的 HTTPServer 或 BaseHTTPRequestHandler 的某处发生了缓存或分叉。

提前感谢您查看它,但是,温和地,不,我不想使用非核心 3.8 模块或使用完全不同的框架或某些 Flask 重新编写。我认为这应该可行,但它的行为不当,我无法找出原因。如果我使用的是 C 或 Go 的内置库,那么它不会像我一样令人头疼。

为了演示,您将 运行 以下 python 实现,并加载 http://127.0.0.1:8081/ 两到三次:

"""
A Quick test server on 8081.
"""
from http.server import HTTPServer, BaseHTTPRequestHandler
import cgi
import json
import os
import sys

ADDR = '127.0.0.1'
PORT = 8081


def run(server_class=HTTPServer, handler_class=BaseHTTPRequestHandler):
    server_address = (ADDR, PORT)
    with server_class(server_address, handler_class) as httpd:
        print("serving at", ADDR, "on", PORT, f"[ http://{ADDR}:{PORT} ]")
        try:
            httpd.serve_forever()
        except KeyboardInterrupt:
            print(" stopping web server due to interrupt signal...")
            httpd.socket.close()


class SimpleHandler(BaseHTTPRequestHandler):
    """
    Implements responses to GET POST
    """

    def __init__(self, request, client_address, server):
        """Sets up the server's memory, a favicon, and one text pseudo-file."""
        self.files = {
            '/oh': ['text/plain', "It's me", ],
            '/favicon.ico': [
                'image/svg+xml',
                '<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 48 48"><text y="1em" font-size="48">⁇</text></svg>',
            ],
        }
        self.head = '<link rel="icon" type="image/svg+xml" sizes="48x48" '\
                    'href="/favicon.ico">'
        super(SimpleHandler, self).__init__(request, client_address, server)

    def _set_headers(self, content_type='application/json', response=200):
        self.send_response(response)
        self.send_header("Content-type", content_type)
        self.end_headers()

    def _html(self, message, title='Simple Server', extra=""):
        """This generates HTML with `message` in the h1 of body."""
        content = f"<html><head><title>{title}</title>{self.head}</head>" \
                  f"<body><h1>{message}</h1>{extra}</body></html>"
        return content.encode("utf8")  # NOTE: must return a bytes object!

    def do_GET(self):
        """Respond to a GET request."""
        if self.path == "/":
            self._set_headers('text/html')
            fnames = [f'<li><a href="{fn}">{fn}</a></li>' for fn in self.files.keys()]
            fnames.sort()
            self.wfile.write(self._html(
                "Welcome",
                extra='Try:'
                      '<ul>'
                      '<li><a href="/hello">/hello</a></li>'
                      f'{"".join(fnames)}'
                      '</ul>'
            ))
        elif self.path == "/hello":
            self._set_headers('text/html')
            self.wfile.write(self._html("hello you"))
        elif self.path in self.files:
            content_type, content = self.files[self.path]
            self.send_response(200)
            self._set_headers(content_type)
            self.wfile.write(content.encode())
        else:
            self.send_error(404)
        # Note this update doesn't seem to happen to the in memory dict.
        self.files[f"/{len(self.files)}"] = [
            "text/html", self._html(len(self.files))]

    def do_HEAD(self):
        if self.path in ["/", "/hello"]:
            self._set_headers('text/html')
        elif self.path in self.files:
            content_type, _ = self.files[self.path]
            self._set_headers(content_type)
        else:
            self.send_error(404)

    def do_POST(self):
        """Should update pseudo-files with posted file contents."""
        ctype, pdict = cgi.parse_header(
            self.headers.get('content-type', self.headers.get_content_type()))
        print("POSTED with content type", ctype)
        content = None
        if ctype == 'application/x-www-form-urlencoded':
            print(" * This multipart/form-data method might not work")
            content = {"content": str(self.rfile.read(int(self.headers['Content-Length'])).decode())}
        elif ctype == 'multipart/form-data':
            print(" * This multipart/form-data method might not work")
            fields = cgi.parse_multipart(self.rfile, pdict)
            content = {"content": fields.get('content')}
        elif ctype == 'application/json':
            data_string = self.rfile.read(int(self.headers['Content-Length']))
            content = json.loads(data_string)
        else:
            self.send_error(404)
        print(" * Received content:", content)
        # Note this update doesn't seem to happen to the in memory dict.
        self.files[self.path] = ['application/json', content]
        self._set_headers(response=201)
        self.wfile.write(json.dumps(content).encode())


if __name__ == '__main__':
    print('FYI:')
    print('  LANG =', os.getenv('LANG'))
    print('  Default Charset Encoding =', sys.getdefaultencoding())
    path_to_script = os.path.dirname(os.path.realpath(__file__))
    print('Serving from path:', path_to_script)
    os.chdir(path_to_script)
    run(handler_class=SimpleHandler)

甚至在加载 http://127.0.0.1:8081/ 之前,您可以尝试发帖向 self.files 字典添加内容。例如

curl -v -H 'content-type: application/json' \
     --data-binary '{"this": "should work"}' http://127.0.0.1:8081/new_file

您可以看到服务器响应,并打印收到的数据,现在应该在 self.files 中,因此 / 应该显示它。 您可以将其与以下内容混合使用:

curl -v --data-urlencode 'content={"this": "should work"}' http://127.0.0.1:8081/new_file2

但是这些都没有添加 self.files['/new_file']'/new_file2',只是不清楚为什么。

一个人应该能够请求 /new_file/new_file2 而那些是 404.

do_GET 的最后几行中,多个 GET / 请求应该显示更多列出的项目。

$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>

虽然将那些添加新键和值的行移动到 self.filesdo_GET 的顶部表明它确实更新了,但只有一次,这看起来更奇怪:

$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/2">/2</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/2">/2</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>

好吧,事实证明每个请求都会创建一个新的 SimpleHandler,因此我不得不将 self.files 移到外部范围并且还要注意在 SimpleHandler__init__。这基本上使行为符合我的预期。