Django 的每站点缓存是如何工作的?
How does django per-site cache work exactly?
我一直以为per-site cache过期了才会更新,其他人thought so too. However I found different result after doing some test on my website. My website,你看,就是典型的博客
我是这样测试的:
使用 memcached -vv
启动 memcached,这样我就可以看到 memcached 中发生了什么。
然后做了一些操作:
访问首页 -> 访问首页 -> 更新首页文章 -> 访问首页
缓存存储什么都没有 什么都没有 缓存存储(奇怪!)
主页确实在我上次访问时更新了。
我的缓存过期时间是600秒,所以我可以向你保证第二次缓存存储操作与缓存过期无关。(实际上我重复了几次,都给出了相同的结果)。
那么对此有何解释呢? documentaion 没有提供太多信息。还是我测试方法不对?
你可以在中间件UpdateCacheMiddleware中查找process_response
的源码搞清楚:
def process_response(self, request, response):
"""Sets the cache, if needed."""
if not self._should_update_cache(request, response):
# We don't need to update the cache, just return.
return response
if response.streaming or response.status_code != 200:
return response
# Don't cache responses that set a user-specific (and maybe security
# sensitive) cookie in response to a cookie-less request.
if not request.COOKIES and response.cookies and has_vary_header(response, 'Cookie'):
return response
# Try to get the timeout from the "max-age" section of the "Cache-
# Control" header before reverting to using the default cache_timeout
# length.
timeout = get_max_age(response)
if timeout is None:
timeout = self.cache_timeout
elif timeout == 0:
# max-age was set to 0, don't bother caching.
return response
patch_response_headers(response, timeout)
if timeout:
cache_key = learn_cache_key(request, response, timeout, self.key_prefix, cache=self.cache)
if hasattr(response, 'render') and callable(response.render):
response.add_post_render_callback(
lambda r: self.cache.set(cache_key, r, timeout)
)
else:
self.cache.set(cache_key, response, timeout)
return response
首先检查您的博客是否旨在拥有经过身份验证的用户,因为 if not request.COOKIES and response.cookies and has_vary_header(response, 'Cookie')
。
它无法正常工作
如果未通过身份验证,那么您必须查看 learn_cache_key
计算缓存视图的密钥的方法:
def learn_cache_key(request, response, cache_timeout=None, key_prefix=None, cache=None):
"""
Learns what headers to take into account for some request URL from the
response object. It stores those headers in a global URL registry so that
later access to that URL will know what headers to take into account
without building the response object itself. The headers are named in the
Vary header of the response, but we want to prevent response generation.
The list of headers to use for cache key generation is stored in the same
cache as the pages themselves. If the cache ages some data out of the
cache, this just means that we have to build the response once to get at
the Vary header and so at the list of headers to use for the cache key.
"""
if key_prefix is None:
key_prefix = settings.CACHE_MIDDLEWARE_KEY_PREFIX
if cache_timeout is None:
cache_timeout = settings.CACHE_MIDDLEWARE_SECONDS
cache_key = _generate_cache_header_key(key_prefix, request)
if cache is None:
cache = caches[settings.CACHE_MIDDLEWARE_ALIAS]
if response.has_header('Vary'):
is_accept_language_redundant = settings.USE_I18N or settings.USE_L10N
# If i18n or l10n are used, the generated cache key will be suffixed
# with the current locale. Adding the raw value of Accept-Language is
# redundant in that case and would result in storing the same content
# under multiple keys in the cache. See #18191 for details.
headerlist = []
for header in cc_delim_re.split(response['Vary']):
header = header.upper().replace('-', '_')
if header == 'ACCEPT_LANGUAGE' and is_accept_language_redundant:
continue
headerlist.append('HTTP_' + header)
headerlist.sort()
cache.set(cache_key, headerlist, cache_timeout)
return _generate_cache_key(request, request.method, headerlist, key_prefix)
else:
# if there is no Vary header, we still need a cache key
# for the request.build_absolute_uri()
cache.set(cache_key, [], cache_timeout)
return _generate_cache_key(request, request.method, [], key_prefix)
请注意,Django 将深入研究 response['Vary']
header,并将为其添加密钥变体。
因此,首先检查您的响应是否添加了 Vary
header,如果是,它具有哪个值,如果它与保存条目之前不同,您将知道为什么没有被缓存。
另请注意,如果激活多语言,它也会为每种语言生成不同的密钥。
Django添加或修改时Vary
header?
在这里您可以看到它在 docs and how to control cache in views here 中发生的时间或发生的方式。
最后要考虑到,如果您在博客中使用 third-party 应用程序,那么它可能会在更新条目等数据变化时使用缓存控制机制来刷新缓存。
Django 版本
在 Django >=1.7 中使用完全限定的 url,因此如果您使用这些版本,还要考虑 HOSTS:
Changed in Django 1.7:
Cache keys use the request’s fully-qualified URL rather than just the path and query string.
嗯,原来我的测试有问题。当没有其他人访问我的站点时,我再次仔细测试。这是 memcached -vv
的输出:
# first visit
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 set :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST 1 600 14
>30 STORED
<30 set :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST 1 600 33215
>30 STORED
<30 connection closed.
# second visit
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 sending key :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 get :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 sending key :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 END
<30 connection closed.
# modified and save
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..7029e9375fc4657a73dae1f9bddb73e5.en-us.CST
>30 END
<30 connection closed.
# visit again
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 sending key :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 get :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 sending key :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 END
<30 connection closed.
此输出表明 memcached 正在按预期工作。我不确定之前输出的解释是什么。
我一直以为per-site cache过期了才会更新,其他人thought so too. However I found different result after doing some test on my website. My website,你看,就是典型的博客
我是这样测试的:
使用
memcached -vv
启动 memcached,这样我就可以看到 memcached 中发生了什么。 然后做了一些操作:访问首页 -> 访问首页 -> 更新首页文章 -> 访问首页
缓存存储什么都没有 什么都没有 缓存存储(奇怪!)
主页确实在我上次访问时更新了。
我的缓存过期时间是600秒,所以我可以向你保证第二次缓存存储操作与缓存过期无关。(实际上我重复了几次,都给出了相同的结果)。
那么对此有何解释呢? documentaion 没有提供太多信息。还是我测试方法不对?
你可以在中间件UpdateCacheMiddleware中查找process_response
的源码搞清楚:
def process_response(self, request, response):
"""Sets the cache, if needed."""
if not self._should_update_cache(request, response):
# We don't need to update the cache, just return.
return response
if response.streaming or response.status_code != 200:
return response
# Don't cache responses that set a user-specific (and maybe security
# sensitive) cookie in response to a cookie-less request.
if not request.COOKIES and response.cookies and has_vary_header(response, 'Cookie'):
return response
# Try to get the timeout from the "max-age" section of the "Cache-
# Control" header before reverting to using the default cache_timeout
# length.
timeout = get_max_age(response)
if timeout is None:
timeout = self.cache_timeout
elif timeout == 0:
# max-age was set to 0, don't bother caching.
return response
patch_response_headers(response, timeout)
if timeout:
cache_key = learn_cache_key(request, response, timeout, self.key_prefix, cache=self.cache)
if hasattr(response, 'render') and callable(response.render):
response.add_post_render_callback(
lambda r: self.cache.set(cache_key, r, timeout)
)
else:
self.cache.set(cache_key, response, timeout)
return response
首先检查您的博客是否旨在拥有经过身份验证的用户,因为 if not request.COOKIES and response.cookies and has_vary_header(response, 'Cookie')
。
如果未通过身份验证,那么您必须查看 learn_cache_key
计算缓存视图的密钥的方法:
def learn_cache_key(request, response, cache_timeout=None, key_prefix=None, cache=None):
"""
Learns what headers to take into account for some request URL from the
response object. It stores those headers in a global URL registry so that
later access to that URL will know what headers to take into account
without building the response object itself. The headers are named in the
Vary header of the response, but we want to prevent response generation.
The list of headers to use for cache key generation is stored in the same
cache as the pages themselves. If the cache ages some data out of the
cache, this just means that we have to build the response once to get at
the Vary header and so at the list of headers to use for the cache key.
"""
if key_prefix is None:
key_prefix = settings.CACHE_MIDDLEWARE_KEY_PREFIX
if cache_timeout is None:
cache_timeout = settings.CACHE_MIDDLEWARE_SECONDS
cache_key = _generate_cache_header_key(key_prefix, request)
if cache is None:
cache = caches[settings.CACHE_MIDDLEWARE_ALIAS]
if response.has_header('Vary'):
is_accept_language_redundant = settings.USE_I18N or settings.USE_L10N
# If i18n or l10n are used, the generated cache key will be suffixed
# with the current locale. Adding the raw value of Accept-Language is
# redundant in that case and would result in storing the same content
# under multiple keys in the cache. See #18191 for details.
headerlist = []
for header in cc_delim_re.split(response['Vary']):
header = header.upper().replace('-', '_')
if header == 'ACCEPT_LANGUAGE' and is_accept_language_redundant:
continue
headerlist.append('HTTP_' + header)
headerlist.sort()
cache.set(cache_key, headerlist, cache_timeout)
return _generate_cache_key(request, request.method, headerlist, key_prefix)
else:
# if there is no Vary header, we still need a cache key
# for the request.build_absolute_uri()
cache.set(cache_key, [], cache_timeout)
return _generate_cache_key(request, request.method, [], key_prefix)
请注意,Django 将深入研究 response['Vary']
header,并将为其添加密钥变体。
因此,首先检查您的响应是否添加了 Vary
header,如果是,它具有哪个值,如果它与保存条目之前不同,您将知道为什么没有被缓存。
另请注意,如果激活多语言,它也会为每种语言生成不同的密钥。
Django添加或修改时Vary
header?
在这里您可以看到它在 docs and how to control cache in views here 中发生的时间或发生的方式。
最后要考虑到,如果您在博客中使用 third-party 应用程序,那么它可能会在更新条目等数据变化时使用缓存控制机制来刷新缓存。
Django 版本
在 Django >=1.7 中使用完全限定的 url,因此如果您使用这些版本,还要考虑 HOSTS:
Changed in Django 1.7: Cache keys use the request’s fully-qualified URL rather than just the path and query string.
嗯,原来我的测试有问题。当没有其他人访问我的站点时,我再次仔细测试。这是 memcached -vv
的输出:
# first visit
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 set :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST 1 600 14
>30 STORED
<30 set :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST 1 600 33215
>30 STORED
<30 connection closed.
# second visit
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 sending key :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 get :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 sending key :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 END
<30 connection closed.
# modified and save
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..7029e9375fc4657a73dae1f9bddb73e5.en-us.CST
>30 END
<30 connection closed.
# visit again
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 get :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 sending key :1:views.decorators.cache.cache_header..f384b899ecab7abd6fb0a567608b97b2.en-us.CST
>30 END
<30 get :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 sending key :1:views.decorators.cache.cache_page..GET.f384b899ecab7abd6fb0a567608b97b2.d41d8cd98f00b204e9800998ecf8427e.en-us.CST
>30 END
<30 connection closed.
此输出表明 memcached 正在按预期工作。我不确定之前输出的解释是什么。