执行数据库清理时，App Engine 出现内存限制超出错误

Question

我有以下代码，我每周运行通过 cron 作业清除旧的数据库条目。 3-4 分钟后，我得到 Exceeded soft private memory limit of 128 MB with 189 MB after servicing 1006 requests total。

然后还有这个消息While handling this request, the process that handled this request was found to be using too much memory and was terminated. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may have a memory leak in your application.下面是明文代码

def clean_user_older_stories(user):
  stories = Story.query(Story.user==user.key).order(-Story.created_time).fetch(offset=200, limit=500, keys_only=True)
  print 'stories len ' + str(len(stories))
  ndb.delete_multi(stories)


def clean_older_stories():
  for user in User.query():
    clean_user_older_stories(user)

我想有更好的方法来处理这个问题。我该如何处理？

Answer 1

您是否尝试过将您的用户查询设为 keys_only 查询？除了密钥之外，您没有使用任何用户属性，这将有助于减少内存使用量。

您应该通过设置 page_size 并使用 Cursor 对大型查询进行分页。

您的处理程序可以使用下一个游标通过任务队列调用自身，直到到达结果集的末尾。您可以选择使用 deferred API cut down on boilerplate code 来完成此类任务。

也就是说，'join' 您在 User 和 Store 之间所做的操作可能会使这具有挑战性。我会首先翻阅用户，因为从您所描述的情况来看，用户会随着时间的推移而增长，但每个用户的故事数量是有限的。

Answer 2

因为In-Context Cache

With executing long-running queries in background tasks, it's possible for the in-context cache to consume large amounts of memory. This is because the cache keeps a copy of every entity that is retrieved or stored in the current context.

尝试禁用缓存

To avoid memory exceptions in long-running tasks, you can disable the cache or set a policy that excludes whichever entities are consuming the most memory.

ctx = ndb.get_context
ctx.set_cache_policy(False)
ctx.set_memcache_policy(False)

执行数据库清理时，App Engine 出现内存限制超出错误

Getting memory limit exceed error on App Engine when doing a db clean

google-app-engine

google-app-engine-python