获取集合的所有文档 ID 的 RavenDB 以进行 "per-document" 修改

Question

我目前正在尝试更新 ravendb 数据库中的文档。问题是我有一种方法可以更新一个文档，但它将文档的 ID 作为参数。我正在使用 python，因此：pyravenDB 作为接口。

方法如下：

def updateDocument(self,id,newAttribute)

        with self.store.open_session() as session:
            doc = session.load(id)
            doc.newAttribute= newAttribute
            session.save_changes()

我的想法是我将对目标集合的所有 ID 使用简单的 for 循环并调用 updateDocument 方法。

我认为有一个 updatebyindex 方法，但我不知道如何使它适应我的用例。

我怎样才能得到这个？

谢谢！

Answer 1

我不是 Python 专家，但快速查看 PyRavenDb 的源代码我可以找到 store.database_commands which are defined in database_commands.py.

语法就像等效的 C# command、

的语法一样

def update_by_index(self, index_name, query, scripted_patch=None, options=None):
    """
    @param index_name: name of an index to perform a query on
    :type str
    @param query: query that will be performed
    :type IndexQuery
    @param options: various operation options e.g. AllowStale or MaxOpsPerSec
    :type BulkOperationOptions
    @param scripted_patch: JavaScript patch that will be executed on query results( Used only when update)
    :type ScriptedPatchRequest
    @return: json
    :rtype: dict
    """
    if not isinstance(query, IndexQuery):
        raise ValueError("query must be IndexQuery Type")
    path = Utils.build_path(index_name, query, options)
    if scripted_patch:
        if not isinstance(scripted_patch, ScriptedPatchRequest):
            raise ValueError("scripted_patch must be ScriptedPatchRequest Type")
        scripted_patch = scripted_patch.to_json()

    response = self._requests_handler.http_request_handler(path, "EVAL", data=scripted_patch)
    if response.status_code != 200 and response.status_code != 202:
        raise response.raise_for_status()
    return response.json()

该函数接受索引的名称、用于查找要更新的文档的查询，以及JavaScript 补丁 将修改文档的数据。

如果您需要更新特定 collection 的所有文档，请考虑通过 Raven/DocumentsByEntityName 索引更新它们。它是一个自动创建的系统索引，它包含对整个数据库中所有文档的引用。因此，您可以编写一个查询来查找所有包含 Tag 的文档，其值对应于您的 collection 的名称，例如Query = "Tag:Groups"，并将查询传递给 update_by_index 方法。

您也可以通过 batch 命令完成文档的更新，该命令也在 database_commands.py and documented here 中定义。 注意：这仅适用于您知道文档 ID 的情况。

如果您对 C# 示例感兴趣，可以使用我去年在达拉斯 https://github.com/maqduni/RavenDb-Demo.

参观 RavenDB 会议后创建的演示项目

Answer 2

就像maqduni说的update_by_index就是你要用的方法。只需创建一个索引来索引您想要的文档。如果你遇到麻烦，你可以尝试查询你想要的文档，然后 ravendb 会为你创建自动索引。创建索引后，只需使用 index_name 和 query 调用 update_by_index（只需确保索引不陈旧）

您的代码需要如下所示：

from pyravendb.data.indexes import IndexQuery
from pyravendb.data.patches import ScriptedPatchRequest
   self.store.database_commands.update_by_index(index_name="YOUR_INDEX_NAME",
        query=IndexQuery(query="TAG:collection_name;"),
        scripted_patch=ScriptedPatchRequest("this.Attribute = newAttribute;"))

IndexQuery中的查询是示例中的lucene语法索引中的TAG是我所有的集合名称。 scripted_patch 采用 js 语法，这是将在您查询的每个文档上运行的脚本。

我将尝试解释两者之间的区别：

get_index 方法将为您提供有关索引的信息，响应是 IndexDefinition。

update_by_index 是一个很长的任务操作，这就是为什么你只得到 operation_id 你需要等到它完成。（ 将在下一个 pyravendb 版本 中为此添加一个功能。此操作不会为您提供已打补丁的文件。新功能将为您提供有关该过程的信息。

page_size仅用于查询结果，不用于索引操作

获取集合的所有文档 ID 的 RavenDB 以进行 "per-document" 修改

Get all of a collection's documents id's RavenDB for a "per-document" modification

python

database

nosql

ravendb