如何在 python 中从 Azure Cosmos DB 读取数据

How to read data from Azure's CosmosDB in python

我有一个 Azure 试用帐户,并已将一些 JSON 文件上传到 CosmosDB。我正在创建一个 python 程序来查看数据,但我在这样做时遇到了问题。这是我目前的代码:

import pydocumentdb.documents as documents
import pydocumentdb.document_client as document_client
import pydocumentdb.errors as errors

url = 'https://ronyazrak.documents.azure.com:443/'
key = '' # primary key

# Initialize the Python DocumentDB client
client = document_client.DocumentClient(url, {'masterKey': key})

collection_link = '/dbs/test1/colls/test1'

collection = client.ReadCollection(collection_link)

result_iterable = client.QueryDocuments(collection)

query = { 'query': 'SELECT * FROM server s' }

我在某处读到该密钥将是我可以在我的 Azure 帐户中找到的主密钥 Keys。我已经用图片中显示的主键填充了键字符串,但出于隐私目的,这里的键是空的。

我还在某处读到,如果我的数据在集合 'test1' Collections.

中,collection_link 应该是“/dbs/test1/colls/test1”

我的代码在函数 client.ReadCollection() 处出错。

那是我遇到的错误“pydocumentdb.errors.HTTPFailure:状态代码:401 {"code":"Unauthorized","message":"The input authorization token can't serve the request. Please check that the expected payload is built as per the protocol, and check the key being used. Server used the following payload to sign: 'get\ncolls\ndbs/test1/colls/test1\nmon, 29 may 2017 19:47:28 gmt\n\n'\r\nActivityId: 03e13e74-8db4-4661-837a-f8d81a2804cc"}"

修复此错误后,还需要做什么?我想将 JSON 文件作为一个大词典,以便我可以查看数据。

我走的路对吗?我是以错误的方式接近这个吗?如何读取数据库中的文档?谢谢

根据您的错误信息,这似乎是您的密钥验证失败造成的,如下官方解释来自here

所以请检查您的密钥,但我认为关键点是 pydocumentdb 使用不正确。 DatabaseCollectionDocumentid 与它们的 link 不同。这些 API ReadCollectionQueryDocuments 需要通过相关 link。您需要通过资源 link 而不是资源 ID

来检索 Azure CosmosDB 中的所有资源

根据你的描述,我想你想列出集合id路径/dbs/test1/colls/test1下的所有文档。作为参考,下面是我的示例代码。

from pydocumentdb import document_client

uri = 'https://ronyazrak.documents.azure.com:443/'
key = '<your-primary-key>'

client = document_client.DocumentClient(uri, {'masterKey': key})

db_id = 'test1'
db_query = "select * from r where r.id = '{0}'".format(db_id)
db = list(client.QueryDatabases(db_query))[0]
db_link = db['_self']

coll_id = 'test1'
coll_query = "select * from r where r.id = '{0}'".format(coll_id)
coll = list(client.QueryCollections(db_link, coll_query))[0]
coll_link = coll['_self']

docs = client.ReadDocuments(coll_link)
print list(docs)

请参阅来自 here 的 DocumentDB Python SDK 的详细信息。

对于那些使用azure-cosmos的人,当前库(2019)我打开了一个doc bug and provided a sample in GitHub

样本

from azure.cosmos import cosmos_client
import json

CONFIG = {
    "ENDPOINT": "ENDPOINT_FROM_YOUR_COSMOS_ACCOUNT",
    "PRIMARYKEY": "KEY_FROM_YOUR_COSMOS_ACCOUNT",
    "DATABASE": "DATABASE_ID",  # Prolly looks more like a name to you
    "CONTAINER": "YOUR_CONTAINER_ID"  # Prolly looks more like a name to you
}

CONTAINER_LINK = f"dbs/{CONFIG['DATABASE']}/colls/{CONFIG['CONTAINER']}"
FEEDOPTIONS = {}
FEEDOPTIONS["enableCrossPartitionQuery"] = True
# There is also a partitionKey Feed Option, but I was unable to figure out how to us it.

QUERY = {
    "query": f"SELECT * from c"
}

# Initialize the Cosmos client
client = cosmos_client.CosmosClient(
    url_connection=CONFIG["ENDPOINT"], auth={"masterKey": CONFIG["PRIMARYKEY"]}
)

# Query for some data
results = client.QueryItems(CONTAINER_LINK, QUERY, FEEDOPTIONS)

# Look at your data
print(list(results))

# You can also use the list as JSON
json.dumps(list(results), indent=4)