无法使用 pySpark 从 Databricks 在 Cosmos DB / documentDB 中写入数据框

Question

在尝试将我一直在处理的数据框保存到 documentDB 集合时，我遇到了一个我无法理解的错误。

堆栈上的其他类似问题指出为数据库或集合提供的名称不正确或区分大小写但我检查了那些...还有什么其他解释？分区键？地区 ?...

另一方面，我无法找到一份完整的文档，说明哪些选项最终会提供给 df.write.format("com.microsoft.azure.cosmosdb.spark").mode('overwrite').options(**ddbconfig).save()

行

Spark 在 Databricks 上给出的错误是：

com.microsoft.azure.documentdb.DocumentClientException: Message: {'Errors':['Owner resource does not exist']}

StackTrace 给出了那些

Py4JJavaError: 
  An error occurred while calling o646.save. :
    com.microsoft.azure.documentdb.DocumentClientException: 
      Message: {"Errors":["Owner resource does not exist"]}

在 storereadresult 中给出了响应

LSN: 623, GlobalCommittedLsn: 623, PartitionKeyRangeId: , IsValid: True, StatusCode: 404, IsGone: False, IsNotFound: True, IsInvalidPartition: False, RequestCharge: 1, ItemLSN: -1, SessionToken: -1#623, ResourceType: Collection, OperationType: Read

编辑：与类似链接帖子中的情况不同。尝试在新的空集合中写入数据时发生此错误。不读取现有数据。我已经在我的问题中澄清了我已经探索了我在那些类似帖子中发现的每条路径（collection/database 主要是名称不匹配）。

Answer 1

经过进一步调查，这是我使用的库版本中的错误。

通过从 azure-cosmosdb-spark_2.3.0_2.11-1.2.2-uber.jar 切换到 azure-cosmosdb-spark_2.3.0_2.11 解决-1.2.7-uber.jar

正如在 github 上看到的那样 https://github.com/Azure/azure-cosmosdb-spark/issues/268

无法使用 pySpark 从 Databricks 在 Cosmos DB / documentDB 中写入数据框

Can't write dataframe in Cosmos DB / documentDB from Databricks with pySpark

apache-spark

pyspark

azure-cosmosdb

azure-databricks