如何解决 AWS Glue pyspark 脚本从 DocumentDB 抛出重试写入错误
How to solve AWS Glue pyspark script throwing retryWrite error from DocumentDB
运行 AWS glue 中的代码下方。作业能够从数据库中读取数据,但在写入时失败。
调用 o102.pyWriteDynamicFrame 时出错。命令失败,错误 301:服务器上的 'Retryable writes are not supported':。完整的响应是 {"ok": 0.0, "code": 301, "errmsg": "Retryable writes are not supported", "operationTime": {"$timestamp": {"t": 1647921685, "i": 1}}}
在作业详细信息部分使用目录 DocumentDB 连接
尝试在连接字符串中使用 retryWrite=false,但仍然出现错误
documentdb_uri = "mongodb://<host name>:27017"
documentdb_write_uri = "mongodb://<host name>:27017"
read_docdb_options = {
"uri": documentdb_uri,
"database": "test",
"collection": "profiles",
"username": "<username>",
"password": "<password>",
"ssl": "true",
"ssl.domain_match": "false"
}
write_documentdb_options = {
"uri": documentdb_write_uri,
"database": "test",
"collection": "collection1",
"username": "<username>",
"password": "<password>",
"ssl": "true",
"ssl.domain_match": "false"
}
# Get DynamicFrame from DocumentDB
dynamic_frame2 = glueContext.create_dynamic_frame.from_options(connection_type="documentdb",
connection_options=read_docdb_options)
# Write DynamicFrame to DocumentDB
glueContext.write_dynamic_frame.from_options(dynamic_frame2, connection_type="documentdb",
connection_options=write_documentdb_options)
job.commit()
正确的选项是 retryWrites=false 并且需要在 uri 的末尾。
你的情况:documentdb_write_uri = "mongodb://<host name>:27017/?retryWrites=false"
通过将 Glue 版本从 3.0 降级到 2.0 解决了这个问题。
在 3.0 中,使用动态帧时无法设置 retryWrite 设置。
已在他们的论坛中创建了一个工单,但尚未解决。
AWS 板中的问题供参考 - https://github.com/awslabs/aws-glue-libs/issues/111 [调用 o365.pyWriteDynamicFrame 时发生错误。命令失败,错误 301:服务器上的 'Retryable writes are not supported' ****.*****.docdb.amazonaws.com:27017.]
运行 AWS glue 中的代码下方。作业能够从数据库中读取数据,但在写入时失败。
调用 o102.pyWriteDynamicFrame 时出错。命令失败,错误 301:服务器上的 'Retryable writes are not supported':。完整的响应是 {"ok": 0.0, "code": 301, "errmsg": "Retryable writes are not supported", "operationTime": {"$timestamp": {"t": 1647921685, "i": 1}}}
在作业详细信息部分使用目录 DocumentDB 连接
尝试在连接字符串中使用 retryWrite=false,但仍然出现错误
documentdb_uri = "mongodb://<host name>:27017"
documentdb_write_uri = "mongodb://<host name>:27017"
read_docdb_options = {
"uri": documentdb_uri,
"database": "test",
"collection": "profiles",
"username": "<username>",
"password": "<password>",
"ssl": "true",
"ssl.domain_match": "false"
}
write_documentdb_options = {
"uri": documentdb_write_uri,
"database": "test",
"collection": "collection1",
"username": "<username>",
"password": "<password>",
"ssl": "true",
"ssl.domain_match": "false"
}
# Get DynamicFrame from DocumentDB
dynamic_frame2 = glueContext.create_dynamic_frame.from_options(connection_type="documentdb",
connection_options=read_docdb_options)
# Write DynamicFrame to DocumentDB
glueContext.write_dynamic_frame.from_options(dynamic_frame2, connection_type="documentdb",
connection_options=write_documentdb_options)
job.commit()
正确的选项是 retryWrites=false 并且需要在 uri 的末尾。
你的情况:documentdb_write_uri = "mongodb://<host name>:27017/?retryWrites=false"
通过将 Glue 版本从 3.0 降级到 2.0 解决了这个问题。 在 3.0 中,使用动态帧时无法设置 retryWrite 设置。
已在他们的论坛中创建了一个工单,但尚未解决。 AWS 板中的问题供参考 - https://github.com/awslabs/aws-glue-libs/issues/111 [调用 o365.pyWriteDynamicFrame 时发生错误。命令失败,错误 301:服务器上的 'Retryable writes are not supported' ****.*****.docdb.amazonaws.com:27017.]