Lambda Python 请求 athena 错误 OutputLocation

Lambda Python request athena error OutputLocation

我正在使用 AWS Lambda,我想在 athena 中进行简单查询并将我的数据存储在 s3 中。

我的代码:

import boto3

def lambda_handler(event, context):
    query_1 = "SELECT * FROM test_athena_laurent.stage limit 5;"
    database = "test_athena_laurent"
    s3_output = "s3://athena-laurent-result/lambda/"

    client = boto3.client('athena')

    response = client.start_query_execution(
    QueryString=query_1,
    ClientRequestToken='string',
    QueryExecutionContext={
        'Database': database
    },
    ResultConfiguration={
        'OutputLocation': 's3://athena-laurent-result/lambda/'
    }
    )
    return response

它适用于 spyder 2.7,但在 AWS 中我有这个错误:

Parameter validation failed:
Invalid length for parameter ClientRequestToken, value: 6, valid range: 32-inf: ParamValidationError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 18, in lambda_handler
    'OutputLocation': 's3://athena-laurent-result/lambda/'

我认为它不理解我的路径,我也不知道为什么。

谢谢

ClientRequestToken (string) -- A unique case-sensitive string used to ensure the request to create the query is idempotent (executes only once). If another StartQueryExecution request is received, the same response is returned and another query is not created. If a parameter has changed, for example, the QueryString , an error is returned. [Boto3 Docs]

如果未提供此字段,则会自动填充。

如果您为 ClientRequestToken 提供字符串值,请确保它是 within length limits from 32 to 128

根据@Tomalak 的观点 ClientRequestTokenstring。但是,根据我刚刚链接的文档,您在使用 SDK 时无论如何都不需要它。

This token is listed as not required because AWS SDKs (for example the AWS SDK for Java) auto-generate the token for users. If you are not using the AWS SDK or the AWS CLI, you must provide this token or the action will fail.

所以,我会这样重构:

import boto3


def lambda_handler(event, context):
    query_1 = "SELECT * FROM some_database.some_table limit 5;"
    database = "some_database"
    s3_output = "s3://some_bucket/some_tag/"

    client = boto3.client('athena')

    response = client.start_query_execution(QueryString = query_1,
                                        QueryExecutionContext={
                                            'Database': database
                                        },
                                        ResultConfiguration={
                                            'OutputLocation': s3_output
                                        }
                                        )
    return response