S3 Select 与 boto3 - 内部错误
S3 Select with boto3 - internalerror
有没有人得到"S3 Select" (https://aws.amazon.com/blogs/aws/s3-glacier-select/ ,
https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-s3-select-is-now-generally-available/) 与 boto3(甚至 cli 或其他 sdk)一起工作?我在下面收到神秘的内部错误:
运行 在具有 IAM 角色的 EC2 上:
[ec2-user@ip-blah bin]$ ./python
Python 2.7.13 (default, Jan 31 2018, 00:17:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto3
>>> s3 = boto3.client('s3')
>>> r = s3.select_object_content(
... Bucket='mybucketname',
... Key='mypath/file.txt',
... ExpressionType='SQL',
... Expression="select count(*) from s3object s",
... InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},
... OutputSerialization = {'CSV': {}},
... )
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InternalError) when calling the SelectObjectContent operation (reached max retries: 4): We encountered an internal error. Please try again.
我的猜测:
- 检查 s3 的权限
- 适应 'RecordDelimiter'、'FieldDelimiter'、'QuoteCharacter' 如果需要 InputSerialization
检查 csv 文件的结构(headers 匹配数据列的数量,转义规范。字符,空格,/n 作为新行定义。,。)
试试
...
表达式="SELECT * FROM S3Object s", InputSerialization={'CSV': {}}, OutputSerialization={'CSV': {}}, ...
希望对您有所帮助!
有没有人得到"S3 Select" (https://aws.amazon.com/blogs/aws/s3-glacier-select/ , https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-s3-select-is-now-generally-available/) 与 boto3(甚至 cli 或其他 sdk)一起工作?我在下面收到神秘的内部错误:
运行 在具有 IAM 角色的 EC2 上:
[ec2-user@ip-blah bin]$ ./python
Python 2.7.13 (default, Jan 31 2018, 00:17:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto3
>>> s3 = boto3.client('s3')
>>> r = s3.select_object_content(
... Bucket='mybucketname',
... Key='mypath/file.txt',
... ExpressionType='SQL',
... Expression="select count(*) from s3object s",
... InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},
... OutputSerialization = {'CSV': {}},
... )
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InternalError) when calling the SelectObjectContent operation (reached max retries: 4): We encountered an internal error. Please try again.
我的猜测:
- 检查 s3 的权限
- 适应 'RecordDelimiter'、'FieldDelimiter'、'QuoteCharacter' 如果需要 InputSerialization
检查 csv 文件的结构(headers 匹配数据列的数量,转义规范。字符,空格,/n 作为新行定义。,。)
试试 ... 表达式="SELECT * FROM S3Object s", InputSerialization={'CSV': {}}, OutputSerialization={'CSV': {}}, ...
希望对您有所帮助!