在 python2.7 中尝试使用 boto3 模块从 S3 存储桶获取 CSV 文件时出现无效时间戳错误

Invalid timestamp error when trying to get CSV file from and S3 bucket using boto3 module in python2.7

我正在尝试获取存储在 S3 存储桶中的 .csv 文件。 Mac 编译器正在将 CSV 上传到 S3 存储桶,而我的代码 (python 2.7) 在 Unix 环境中是 运行。 CSV 看起来像这样(我包含了回车 return 字符):

Order,Item,Date,Quantity\r
1,34975,8/4/15,10\r
2,921644,3/10/15,2\r
3,N18DAJ,1/7/15,10\r
4,20816,12/12/15,9\r

我从 s3 存储桶中获取文件的代码:

import boto3

def readcsvFromS3(bucket_name, key):
    s3 = boto3.resource('s3')
    obj = s3.Object(bucket_name=bucket_name, key=key)
    response = obj.get()
    data = response['Body'].read()

response = obj.get() 行发生错误。我得到的错误是:

Traceback (most recent call last):
  File "slot.py", line 163, in <module>
    columnNames, rowArray = neo.readcsvFromS3(bucket_name=config.s3bucket, key=config.orde
  File "/home/jcgarciaram/WMSight/wmsight-api/api/utilities/pythonScripts/slotting/neo4jUt
    response = obj.get()
  File "/usr/local/lib/python2.7/dist-packages/boto3/resources/factory.py", line 481, in d
    response = action(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/boto3/resources/action.py", line 83, in __c
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 228, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 481, in _make_api
    operation_model, request_dict)
  File "/usr/local/lib/python2.7/dist-packages/botocore/endpoint.py", line 117, in make_re
    return self._send_request(request_dict, operation_model)
  File "/usr/local/lib/python2.7/dist-packages/botocore/endpoint.py", line 144, in _send_r
    request, operation_model, attempts)
  File "/usr/local/lib/python2.7/dist-packages/botocore/endpoint.py", line 203, in _get_re
    parser.parse(response_dict, operation_model.output_shape)),
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 208, in parse
    parsed = self._do_parse(response, shape)
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 570, in _do_pars
    member_shapes, final_parsed)
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 626, in _parse_n
    member_shape, headers[header_name])
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 226, in _parse_s
    return handler(shape, node)
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 149, in _get_tex
    return func(self, shape, text)
  File "/usr/local/lib/python2.7/dist-packages/botocore/parsers.py", line 380, in _handle_
    return self._timestamp_parser(text)
  File "/usr/local/lib/python2.7/dist-packages/botocore/utils.py", line 344, in parse_time
    raise ValueError('Invalid timestamp "%s": %s' % (value, e))
ValueError: Invalid timestamp "Wed, 16 Jan 48199 20:37:02 GMT": year is out of range

我一直在研究,但似乎无法弄清楚问题所在。有什么想法吗?

经过几天的搜索和调试,我们终于确定了问题的原因。我们尝试以 JSON 格式而不是 CSV 格式上传文件,当我们在 Python.

中尝试使用 boto3 下载文件时看到同样的错误时,想象一下我们的惊讶吧

然后我们开始查看 S3 中文件本身的属性(右键单击文件并单击属性)而不是内容。

我们找到了一个名为元数据的部分,并找到了以下条目:

Key: Expires / Value: Tue, 15 Jan 48199 02:16:52 GMT.

将值的年份更改为诸如 2200 之类的日期后,一切正常!我们现在正在研究 Node.js 中的上传过程,以了解如何确保正确设置此值。