How to fix "OverflowError: Unsupported UTF-8 sequence length when encoding string"
How to fix "OverflowError: Unsupported UTF-8 sequence length when encoding string"
将 pandas 数据帧转换为 json
时出现以下错误
OverflowError: Unsupported UTF-8 sequence length when encoding string
这是
的代码
bytes_to_write = data.to_json(orient='records').encode()
fs = s3fs.S3FileSystem(key=aws_access_key_id, secret=aws_secret_access_key)
with fs.open(file, 'wb') as f:
f.write(bytes_to_write)
虽然试图转换为 json 的数据包含更多 utf-8
代码
如何解决?
如此 answer suggests, I converted the data-frame using the function .to_json()
and the default_handler
parameter, you can find the documentation here.
您必须注意 default_handler=str
参数,以免出现上述错误。您可以在上面的文档中阅读详细信息。
dataframe.to_json('foo.json', default_handler=str)
请不要忘记考虑函数可以以不同的方式输出 json
,orient='<option>'
参数指定,正如文档所说:
orient: str
Indication of expected JSON string format.
...
The format of the JSON string:
- ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
- ‘records’ : list like [{column -> value}, … , {column -> value}]
- ‘index’ : dict like {index -> {column -> value}}
- ‘columns’ : dict like {column -> {index -> value}}
- ‘values’ : just the values array
- ‘table’ : dict like {‘schema’: {schema}, ‘data’: {data}}
Describing the data, where data component is like orient='records'.
将 pandas 数据帧转换为 json
时出现以下错误OverflowError: Unsupported UTF-8 sequence length when encoding string
这是
的代码 bytes_to_write = data.to_json(orient='records').encode()
fs = s3fs.S3FileSystem(key=aws_access_key_id, secret=aws_secret_access_key)
with fs.open(file, 'wb') as f:
f.write(bytes_to_write)
虽然试图转换为 json 的数据包含更多 utf-8
代码
如何解决?
如此 answer suggests, I converted the data-frame using the function .to_json()
and the default_handler
parameter, you can find the documentation here.
您必须注意 default_handler=str
参数,以免出现上述错误。您可以在上面的文档中阅读详细信息。
dataframe.to_json('foo.json', default_handler=str)
请不要忘记考虑函数可以以不同的方式输出 json
,orient='<option>'
参数指定,正如文档所说:
orient: str
Indication of expected JSON string format.
...
The format of the JSON string:
- ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
- ‘records’ : list like [{column -> value}, … , {column -> value}]
- ‘index’ : dict like {index -> {column -> value}}
- ‘columns’ : dict like {column -> {index -> value}}
- ‘values’ : just the values array
- ‘table’ : dict like {‘schema’: {schema}, ‘data’: {data}}
Describing the data, where data component is like orient='records'.