Pandas 使用 read_sql 时出现 UnicodeDecodeError

Question

我正在尝试使用 pandas.read_sql 执行 SQL 查询。它通常有效，但对于某些查询，我运行进入此错误：

  File "C:\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1454, in _fetchall_as_list
    result = cur.fetchall()

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 3: ordinal not in range(128)

我尝试了为此处非常相似的问题 () 建议的解决方案，但没有解决问题。

我正在使用 cx_oracle 库进行数据库连接。

我试过了

db = cx_Oracle.connect(user,pwd, dsn_dict[dbname],encoding='utf-8')

但是当我使用

检查编码时

print(db.encoding)
print(db.nencoding)

我总是得到

ASCII
ASCII

我尝试使用

更改 NLS_LANG

os.environ['NLS_LANG'] = 'AMERICAN_AMERICA.US7ASCII'

但它导致相同的错误

这些是数据库 NLS 参数：

NLS_CHARACTERSET    US7ASCII

NLS_NCHAR_CHARACTERSET  AL16UTF16

我在访问中运行相同的查询，我在查询结果中注意到这个字符，这可能导致了这个问题：

¿

基本上，我不知道如何设置正确的编码来处理这个问题。任何帮助表示赞赏。谢谢。

解决方案：

作为参考，我通过设置解决了这个问题

os.environ['NLS_LANG'] = 'AMERICAN_AMERICA.UTF8'

虽然我不喜欢这样做。更好的解决方案表示赞赏。

Answer 1

对于 cx_Oracle 6 这应该适合你：

cx_Oracle.connect("user/pw@dsn", encoding = "UTF-8", nencoding = "UTF-8")

由于您的数据库编码是 ASCII，您甚至可以只设置 nencoding 参数。如果您要使用 NLS_LANG 环境变量，请确保使用真正的 UTF-8 编码。由于历史原因，在 Oracle 中称为 AL32UTF8！

Pandas 使用 read_sql 时出现 UnicodeDecodeError

Pandas UnicodeDecodeError while using read_sql

python

oracle

cx-oracle

pandas