为什么 SQLite Blob 类型在 pandas Python 中自动转换为 unicode 序列

Question

我有一个 SQLite Table 有一些 Blob 类型的列。

Table 包含：

DB_Index     DB_ColumnBlob
619823       0A 00 00 00 4E 04

使用此代码查询：

sql = r'''SELECT DB_ColumnBlob WHERE DB_Index = 619823'''
conn = sqlite3.connect(Settings.databasepath)
df = pd.read_sql_query(sql = sql, con = conn)
print df.iloc[0,0]

我得到这个输出：

N z X a& + �,

DB_ColumnBlob 包含一串十六进制数字，我想将它们作为输出，以便将其分段为 2 位长度的段，然后转换为整数。但是随着我得到的输出，我不明白发生了什么。

提前致谢。

Answer 1

Pandas 或者 sqlite3 没有将它转换成 unicode（sqlite3 将它映射到一个 python 缓冲区对象，见 docs），你看到的只是 default Python string repr 用于缓冲区对象。

一个小例子：

In [46]: buffer('\x01\x42\x55')
Out[46]: <read-only buffer for 0x000000000DCCA210, size -1, offset 0 at 0x000000000DCB29D0>

In [47]: print buffer('\x01\x42\x55')
☺BU

如您所见，如果您打印缓冲区，您将获得字符串表示，但它仍然是一个缓冲区。
如果你询问 df.iloc[0,0] 的类型或只是 return 它（不使用打印），你会看到它确实仍然是一个缓冲区：

In [64]: df.iloc[0,0]
Out[64]: <read-write buffer ptr 0x000000000DCC9E68, size 3 at 0x000000000DCC9E30>

In [65]: type(df.iloc[0,0])
Out[65]: buffer

为什么 SQLite Blob 类型在 pandas Python 中自动转换为 unicode 序列

Why SQLite Blob type is being automatically converted in pandas Python to unicode sequence

python

sqlite

blob

pandas