将二进制数据插入 longblob 列时出现 pymysql 编码错误
pymysql encoding error while inserting binary data into longblob column
我正在尝试将二进制文件的内容插入到 longblob 列中:
Python代码:
conn = pymysql.connect(...)
cursor = conn.cursor()
with open('test.bz2', 'rb') as fp:
data = fp.read()
cursor.execute('insert into test_t (test) values (%s)', [data])
错误堆栈跟踪:
Traceback (most recent call last):
File "./doit2", line 9, in <module>
cursor.execute('insert into test_t (test) values (%s)', [data])
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 127, in execute
result = self._query(query)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 275, in _query
conn.query(q)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/connections.py", line 763, in query
sql = sql.encode(self.encoding)
UnicodeEncodeError: 'latin-1' codec can't encode character '\udcae' in position 45: ordinal not in range(256)
创建 table 脚本:
mysql> show create table test_t;
+--------+--------------------------------------------------------------------------+
| Table | Create Table |
+--------+--------------------------------------------------------------------------+
| test_t | CREATE TABLE `test_t` (
`test` longblob
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
+--------+--------------------------------------------------------------------------+
默认编码:
=->python3 -c 'import sys; print(sys.getdefaultencoding())'
utf-8
添加 "charset='utf8', use_unicode=True" 连接调用,将错误更改为:
Traceback (most recent call last):
File "./doit2", line 13, in <module>
cursor.execute('insert into test_t (test) values (%s)', [data])
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 127, in execute
result = self._query(query)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 275, in _query
conn.query(q)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/connections.py", line 763, in query
sql = sql.encode(self.encoding)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcae' in position 45: surrogates not allowed
This 应该可以解决问题:
conn = pymysql.connect(...)
conn.set_character_set('utf8')
cursor = conn.cursor()
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
with open('test.bz2', 'rb') as fp:
data = fp.read()
cursor.execute('insert into test_t (test) values (%s)', [data])
看起来这是一个 pymysql 错误。我从 0.6.4 升级到 0.6.6(目前最新),问题不再存在。
我正在尝试将二进制文件的内容插入到 longblob 列中:
Python代码:
conn = pymysql.connect(...)
cursor = conn.cursor()
with open('test.bz2', 'rb') as fp:
data = fp.read()
cursor.execute('insert into test_t (test) values (%s)', [data])
错误堆栈跟踪:
Traceback (most recent call last):
File "./doit2", line 9, in <module>
cursor.execute('insert into test_t (test) values (%s)', [data])
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 127, in execute
result = self._query(query)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 275, in _query
conn.query(q)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/connections.py", line 763, in query
sql = sql.encode(self.encoding)
UnicodeEncodeError: 'latin-1' codec can't encode character '\udcae' in position 45: ordinal not in range(256)
创建 table 脚本:
mysql> show create table test_t;
+--------+--------------------------------------------------------------------------+
| Table | Create Table |
+--------+--------------------------------------------------------------------------+
| test_t | CREATE TABLE `test_t` (
`test` longblob
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
+--------+--------------------------------------------------------------------------+
默认编码:
=->python3 -c 'import sys; print(sys.getdefaultencoding())'
utf-8
添加 "charset='utf8', use_unicode=True" 连接调用,将错误更改为:
Traceback (most recent call last):
File "./doit2", line 13, in <module>
cursor.execute('insert into test_t (test) values (%s)', [data])
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 127, in execute
result = self._query(query)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/cursors.py", line 275, in _query
conn.query(q)
File "/u02/srm_tp/local/lib/python3.4/site-packages/pymysql/connections.py", line 763, in query
sql = sql.encode(self.encoding)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcae' in position 45: surrogates not allowed
This 应该可以解决问题:
conn = pymysql.connect(...)
conn.set_character_set('utf8')
cursor = conn.cursor()
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
with open('test.bz2', 'rb') as fp:
data = fp.read()
cursor.execute('insert into test_t (test) values (%s)', [data])
看起来这是一个 pymysql 错误。我从 0.6.4 升级到 0.6.6(目前最新),问题不再存在。