HTML 发布的表单数据被写入 MySQL 数据库

HTML posted form data gets written as jibberish into MySQL database

当我尝试在希腊字母文本字段中输入数据时,我的 wsgi 脚本将该数据保存为 MySQL 数据库中的乱码,我不知道为什么。 下面是表单方法发送数据时的相关代码:

pdata = pdata + '''
<form methods="POST" enctype="multipart/form-data" action="%s">
    <tr>
            <td> <center>   <input type="text"  name="task"     size=50>    </td>
            <td> <center>   <input type="text"  name="price"    size=5>     </td>
            <td> <center>   <input type="text"  name="lastvisit">           </td>
        </table><br><br>
        <td>    <input type="image" src="/static/img/submit.gif" name="update" value="Ενημέρωση!">  </td>
    </tr>
</form>
''' % app.get_url( '/update/<name>', name=name )


pdata = pdata + "<meta http-equiv='REFRESH' content='200;%s'>" % app.get_url( '/' )
return pdata

这里是相关的回调函数,它试图将发布的表单数据输入 MySQL 数据库。

@app.route( '/update/<name>' )

def update( name ):

pdata = ''

task = request.query.get('task')
price = request.query.get('price')
lastvisit = request.query.get('lastvisit')


# check if date entered as intented, format it properly for MySQL
lastvisit = datetime.strptime(lastvisit, '%d %m %Y').strftime('%Y-%m-%d')


if( ( task and len(task) <= 200 ) and ( price and price.isdigit() and len(price) <= 3 ) and lastvisit != "error" ):
    # find the requested client based on its name
    cur.execute('''SELECT ID FROM clients WHERE name = %s''', name )
    clientID = cur.fetchone()[0]

    try:
        # found the client, save primary key and use it to issue hits & money UPDATE
        cur.execute('''UPDATE clients SET hits = hits + 1, money = money + %s WHERE ID = %s''', ( int(price), clientID ))

        # update client profile by adding a new record
        cur.execute('''INSERT INTO jobs (clientID, task, price, lastvisit) VALUES (%s, %s, %s, %s)''', ( clientID, task, price, lastvisit ))
    except:
        cur.rollback()

我不明白为什么数据以乱码而不是正确的 utf-8 格式存储到数据库中。同样尝试使用utf-8编码类型也没有用。

<form methods="POST" enctype="utf-8" action="%s">

根据 wsgi_mod 文档,WSGIDaemonProcess 默认编码是 ASCII。 ASCII 中不包含希腊字符,您的输入未正确解码。如果要允许希腊字符,则必须使用 UTF-8 或 iso-8859-1。通常服务器是由 init 系统启动的守护进程,99% 的时间仍然使用 ASCII 作为默认编码。在开发或调试时,您通常不会遇到这些问题,因为 python 脚本继承了通常使用 UTF-8 的当前用户的环境。

$env
.....
LANG=en_GB.UTF-8
.....

引用自 wsgi_mod for apache:

lang=locale Set the current language locale. This is the same as having set the LANG environment variable. You will need to set this on many Linux systems where Apache when started up from system init scripts uses the default C locale, meaning that the default system encoding is ASCII. Unless you need a special language locale, set this to en_US.UTF-8. Whether the lang or locale option works best can depend on the system being used. Set both if you aren’t sure which is appropriate.

locale=locale Set the current language locale. This is the same as having set the LC_ALL environment variable. You will need to set this on many Linux systems where Apache when started up from system init scripts uses the default C locale, meaning that the default system encoding is ASCII. Unless you need a special language locale, set this to en_US.UTF-8. Whether the lang or locale option works best can depend on the system being used. Set both if you aren’t sure which is appropriate.

The html form data to be posted is "αυτή είναι μια δοκιμή" and the end result inside database is "αÏÏή είναι μια δοκιμή"

但是,“αυτή είναι μια δοκιμή”显然是无效的 UTF-8,因为 位置 38 处的字节 (ή) 表示它是一个 two-byte UTF-8 字符,但后面只有 1 个字节 (reference).

如果这正是传递给代码的数据;那么您需要检查并确认 HTML 表单以正确的 UTF-8 格式提交数据。

<form accept-charset='UTF-8'>

假设您的输入字符串是正确的 UTF-8 编码,则您的输出字符串“αεÏÏε® εεεε½α±ε εεεεα εεεεμε”是 UTF-7 或者更可能是 ISO-8859-1 编码 (reference).

因此问题可能出在传输机制(如上定义;在 HTML 表单提交中)或数据库存储编码。

Yes MySQL Tables and Columns are configured to be utf8_general_ci.

这也可能是个问题。 MySQL utf8_NOT 完整的 UTF-8 (wat?!),因为它是 3 字节而不是 4 字节;因此,如果您存储了一个 4 字节的 UTF-8 字符,它将偏移 所有后面的字符字节 并使它们看起来像垃圾。

解决方案:

将您的 MySQL 列和所有排序规则更新为 utf8mb4_unicode_ci

同时检查以确保您的 MySQL 传输机制也在使用 utf8mb4_

Read This