PyCurl 执行 POST 时的奇怪行为

Weird behavior when doing POST by PyCurl

我有一个简单的代码 post 数据到远程服务器:

def main():
    headers = {}
    headers['Content-Type'] = 'application/json'

    target_url = r'the_url'

    data = {"bodyTextPlain": "O estimulante concorrente dos azulzinhos\r\nConhe\u00e7a a nova tend\u00eancia em estimulante masculino e feminino\r\n\r\nEste estimulante ficou conhecido por seus efeitos similares as p\u00edlulas\r\nazuis,\r\ndestacando-se por n\u00e3o possuir contraindica\u00e7\u00e3o ou efeito colateral.\r\n\r\nSucesso de vendas e principal concorrente natural dos azulzinhos,\r\nsua f\u00f3rmula \u00e9 totalmente natural e livre de qu\u00edmicos.\r\n\r\nPossuindo registro no Minist\u00e9rio da Sa\u00fade (ANVISA) e atestado de\r\nautenticidade.\r\n\r\nSaiba mais http://www5.somenteagora.com.br/maca\r\nAdquirindo 3 frascos voc\u00ea ganha +1 de brinde. Somente esta semana!\r\n\r\n\r\n\r\n\r\nPare de receber\r\nhttp://www5.somenteagora.com.br/app/sair/3056321/1\r\n\r\n"}

    buffer = StringIO()
    curl = pycurl.Curl()
    curl.setopt(curl.URL, target_url)
    curl.setopt(pycurl.HTTPHEADER, ['%s: %s' % (k, v) for k, v in headers.items()])

    # this line causes the problem
    curl.setopt(curl.POSTFIELDS, json.dumps(data))

    curl.setopt(pycurl.SSL_VERIFYPEER, False)
    curl.setopt(pycurl.SSL_VERIFYHOST, False)
    curl.setopt(pycurl.WRITEFUNCTION, buffer.write)
    curl.perform()

    response = buffer.getvalue()

    print curl.getinfo(pycurl.HTTP_CODE)
    print response

远程服务器在解析我发送的 json 字符串时出错:

500 { "status" : "Error", "message" : "Unexpected IOException (of type java.io.CharConversionException): Invalid UTF-32 character 0x3081a901(above 10ffff) at char #7, byte #31)" }

但是,如果我将 json.dumps 中的 post 数据保存到一个变量中,然后执行 post:

    #curl.setopt(curl.POSTFIELDS, json.dumps(data))

    data_s = json.dumps(data)
    curl.setopt(curl.POSTFIELDS, data_s)

那就没有报错了:

200

这两种情况有什么区别吗?

谢谢。

这是一个非常微妙的问题。答案在于 documentation for Curl.setopt_string(option, value):

中的这个警告

Warning: No checking is performed that option does, in fact, expect a string value. Using this method incorrectly can crash the program and may lead to a security vulnerability. Furthermore, it is on the application to ensure that the value object does not get garbage collected while libcurl is using it. libcurl copies most string options but not all; one option whose value is not copied by libcurl is CURLOPT_POSTFIELDS.

当您使用变量时,这会创建对字符串的引用,因此它不会被垃圾回收。内联表达式时,字符串在 libcurl 完成使用之前被释放,结果不可预知。

为避免担心对象的生命周期,您可以改用 CURLOPT_COPYPOSTFIELDS