Python 请求模拟 CURL POST 发送包含 1 个或多个文件的多部分请求和 JSON 正文

Python Requests Emulate a CURL POST sending multipart request with 1 or more files AND JSON body

我已经研究了两天了,但没有成功!

工作 CURL 请求

curl -X POST -v "http://:8080/controller/endpoint" -H "Cache-Control: no-cache" -H "Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW" -F "message={ \"id\": \"b3562c86-6ff4-4bf7-9c4a-4c64fff4d0ea\", \"stuff\": [
{
\"id\": \"1ca2d9b1-1d73-432a-b483-be404afff8da\",
.......
\"endTime\": \"\"
}]}};type=application/json" -F "files=@file.zip"

Returns 输出如下:

 ./rest.sh http://127.0.0.1/anything
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> POST /anything HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 127.0.0.1
> Accept: */*
> Cache-Control: no-cache
> Content-Length: 493
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW; boundary=------------------------52912a6946761b42
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
* Server gunicorn/19.9.0 is not blacklisted
< Server: gunicorn/19.9.0
< Date: Tue, 12 Feb 2019 18:18:56 GMT
< Connection: keep-alive
< Content-Type: application/json
< Content-Length: 725
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
<
{
  "args": {},
  "data": "",
  "files": {
    "files": "ZIP-CONTENT-GOES-HERE"
  },
  "form": {
    "message": "{ \"runId\": \"1ca2d9b1-1d73-432a-b483-be404a13e8da\", \"reports\": [\n{\n\"executionId\": \"1ca2d9b1-1d73-432a-b483-be404a13e8da\",\n\"endTime\": \"\"\n}]}}"
  },
  "headers": {
    "Accept": "*/*",
    "Cache-Control": "no-cache",
    "Content-Length": "493",
    "Content-Type": "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW; boundary=------------------------52912a6946761b42",
    "Expect": "100-continue",
    "Host": "127.0.0.1",
    "User-Agent": "curl/7.35.0"
  },
  "json": null,
  "method": "POST",
  "origin": "172.17.42.1",
  "url": "http://127.0.0.1/anything"
}
* Connection #0 to host 127.0.0.1 left intact

现在,如果我将 ,scrub2.zip 添加到 curl 命令(发送 2 个 zip 文件和 JSON 数据),我得到的输出如下所示:

 ./rest.sh http://127.0.0.1/anything
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
> POST /anything HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 127.0.0.1
> Accept: */*
> Cache-Control: no-cache
> Content-Length: 878
> Expect: 100-continue
> Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW; boundary=------------------------27d684afce904423
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
* Server gunicorn/19.9.0 is not blacklisted
< Server: gunicorn/19.9.0
< Date: Tue, 12 Feb 2019 18:20:36 GMT
< Connection: keep-alive
< Content-Type: application/json
< Content-Length: 1117
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
<
{
  "args": {},
  "data": "",
  "files": {},
  "form": {
    "files": "--------------------------fd702594c1765b85\r\nContent-Disposition: attachment; filename=\"scrubbed.zip\"\r\nContent-Type: application/octet-stream\r\n\r\nZIP-CONTENT-GOES-HERE\r\n--------------------------fd702594c1765b85\r\nContent-Disposition: attachment; filename=\"scrubbed2.zip\"\r\nContent-Type: application/octet-stream\r\n\r\nZIP-CONTENT-GOES-HERE222222222\n\r\n--------------------------fd702594c1765b85--",
    "message": "{ \"runId\": \"1ca2d9b1-1d73-432a-b483-be404a13e8da\", \"reports\": [\n{\n\"executionId\": \"1ca2d9b1-1d73-432a-b483-be404a13e8da\",\n\"endTime\": \"\"\n}]}}"
  },
  "headers": {
    "Accept": "*/*",
    "Cache-Control": "no-cache",
    "Content-Length": "878",
    "Content-Type": "multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW; boundary=------------------------27d684afce904423",
    "Expect": "100-continue",
    "Host": "127.0.0.1",
    "User-Agent": "curl/7.35.0"
  },
  "json": null,
  "method": "POST",
  "origin": "172.17.42.1",
  "url": "http://127.0.0.1/anything"
}
* Connection #0 to host 127.0.0.1 left intact

你看出区别了吗?这 2 个文件现在嵌入在 form/files 中,而不是单独显示的文件和 form/message!

Java API 端点接受这种 CURL 请求 在调试器中看起来如下:

但我在 Python 上的所有尝试,例如:

multipart_form_data_object = {
    'scrubbed.zip': (args.files[0], open(args.files[0], 'rb'), "application/json"),
    'files': (args.files[1], open(args.files[1], 'rb'), "application/json"),
    'message': (None, open(args.message, 'rb'), 'application/json')
}
 response = requests.post(args.url + ':' + str(args.port) + '/' + args.endpoint, files=multipart_form_data_object,
                             proxies=proxies)

(这是我最接近它的工作),看起来像这样:

multipart_form_data_object = {
    'scrubbed.zip': (args.files[0], open(args.files[0], 'rb'), "application/json"),
    'files': (args.files[1], open(args.files[1], 'rb'), "application/json"),
    'message': (None, open(args.message, 'rb'), 'application/json')
}
response = requests.post(args.url + ':' + str(args.port) + '/' + args.endpoint, files=multipart_form_data_object,
                         proxies=proxies)

输出为:

{'Content-Length': '664', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'User-Agent': 'python-requests/2.21.0', 'Connection': 'keep-alive', 'Content-Type': 'multipart/form-data; boundary=227d4ef5a41db8a690e5cebadf336851'}
{
  "args": {},
  "data": "",
  "files": {
    "files": "ZIP-CONTENT-GOES-HERE",
    "scrubbed.zip": "ZIP-CONTENT-GOES-HERE22222"
  },
  "form": {
    "message": "{\r\n  \"runId\": \"9c4a-4c64f6d4d0ea\",\r\n  \"reports\": [\r\n    {\r\n      \"executionId\": \"d73-432a-b483-be404a13e8da\",\r\n      \"endTime\": \"\"\r\n    }\r\n  ]\r\n}"
  },
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Connection": "keep-alive",
    "Content-Length": "664",
    "Content-Type": "multipart/form-data; boundary=227d4ef5a41db8a690e5cebadf336851",
    "Host": "java.api.host.com",
    "User-Agent": "python-requests/2.21.0"
  },
  "json": null,
  "method": "POST",
  "origin": "10.0.0.2",
  "url": "http://java.api.host.com/anything"
}

现在,尝试调整它以发送一组文件(否则,如果我将 scrubbed.zip 重命名为 files ,它会被覆盖),使其看起来像:

multipart_form_data_object = {
    'files': [(args.files[0], open(args.files[0], 'rb'), "application/json"),
     (args.files[1], open(args.files[1], 'rb'), "application/json")],
    'message': (None, open(args.message, 'rb'), 'application/json')
}

导致错误:

Traceback (most recent call last):
  File ".\load_stress_test_endpoint.py", line 84, in <module>
    post()
  File ".\load_stress_test_endpoint.py", line 76, in post
    proxies=proxies)
  File "C:\Python\Python27\lib\site-packages\requests\api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "C:\Python\Python27\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Python\Python27\lib\site-packages\requests\sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "C:\Python\Python27\lib\site-packages\requests\sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 316, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 504, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 169, in _encode_files
    body, content_type = encode_multipart_formdata(new_fields)
  File "C:\Python\Python27\lib\site-packages\urllib3\filepost.py", line 90, in encode_multipart_formdata
    body.write(data)
TypeError: 'tuple' does not have the buffer interface

我最后的尝试是另一种数据结构(列表),如下:

multiple_files_list = [
    ('files', (args.files[0], open(args.files[0], 'rb'), "application/json")),
    ('files', (args.files[1], open(args.files[1], 'rb'), "application/json")),
    ('message', None, open(args.message, 'rb'), 'application/json')
]

错误结果:

Traceback (most recent call last):
  File ".\load_stress_test_endpoint.py", line 84, in <module>
    post()
  File ".\load_stress_test_endpoint.py", line 76, in post
    proxies=proxies)
  File "C:\Python\Python27\lib\site-packages\requests\api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "C:\Python\Python27\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Python\Python27\lib\site-packages\requests\sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "C:\Python\Python27\lib\site-packages\requests\sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 316, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 504, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python\Python27\lib\site-packages\requests\models.py", line 141, in _encode_files
    for (k, v) in files:
ValueError: too many values to unpack

能否请教一下,如何使Python请求包的性能与CURL请求类似?

以下是 Java 端点的设置方式:

public Response index(@RequestPart("message") @Valid
                          final Message message,
                          @ApiParam(value = "Multipart File array of compressed archives (zip) ", required = true) @RequestPart("files") @Valid
                          final MultipartFile[] files)

您的脚本应如下所示:

注:存在对requests_toolbelt

的依赖

send.py

import argparse
import requests
from requests_toolbelt import MultipartEncoder

parser = argparse.ArgumentParser()
parser.add_argument('message')
parser.add_argument('--files', nargs='+')
args = parser.parse_args()

multipart_form_data_object = MultipartEncoder(
    fields=(
        ('files', (args.files[0], open(args.files[0], 'rb'), "application/json")),
        ('files', (args.files[1], open(args.files[1], 'rb'), "application/json")),
        ('message', ('message', open(args.message, 'rb'), 'application/json')),
    )
)

res = requests.post('http://localhost:8000', data=multipart_form_data_object, headers={'Content-Type': multipart_form_data_object.content_type})
print(res.content)

我用 django 测试了它:

urls.py

from django.urls import path
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt

@csrf_exempt
def dump(request):
    data = {name: [o.read().decode('utf8') for o in request.FILES.getlist(name)] for name in request.FILES.keys()}
    return JsonResponse(data)

urlpatterns = [
    path('', dump),
]

使用以下方式调用它:

curl -s http://127.0.0.1:8000/ -F "message=@$(pwd)/file1" -F "files=@$(pwd)/file2" -F "files=@$(pwd)/file3"

并使用 python

python send.py file1 --files file2 file3

相同的输出:

{"files": ["{\"message\": \"hello world\"}\n", "something else\n"], "message": ["hello world\n"]}