上传 Office 在 Python 中使用 API POST 方法打开 XML 个文件

Question

我正在尝试编写一个脚本来帮助我使用我们的 CAT 工具 (Memsource) 自动执行一些工作。为此，我需要使用 API.

上传一些文件

我依赖 Memsource API 此处提供的文档：https://cloud.memsource.com/web/docs/api#operation/createJob

我写了一段简短的代码来测试文件上传，然后再改为异步，但我遇到了一些严重的问题：文本文件上传正确，尽管文本的 body 在上传后包含一些添加内容：

--4002a5507da490554ad71ce8591ccf69    
Content-Disposition: form-data; name="file"; filename=“test.txt"

我也试过上传DOCX文件，但在Memsource在线编辑器中连打开都打不开——我猜是一路上修改了内容，但我找不到在哪里...

负责上传的代码如下：

def test_upload(self):
    # Assemble "Memsource" header as mentioned in the API docs
    Memsource_header = {
        "targetLangs": ["pl"],
    }

    # Open the file to be uploaded and extract file name
    f = open("Own/TMS_CAT/test.txt", "rb")
    f_name = os.path.basename(f.name)

    # Assemble the request header
    header = {
        "Memsource": json.dumps(Memsource_header),
        "Content-Disposition": f'attachment; filename="{f_name}"',
        "Authorization": f"ApiToken {self.authToken}",
        "Content-Type": "application/octet-stream; charset=utf-8",
    }

    # Make POST request and catch results
    file = {"file": f}

    req = requests.post(
        "https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
        headers=header,
        files=file,
    )
    print(req.request.headers)
    print(req.json())

请求header:

{
   "User-Agent":"python-requests/2.27.1",
   "Accept-Encoding":"gzip, deflate",
   "Accept":"*/*",
   "Connection":"keep-alive",
   "Memsource":"{\"targetLangs\": [\"pl\"]}",
   "Content-Disposition":"attachment; filename=\"test.txt\"",
   "Authorization":"ApiToken {secret}",
   "Content-Type":"application/octet-stream; charset=utf-8",
   "Content-Length":"2902"
}

Memsource 的响应：

    {
   "asyncRequest":{
      "action":"IMPORT_JOB",
      "dateCreated":"2022-02-22T18:36:30+0000",
      "id":"{id}"
   },
   "jobs":[
      {
         "workflowLevel":1,
         "workflowStep":{
            "uid":"{uid}",
            "order":2,
            "id":"{id}",
            "name":"Tra"
         },
         "imported":false,
         "dateCreated":"2022-02-22T18:36:30+0000",
         "notificationIntervalInMinutes":-1,
         "updateSourceDate":"None",
         "dateDue":"2022-10-10T12:00:00+0000",
         "targetLang":"pl",
         "continuous":false,
         "jobAssignedEmailTemplate":"None",
         "uid":"{id}",
         "status":"NEW",
         "filename":"test.txt",
         "sourceFileUid":"{id}",
         "providers":[
            
         ]
      }
   ],
   "unsupportedFiles":[
      
   ]
}

我觉得都不错...

我将不胜感激任何关于如何让这个东西工作的建议！ :-)

Answer 1

我设法解决了这个问题 — 注意到请求正在向请求的 body 添加一些有限的 headers，即传入 files 的文件内容参数。

我简单地摆脱了它并更改了代码如下：

# Open the file to be uploaded and extract file name
        with open(
            "/file.ext", "rb"
        ) as f:
            f_name = os.path.basename(f.name)

            # Assemble the request header
            header = {
                "Memsource": json.dumps(Memsource_header),
                "Content-Disposition": f'attachment; filename="{f_name}"',
                "Authorization": f"ApiToken {self.authToken}",
                "Content-Type": "application/octet-stream; charset=utf-8",
            }

            req = requests.post(
                "https://cloud.memsource.com/web/api2/v1/projects/{project-id}/jobs",
                headers=header,
                data=f,
            )

上传 Office 在 Python 中使用 API POST 方法打开 XML 个文件

Uploading Office Open XML files using API POST method in Python

rest

post

docx

python-3.x