我应该如何解码文件的内容以将其包含在 multipart/form-data POST 请求中?

How should I decode the contents of a file to include it in a multipart/form-data POST request?

我处于必须手动构建 multipart/form-data POST 请求正文的情况。我对结构的理解很好,我可以成功上传不包含文件的表单。我有一个文件作为 File 对象,我需要将文件的内容解释为一个字符串以将它们包含在请求的正文中。我遇到的所有包含文件的多部分表单数据的示例都只是包含类似 "文件内容转到此处" 的内容,其中包含文件,从不讨论如何从文件获取字符串。 this 问题的最佳答案接近我正在寻找的内容,但我宁愿避免 base64 的额外开销,因为我的表单将处理许多文件。我发现

`
--${boundary}
Content-Disposition: form-data; name="file"; filename="${file.name}"
Content-Type: ${file.type}

${await file.text()}`

适用于简单的 pdf,但对于 jpeg 失败(此处“失败”表示我的服务器无法正确解析图像)。

我有一个使用 FormData 实例和 Fetch 的工作示例(我不能在生产中使用 FormData)。在 Chrome 开发人员工具中,我可以获得请求的原始主体以查看文件的外观。下面是文件的开头:

Content-Disposition: form-data; name="file"; filename="test.jpg"
Content-Type: image/jpeg

ÿØÿî!AdobedÀ    E¿d„¾¤ÿÛ„           
$$''$335;;;;;;;;;;

使用 file.text() 消息的相同部分如下所示:

����!Adobed�    E�d������           
$$''$33

当文件被解码成这样时:

`
--${boundary}
Content-Disposition: form-data; name="file"; filename="${file.name}"
Content-Type: ${file.type}

${String.fromCharCode.apply(null, new Uint8Array(await file.arrayBuffer()))}`
    }
    result += `

文件的开头看起来是正确的,但比较完整的字符串表明存在一些差异。

我找到了这个

4.3 Encoding

   While the HTTP protocol can transport arbitrary binary data, the
   default for mail transport is the 7BIT encoding.  The value supplied
   for a part may need to be encoded and the "content-transfer-encoding"
   header supplied if the value does not conform to the default
   encoding.  [See section 5 of RFC 2046 for more details.]

在 RFC 2388 中,但我认为这是指如何通过网络发送请求正文,而不是关于正文的构建方式。我觉得我在这里缺少一些核心概念。任何帮助将不胜感激。

编辑: 以下是表单数据发送到我的服务器的方式:

        const response = await fetch(url, {
            method: 'POST', // *GET, POST, PUT, DELETE, etc.
            mode: 'cors', // no-cors, *cors, same-origin
            cache: 'no-cache', // *default, no-cache, reload, force-cache, only-if-cached
            credentials: 'same-origin', // include, *same-origin, omit
            redirect: 'follow', // manual, *follow, error
            referrer: 'no-referrer', // no-referrer, *client
            body: serializedData, // body data type must match "Content-Type" header
            headers: {
                'Content-Type': 'multipart/form-data; boundary=' + boundary,
            },
        })

4.10.21.7 Multipart form data The multipart/form-data encoding algorithm, given an entry list and encoding, is as follows:

Let result be the empty string.

For each entry in entry list:

For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a U+0023 NUMBER SIGN character (#), one or more ASCII digits representing the code point of the character in base ten, and finally a U+003B (;).

Encode the (now mutated) entry list using the rules described by RFC 7578, Returning Values from Forms: multipart/form-data, and return the resulting byte stream. [RFC7578]

Each entry in entry list is a field, the name of the entry is the field name and the value of the entry is the field value.

The order of parts must be the same as the order of fields in entry list. Multiple entries with the same name must be treated as distinct fields.

The parts of the generated multipart/form-data resource that correspond to non-file fields must not have a Content-Type header specified. Their names and values must be encoded using the character encoding selected above.

File names included in the generated multipart/form-data resource (as part of file fields) must use the character encoding selected above, though the precise name may be approximated if necessary (e.g. newlines could be removed from file names, quotes could be changed to "%22", and characters not expressible in the selected character encoding could be replaced by other characters).

The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used to generate the MIME type of the form submission payload generated by this algorithm.)

For details on how to interpret multipart/form-data payloads, see RFC 7578. [RFC7578] -- HTML: The Living Standard

这肯定回答了我的问题,但我对实现仍然有点困惑。