HTTP 范围请求 multipart/byteranges - 末尾是否有 CRLF?
HTTP Range request multipart/byteranges - is there a CRLF at the end?
RFC7233 很好很清楚,除了行结尾。
我对 multipart/byteranges
响应的 HTTP 响应 body 特别感兴趣。我假设每一行都像 HTTP headers 一样由 CRLF 终止,但本文档并未明确说明。我完全困惑的是最后一行:--THIS_SEPARATOR_SEPARATES--
。它后面是 CRLF 吗?
完整块:
HTTP/1.1 206 Partial Content
Date: Wed, 15 Nov 1995 06:25:24 GMT
Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT
Content-Length: 1741
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 500-999/8000
...the first range...
--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 7000-7999/8000
...the second range
--THIS_STRING_SEPARATES--
抱歉,我实在找不到,如有帮助,将不胜感激。
注意:请不要有直觉,只有 RFC 参考。
该规范引用:
https://www.rfc-editor.org/rfc/rfc2046#section-5.1.1
其中明确指出:
--gc0pJq0M:08jU534c0p
The boundary delimiter MUST occur at the beginning of a line, i.e.,
following a CRLF, and the initial CRLF is considered to be attached
to the boundary delimiter line rather than part of the preceding
part. The boundary may be followed by zero or more characters of
linear whitespace. It is then terminated by either another CRLF and
the header fields for the next part, or by two CRLFs, in which case
there are no header fields for the next part. If no Content-Type
field is present it is assumed to be "message/rfc822" in a
"multipart/digest" and "text/plain" otherwise.
如果您更仔细地阅读 RFC 7233,Appendix A refers to RFC 2046 Section 5.1 了解 HTTP body:
中 MIME 数据的实际格式
When a 206 (Partial Content) response message includes the content of
multiple ranges, they are transmitted as body parts in a multipart
message body ([RFC2046], Section 5.1) with the media type of
"multipart/byteranges".
RFC 2046 第 5.1 节定义了“多部分”媒体类型的正式定义及其边界的格式和解析方式。
为了回答您的问题,这里是来自 RFC 2046 的正式语法:
The boundary delimiter MUST occur at the beginning of a line, i.e.,
following a CRLF, and the initial CRLF is considered to be attached
to the boundary delimiter line rather than part of the preceding
part. The boundary may be followed by zero or more characters of
linear whitespace. It is then terminated by either another CRLF and
the header fields for the next part, or by two CRLFs, in which case
there are no header fields for the next part. If no Content-Type
field is present it is assumed to be "message/rfc822" in a
"multipart/digest" and "text/plain" otherwise.
NOTE: The CRLF preceding the boundary delimiter line is conceptually
attached to the boundary so that it is possible to have a part that
does not end with a CRLF (line break). Body parts that must be
considered to end with line breaks, therefore, must have two CRLFs
preceding the boundary delimiter line, the first of which is part of
the preceding body part, and the second of which is part of the
encapsulation boundary.
...
The boundary delimiter line following the last body part is a
distinguished delimiter that indicates that no further body parts
will follow. Such a delimiter line is identical to the previous
delimiter lines, with the addition of two more hyphens after the
boundary parameter value.
--gc0pJq0M:08jU534c0p--
NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
boundary value with the beginning of each candidate line. An exact
match of the entire candidate line is not required; it is sufficient
that the boundary appear in its entirety following the CRLF.
...
The only mandatory global parameter for the "multipart" media type is
the boundary parameter, which consists of 1 to 70 characters from a
set of characters known to be very robust through mail gateways, and
NOT ending with white space. (If a boundary delimiter line appears to
end with white space, the white space must be presumed to have been
added by a gateway, and must be deleted.) It is formally specified
by the following BNF:
boundary := 0*69 bcharsnospace
bchars := bcharsnospace / " "
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
"+" / "_" / "," / "-" / "." /
"/" / ":" / "=" / "?"
Overall, the body of a "multipart" entity may be specified as
follows:
dash-boundary := "--" boundary
; boundary taken from the value of
; boundary parameter of the
; Content-Type field.
multipart-body := [preamble CRLF]
dash-boundary transport-padding CRLF
body-part *encapsulation
close-delimiter transport-padding
[CRLF epilogue]
transport-padding := *LWSP-char
; Composers MUST NOT generate
; non-zero length transport
; padding, but receivers MUST
; be able to handle padding
; added by message transports.
encapsulation := delimiter transport-padding
CRLF body-part
delimiter := CRLF dash-boundary
close-delimiter := delimiter "--"
preamble := discard-text
epilogue := discard-text
discard-text := *(*text CRLF) *text
; May be ignored or discarded.
body-part := MIME-part-headers [CRLF *OCTET]
; Lines in a body-part must not start
; with the specified dash-boundary and
; the delimiter must not appear anywhere
; in the body part. Note that the
; semantics of a body-part differ from
; the semantics of a message, as
; described in the text.
OCTET := <any 0-255 octet value>
新部分开头的每个定界符都以 CRLF 终止,紧接定界符之前的任何 CRLF 都被解析为边界的一部分,而不是前一部分的数据。但是,在最终关闭边界的末尾没有 CRLF,除非存在尾声(这在电子邮件中很少使用,而且我从未见过它在 HTTP 中使用,因为无法确定尾声何时结束除非存在有效的 Content-Length
header,它不应该与 self-terminating 内容类型(如 MIME)一起使用。
RFC7233 很好很清楚,除了行结尾。
我对 multipart/byteranges
响应的 HTTP 响应 body 特别感兴趣。我假设每一行都像 HTTP headers 一样由 CRLF 终止,但本文档并未明确说明。我完全困惑的是最后一行:--THIS_SEPARATOR_SEPARATES--
。它后面是 CRLF 吗?
完整块:
HTTP/1.1 206 Partial Content
Date: Wed, 15 Nov 1995 06:25:24 GMT
Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT
Content-Length: 1741
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 500-999/8000
...the first range...
--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 7000-7999/8000
...the second range
--THIS_STRING_SEPARATES--
抱歉,我实在找不到,如有帮助,将不胜感激。 注意:请不要有直觉,只有 RFC 参考。
该规范引用:
https://www.rfc-editor.org/rfc/rfc2046#section-5.1.1
其中明确指出:
--gc0pJq0M:08jU534c0p
The boundary delimiter MUST occur at the beginning of a line, i.e., following a CRLF, and the initial CRLF is considered to be attached
to the boundary delimiter line rather than part of the preceding
part. The boundary may be followed by zero or more characters of
linear whitespace. It is then terminated by either another CRLF and
the header fields for the next part, or by two CRLFs, in which case
there are no header fields for the next part. If no Content-Type
field is present it is assumed to be "message/rfc822" in a
"multipart/digest" and "text/plain" otherwise.
如果您更仔细地阅读 RFC 7233,Appendix A refers to RFC 2046 Section 5.1 了解 HTTP body:
中 MIME 数据的实际格式When a 206 (Partial Content) response message includes the content of multiple ranges, they are transmitted as body parts in a multipart message body ([RFC2046], Section 5.1) with the media type of "multipart/byteranges".
RFC 2046 第 5.1 节定义了“多部分”媒体类型的正式定义及其边界的格式和解析方式。
为了回答您的问题,这里是来自 RFC 2046 的正式语法:
The boundary delimiter MUST occur at the beginning of a line, i.e., following a CRLF, and the initial CRLF is considered to be attached to the boundary delimiter line rather than part of the preceding part. The boundary may be followed by zero or more characters of linear whitespace. It is then terminated by either another CRLF and the header fields for the next part, or by two CRLFs, in which case there are no header fields for the next part. If no Content-Type field is present it is assumed to be "message/rfc822" in a "multipart/digest" and "text/plain" otherwise. NOTE: The CRLF preceding the boundary delimiter line is conceptually attached to the boundary so that it is possible to have a part that does not end with a CRLF (line break). Body parts that must be considered to end with line breaks, therefore, must have two CRLFs preceding the boundary delimiter line, the first of which is part of the preceding body part, and the second of which is part of the encapsulation boundary.
...
The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value. --gc0pJq0M:08jU534c0p-- NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the boundary value with the beginning of each candidate line. An exact match of the entire candidate line is not required; it is sufficient that the boundary appear in its entirety following the CRLF.
...
The only mandatory global parameter for the "multipart" media type is the boundary parameter, which consists of 1 to 70 characters from a set of characters known to be very robust through mail gateways, and NOT ending with white space. (If a boundary delimiter line appears to end with white space, the white space must be presumed to have been added by a gateway, and must be deleted.) It is formally specified by the following BNF: boundary := 0*69 bcharsnospace bchars := bcharsnospace / " " bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / "_" / "," / "-" / "." / "/" / ":" / "=" / "?" Overall, the body of a "multipart" entity may be specified as follows: dash-boundary := "--" boundary ; boundary taken from the value of ; boundary parameter of the ; Content-Type field. multipart-body := [preamble CRLF] dash-boundary transport-padding CRLF body-part *encapsulation close-delimiter transport-padding [CRLF epilogue] transport-padding := *LWSP-char ; Composers MUST NOT generate ; non-zero length transport ; padding, but receivers MUST ; be able to handle padding ; added by message transports. encapsulation := delimiter transport-padding CRLF body-part delimiter := CRLF dash-boundary close-delimiter := delimiter "--" preamble := discard-text epilogue := discard-text discard-text := *(*text CRLF) *text ; May be ignored or discarded. body-part := MIME-part-headers [CRLF *OCTET] ; Lines in a body-part must not start ; with the specified dash-boundary and ; the delimiter must not appear anywhere ; in the body part. Note that the ; semantics of a body-part differ from ; the semantics of a message, as ; described in the text. OCTET := <any 0-255 octet value>
新部分开头的每个定界符都以 CRLF 终止,紧接定界符之前的任何 CRLF 都被解析为边界的一部分,而不是前一部分的数据。但是,在最终关闭边界的末尾没有 CRLF,除非存在尾声(这在电子邮件中很少使用,而且我从未见过它在 HTTP 中使用,因为无法确定尾声何时结束除非存在有效的 Content-Length
header,它不应该与 self-terminating 内容类型(如 MIME)一起使用。