为什么我的 RFC 2822 日期没有被 chrono 解析?
Why is my RFC 2822 date not parsed by chrono?
我正在编写一些代码来解析 RSS 提要,但我在使用 Abstruse Goose RSS feed 时遇到了问题。
如果您查看该提要,日期被编码为 Mon, 06 Aug 2018 00:00:00 UTC
。对我来说,它看起来像 RFC 2822。
我尝试使用 chrono 的 DateTime::parse_from_rfc2822
解析它,但我得到 ParseError(NotEnough)
.
let pub_date = entry.pub_date().unwrap().to_owned();
return rfc822_sanitizer::parse_from_rfc2822_with_fallback(&pub_date)
.unwrap_or_else(|e| {
panic!(
"pub_date for item {:?} (value is {:?}) can't be parsed due to error {:?}",
&entry, pub_date, e
)
})
.naive_utc();
我做错了什么吗?我必须以某种方式破解它吗?
我使用 rfc822_sanitizer,它可以很好地修复错误的书写错误(大部分时间)。我不认为它会影响解析......但谁知道呢?
RFC2822
date/time 格式在 RFC 中被很好地编码为以下格式:
date-time = [ day-of-week "," ] date FWS time [CFWS]
day-of-week = ([FWS] day-name) / obs-day-of-week
day-name = "Mon" / "Tue" / "Wed" / "Thu" /
"Fri" / "Sat" / "Sun"
date = day month year
year = 4*DIGIT / obs-year
month = (FWS month-name FWS) / obs-month
month-name = "Jan" / "Feb" / "Mar" / "Apr" /
"May" / "Jun" / "Jul" / "Aug" /
"Sep" / "Oct" / "Nov" / "Dec"
day = ([FWS] 1*2DIGIT) / obs-day
time = time-of-day FWS zone
time-of-day = hour ":" minute [ ":" second ]
hour = 2DIGIT / obs-hour
minute = 2DIGIT / obs-minute
second = 2DIGIT / obs-second
zone = (( "+" / "-" ) 4DIGIT) / obs-zone
其中obs-zone
定义如下:
obs-zone = "UT" / "GMT" / ; Universal Time
; North American UT
; offsets
"EST" / "EDT" / ; Eastern: - 5/ - 4
"CST" / "CDT" / ; Central: - 6/ - 5
"MST" / "MDT" / ; Mountain: - 7/ - 6
"PST" / "PDT" / ; Pacific: - 8/ - 7
%d65-73 / ; Military zones - "A"
%d75-90 / ; through "I" and "K"
%d97-105 / ; through "Z", both
%d107-122 ; upper and lower case
很多人在滚动他们自己的时间戳生成库时犯的错误就是这一点——如何正确标记 RFC2822
TZ 偏移量。 UT
的原因是因为 UTC
和 UT
不完全相同(一个有额外的秒数,另一个有......四个变体!并且 RFC 没有定义使用了哪一个;它们都略有不同)。
我正在编写一些代码来解析 RSS 提要,但我在使用 Abstruse Goose RSS feed 时遇到了问题。
如果您查看该提要,日期被编码为 Mon, 06 Aug 2018 00:00:00 UTC
。对我来说,它看起来像 RFC 2822。
我尝试使用 chrono 的 DateTime::parse_from_rfc2822
解析它,但我得到 ParseError(NotEnough)
.
let pub_date = entry.pub_date().unwrap().to_owned();
return rfc822_sanitizer::parse_from_rfc2822_with_fallback(&pub_date)
.unwrap_or_else(|e| {
panic!(
"pub_date for item {:?} (value is {:?}) can't be parsed due to error {:?}",
&entry, pub_date, e
)
})
.naive_utc();
我做错了什么吗?我必须以某种方式破解它吗?
我使用 rfc822_sanitizer,它可以很好地修复错误的书写错误(大部分时间)。我不认为它会影响解析......但谁知道呢?
RFC2822
date/time 格式在 RFC 中被很好地编码为以下格式:
date-time = [ day-of-week "," ] date FWS time [CFWS]
day-of-week = ([FWS] day-name) / obs-day-of-week
day-name = "Mon" / "Tue" / "Wed" / "Thu" /
"Fri" / "Sat" / "Sun"
date = day month year
year = 4*DIGIT / obs-year
month = (FWS month-name FWS) / obs-month
month-name = "Jan" / "Feb" / "Mar" / "Apr" /
"May" / "Jun" / "Jul" / "Aug" /
"Sep" / "Oct" / "Nov" / "Dec"
day = ([FWS] 1*2DIGIT) / obs-day
time = time-of-day FWS zone
time-of-day = hour ":" minute [ ":" second ]
hour = 2DIGIT / obs-hour
minute = 2DIGIT / obs-minute
second = 2DIGIT / obs-second
zone = (( "+" / "-" ) 4DIGIT) / obs-zone
其中obs-zone
定义如下:
obs-zone = "UT" / "GMT" / ; Universal Time
; North American UT
; offsets
"EST" / "EDT" / ; Eastern: - 5/ - 4
"CST" / "CDT" / ; Central: - 6/ - 5
"MST" / "MDT" / ; Mountain: - 7/ - 6
"PST" / "PDT" / ; Pacific: - 8/ - 7
%d65-73 / ; Military zones - "A"
%d75-90 / ; through "I" and "K"
%d97-105 / ; through "Z", both
%d107-122 ; upper and lower case
很多人在滚动他们自己的时间戳生成库时犯的错误就是这一点——如何正确标记 RFC2822
TZ 偏移量。 UT
的原因是因为 UTC
和 UT
不完全相同(一个有额外的秒数,另一个有......四个变体!并且 RFC 没有定义使用了哪一个;它们都略有不同)。