如何通过 nom 解析匹配的分隔符?
How to parse matched separators by nom?
我想通过 nom[ 解析四种形式的 YMD 日期(“20190919”、“2019.09.19”、“2019-09-19”和“2019/09/19”) =26=]图书馆.
我从 iso8601 parser which parse only "YYYY-MM-DD" form. And I tryed to match separator and reuse it for next matching like in regex (\d{4})([.-/]?)(\d{2})(\d{2}) 开始。
原来这段代码有效:
fn parse_ymd(i: &[u8]) -> IResult<&[u8], DateType> {
let (i, y) = year(i)?;
// Match separator if it exist.
let (i, sep) = opt(one_of(".-/"))(i)?;
let (i, m) = month(i)?;
// If first separator was matched then try to find next one.
let (i, _) = if let Some(sep) = sep {
tag(&[sep as u8])(i)?
} else {
// Support the same signature as previous branch.
(i, &[' ' as u8][..])
};
let (i, d) = day(i)?;
Ok((
i,
DateType::YMD {
year: y,
month: m,
day: d,
},
))
}
但显然看起来很奇怪。
是否有一些 nom 工具可以更合适地做到这一点?
(这个问题是关于 nom 功能,以及如何正确地做事。不仅仅是这个特定的例子。)
你的解决方案很不错。我真的只能提供一个建议:
fn parse_ymd(i: &[u8]) -> IResult<&[u8], DateType> {
...
// If first separator was matched then try to find next one.
let i = match sep {
Some(sep) => tag(&[sep as u8])(i)?.0,
_ => i,
};
...
}
您可能不熟悉直接访问元组元素的语法。来自 rust book:
In addition to destructuring through pattern matching, we can access a tuple element directly by using a period (.) followed by the index of the value we want to access.
在这种情况下,它可以避免尝试匹配两个手臂签名的尴尬。
我想通过 nom[ 解析四种形式的 YMD 日期(“20190919”、“2019.09.19”、“2019-09-19”和“2019/09/19”) =26=]图书馆.
我从 iso8601 parser which parse only "YYYY-MM-DD" form. And I tryed to match separator and reuse it for next matching like in regex (\d{4})([.-/]?)(\d{2})(\d{2}) 开始。
原来这段代码有效:
fn parse_ymd(i: &[u8]) -> IResult<&[u8], DateType> {
let (i, y) = year(i)?;
// Match separator if it exist.
let (i, sep) = opt(one_of(".-/"))(i)?;
let (i, m) = month(i)?;
// If first separator was matched then try to find next one.
let (i, _) = if let Some(sep) = sep {
tag(&[sep as u8])(i)?
} else {
// Support the same signature as previous branch.
(i, &[' ' as u8][..])
};
let (i, d) = day(i)?;
Ok((
i,
DateType::YMD {
year: y,
month: m,
day: d,
},
))
}
但显然看起来很奇怪。
是否有一些 nom 工具可以更合适地做到这一点?
(这个问题是关于 nom 功能,以及如何正确地做事。不仅仅是这个特定的例子。)
你的解决方案很不错。我真的只能提供一个建议:
fn parse_ymd(i: &[u8]) -> IResult<&[u8], DateType> {
...
// If first separator was matched then try to find next one.
let i = match sep {
Some(sep) => tag(&[sep as u8])(i)?.0,
_ => i,
};
...
}
您可能不熟悉直接访问元组元素的语法。来自 rust book:
In addition to destructuring through pattern matching, we can access a tuple element directly by using a period (.) followed by the index of the value we want to access.
在这种情况下,它可以避免尝试匹配两个手臂签名的尴尬。