通过 Rust 宏自定义文字?
Custom literals via Rust macros?
在 Rust 中是否可以定义一个可以解析自定义文字的宏,例如类似于
的内容
vector!(3x + 15y)
澄清一下,我希望能够尽可能接近上述语法(当然在可能的范围内)。
我假设 "custom literal" 是指 "a regular Rust literal (excluding raw literals), immediately followed by a custom identifier"。这包括:
"str"x
,带有自定义后缀 x
的字符串文字 "str"
123x
,带有自定义后缀 x
的数字文字 123
b"bytes"x
,带有自定义后缀 x
的字节文字 b"bytes"
如果上面的定义对你来说足够了,那么你很幸运,因为根据the Rust reference:
,上面确实是 Rust 中所有有效的文字标记
A suffix is a non-raw identifier immediately (without whitespace) following the primary part of a literal.
Any kind of literal (string, integer, etc) with any suffix is valid as a token, and can be passed to a macro without producing an error. The macro itself will decide how to interpret such a token and whether to produce an error or not.
However, suffixes on literal tokens parsed as Rust code are restricted. Any suffixes are rejected on non-numeric literal tokens, and numeric literal tokens are accepted only with suffixes from the list below.
因此 Rust 明确地 允许宏支持自定义字符串文字。
现在,您将如何编写这样的宏?您不能使用 macro_rules!
编写声明性宏,因为无法通过其简单的模式匹配来检测和操作自定义文字后缀。但是,可以编写一个 procedural macro 来执行此操作。
我不会详细介绍如何编写过程宏,因为在单个 Whosebug 答案中写太多了。但是,作为起点,我将为您提供一个程序宏示例,它按照您的要求执行某些操作。它采用给定表达式中的任何自定义整数文字 123x
或 123y
,并将它们转换为函数调用 x_literal(123)
和 y_literal(123)
:
extern crate proc_macro;
use proc_macro::TokenStream;
use quote::ToTokens;
use syn::{
parse_macro_input, parse_quote,
visit_mut::{self, VisitMut},
Expr, ExprLit, Lit, LitInt,
};
// actual procedural macro
#[proc_macro]
pub fn vector(input: TokenStream) -> TokenStream {
let mut input = parse_macro_input!(input as Expr);
LiteralReplacer.visit_expr_mut(&mut input);
input.into_token_stream().into()
}
// "visitor" that visits every node in the syntax tree
// we add our own behavior to replace custom literals with proper Rust code
struct LiteralReplacer;
impl VisitMut for LiteralReplacer {
fn visit_expr_mut(&mut self, i: &mut Expr) {
if let Expr::Lit(ExprLit { lit, .. }) = i {
match lit {
Lit::Int(lit) => {
// get literal suffix
let suffix = lit.suffix();
// get literal without suffix
let lit_nosuffix = LitInt::new(lit.base10_digits(), lit.span());
match suffix {
// replace literal expression with new expression
"x" => *i = parse_quote! { x_literal(#lit_nosuffix) },
"y" => *i = parse_quote! { y_literal(#lit_nosuffix) },
_ => (), // other literal suffix we won't modify
}
}
_ => (), // other literal type we won't modify
}
} else {
// not a literal, use default visitor method
visit_mut::visit_expr_mut(self, i)
}
}
}
例如,宏会将 vector!(3x + 4y)
转换为 x_literal(3) + y_literal(4)
。
在 Rust 中是否可以定义一个可以解析自定义文字的宏,例如类似于
的内容vector!(3x + 15y)
澄清一下,我希望能够尽可能接近上述语法(当然在可能的范围内)。
我假设 "custom literal" 是指 "a regular Rust literal (excluding raw literals), immediately followed by a custom identifier"。这包括:
"str"x
,带有自定义后缀x
的字符串文字 123x
,带有自定义后缀x
的数字文字 b"bytes"x
,带有自定义后缀x
的字节文字
"str"
123
b"bytes"
如果上面的定义对你来说足够了,那么你很幸运,因为根据the Rust reference:
,上面确实是 Rust 中所有有效的文字标记A suffix is a non-raw identifier immediately (without whitespace) following the primary part of a literal.
Any kind of literal (string, integer, etc) with any suffix is valid as a token, and can be passed to a macro without producing an error. The macro itself will decide how to interpret such a token and whether to produce an error or not.
However, suffixes on literal tokens parsed as Rust code are restricted. Any suffixes are rejected on non-numeric literal tokens, and numeric literal tokens are accepted only with suffixes from the list below.
因此 Rust 明确地 允许宏支持自定义字符串文字。
现在,您将如何编写这样的宏?您不能使用 macro_rules!
编写声明性宏,因为无法通过其简单的模式匹配来检测和操作自定义文字后缀。但是,可以编写一个 procedural macro 来执行此操作。
我不会详细介绍如何编写过程宏,因为在单个 Whosebug 答案中写太多了。但是,作为起点,我将为您提供一个程序宏示例,它按照您的要求执行某些操作。它采用给定表达式中的任何自定义整数文字 123x
或 123y
,并将它们转换为函数调用 x_literal(123)
和 y_literal(123)
:
extern crate proc_macro;
use proc_macro::TokenStream;
use quote::ToTokens;
use syn::{
parse_macro_input, parse_quote,
visit_mut::{self, VisitMut},
Expr, ExprLit, Lit, LitInt,
};
// actual procedural macro
#[proc_macro]
pub fn vector(input: TokenStream) -> TokenStream {
let mut input = parse_macro_input!(input as Expr);
LiteralReplacer.visit_expr_mut(&mut input);
input.into_token_stream().into()
}
// "visitor" that visits every node in the syntax tree
// we add our own behavior to replace custom literals with proper Rust code
struct LiteralReplacer;
impl VisitMut for LiteralReplacer {
fn visit_expr_mut(&mut self, i: &mut Expr) {
if let Expr::Lit(ExprLit { lit, .. }) = i {
match lit {
Lit::Int(lit) => {
// get literal suffix
let suffix = lit.suffix();
// get literal without suffix
let lit_nosuffix = LitInt::new(lit.base10_digits(), lit.span());
match suffix {
// replace literal expression with new expression
"x" => *i = parse_quote! { x_literal(#lit_nosuffix) },
"y" => *i = parse_quote! { y_literal(#lit_nosuffix) },
_ => (), // other literal suffix we won't modify
}
}
_ => (), // other literal type we won't modify
}
} else {
// not a literal, use default visitor method
visit_mut::visit_expr_mut(self, i)
}
}
}
例如,宏会将 vector!(3x + 4y)
转换为 x_literal(3) + y_literal(4)
。