使用“thread_local!”和“OnceCell”在多个线程之间共享包含“Rc”引用的静态延迟初始化对象

Question

我已将我的测试分成几个类似的部分。在每个部分中，将结果与静态测试字符串进行比较，该字符串以专用测试语言（此处称为 dum）编写并使用 pest.

进行解析

这是我的 MWE 的全局结构。

$ tree
.
├── Cargo.lock
├── Cargo.toml
├── src
│   └── main.rs
└── tests
    ├── dum.pest
    ├── section_1.rs
    └ .. imagine more similar sections here.

Cargo.toml

[package]
...
edition = "2018"

[dev-dependencies]
pest = "*"
pest_derive = "*"
once_cell = "*"
lazy_static = "*"

main.rs 只包含 fn main() {}.
dum.pest 是一个虚拟 any = { ANY* }.
section_1.rs 序言是：

use pest_derive::Parser;
use pest::{iterators::Pairs, Parser};

// Compile dedicated grammar.
#[derive(Parser)]
#[grammar = "../tests/dum.pest"]
pub struct DumParser;

// Here is the static test string to run section 1 against.
static SECTION_1: &'static str = "Content to parse for section 1.";

// Type of the result expected to be globally available in the whole test section.
type ParseResult = Pairs<'static, Rule>;

现在，我第一个让所有测试函数都可以使用解析结果的幼稚尝试是：

// Naive lazy_static! attempt:
use lazy_static::lazy_static;
lazy_static! {
    static ref PARSED: ParseResult = {
        DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed.")
    };
}
#[test]
fn first() {
    println!("1: {:?} parsed to {:?}", &*SECTION_1, *PARSED);
}
#[test]
fn second() {
    println!("2: {:?} parsed to {:?}", &*SECTION_1, *PARSED);
}

这不编译。根据 pest，这是因为它们使用了无法在线程之间安全共享的内部 Rc 引用，我认为 cargo test 确实为每个 #[test] 函数创建了一个新线程。

建议的解决方案涉及使用 thread_local! 和 OnceCell，但我无法弄清楚。以下两次尝试：

// Naive thread_local! attempt:
thread_local! {
    static PARSED: ParseResult = {
        println!(" + + + + + + + PARSING! + + + + + + + "); // /!\ SHOULD APPEAR ONLY ONCE!
        DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed.")
    };
}
#[test]
fn first() {
    PARSED.with(|p| println!("1: {:?} parsed to {:?}", &*SECTION_1, p));
}
#[test]
fn second() {
    PARSED.with(|p| println!("2: {:?} parsed to {:?}", &*SECTION_1, p));
}

和

// Naive OnceCell attempt:
use once_cell::sync::OnceCell;
thread_local! {
static PARSED: OnceCell<ParseResult> = {
    println!(" + + + + + + + PARSING! + + + + + + + "); // /!\ SHOULD APPEAR ONLY ONCE!
        let once = OnceCell::new();
        once.set(DumParser::parse(Rule::any, &*SECTION_1).expect("Parse failed."))
        .expect("Already set.");
        once
    };
}
#[test]
fn first() {
    PARSED.with(|p| println!("1: {:?} parsed_to {:?}", &*SECTION_1, p.get().unwrap()));
}
#[test]
fn second() {
    PARSED.with(|p| println!("2: {:?} parsed_to {:?}", &*SECTION_1, p.get().unwrap()));
}

都可以编译运行很好。但是 cargo test -- --nocapture 的输出建议实际上对每个测试函数进行一次解析：

running 2 tests
 + + + + + + + PARSING! + + + + + + +
 + + + + + + + PARSING! + + + + + + +
1: "Content to parse for section 1." parsed_to [Pair { rule: any, span: Span { str: "Content to parse for section 1.", start: 0, end: 31 }, inner: [] }]
2: "Content to parse for section 1." parsed_to [Pair { rule: any, span: Span { str: "Content to parse for section 1.", start: 0, end: 31 }, inner: [] }]
test first ... ok
test second ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

这表明我的两次尝试都失败了。

这些方法有什么问题？
如何使每个部分只进行一次解析？

Answer 1

为什么 `lazy_static!` 不合适？

是否 cargo test 每次测试都启动一个新线程实际上是无关紧要的。

一个 static 变量是全局的，因此可能在线程之间共享，因此即使没有线程被派生，它也必须是 Sync。

并且由于 Rc 不是 Sync（不能在线程之间共享），所以这行不通。

为什么 `thread_local!` 不合适？

顾名思义，每个线程有一个 thread_local! 变量。

thread_local! 中的代码实际上并不是运行紧接着 thread-creation；因为变量是在第一次访问时延迟实例化的。

如何让每个部分只进行一次解析？

不要直接使用pest的输出。

如果你 post-process pest 的输出并从中创建一个 Sync 的结构，那么你可以用 lazy_static 存储它，它只会被解析一次。

实际上，您可以走得更远，完全避免 lazy_static。如果您可以用纯粹的 const 方式表达结构，那么您可以使用 build.rs 脚本或程序宏将字符串转换为 compile-time 处的模型。但是对于测试来说，这可能不值得付出努力。

使用“thread_local!”和“OnceCell”在多个线程之间共享包含“Rc”引用的静态延迟初始化对象

Share static lazy-initialized object containing `Rc` refs among multiple threads with `thread_local!` and `OnceCell`

concurrency

static

multithreading

lazy-initialization

rust

为什么 `lazy_static!` 不合适？

为什么 `thread_local!` 不合适？

如何让每个部分只进行一次解析？

使用“thread_local!”和“OnceCell”在多个线程之间共享包含“Rc”引用的静态延迟初始化对象

Share static lazy-initialized object containing `Rc` refs among multiple threads with `thread_local!` and `OnceCell`

concurrency

static

multithreading

lazy-initialization

rust

为什么 lazy_static! 不合适？

为什么 thread_local! 不合适？

如何让每个部分只进行一次解析？

为什么 `lazy_static!` 不合适？

为什么 `thread_local!` 不合适？