如何创建 BTreeSet<(String, String, String)> 的子范围?如何将边界元组变成元组的边界?

How to create a subrange of a BTreeSet<(String, String, String)>? How to turn a tuple of bounds into a bound of a tuple?

我正在尝试使用 BTreeSet<(String, String, String)> 作为创建简单内存中“triple store”的方法。

准确地说:

type Entity = String;
type Attribute = String;
type Value = String;
type EAV = (Entity, Attribute, Value);
type EAVSet = BTreeSet<EAV>;


pub fn example_db() -> EAVSet {
    let mut example: EAVSet = BTreeSet::new();
    insert_strtup(&mut example, ("1", "type", "user"));
    insert_strtup(&mut example, ("1", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("1", "user/age", "33"));

    insert_strtup(&mut example, ("2", "type", "user"));
    insert_strtup(&mut example, ("2", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("2", "user/age", "42"));

    return example;
}


fn insert_strtup(db: &mut EAVSet, val: (&str, &str, &str)) -> () {
    db.insert((val.0.to_string(), val.1.to_string(), val.2.to_string()));
}

pub fn example()  {
    let db = example_db();

    // How to customize this?
    let range: (Bound<EAV>, Bound<EAV>) = (Bound::Unbounded, Bound::Unbounded);

    for elem in eavt.range(range) {
        println!("{:?}", elem);
    }
}

我面临的问题是,我希望人们能够迭代集合中的一个子范围值。但是,std::ops::Bound 的简单用法是不可能的,因为我们存储了一个包含多个字段的元组。

我希望能够为以下所有内容构建范围查询:

到目前为止,我想到的唯一想法是使用我们知道的字符串键,因为事实比较低。高于我们在 'placeholder' 字段中寻找的值。但这感觉非常hackish/error-prone,就像重新发明轮子一样。

有没有可能把 (Bound<String>, Bound<String>, Bound<String>) 变成 Bound<(String, String, String)> 的方法? 或者这里有另一种方法吗?

编辑: 通过将所有值包装在有序枚举 (Min, Exact(String), Max) 中显示了一种解决方案,但此解决方案需要更改存储在 BTreeSet 中的值类型。这也感觉像是增加了内存开销,因为我们实际上从不在内部存储 Exact(some_string) 以外的任何内容。是否有另一种方法不需要更改存储在 BTreeSet?

中的值的类型

I'd like to be able to build range-queries for all of the following:

all entities;
all entities with an ID in range x..y;
all fields of entity 1;
the current value of entity 1's "user/age" field).

Is there an[other] approach that does not require altering the type of values stored in the BTreeSet?

鉴于上述限制,以下工作。由于一切都是字符串,范围使用字符串比较,意思是“a”..“b”代表所有以“a”开头的字符串。空字符串是一个自然的最小字符串值,但没有现成的最大字符串值,因此我们使用一个大的静态字符串。这当然一点都不好。这也许可以通过使用 Option 而不是 String 来改进,None 值代表最大值,而 Some("") 将是最小值。然后您还必须实施自己的比较...

use std::collections::BTreeSet;
use std::ops::Bound;

type Entity = String;
type Attribute = String;
type Value = String;
type EAV = (Entity, Attribute, Value);
type EAVSet = BTreeSet<EAV>;

pub fn example_db() -> EAVSet {
    let mut example: EAVSet = BTreeSet::new();
    insert_strtup(&mut example, ("1", "type", "user"));
    insert_strtup(&mut example, ("1", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("1", "user/age", "33"));

    insert_strtup(&mut example, ("2", "type", "user"));
    insert_strtup(&mut example, ("2", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("2", "user/age", "42"));

    insert_strtup(&mut example, ("11", "type", "user"));
    insert_strtup(&mut example, ("11", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("11", "user/age", "33"));

    insert_strtup(&mut example, ("12", "type", "user"));
    insert_strtup(&mut example, ("12", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("12", "user/age", "42"));
    return example;
}

fn insert_strtup(db: &mut EAVSet, val: (&str, &str, &str)) -> () {
    db.insert((val.0.to_string(), val.1.to_string(), val.2.to_string()));
}

static MAX_STRING: &str = "ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ";

pub fn main() {
    let db = example_db();

    // How to customize this?
    let range: (Bound<EAV>, Bound<EAV>) = (Bound::Unbounded, Bound::Unbounded);

    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tall entities with an ID in range \"11\"..=\"12\":");

    let range = (
        Bound::Included(("11".to_string(), "".to_string(), "".to_string())),
        Bound::Excluded(("120".to_string(), "".to_string(), "".to_string())),
    );

    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tall fields of entity 1:");

    let range = (
        Bound::Included(("1".to_string(), "".to_string(), "".to_string())),
        Bound::Excluded(("10".to_string(), "".to_string(), "".to_string())),
    );
    
    for elem in db.range(range) {
        println!("{:?}", elem);
    }

    println!("\tthe current value of entity 1's \"user/age\" field:");

    let range = (
        Bound::Included(("1".to_string(), "user/age".to_string(), "".to_string())),
        Bound::Excluded(("1".to_string(), "user/age".to_string(), MAX_STRING.to_string())),
    );
    
    for elem in db.range(range) {
        println!("{:?}", elem);
    }
}

因为 Borrow 总是 returns 一个引用 (grrrrrrrr),而 Borrowed 不一定是 Copy,你可以依赖一个哨兵内存地址?

请注意,由于不允许关联 static 项目,因此您可能需要为要使用的每种类型复制此代码。

use std::borrow::Borrow;
use std::cmp::Ordering;

#[repr(transparent)]
pub struct StringWithMinMaxSentinel(String);

// must be static, not const, to ensure a constant memory address
pub static STRING_MIN_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());
pub static STRING_MAX_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());

impl Borrow<StringWithMinMaxSentinel> for String {
    fn borrow(self: &String) -> &StringWithMinMaxSentinel {
        unsafe { &*(self as *const String as *const StringWithMinMaxSentinel) }
    }
}

impl PartialEq for StringWithMinMaxSentinel {
    fn eq(&self, other: &Self) -> bool {
        std::ptr::eq(self, other) || (!std::ptr::eq(self, &STRING_MIN_SENTINEL) && !std::ptr::eq(other, &STRING_MAX_SENTINEL) && !std::ptr::eq(other, &STRING_MIN_SENTINEL) && !std::ptr::eq(self, &STRING_MAX_SENTINEL) && self.0.eq(&other.0))
    }
}

impl Eq for StringWithMinMaxSentinel {}

impl PartialOrd for StringWithMinMaxSentinel {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

impl Ord for StringWithMinMaxSentinel {
    fn cmp(&self, other: &Self) -> Ordering {
        if std::ptr::eq(self, other) {
            Ordering::Equal
        } else if std::ptr::eq(self, &STRING_MIN_SENTINEL) || std::ptr::eq(other, &STRING_MAX_SENTINEL) {
            Ordering::Less
        } else if std::ptr::eq(self, &STRING_MAX_SENTINEL) || std::ptr::eq(other, &STRING_MIN_SENTINEL) {
            Ordering::Greater
        } else {
            self.0.cmp(&other.0)
        }
    }
}