Rust - 在递归函数中收集 Vec 的切片

Rust - Collecting slices of a Vec in a recursive function

我目前正在尝试构建霍夫曼编码程序,并且正在努力解决我在遍历生成的霍夫曼树以创建查找时遇到的问题 table。我决定用递归函数实现上述遍历。在实际实现中,我使用 bitvec crate 来保存位序列,但为了简单起见,我将在这个 post.

中使用 Vec<bool>

我的想法是保存 Vec codewords 中所有代码字的集合,然后只保存该向量中的一部分用于实际查找 table,因为我用了 HashMap.

问题是我将如何解决为左右遍历添加 0 或 1。我的想法是保存当前序列的一个片段的克隆,将 0 附加到 codewords,然后在向左遍历后将该克隆附加到 codewords 的末尾,这样我就可以推送一个1 并向右移动。我想出的函数如下所示:

use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

fn traverse_tree<'a>(
    cur_index: usize,
    height: i16,
    codewords: &'a mut Vec<bool>,
    lookup_table: &mut HashMap<u8, &'a [bool]>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        lookup_table.insert(cur_node.val, cur_sequence);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}

显然那段代码中的生命周期和借用存在问题,我有点明白问题所在了。据我了解,当我在递归调用中将 codewords 作为参数时,只要我将切片保存在 lookup_table 中,它就必须借用向量,这显然是不可能的,从而导致错误.我该如何解决?

这就是 cargo check 给我的:

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:43:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
43 |       codewords.append(&mut cur_sequence); // second mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:44:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
44 |       codewords.push(true); // third mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:48:9
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
48 |           codewords, // fourth mutable borrow occurs here
   |           ^^^^^^^^^ second mutable borrow occurs here

我在这里错过了什么?向量 API 中是否有我遗漏的一些神奇功能,为什么这首先会造成生命周期问题?据我所知,我的所有生命周期都是正确的,因为 codewords 总是活得足够长 lookup_table 来保存所有这些切片,而且我永远不会同时借用两次。如果我的生命周期有问题,编译器会在 if cur_node.left == -1 块内抱怨,而我在它之后采用的 cur_sequence 是一个拥有的 Vec,所以不能有任何借用与此有关的问题。因此,问题实际上在于具有以 mutable 引用作为参数的递归函数的核心思想。

我有什么办法可以解决这个问题吗?我尝试让 codewords 拥有并返回它,但是编译器无法确保我在 lookup_table 中保存的位序列存在足够长的时间。我唯一的想法是将拥有的向量保存在 lookup_table 中,但那时 codewords 向量首先已经过时,我可以通过使用 cur_sequence 向量来简单地实现它作为我在每次调用中克隆的参数,但我选择了我的方法以在之后的实际编码过程中获得更好的缓存性能,然后我会丢失。

问题是,当您像在 let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..]; 中那样从 codewords 创建切片 cur_sequence 时,编译器会将对 codewords 的引用的生命周期延长到至少与 cur_sequence 相同(原因:编译器希望确保切片 cur_sequence 始终有效,但如果您更改 codewords (例如,清除它)则可能 cur_sequence 无效。通过保持对 codewords 的不可变引用,然后借用规则将禁止在切片仍然存在时修改 codewords。不幸的是,您将 cur_sequence 保存在 lookup_table 中,从而在整个函数中保持对 codewords 的引用,因此您不能再可变地借用 codewords

解决方法是自己维护slice的索引:创建一个struct:

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

然后用它代替切片:

let cur_range = Range::new(
    codewords.len() - 1 - height as usize,
    codewords.len() - 1
);
lookup_table.insert(cur_node.val, cur_range);

这样,保持范围有效的责任就落在了你身上。

完整代码:

use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

fn traverse_tree(
    cur_index: usize,
    height: i16,
    codewords: &mut Vec<bool>,
    lookup_table: &mut HashMap<u8, Range>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        // let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        let cur_range = Range::new(
            codewords.len() - 1 - height as usize,
            codewords.len() - 1
        );
        lookup_table.insert(cur_node.val, cur_range);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}