迭代正则表达式捕获的生命周期问题

Lifetime issue iterating over regex captures

我正在尝试使用正则表达式从字符串中获取所有非空白字符,但我总是遇到同样的错误。

extern crate regex; // 1.0.2

use regex::Regex;
use std::vec::Vec;

pub fn string_split<'a>(s: &'a String) -> Vec<&'a str> {
    let mut returnVec = Vec::new();
    let re = Regex::new(r"\S+").unwrap();

    for cap in re.captures_iter(s) {
        returnVec.push(&cap[0]);
    }

    returnVec
}

pub fn word_n(s: &String, n: i32) -> &str {
    let bytes = s.as_bytes();

    let mut num = 0;
    let mut word_start = 0;
    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' || item == b'\n' {
            num += 1;
            if num == n {
                return &s[word_start..i].trim();
            }
            word_start = i;
            continue;
        }
    }

    &s[..]
}

错误:

error[E0597]: `cap` does not live long enough
  --> src/main.rs:11:25
   |
11 |         returnVec.push(&cap[0]);
   |                         ^^^ borrowed value does not live long enough
12 |     }
   |     - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 6:1...
  --> src/main.rs:6:1
   |
6  | pub fn string_split<'a>(s: &'a String) -> Vec<&'a str> {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

还有更多信息:

$ rustc --explain E0597

This error occurs because a borrow was made inside a variable which has a
greater lifetime than the borrowed one.

Example of erroneous code:

```
struct Foo<'a> {
    x: Option<&'a u32>,
}

let mut x = Foo { x: None };
let y = 0;
x.x = Some(&y); // error: `y` does not live long enough
```
In here, `x` is created before `y` and therefore has a greater lifetime. Always
keep in mind that values in a scope are dropped in the opposite order they are
created. So to fix the previous example, just make the `y` lifetime greater than
the `x`'s one:

```
struct Foo<'a> {
    x: Option<&'a u32>,
}

let y = 0;
let mut x = Foo { x: None };
x.x = Some(&y);
```

此时我已经尝试了几种延长 cap 变量生命周期的方法,但在阅读了 Rust 书中的借用和生命周期部分后,我无法得到任何工作。

documentation of impl<'t> Index<usize> for Captures<'t>(这是您代码中的 cap[0])说:

The text can't outlive the Captures object if this method is used, because of how Index is defined (normally a[i] is part of a and can't outlive it); to do that, use get() instead.

因此 get 有效(请注意,我已将 &'a String 参数替换为 &'a str):

use regex::Regex;

pub fn string_split<'a>(s: &'a str) -> Vec<&'a str> {
    let mut return_vec = Vec::new();
    let re = Regex::new(r"\S+").unwrap();

    for cap in re.captures_iter(s) {
        return_vec.push(cap.get(0).unwrap().as_str());
    };

    return_vec
}

fn main() {
    println!("{:?}", string_split("Hello, world!"));
}