用下一个 unicode 字符替换字符串中的某些字符

Question

我有一个输入文本如下：

 inputtext = "This is a test";

我需要将某些字符（基于特定条件）替换为下一个 unicode 字符

 let i = 0;
 for c in inputtext.chars() {
   if (somecondition){
     // Replace char here
     inputtext.replace_range(i..i+1, newchar);
     // println!("{}", c);
 }

最好的方法是什么？

Answer 1

你不能轻易地更新字符串 in-place 因为 Rust 字符串不仅仅是一个字符数组，它是一个字节的数组（在 UTF-8 中编码），不同的字符可能使用不同的字节数。例如，字符 ߿（U+07FF“Nko Taman Sign”）使用两个字节，而下一个 Unicode 字符 ࠀ（U+0800“Samaritan Letter Alaf”）使用三个字节。

因此最简单的方法是将字符串转换为字符迭代器（使用 .chars()），适当地操作该迭代器，然后使用 .collect().

构造一个新字符串

例如：

let old = "abcdef";

let new = old.chars()
    // note: there's an edge case if ch == char::MAX which we must decide
    //       how to handle. in this case I chose to not change the
    //       character, but this may be different from what you need.
    .map(|ch| {
        if somecondition {
            char::from_u32(ch as u32 + 1).unwrap_or(ch)
        } else {
            ch
        }
    })
    .collect::<String>();

用下一个 unicode 字符替换字符串中的某些字符

Replace some characters in a string with the next unicode character

string

replace

rust