在 Rust 中使用 serde_json 时如何避免双重 '\' 转义?

How to avoid double '\' escape when using serde_json in Rust?

是否可以通过某种方式从字节创建字符串值而不用在 SERDE 中将反斜杠字符加倍?

Playground:

use serde_json::json;
use serde_json::{Value};
use std::str;

fn main() {
    let bytes = [79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92,
                117, 48, 48, 49, 102, 50, 50, 50, 50, 71, 66, 54, 87,
                65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102,
                123, 92, 34, 36, 116, 122, 92, 34, 58, 92, 34, 69, 117, 114,
                111, 112, 101, 47, 66, 101, 114, 108, 105, 110, 92, 34, 125];
    let string = str::from_utf8(&bytes).unwrap();
    let json_string = json!(&string);
    let json_string2 = Value::String(string.to_string());
    println!("string: {}",string);
    println!("json 1: {}",json_string);
    println!("json 2: {}",json_string2);
}

以下字符在 JSON 中保留,必须正确转义才能在字符串中使用:

  • 退格键替换为\b
  • 换页替换为\f
  • 换行替换为\n
  • 回车 return 替换为 \r
  • 制表符替换为 \t
  • 双引号替换为\"
  • 反斜杠替换为 \

答案是“”因为如果不转义反斜杠,serde 将产生无效的JSON

But how is it possible to build a serde_json::Value::String from a &[u8]

您必须先创建一个常规字符串,然后对保留字符进行转义。幸运的是,serde 为我们提供了 json!() 宏来完成后者:

use serde_json::json;

fn main() {
    // if it's not a UTF-8 encoded string, then you should use some external crate to do the decoding
    let slice: &[u8] = //some utf-8 encoded slice
    let string = String::from_utf8(slice.to_vec()).unwrap();
    let v = json!("hel\"lo");
    println!("{:?}", v);
}

您有一个已经包含转义符的字符串。为避免反斜杠本身被转义,您可以在将字符串传递给 serde 之前自行解释转义。例如,使用 unescape crate 来解释转义,代码将如下所示:

use serde_json::json;
use std::str;
use unescape::unescape;

fn main() {
    let bytes = [
        79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92, 117, 48, 48, 49, 102, 50, 50, 50, 50,
        71, 66, 54, 87, 65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102, 123, 92, 34, 36,
        116, 122, 92, 34, 58, 92, 34, 69, 117, 114, 111, 112, 101, 47, 66, 101, 114, 108, 105, 110,
        92, 34, 125,
    ];
    let string_with_escapes = str::from_utf8(&bytes).unwrap();
    let unescaped_string = unescape(string_with_escapes).unwrap();
    let json_string = json!(&unescaped_string);
    println!("string with escapes: {}", string_with_escapes);
    println!("string without escapes: {}", unescaped_string);
    println!("json: {}", json_string);
}

输出(但请注意,没有转义的字符串包含一些未呈现的不可打印字符):

string with escapes: OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}
string without escapes: OBXZFAD6P6LA2222GB6WAAU46WWV{"$tz":"Europe/Berlin"}
json: "OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}"

如果您希望避免依赖 unescape(自 2016 年成立以来一直未更新),您甚至可以让 serde_json 进行转义:

fn unescape(s: &str) -> serde_json::Result<String> {
    serde_json::from_str(&format!("\"{}\"", s))
}

Playground

您可以创建自定义 Formater:

use serde::ser::Serialize;
use serde_json::ser::{CharEscape, Serializer};
use std::{io, str};

struct NoEscape;

impl serde_json::ser::Formatter for NoEscape {
    fn write_char_escape<W: ?Sized>(
        &mut self,
        writer: &mut W,
        char_escape: CharEscape,
    ) -> io::Result<()>
    where
        W: io::Write,
    {
        use self::CharEscape::*;

        let c = match char_escape {
            Quote => '"',
            ReverseSolidus => '\',
            Solidus => '/',
            Backspace => 'b',
            FormFeed => 'f',
            LineFeed => 'n',
            CarriageReturn => 'r',
            Tab => 't',
            AsciiControl(_) => todo!(),
        };

        write!(writer, "{}", c)
    }
}

fn main() {
    let raw = [
        79, 66, 88, 90, 70, 65, 68, 54, 80, 54, 76, 65, 92, 117, 48, 48, 49, 102, 50, 50, 50, 50,
        71, 66, 54, 87, 65, 65, 85, 52, 54, 87, 87, 86, 92, 117, 48, 48, 49, 102, 123, 92, 34, 36,
        116, 122, 92, 34, 58, 92, 34, 69, 117, 114, 111, 112, 101, 47, 66, 101, 114, 108, 105, 110,
        92, 34, 125,
    ];
    let foo = str::from_utf8(&raw).unwrap();
    let mut ser = Serializer::with_formatter(Vec::new(), NoEscape {});
    foo.serialize(&mut ser).unwrap();
    let writer = ser.into_inner();
    let result = str::from_utf8(&writer).unwrap();

    assert_eq!(
        result,
        r#""OBXZFAD6P6LA\u001f2222GB6WAAU46WWV\u001f{\"$tz\":\"Europe/Berlin\"}""#
    )
}