System.Text.Json 将具有多个子对象的对象反序列化为同一个实例

System.Text.Json Deserialize object with multiple child objects into the same instance

也许有点奇怪的用例,但我有大量 json 需要放入数据库,但问题是 author 多次出现我需要引用同一作者对象的 json 对象中的对象。

例如:

public class Book
{
  public string Title { get; set; }
  public string Genre { get; set; }
  public Author Author { get; set; }
  public List<Author> Reviewers { get; set; }
}

public class Author
{
  public string Id { get; set; }
  public string Name{ get; set; }
}
// data.json
{
  "title": "...",
  "genre": "...",
  "author": { // Deserializes into Author class
    "id": "1"
    "name": "..."
  },
  "reviewers": [ // Deserializes into List<Author>
    {
      "id": "1", // Needs to point to same object as "author"
      "name": "..."
    },
    {
       "id": "2",
       "name": "..."
    }
  ]
}

在这个例子中,由于 Entity Framework 核心要求没有两个相同的对象,因此当我需要他们引用同一个对象时,具有相同 Id 的作者被反序列化为两个不同的对象id可以被追踪

System.InvalidOperationException:
  'The instance of entity type 'Author' cannot be tracked because another instance with the key
  value '{Id: 1}' is already being tracked. When attaching existing entities, ensure that only
  one entity instance with a given key value is attached.'

你可以像这样在反序列化后修复对作者的引用。

Dictionary<string, Author> authors = new();

foreach(var book in books)
{
  book.Author = GetAuthor(book.Author);
  book.Reviewers = book.Reviewers.Select(r => GetAuthor(r)).ToList();
}

Author GetAuthor(Author author)
{
  if(!authors.ContainsKey(author.Id))
  {
    authors[author.Id] = author;
  }
  return authors[author.Id];
}

根据书籍和处理器的数量,并行执行它可能是值得的。

ConcurrentDictionary<string, Author> authors = new();

Parallel.ForEach(books, book =>
{
  book.Author = GetAuthor(book.Author);
  book.Reviewers = book.Reviewers.Select(r => GetAuthor(r)).ToList();
});

Author GetAuthor(Author author)
{
  return authors.GetOrAdd(author.Id, _ => author);
}