JSON 数据集中的数据共现

data co-occurrence in JSON dataset

我在提取 JSON 信息时遇到问题。 我的 JSON 文件包含小说的 100 章。每章包含该章中的一些字符。

例如:

{"ONE": ["PERSON A", "PERSON B", "PERSON C", "PERSON D", "PERSON A"],
"TWO": ["PERSON A", "PERSON D", "PERSON F", "PERSON G", "PERSON H"],
"THREE": ["PERSON F", "PERSON D", "PERSON A", "PERSON A", "PERSON A"]
... "ONE HUNDRED": ["PERSON B", "PERSON A"]
}

我的目标是设计一种方法来提取两个字符在整本书中共现的次数,并且两个字符在一章中只能共现一次。 例如,在100章内,我想知道PERSON A和PERSON B共出现了多少次。

我想到了两种方法, A. 使用 JSON PATH 并过滤出数据集(其中 PERSON A 和 B 共同出现),并计算他们共同出现的章节数。 (我也不知道要查询什么 :P ) B. 虽然我不太擅长JAVASCRIPT。我的思路是定义一个整数,然后在JSON文件的每一章循环运行for循环

不知道你们能不能和我分享一下这方面的知识!谢谢!

这是一个函数,您可以在其中指定是要章节计数还是章节数组

这里是分解的函数

const cooccur = (people, rettype) => {
  let result = Object.keys(
  // the final result will be an array of object keys
     Object.fromEntries(Object.entries(chapters)
     // but to iterate your object, we need to first convert it into an array with Object.entries
     // then with that result, convert it back into an object with Object.fromEntries
        .filter(c => people.filter(r => c[1].indexOf(r) > -1).length === people.length)));
         // this double filter will run through each chapter and filter it based on the second filter's result
         // the second filter takes our people array and finds how many total occurences of both people in a given chapter
         // if the total number of occurences equals the number of people we're searching for, it's a match
  return rettype === 'count' ? result.length : result;
}

let chapters = {
  "ONE": ["PERSON A", "PERSON B", "PERSON C", "PERSON D", "PERSON A"],
  "TWO": ["PERSON A", "PERSON D", "PERSON F", "PERSON G", "PERSON H"],
  "THREE": ["PERSON F", "PERSON D", "PERSON A", "PERSON A", "PERSON A"],
  "ONE HUNDRED": ["PERSON B", "PERSON A"]
}

const cooccur = (people, rettype) => {
  let result = Object.keys(Object.fromEntries(Object.entries(chapters).filter(c => people.filter(r => c[1].indexOf(r) > -1).length === people.length)));
  return rettype === 'count' ? result.length : result;
}

console.log('number of occurences:', cooccur(["PERSON A", "PERSON B"], 'count'));
console.log('occurence chapters:', cooccur(["PERSON A", "PERSON B"], 'chapters'));

可能与@Kinglish 的回答一致,但为了完整性我想添加这个。

Proper JSON Path 还没有这方面的语法,但我们正在构建官方规范,所以现在是提出规范的最佳时机。实际上,我们最近一直在研究要支持的表达式语法。我在 a comment 中引用了这个问题来解释提案。