如何有效地找到句子数组中字符串数组的确切个体数？

Question

如何高效地找到句子数组中字符串数组的准确个体数？

例子

var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];

Answer : jhon ->1 times (do not consider jhonny), parker-> 3 times.

我在做什么:

var lenObj ={};
for(let i=0; i< sentences.length; i++){
    for(let j=0; j<name.length; j++){
        // split the sentences element and compare with each word in names array. And update the count in lenObj; 
    }
}

使用正则表达式： 我使用 \b 作为边界。但问题是动态的，我无法分配值：所以 "/\b+sentences[i]+"\b/gi" 不起作用

for(let i=0; i< sentences.length; i++){
    for(let j=0; j<name.length; j++){
        var count = (str.match("/\b+sentences[i]+"\b/gi") || []).length; // is not working
        // if I hardcode it then it is working (str.match(/\bjhon\b/gi));
    }
}

但我觉得上述解决方案效率不高。如果有什么办法可以更有效和优化地做到这一点？

Answer 1

通过用 \b 围绕每个名称创建正则表达式，通过 | 连接，然后传递给 new RegExp。然后，您可以遍历每个句子和该模式的每个匹配项，并将每个匹配项放在一个计算每个名称的匹配项数的对象上：

var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];
const pattern = new RegExp(names.map(name => `\b${name}\b`).join('|'), 'gi');

const counts = {};
for (const sentence of sentences) {
  for (const match of (sentence.match(pattern) || [])) {
    counts[match] = (counts[match] || 0) + 1;
  }
}
console.log(counts);

Answer 2

您可以拆分字符串并按名称过滤并获取数组的长度。

var names = ["jhon", "parker"],
    sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
    parts = sentences.join(' ').split(/\s+/),
    result = names.map(name => parts
        .filter(s => s === name)
        .length
    );

console.log(result);

线性时间复杂度：

创建一个对象，以所需名称作为键，零作为计数值，
得到 sentences 加入一个单独的刺，
拆分这个字符串
迭代零件并检查零件是否是计数的键，然后递增计数。

var names = ["jhon", "parker"],
    sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
    counts = names.reduce((o, n) => (o[n] = 0, o), {});

sentences.join(' ').split(/\s+/).forEach(s => {
    if (s in counts) counts[s]++;
});

console.log(counts);

Answer 3

您可以将对象 RegExp 用于动态表达式，并使用函数 map 和 reduce 进行计数。

let names= ["jhon", "parker"],
    sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
    result = names.map(n => sentences.reduce((a, s) => a + (s.match(new RegExp(`\b${n}\b`, "g")) || []).length, 0));

console.log(result);

线性复杂度方法

let names= ["jhon", "parker"],
    sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
    words = sentences.join(" "),
    result = names.map(n => (words.match(new RegExp(`\b${n}\b`, "g")) || []).length);

console.log(result);

如何有效地找到句子数组中字符串数组的确切个体数？

How to find the exact individual count of array of string in array of sentences efficiently?

javascript

arrays

algorithm

trie