如何有效地找到句子数组中字符串数组的确切个体数?
How to find the exact individual count of array of string in array of sentences efficiently?
如何高效地找到句子数组中字符串数组的准确个体数?
例子
var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];
Answer : jhon ->1 times (do not consider jhonny), parker-> 3 times.
我在做什么:
var lenObj ={};
for(let i=0; i< sentences.length; i++){
for(let j=0; j<name.length; j++){
// split the sentences element and compare with each word in names array. And update the count in lenObj;
}
}
使用正则表达式: 我使用 \b 作为边界。
但问题是动态的,我无法分配值:所以 "/\b+sentences[i]+"\b/gi"
不起作用
for(let i=0; i< sentences.length; i++){
for(let j=0; j<name.length; j++){
var count = (str.match("/\b+sentences[i]+"\b/gi") || []).length; // is not working
// if I hardcode it then it is working (str.match(/\bjhon\b/gi));
}
}
但我觉得上述解决方案效率不高。如果有什么办法可以更有效和优化地做到这一点?
通过用 \b
围绕每个名称创建正则表达式,通过 |
连接,然后传递给 new RegExp
。然后,您可以遍历每个句子和该模式的每个匹配项,并将每个匹配项放在一个计算每个名称的匹配项数的对象上:
var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];
const pattern = new RegExp(names.map(name => `\b${name}\b`).join('|'), 'gi');
const counts = {};
for (const sentence of sentences) {
for (const match of (sentence.match(pattern) || [])) {
counts[match] = (counts[match] || 0) + 1;
}
}
console.log(counts);
您可以拆分字符串并按名称过滤并获取数组的长度。
var names = ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
parts = sentences.join(' ').split(/\s+/),
result = names.map(name => parts
.filter(s => s === name)
.length
);
console.log(result);
线性时间复杂度:
- 创建一个对象,以所需名称作为键,零作为计数值,
- 得到
sentences
加入一个单独的刺,
- 拆分这个字符串
- 迭代零件并检查零件是否是计数的键,然后递增计数。
var names = ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
counts = names.reduce((o, n) => (o[n] = 0, o), {});
sentences.join(' ').split(/\s+/).forEach(s => {
if (s in counts) counts[s]++;
});
console.log(counts);
您可以将对象 RegExp
用于动态表达式,并使用函数 map
和 reduce
进行计数。
let names= ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
result = names.map(n => sentences.reduce((a, s) => a + (s.match(new RegExp(`\b${n}\b`, "g")) || []).length, 0));
console.log(result);
线性复杂度方法
let names= ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
words = sentences.join(" "),
result = names.map(n => (words.match(new RegExp(`\b${n}\b`, "g")) || []).length);
console.log(result);
如何高效地找到句子数组中字符串数组的准确个体数?
例子
var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];
Answer : jhon ->1 times (do not consider jhonny), parker-> 3 times.
我在做什么:
var lenObj ={};
for(let i=0; i< sentences.length; i++){
for(let j=0; j<name.length; j++){
// split the sentences element and compare with each word in names array. And update the count in lenObj;
}
}
使用正则表达式: 我使用 \b 作为边界。
但问题是动态的,我无法分配值:所以 "/\b+sentences[i]+"\b/gi"
不起作用
for(let i=0; i< sentences.length; i++){
for(let j=0; j<name.length; j++){
var count = (str.match("/\b+sentences[i]+"\b/gi") || []).length; // is not working
// if I hardcode it then it is working (str.match(/\bjhon\b/gi));
}
}
但我觉得上述解决方案效率不高。如果有什么办法可以更有效和优化地做到这一点?
通过用 \b
围绕每个名称创建正则表达式,通过 |
连接,然后传递给 new RegExp
。然后,您可以遍历每个句子和该模式的每个匹配项,并将每个匹配项放在一个计算每个名称的匹配项数的对象上:
var names= ["jhon", "parker"];
var sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"];
const pattern = new RegExp(names.map(name => `\b${name}\b`).join('|'), 'gi');
const counts = {};
for (const sentence of sentences) {
for (const match of (sentence.match(pattern) || [])) {
counts[match] = (counts[match] || 0) + 1;
}
}
console.log(counts);
您可以拆分字符串并按名称过滤并获取数组的长度。
var names = ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
parts = sentences.join(' ').split(/\s+/),
result = names.map(name => parts
.filter(s => s === name)
.length
);
console.log(result);
线性时间复杂度:
- 创建一个对象,以所需名称作为键,零作为计数值,
- 得到
sentences
加入一个单独的刺, - 拆分这个字符串
- 迭代零件并检查零件是否是计数的键,然后递增计数。
var names = ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
counts = names.reduce((o, n) => (o[n] = 0, o), {});
sentences.join(' ').split(/\s+/).forEach(s => {
if (s in counts) counts[s]++;
});
console.log(counts);
您可以将对象 RegExp
用于动态表达式,并使用函数 map
和 reduce
进行计数。
let names= ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
result = names.map(n => sentences.reduce((a, s) => a + (s.match(new RegExp(`\b${n}\b`, "g")) || []).length, 0));
console.log(result);
线性复杂度方法
let names= ["jhon", "parker"],
sentences = ["hello jhon", "hello parker and parker", "jhonny jhonny yes parker"],
words = sentences.join(" "),
result = names.map(n => (words.match(new RegExp(`\b${n}\b`, "g")) || []).length);
console.log(result);