如何创建一个正则表达式,为每个句子证明几个句子特定关键字的存在?
How to create a regular expression which for each sentence proves the existence of several sentence specific keywords?
我要扫的文字是:“你今天怎么来了?” (关键词是“为什么”“你”“来”“这里”“今天”)
我想构建一个正则表达式,它也可以扫描类似的问题,例如:
- “今天是什么把你带到这里来的?” (关键词是“什么”、“带来”、“你”、“今天”)
- “今天进来的原因是什么?” (关键词是“什么”“原因”“即将到来”“今天”)
- “你今天早上来这里有什么原因吗?” (关键词是“是”“原因”,“这里”,“早上”)
- 你今天怎么来了? (关键词是“为什么”、“你”、“这里”、“今天”)?
所以是这样的:/keywords from first sentence | keywords from second sentence | keywords from third sentence | etc... /ig
并且如果上面的正则表达式为真(使用.test
),那么用户就会alerted
something.
我试过使用以下方法:/(?=.*\bwhat's\b)(?=.*\bbrought\b)(?=.*\byou\b)(?=.*\btoday\b)|(?=.*\bwhy\b)(?=.*\byou\b)(?=.*\bhere\b)(?=.*\btoday\b)/ig
。所以问哪个问题都没有关系,因为不管怎样都会执行相同的功能(我只需要匹配的关键词!)
我想构建一个正则表达式,可以从所有这些短语中扫描关键词,然后,如果关键词匹配,就会执行一个函数。如果有人对 topics/readings 创建这样一个正则表达式有任何建议,那就太好了!
@PeterSeliger would you have any advice on where I can look to get started on creating such a solution? – sb2021
1/2 A working solution needs a regex for split
ting a multiline string into an array of (line) strings. One also needs an array of arrays where each nested array holds the keywords of a line/sentence.
2/3 ... Then one wants to check for the array of new lines whether every
single line matches the keyword criteria. The callback function, which has to return a boolean value, then needs to check whether every
line related keyword is part of the currently processed line. And the callback function of this task will utilize a dynamically created regex ...
3/3 ... which uses the i
flag in order to test
the existence of a keyword in a case insensitive way (i
gnore case) ... because of one of the OP's use cases where one runs into comparing "What's"
versus "what's"
const sampleMultilineData =
`What's brought you here today?
What is the reason for coming in today?
Is there a reason you are here this morning?
Why are you here today?`;
const listOfKeywordLists = [
["what's", "brought", "you", "today"],
["what", "reason", "coming", "today"],
["is", "reason", "here", "morning"],
["why", "you", "here", "today"],
];
const isEveryLineContainsAllOfItsKeywords = sampleMultilineData
// create array of separte lines
// from the multiline string data.
.split(/\n/)
// wheter every single line matches the keyword criteria.
.every((line, lineIndex) =>
// access the line related keyword list via `lineIndex`.
listOfKeywordLists[lineIndex]
// wheter a line contains every keyword
// with word boundaries (`\b`) to the keyword
// in a case insensitive (`i` flag) way.
.every(keyword =>
RegExp(`\b${ keyword }\b`, 'i').test(line)
)
// // works case sensitive, thus the result is `false`.
//.every(keyword => line.includes(keyword))
);
console.log({ isEveryLineContainsAllOfItsKeywords });
.as-console-wrapper { min-height: 100%!important; top: 0; }
我要扫的文字是:“你今天怎么来了?” (关键词是“为什么”“你”“来”“这里”“今天”)
我想构建一个正则表达式,它也可以扫描类似的问题,例如:
- “今天是什么把你带到这里来的?” (关键词是“什么”、“带来”、“你”、“今天”)
- “今天进来的原因是什么?” (关键词是“什么”“原因”“即将到来”“今天”)
- “你今天早上来这里有什么原因吗?” (关键词是“是”“原因”,“这里”,“早上”)
- 你今天怎么来了? (关键词是“为什么”、“你”、“这里”、“今天”)?
所以是这样的:/keywords from first sentence | keywords from second sentence | keywords from third sentence | etc... /ig
并且如果上面的正则表达式为真(使用.test
),那么用户就会alerted
something.
我试过使用以下方法:/(?=.*\bwhat's\b)(?=.*\bbrought\b)(?=.*\byou\b)(?=.*\btoday\b)|(?=.*\bwhy\b)(?=.*\byou\b)(?=.*\bhere\b)(?=.*\btoday\b)/ig
。所以问哪个问题都没有关系,因为不管怎样都会执行相同的功能(我只需要匹配的关键词!)
我想构建一个正则表达式,可以从所有这些短语中扫描关键词,然后,如果关键词匹配,就会执行一个函数。如果有人对 topics/readings 创建这样一个正则表达式有任何建议,那就太好了!
@PeterSeliger would you have any advice on where I can look to get started on creating such a solution? – sb2021
1/2 A working solution needs a regex for
split
ting a multiline string into an array of (line) strings. One also needs an array of arrays where each nested array holds the keywords of a line/sentence.2/3 ... Then one wants to check for the array of new lines whether
every
single line matches the keyword criteria. The callback function, which has to return a boolean value, then needs to check whetherevery
line related keyword is part of the currently processed line. And the callback function of this task will utilize a dynamically created regex ...3/3 ... which uses the
i
flag in order totest
the existence of a keyword in a case insensitive way (i
gnore case) ... because of one of the OP's use cases where one runs into comparing"What's"
versus"what's"
const sampleMultilineData =
`What's brought you here today?
What is the reason for coming in today?
Is there a reason you are here this morning?
Why are you here today?`;
const listOfKeywordLists = [
["what's", "brought", "you", "today"],
["what", "reason", "coming", "today"],
["is", "reason", "here", "morning"],
["why", "you", "here", "today"],
];
const isEveryLineContainsAllOfItsKeywords = sampleMultilineData
// create array of separte lines
// from the multiline string data.
.split(/\n/)
// wheter every single line matches the keyword criteria.
.every((line, lineIndex) =>
// access the line related keyword list via `lineIndex`.
listOfKeywordLists[lineIndex]
// wheter a line contains every keyword
// with word boundaries (`\b`) to the keyword
// in a case insensitive (`i` flag) way.
.every(keyword =>
RegExp(`\b${ keyword }\b`, 'i').test(line)
)
// // works case sensitive, thus the result is `false`.
//.every(keyword => line.includes(keyword))
);
console.log({ isEveryLineContainsAllOfItsKeywords });
.as-console-wrapper { min-height: 100%!important; top: 0; }