如何搜索 HTML 标签之间的文本

How to search the text between the HTML tags

我正在使用 mongoJS 来处理我的数据库查询。我遇到了字符串包含 HTML 标签的问题,我正在使用正则表达式在 collection 中搜索我的字符串。如何通过忽略 HTML 标签来搜索文本?

var userInput = $scope.userInput; // value from user input
db.collections.find({'obj': {$regex: new RegExp(userInput) } }).toArray(function(err, result){ 
  return res.json(result); 
}

Collections

[{_id:"34aw34d343s4", obj:"How are you?"},
{_id:"34asdfwer343s4", obj:"Are you okay?"},
{_id:"3sDaweqr43s4", obj:"Goodbye, my friend!"},
{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]

用户输入

these are
these
these are important

输出

[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[]

预计

[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]
[{_id:"34aw3sdfgds3s4", obj:"Do you know these are <strong>important</strong> items"}]

如果你想删除html标签那么下面的方法

  1. jQuery(html).text();
  2. yourStr.replace(/<(?:.|\n)*?>/gm, '');
  3. yourStr.replace(/<[^>]+>/g, '');

更多关于 Strip HTML from Text JavaScript

您可以使用 RegExp test 方法:/these|are/.test(stringToCheckAgainst);

var testCases = ["these are", "these", "these are <strong>item</strong>"];

testCases.forEach(function(value) {
  document.write(/these|are/.test(value) + "\n");
});

你应该sanitize the user input before it goes into the database. From my understanding of your system, there is a great probability that user input (prior to being inserted in the database)is not sanitized and your site is vulnerable to an XSS attack

我建议您使用像 sanitize-html to secure your site against cross-site scripting 这样的库,以及这个问题的答案。