如何过滤 parent child 数组的多个属性,这些属性可能有几个树级别深

How to filter multiple properties of a parent child array which could be several tree level deep

TL;DR; 为简单起见,我如何过滤 parent child 数组的多个属性,这些属性可能有几个树级别深。这是供数百名用户使用的开源数据网格库。

所以我有一个 parent/children 引用数组,children 也可以有 children 自己等等,树级深度没有限制。此外,我需要能够不仅从具有树结构的 属性 过滤,而且能够过滤该数组的任何属性,即网格中的列。

例如,我有这个数组,它代表一个文件资源管理器列表

const myFiles = [
    {id: 11, file: "Music", parentId: null },
    {id: 12, file: "mp3", parentId: 11 },
    {id: 14, file: "pop", parentId: 12 },
    {id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85, parentId: 14, },
    {id: 16, file: "rock", parentId: 12 },
    {id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16, },
    {id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null, },
    {id: 21, file: "Documents", parentId: null, },
    {id: 2, file: "txt", parentId: 21 },
    {id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2, },
    {id: 4, file: "pdf", parentId: 21 },
    {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, },
    {id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4, },
    {id: 7, file: "xls", parentId: 21 },
    {id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7, },
    {id: 9, file: "misc", parentId: 21 },
    {id: 10, file: "something.txt", dateModified: "2015-02-26", size: 0.4, parentId: 9, },
]

数组看起来很扁平,但实际上它是一个树视图结构,在数据网格中表示,如下所示。

我发现部分有效的方法是遍历整个数组并添加每个项目可以包含的完整文件列表,包括它自己,例如,如果文档有一个 child PDF,它本身有a child Map.pdf,那么树映射可以用 ["Documents", "PDF", "map.pdf"] 表示,我们将其存储在 parent object,然后在下一个 child 我们存储 ["PDF","map.pdf"] 最后在最后一个 child 我们存储 ["map.pdf"] 就像所以

    {id: 21, file: "Documents", parentId: null, treeMap: ["Documents", "PDF", "map.pdf"] }
    {id: 4, file: "pdf", parentId: 21, treeMap: ["PDF", "map.pdf"] }
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, treeMap: ["map.pdf"] }

这是允许我这样做的方法

export function modifyDatasetToAddTreeMapping(items: any[], treeViewColumn: Column, dataView: any) {
  for (let i = 0; i < items.length; i++) {
    items[i]['treeMap'] = [items[i][treeViewColumn.id]];
    let item = items[i];

    if (item['parentId'] !== null) {
      let parent = dataView.getItemById(item['parentId']);

      while (parent) {
        parent['treeMap'] = dedupePrimitiveArray(parent['treeMap'].concat(item['treeMap']));
        item = parent;
        parent = dataView.getItemById(item['parentId']);
      }
    }
  }
}

export function dedupePrimitiveArray(inputArray: Array<number | string>): Array<number | string> {
  const seen = {};
  const out = [];
  const len = inputArray.length;
  let j = 0;
  for (let i = 0; i < len; i++) {
    const item = inputArray[i];
    if (seen[item] !== 1) {
      seen[item] = 1;
      out[j++] = item;
    }
  }
  return out;
}

然后 datagrid 库使用我可以这样使用的 Filter 方法,其中 columnFilters 是一个包含 1 个或多个过滤器的 object,例如 const columnFilters = { file: 'map', size: '>3' }

datagrid 是一个库 (SlickGrid),它使用像这样的过滤方法 dataView.setFilter(treeFilter);

function treeFilter(dataView: any, item: any) {
    const columnFilters = { file: this.searchString.toLowerCase(), size: 2 };
    let filterCount = 0;

    if (item[parentPropName] !== null) {
      let parent = dataView.getItemById(item['parentId']);
      while (parent) {
        if (parent.__collapsed) {
          return false;
        }
        parent = dataView.getItemById(parent['parentId']);
      }
    }

    for (const columnId in columnFilters) {
      if (columnId !== undefined && columnFilters[columnId] !== '') {
        filterCount++;

        if (item.treeMap === undefined || !item.treeMap.find((itm: string) => itm.endsWith(columnFilters[columnId]))) {
          return false;
        }
      }
    }
    return true;
  }

通过调用 modifyDatasetToAddTreeMapping(),如果我想在文件列上进行过滤,它可以正常工作,但如果我添加更多的列过滤器,它就不会按预期工作。例如,如果你看一下第二个打印屏幕,你会看到我输入了 "map" 并且会显示 "Documents > PDF > map.pdf" 这很好但是如果添加一个小于 3Mb 的文件大小它不应该显示 "map.pdf" 并且因为该文件没有显示并且 "Documents > PDF" 不包含单词 "map" 那么什么都不应该显示,所以你可以看到过滤器没有正常工作。

所以对于当前的实现,我有 2 个问题 1. 不显示树节点时行为不正确,不应该显示它的 parent 2. 必须调用 modifyDatasetToAddTreeMapping() 是一个可能不需要的额外调用 3. 它还修改了源数组,我可以深度克隆该数组,但那将是性能上的另一项开销

在转换为层次结构(树)之后,这可能可以通过递归来实现,但是如果使用递归,我想不出最好的算法来做到这一点,总是向下钻取不是很昂贵吗树找物品?

最后,目的是将它与可能有 10k 甚至 50k 行的 SlickGrid 一起使用,因此它必须很快。你可以看到这个SlickGrid demo but their implementation of the filtering is not correct, also I found the method add the mapping in this other SO Answer

注意:我还想指出,此问题的解决方案可能会使数百(或数千)用户受益,因为它将在 Angular-Slickgrid and Aurelia-Slickgrid 中实现,它们都是开源库和至少有 300 多个用户使用。

使用单词 "map" 进行过滤不应 return 任何内容,因为 nodes/children 的 none 具有该文本。

编辑

最好的代码是将完成这项工作的任何代码插入常规 JS filter,因此这意味着最终解决方案将是一种方法 myFilter,即 filter 回调方法。我坚持这个的原因是因为我使用外部库 SlickGrid 并且我必须使用该库作为公开的 public 方法可用的内容。

function myFilter(item, args) {
  const columnFilters = args.columnFilters;

  // iterate through each items of the dataset
  // return true/false on each item
}

// to be used as a drop in
dataView.setFilterArgs({ columnFilters: this._columnFilters });
dataView.setFilter(myFilter.bind(this));

如果我有 const columnFilters = { file: "map", size: "<3.2" }; ,数组的预期结果将是 4 行

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, }
]

如果我有 const columnFilters = { file: "map", size: "<3" }; ,数组的预期结果将是 3 行

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
]

最后,如果我有 const columnFilters = { file: "map", size: ">3" };,那么预期结果将是一个空数组,因为文件的 none 具有该字符和文件大小条件。

编辑 2

从@AlexL 的回答来看,它开始起作用了。只是一些调整,但它看起来已经很有前途了

编辑 3

感谢 Alex 的出色工作,他的回答帮助我将其合并到我的开源库中。我现在有 2 个带有 Parent/Child ref (flat dataset) and with a Hierarchical Dataset(树数据集)的现场演示。我希望我能不止一次投票 :)

我有办法。它应该是相当高效的,但我们可能还想换掉 map 和 reduce 等,以换成旧的 for-loops 以进一步优化速度(我看过各种博客和文章将 forEach、map 等的速度与 for-loop 和 for-loops 似乎赢了)

这是一个演示(也在这里:https://codepen.io/Alexander9111/pen/abvojzN):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

//example how to use the "<3" string - better way than using eval():
const columnFilters = { file: "map", size: "<3.2" }; //, size: "<3" 
const isSizeValid = Function("return " + myFiles[11].size + "<3")();
//console.log(isSizeValid);

const myObj = myFiles.reduce((aggObj, child) => {
  aggObj[child.id] = child;
  //the filtered data is used again as each subsequent letter is typed
  //we need to delete the ._used property, otherwise the logic below
  //in the while loop (which checks for parents) doesn't work:
  delete aggObj[child.id]._used;
  return aggObj;
}, {});

function filterMyFiles(myArray, columnFilters){
  const filteredChildren = myArray.filter(a => {
    for (let key in columnFilters){
      //console.log(key)      
      if (a.hasOwnProperty(key)){
        const strContains =  String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>=]?\d+)?\s*))+$/;
        const comparison = re.test(columnFilters[key]) && Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison){
          //don't return true as need to check other keys in columnFilters
        }else{
          //console.log('false', a)
          return false;
        }
      } else{
        return false;
      }           
    }
    //console.log('true', a)
    return true;
  })
  return filteredChildren;
}

const initFiltered = filterMyFiles(myFiles, columnFilters);

const finalWithParents = initFiltered.map(child => {
  const childWithParents = [child];
  let parent = myObj[child.parentId];
  while (parent){
    //console.log('parent', parent)
    parent._used || childWithParents.unshift(parent)
    myObj[parent.id]._used = true;
    parent = myObj[parent.parentId] || false;    
  }
  return childWithParents;
}).flat();

console.log(finalWithParents)
.as-console-wrapper { max-height: 100% !important; top: 0; }

基本上设置一个 object 以供以后查找所有 parents。

然后进行一次过滤(即数组的一次迭代)并过滤那些与 columnFilters object.

中的条件匹配的条件

然后映射(即一次迭代)这个过滤后的数组,并使用在开始时创建的 object 找到每个 parents(因此嵌套迭代达到 N 深度)。

用 .flat() 展平数组(假设最后一次迭代)然后我们就完成了。

有任何问题请告诉我。

更新 - For-loop 方法加上尝试减少对数组的迭代

删除几个迭代:) (https://codepen.io/Alexander9111/pen/MWagdVz):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

const columnFilters = { file: "map", size: "<3.2" };
console.log(customLocalFilter(myFiles, columnFilters));

function customLocalFilter(array, filters){  
  const myObj = {};
  for (let i = 0; i < myFiles.length; i++) {
    myObj[myFiles[i].id] = myFiles[i];
    //the filtered data is used again as each subsequent letter is typed
    //we need to delete the ._used property, otherwise the logic below
    //in the while loop (which checks for parents) doesn't work:
    delete myObj[myFiles[i].id]._used;
  }

  const filteredChildrenAndParents = [];
  for (let i = 0; i < myFiles.length; i++) {
    const a = myFiles[i];
    let matchFilter = true;
    for (let key in columnFilters) {
      if (a.hasOwnProperty(key)) {
        const strContains = String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>!=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>!=]?\d+)?\s*))+$/;
        const comparison =
          re.test(columnFilters[key]) &&
          Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison) {
          //don't return true as need to check other keys in columnFilters
        } else {
          matchFilter = false;
          continue;
        }
      } else {
        matchFilter = false;
        continue;
      }
    }
    if (matchFilter) {
      const len = filteredChildrenAndParents.length;
      filteredChildrenAndParents.splice(len, 0, a);
      let parent = myObj[a.parentId] || false;
      while (parent) {
        //only add parent if not already added:
        parent._used || filteredChildrenAndParents.splice(len, 0, parent);
        //mark each parent as used so not used again:
        myObj[parent.id]._used = true;
        //try to find parent of the current parent, if exists:
        parent = myObj[parent.parentId] || false;
      }
    }
  }
  return filteredChildrenAndParents;
}
.as-console-wrapper { max-height: 100% !important; top: 0; }