按组保留数组的前 N ​​个元素

Keep top N elements of an array by group

我在 Blaze Advisor (rule enginge) 中使用专有语言。我正在寻找一种算法,如何按特定属性形成的组只保留数组中的前 N ​​个项目。例如有两个数组:

parrent[0].id = 1
parrent[1].id = 2

第二个数组:

child[0].parrentId = 1
child[0].result = 3.0
child[1].parrentId = 1
child[1].result = 2.0
child[2].parrentId = 1
child[2].result = 4.0
child[3].parrentId = 1
child[3].result = 6.0
child[4].parrentId = 1
child[4].result = -1.0
child[5].parrentId = 2
child[5].result = 1.0
child[6].parrentId = 2
child[6].result = 16.0
child[7].parrentId = 2
child[7].result = 2.0
child[8].parrentId = 2
child[8].result = -10.0
child[9].parrentId = 2
child[9].result = 5.0

我想只保留 child 数组中每个 parrentId 的前三个元素,如 result 属性所示。在我的语言中,我可以完成所有基本操作 - 我可以使用 if/else,while,for,for each constructs,并创建新变量。我可以对数组 asc/desc 进行排序并获取已排序元素的索引。我可以删除数组的元素。

对于我的数据,我需要以下结果:

child[0].parrentId = 1
child[0].result = 3.0
child[1].parrentId = 1
child[2].result = 4.0
child[3].parrentId = 1
child[3].result = 6.0
child[6].parrentId = 2
child[6].result = 16.0
child[7].parrentId = 2
child[7].result = 2.0
child[9].parrentId = 2
child[9].result = 5.0

可以使用 selection algorithm 来保留数组的前 N ​​个元素。将数据分组这一事实不应增加问题的复杂性,只需忽略不在组中的元素即可。

例如,如果您使用 partial sorting,您可以只对组中的元素进行部分排序。您不会像在标准部分排序中那样提前知道第 N 个条目的索引,但是您可以修改外循环以遍历所有条目(而不仅仅是前 N 个条目)但跟踪您有多少已经部分排序并在完成后中断。

顺便说一下,既然你在问题中声明你可以用你的专有语言对数组进行排序,那么如果它不是太低效,你只需对整个数组进行冷排序,然后遍历整个数组直到找到所有兴趣组的前 N ​​个项目。这是蛮力,但如果您没有性能问题,那么它可能是最简单的答案。

辅助class:

以及函数:

有代码:

len is an integer initially top.children.count - 1;
idx is an integer initially len;
while idx > atIdx do {
    top.children[idx] = top.children[idx-1];
    decrement idx;
}
top.children[atIdx] = child;

此代码可以满足您的要求:

child is an fixed array of 10 Child;

counter is an integer initially 0;
while counter < 10 do { child[counter] = a Child; increment counter }

child[0].parrentId = 1;
child[0].result = 3.0;
child[1].parrentId = 1;
child[1].result = 2.0;
child[2].parrentId = 1;
child[2].result = 4.0;
child[3].parrentId = 1;
child[3].result = 6.0;
child[4].parrentId = 1;
child[4].result = -1.0;
child[5].parrentId = 2;
child[5].result = 1.0;
child[6].parrentId = 2;
child[6].result = 16.0;
child[7].parrentId = 2;
child[7].result = 2.0;
child[8].parrentId = 2;
child[8].result = -10.0;
child[9].parrentId = 2;
child[9].result = 5.0;

groups is an array of real;

topN is an integer initially 4;

//Init the hashmap of [group] -> [array of 'N' top Child]
top3fromGroup is an association from real to TopChildren;
for each Child in child do if not groups.contains(it.parrentId) then { 
    top3fromGroup[it.parrentId] = a TopChildren;
    initCounter is an integer initially 0;
    while initCounter < topN do {
        top3fromGroup[it.parrentId].children[initCounter] = a Child initially { it.result = Double.MIN_VALUE;} 
        increment initCounter;
    }
    groups.append(it.parrentId);
}

//Filling the groups at the hashmap with the Child elements ordered inside its groups
for each real in groups do { 
    group is a real initially it;
    for each Child in child do {
        localChild is some Child initially it;
        if it.parrentId = group then {
            top is some TopChildren initially top3fromGroup[group]; 
            topValuesIdx is an integer initially 0;
            while topValuesIdx < top.children.count do {
                topChild is some Child initially top.children[topValuesIdx];
                if localChild.result > topChild.result then { 
                    insertAt(topValuesIdx, localChild, top);
                    topValuesIdx = top.children.count;
                } 
                increment topValuesIdx;
            }
        }
    }
}

//Printing the hashmap
for each real in groups do {
    group is a real initially it;
    print("Group: " group);
    childIdx is an integer initially 0;
    for each Child in top3fromGroup[it].children do {
        print("\tchild["childIdx"].parrentId = " it.parrentId); 
        print("\tchild["childIdx"].result = " it.result);
        increment childIdx;
    }
}

Eclipse/Blaze 控制台上的输出为:

Group: 1.0
    child[0].parrentId = 1.0
    child[0].result = 6.0
    child[1].parrentId = 1.0
    child[1].result = 4.0
    child[2].parrentId = 1.0
    child[2].result = 3.0
    child[3].parrentId = 1.0
    child[3].result = 2.0
Group: 2.0
    child[0].parrentId = 2.0
    child[0].result = 16.0
    child[1].parrentId = 2.0
    child[1].result = 5.0
    child[2].parrentId = 2.0
    child[2].result = 2.0
    child[3].parrentId = 2.0
    child[3].result = 1.0

Execution complete.

我知道这是一个非常简单的解决方案,而不是最佳解决方案。