了解 Phreak 算法的性能

Question

我试图了解是什么导致 Drools 在我们的用例中表现不佳。我在这里阅读了 Phreak 文档：https://access.redhat.com/documentation/en-us/red_hat_decision_manager/7.4/html/decision_engine_in_red_hat_decision_manager/phreak-algorithm-con_decision-engine

它似乎没有提到任何关于如何完成节点到节点跳转的内容。这是一个例子来解释我的意思：

假设我有一个包含三个字段的 Person 对象：name、lastName、SSN

我这样定义了大量规则：当 lastName = 'Smith' 和 SSN = 'XXXXXXXXXX' 时，name = "Jane"。假设我有大量姓氏为 "Smith" 的人，比如说 10,000，那么在给定姓氏和 SSN 的情况下获得单个姓名的复杂性是多少？是否需要 10,000 次比较，或者 "Smith" 节点是否保留某种形式的哈希映射与所有底层 SSN？

如果我使用一个范围（例如日期范围）而不是带有相等运算符的 SSN，会怎样，定义如下规则："when lastName = 'Smith' and startedShool >= 2005 and finishedSchool <= 2010"。节点是否保留一些奇特的数据结构来加速范围运算符的查询？

编辑：根据请求，我添加了一个示例来解释规则的设置方式

我们有一个名为 Evidence 的 class 作为输入和输出。我们在不同的激活组中构建每个证据实例并将其添加到一个集合中。我们通常会定义一些包罗万象的、低显着性的广泛规则，并添加具有更具体条件和更高显着性的规则。

这是两个显着性越来越高的规则的示例。我们定义了其中的 ~10K。

可以想象一种树状结构，在每个级别上显着性增加并且条件变得更加严格。 Evidence class 用作一种键值对，因此它们中的许多将具有相同的共同节点（例如 name = location）。

要执行规则，我们将添加两个证据（例如 [name= location, stringValue='BBB'] 和 [name = code, stringValue = theCode]）并触发规则。

rule "RULE 1::product"
    salience 0
    activation-group "product"
when
    $outputs : EvidenceSet.Builder(  )  

    Evidence( name == "location" && stringValue == "BBB" )   
then
    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("999"));


end

rule "RULE 2::product"
    salience 1
    activation-group "product"
when
    $outputs : EvidenceSet.Builder(  )  

    Evidence( name == "location" && stringValue == "BBB" )   

    Evidence( name == "code" && stringValue == "thecode" )   
then
    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));


end

Answer 1

这是规则模板

rule "RULE %s::product"
when
    $outputs : EvidenceSet.Builder(  )  
    Evidence( name == "location" && stringValue == "BBB" ) 
    Evidence( name == "code" && stringValue matches ".*code.*" )
then
    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));
end

这是从模板中构建 10000 条规则 drl 的代码

public static void main(String[] args) throws IOException {
    String template = Resources.toString(Resources.getResource("template.drl"), Charset.defaultCharset());
    
    try (PrintStream file = new PrintStream(new File("test.drl"))) {
        for (int i = 0; i < 10_000; i++)
            file.println(String.format(template, i));
    }
}

这是基于您的 drl 语法的域 classes，不包括 equals、hashCode 和 toString，我觉得很奇怪...

public class Evidence {
    public String name;
    public String stringValue;
    
    public Evidence() {
    }
    
    public Evidence(String name, String stringValue) {
        this.name = name;
        this.stringValue = stringValue;
    }
    
    public static Builder newBuilder() {
        return new Builder();
    }
}

public class EvidenceSet {
    public static Set<Evidence> evidenceSet = new HashSet<>();
    
    public static class Builder {
        public Evidence evidence = new Evidence();
        
        public Builder setName(String name) {
            evidence.name = name;
            return this;
        }
        
        public Builder setStringValue(String stringValue) {
            evidence.stringValue = stringValue;
            return this;
        }
        
        public void addEvidences(Builder builder) {
            evidenceSet.add(builder.evidence);
        }
    }
}

这里是执行10k规则文件的测试

@DroolsSession("classpath:/test.drl")
public class PlaygroundTest {
    
    private PerfStat firstStat = new PerfStat("first");
    private PerfStat othersStat = new PerfStat("others");
    @Rule
    public DroolsAssert drools = new DroolsAssert();
    
    @Test
    public void testIt() {
        firstStat.start();
        drools.insertAndFire(new EvidenceSet.Builder(), new Evidence("location", "BBB"), new Evidence("code", "thecode"));
        firstStat.stop();
        
        for (int i = 0; i < 500; i++) {
            othersStat.start();
            drools.insertAndFire(new Evidence("code", "code" + i));
            othersStat.stop();
        }
        
        System.out.printf("first, ms: %s%n", firstStat.getStat().getAvgTimeMs());
        System.out.printf("others, ms: %s%n", othersStat.getStat().getAvgTimeMs());
        System.out.println(EvidenceSet.evidenceSet);
    }
}

规则和测试应符合您的要求，以匹配规则中的所有决策节点。 Agency group 和 salience 与性能无关，此处略过。

没有 'for' 部分的测试，给出以下分析结果

'winner' 是 org.drools.core.rule.JavaDialectRuntimeData.PackageClassLoader.internalDefineClass 它创建并注册自动生成的 java class for drools 然后阻止，这是重量级操作。

与'for'部分的测试，给出了完全不同的画面

注意internalDefineClass保持不变，测试代码执行变得值得注意。因为加载了 classes 的 10k 规则并且在我的 6 核（12 个逻辑）CPU 机器上花费了 12.5 秒。之后，所有其他插入都在 avg 中执行。 700 毫秒。让我们分析一下那个时间。

然后规则块在平均 0.01 毫秒内执行。

RULE 9::product - min: 0.00 avg: 0.01 max: 1.98 activations: 501

10000 RHS * 0.01 ms = 100 ms是10k条规则执行下面的耗时

$outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));

虚拟对象的创建和操作变得引人注目，因为您将它乘以 10k。根据第二次性能分析结果，大部分时间它都将测试运行ner 输出打印到控制台。总计 100 毫秒。考虑到您粗鲁的对象创建服务于 'the builder' 范例，RHS 执行速度非常快。

为了回答你的问题，我会猜测并建议以下问题：

您不能重复使用已编译的规则会话。
它可以被低估然后阻止执行时间，如果你在 RHS 中的某处有虚拟 String.format 它将对整体执行时间值得注意，因为格式相对较慢并且你正在处理执行 10k 次。
可能存在笛卡尔规则匹配。仅仅因为你使用 set 作为结果，你只能猜测有多少 'exactly the same results' 被插入到那个 set 中，这意味着巨大的执行浪费。
如果您使用的是有状态会话，我看不到任何内存清理和对象收回。这可能会在 OOME 之前导致性能问题。这是运行...
的内存占用

第 2 天...笛卡尔匹配

我删除了日志记录并对笛卡尔匹配进行了测试

@Test
public void testIt() {
    firstStat.start();
    drools.insertAndFire(new EvidenceSet.Builder(), new Evidence("location", "BBB"), new Evidence("code", "thecode"));
    firstStat.stop();
    
    for (int i = 0; i < 40; i++) {
        othersStat.start();
        drools.insertAndFire(new Evidence("location", "BBB"), new Evidence("code", "code" + i));
        othersStat.stop();
    }
    
    drools.printPerformanceStatistic();
    System.out.printf("first, ms: %s%n", firstStat.getStat().getAvgTimeMs());
    System.out.printf("others, ms: %s%n", othersStat.getStat().getAvgTimeMs());
    System.out.println(EvidenceSet.evidenceSet);
}

注意我们在每次迭代中插入 new Evidence("location", "BBB")。我只能通过 40 次迭代运行测试，否则我最终会遇到 OOME（对我来说是新事物）。这是 CPU 和 40 次迭代的内存消耗：

每条规则触发1681次！右手平均时间不明显（优化并从执行中删除？），但是执行块时...

RULE 999::product - min: 0.00 avg: 0.00 max: 1.07 activations: 1681
RULE 99::product - min: 0.00 avg: 0.00 max: 1.65 activations: 1681
RULE 9::product - min: 0.00 avg: 0.00 max: 1.50 activations: 1681
first, ms: 10271.322
others, ms: 1614.294

删除笛卡尔匹配后，相同的测试执行得更快，并且没有内存和 GC 问题

RULE 999::product - min: 0.00 avg: 0.04 max: 1.27 activations: 41
RULE 99::product - min: 0.00 avg: 0.04 max: 1.28 activations: 41
RULE 9::product - min: 0.00 avg: 0.04 max: 1.52 activations: 41
first, ms: 10847.358
others, ms: 102.015

现在我可以将迭代次数增加到 1000 次，并且迭代的平均执行时间更少

RULE 999::product - min: 0.00 avg: 0.00 max: 1.06 activations: 1001
RULE 99::product - min: 0.00 avg: 0.00 max: 1.20 activations: 1001
RULE 9::product - min: 0.00 avg: 0.01 max: 1.79 activations: 1001
first, ms: 10986.619
others, ms: 71.748

插入事实而不撤回会限制它

总结：你需要确保你没有得到笛卡尔匹配。

省略任何笛卡尔匹配的快速解决方案是保持唯一的 Evidence。以下不是 'common approach' 执行此操作，但不需要更改和增加规则条件的复杂性。

添加方法到Evidenceclass

public boolean isDuplicate(Evidence evidence) {
    return this != evidence && hashCode() == evidence.hashCode() && equals(evidence);
}

添加显着性最高的规则

rule "unique known evidence"
salience 10000
when
    e: Evidence() 
    Evidence(this.isDuplicate(e))
then
    retract(e);
end

这已经过测试，并且似乎可以通过先前的重现笛卡尔问题的 junit 测试运行。有趣的是，对于这个测试（1000 次迭代，1000 条规则），isDuplicate 方法被调用了 2008004 次，所有调用的总时间为 172.543 毫秒。在我的机器上。规则的 RHS（事件撤回）花费的时间至少比其他规则（证据收集）长 3 倍。

RULE 999::product - min: 0.00 avg: 0.00 max: 1.07 activations: 1001
RULE 99::product - min: 0.00 avg: 0.01 max: 1.81 activations: 1001
RULE 9::product - min: 0.00 avg: 0.00 max: 1.25 activations: 1001
unique known evidence - min: 0.01 avg: 0.03 max: 1.88 activations: 1000
first, ms: 8974.7
others, ms: 69.197

了解 Phreak 算法的性能

Understanding Phreak algorithm's performance

drools