我如何重构这个递归策略表达式 Strategy 来参数化它的长度？

Question

上下文

首先感谢假设。它既非常强大又非常有用！

我编写了一个假设策略来生成以下形式的单调（ANDS 和 OR）策略表达式：

(A and (B or C))

这可以认为是一个树结构，其中A、B、C是叶子节点的属性，而'and'和'or'是非叶子节点。

该策略似乎可以根据需要生成表达式。

>>> find(policy_expressions(), lambda x: len(x.split()) > 3)
'(A or (A or A))'

（或许可以提高样本的统计多样性，但这不是本题的本质）

不等式也是有效的。例如：

(N or (WlIorO and (nX <= 55516 and e)))

我想约束或过滤示例，以便我可以生成具有指定数量的叶节点（即属性）的策略表达式。

为了进行性能测试，我试过将 data.draw() 与 filter 一起使用，如下所示：

@given(data=data())
def test_keygen_encrypt_proxy_decrypt_decrypt_execution_time(data, n):
    """
    :param n: the input size n. Number of attributes or leaf nodes in policy tree.
    """

    policy_str = data.draw(strategy=policy_expressions().filter(lambda x: len(extract_attributes(group, x)) == n),
                           label="policy string")

其中extract_attributes()计算表达式中叶节点的个数，n是期望的叶节点数。

这个解决方案的问题在于，当 n > 16 时，假设会抛出一个：

hypothesis.errors.Unsatisfiable: Unable to satisfy assumptions of hypothesis test_keygen_encrypt_proxy_decrypt_decrypt_execution_time.

我想生成具有 100 个叶节点的有效策略表达式。

该方法的另一个缺点是假设报告 HealthCheck.filter_too_much 和 HealthCheck.too_slow 并且 @settings 变得丑陋。

我宁愿有一个参数说 policy_expressions(leaf_nodes=4) 来得到这样的例子：

(N or (WlIorO and (nX <= 55516 and e)))

我一开始避免这样做，因为我看不出如何使用递归策略代码来做到这一点。

问题

你能建议一种方法来重构这个策略，以便它可以根据叶节点的数量进行参数化吗？

这是策略代码（无论如何它在 Charm Crypto 中是开源的）

from hypothesis.strategies import text, composite, sampled_from, characters, one_of, integers


def policy_expressions():
    return one_of(attributes(), inequalities(), policy_expression())


@composite
def policy_expression(draw):
    left = draw(policy_expressions())
    right = draw(policy_expressions())
    gate = draw(gates())
    return u'(' + u' '.join((left, gate, right)) + u')'


def attributes():
    return text(min_size=1, alphabet=characters(whitelist_categories='L', max_codepoint=0x7e))


@composite
def inequalities(draw):
    attr = draw(attributes())
    oper = draw(inequality_operators())
    numb = draw(integers(min_value=1))
    return u' '.join((attr, oper, str(numb)))


def inequality_operators():
    return sampled_from((u'<', u'>', u'<=', u'>='))


def gates():
    return sampled_from((u'or', u'and'))


def assert_valid(policy_expression):
    assert policy_expression  # not empty
    assert policy_expression.count(u'(') == policy_expression.count(u')')

https://github.com/JHUISI/charm/blob/dev/charm/toolbox/policy_expression_spec.py

Answer 1

我建议明确地将叶数构建到数据的构建方式中，然后传入您想要的叶数：

from hypothesis.strategies import text, composite, sampled_from, characters, one_of, integers


def policy_expressions_of_size(num_leaves):
    if num_leaves == 1:
        return attributes()
    elif num_leaves == 2:
        return one_of(inequalities(), policy_expression(num_leaves))
    else:
        return policy_expression(num_leaves)


policy_expressions = integers(min_value=1, max_value=500).flatmap(policy_expressions_of_size)


@composite
def policy_expression(draw, num_leaves):
    left_leaves = draw(integers(min_value=1, max_value=num_leaves - 1))
    right_leaves = num_leaves - left_leaves
    left = draw(policy_expressions_of_size(left_leaves))
    right = draw(policy_expressions_of_size(right_leaves))
    gate = draw(gates())
    return u'(' + u' '.join((left, gate, right)) + u')'


def attributes():
    return text(min_size=1, alphabet=characters(whitelist_categories='L', max_codepoint=0x7e))


@composite
def inequalities(draw):
    attr = draw(attributes())
    oper = draw(inequality_operators())
    numb = draw(integers(min_value=1))
    return u' '.join((attr, oper, str(numb)))


def inequality_operators():
    return sampled_from((u'<', u'>', u'<=', u'>='))


def gates():
    return sampled_from((u'or', u'and'))

然后您可以选择您希望策略表达式的确切大小：

>>> policy_expressions.example()
'((((((oOjFo or (((cH and (Q or (uO > 18 and byy))) and kS) or pqKUUZ > 74)) and (gi or mwsrU <= 4115)) and qLkVSTqXZxgScTj) and (vNJ > 969 and (Drwvh or (((xhmsWhHpc or hQSMnfgyiYnblLFJ) or sesfHbQ) and jt)))) or xS) and ((V and (mArqYR or qY)) or (((uVf and bbtKUCnecMKjRJD > 18944) and nerVkPSs < 29292) and (UlOJebfbgcJz or (bxfVfjgmfulSB > 71 or (jqGLlr or (zQqj and zqUGwc < 24845)))))))'
>>> 
>>> policy_expressions_of_size(1).example()
'Eo'
>>> 
>>> policy_expressions_of_size(2).example()
'KJAitOKC > 18179'
>>> policy_expressions_of_size(10).example()
'(((htjdVy or (((XTfZil or (rqZw and DEOeER)) and xGVsdeQJLTJxLsC < 388312303) or LxLfUPljUTH)) or (Kb or EoipoYzjncAGKTE)) or bc)'
>>> policy_expressions_of_size(100).example()
'(((((CxySeUrNW or bZG) or (gzSUGgTG and (((V or n) or wqA) or veuTEnjGKwIpkDDDBiQkMwsNbxrBv))) or (((SKgQSXtAg or ChCHcEsVavy) and (((Yxj and xcCX) or QrILGAWxVKXWRb > 98817811688973569232860005374239659122) or JD <= 28510)) and KhrGfZciz > 4057857855522854443)) and (ZMIzFELKAKDMrH and (((MOmAZ and J <= 22052) or (Scy >= 17563 and (VCS and ((FFLa and EtZvqwNymnZNnjlREM) or pU)))) or A))) and ((((kaYzzIXIu and (lwos and (vp and GqG))) and ((Nh and lb) or ((TbNZWYOpYmj and (AQs or w)) or NjFYLBr > 228431293))) or ((((FTSXkXGZyKXD or zXeVEqNgkyXI) or mNGI) or ((cGOGK or gjcI) and DQzYonXszfSrZMB)) and JI > 3802)) or (((jIREd and IVzFB >= 28149) and (UdCBg < 20 or (VSGxr or XBuiS <= 1615))) and (rE > 10511139808015932 and ((((((((W and u) or yslVZ) or (eVGlz < 7033 or UiE)) and ((trOmArBc and Zx) or mPKva)) or ((qqDmKUpAnW or yvSkhTgqXQaLnxL) or Z)) or snXcMDhhf) and ((Wu or XSjbKdsZqEiXXvOb) and (DNZg and qv >= 7503))) and ((rnffxTLThwvw >= 24460 and ((oO or y <= 24926) and (NjM and vEHukii))) or ((((BTdpW and rP) or (rjUylCZwJzGobXZR or MNoBdEEIuLbTRvZHMb < 7958346708112664935)) and ((YU or gY >= 15498) and (s and GnOydthO > 103))) or ((caumKPjp < 27 and OQoFXscbD) or ((qaxYwfnelmetYqHKnatQ or P) and (ixzsvX and mYROpqoHAqeEy))))))))))'

我如何重构这个递归策略表达式 Strategy 来参数化它的长度？

How can I refactor this recursive policy expression Strategy to parameterize its length?

python

python-hypothesis