在 jqwik 属性 测试框架中寻找更好的方法来生成图的边列表
Looking for better ways to generate a list of edges for a graph in jqwik property testing framework
目前我正在使用:
@Provide
Arbitrary<List<Tuple.Tuple3<Integer,Integer,Integer>>> edgeLists (
TypeUsage type, ArbitraryProvider.SubtypeProvider subtype) {
int vertices = 10;
int degree_min = 1;
int degree_max = 4;
int min_edge_flow = 1;
int max_edge_flow = 10;
for (Annotation a : type.getAnnotations()) {
if (a instanceof MaxFlowParameters) {
MaxFlowParameters params = (MaxFlowParameters) a;
vertices = Math.max(1, params.vertices());
degree_min = Math.max(1, params.degree_min());
degree_max = Math.min(vertices, Math.max(degree_min, params.degree_max()));
min_edge_flow = Math.min(vertices, Math.max(0, params.min_edge_flow()));
max_edge_flow = Math.min(vertices, Math.max(min_edge_flow, params.max_edge_flow()));
}
}
Function<List<Integer>,List<Integer>> expand = new Function<List<Integer>,List<Integer>> () {
@Override
public List<Integer> apply (List<Integer> t) {
List<Integer> result = new ArrayList<>();
ListUtils.enumerate(t, (idx,copies) -> {
while (copies-- > 0) result.add(idx+1);
return true;
});
return result;
}
};
int num_vertices = vertices;
int the_min_edge_flow = min_edge_flow;
int the_max_edge_flow = max_edge_flow;
return Arbitraries.integers().between(degree_min, degree_max).list().ofSize(vertices).map(expand)
.flatMap(sources -> Arbitraries.integers().between(1, num_vertices).list().ofSize(sources.size())
.flatMap(targets -> Arbitraries.integers().between(the_min_edge_flow, the_max_edge_flow).list().ofSize(sources.size())
.map(flows -> {
int limit = sources.size();
List<Tuple3<Integer,Integer,Integer>> result = new ArrayList<>(limit);
for (int i = 0; i < limit; i++) {
result.add(Tuple.of(sources.get(i), targets.get(i), flows.get(i)));
}
return result;
})));
}
@Provide
Arbitrary<Graph<String,IntegerFlow>> graphs (TypeUsage type, ArbitraryProvider.SubtypeProvider subtype) {
return Combinators.withBuilder(() -> new GraphBuilder())
.use(this.edgeLists(type, subtype)).in((builder, edges) -> builder.withEdges(edges))
.build(builder -> builder.build());
}
@Property
void searchOrdersEqual (
@ForAll @From("edgeLists") List<Tuple.Tuple3<Integer,Integer,Integer>> edgeList,
@ForAll Random random) {
// for current in memory graph impl the search order in which augmenting paths are found will change
// if the order the edges are declared in changes. so if we see that one search order does not
// yield the same result as another, that the algo can not always be finding the max flow. if search
// orders return the same result, its still not guaranteed its finding max-flow, that will
// require additional tests. if this test fails, however, we definitely know that the algo is not
// always finding max flow.
int last = -1;
for (int i = 0; i < 10; i++) {
Collections.shuffle(edgeList, random);
Graph<String,IntegerFlow> graph = new GraphBuilder().withEdges(edgeList).build();
int next = new FordFulkerson<>(graph, graph.get(0), graph.get(graph.vertexCount()-1)).maxflow();
if (last < 0) last = next;
Assertions.assertThat(next).isEqualTo(last);
}
}
@Property
void validMinCutCandidate (@ForAll @From("graphs") Graph<String,IntegerFlow> graph) {
// given the testing constraints we are not going to find the actual min-cut, as that would involve
// re-implementation in some form. however we can check if its possible that there is a valid min-cut
// very easily. if we find that its not even possible that a valid min-cut is specified by a solution
// we know that the algorithm can not be finding the true max-flow.
Vertex<String,IntegerFlow> source = graph.get(0);
Vertex<String,IntegerFlow> sink = graph.get(graph.vertexCount() - 1);
MaxIntegerFlow<String,IntegerFlow> algorithm = new FordFulkerson<>(graph, source, sink);
int flow = algorithm.maxflow();
int possibleCut = 0;
for (Vertex<String,IntegerFlow> vertex : graph.vertices()) {
if (vertex == sink) continue;
for (Traverser<Edge<String,IntegerFlow>> trav = vertex.outgoing(); trav.moveNext();) {
if (trav.get().label().available() == 0) {
possibleCut += trav.get().label().flow();
}
}
}
Assertions.assertThat(possibleCut).isGreaterThanOrEqualTo(flow);
}
在这种情况下,我只是通过 id 指定 source/target 个顶点并添加一个流组件(它可以是一个权重或任何数量的其他关联值)。该方案是从 [degree_min,degree_max] 中为每个顶点制作一个度值列表,然后将该列表扩展为一个列表,其中每个源重复度数次。一旦我有了这个列表,我就可以生成目标和标签的序列并组合起来形成边缘。
这足以保证我有一个完整的顶点列表,并且每个顶点都有适当数量的出边。但是我没有看到这种方法可以很好地扩展以添加更多 realistic/useful 约束。特别是考虑到可能需要的额外过滤和映射步骤,而且目前可能已经太多了......
例如,我认为能够为每个节点的边创建一个任意节点,然后加入任意节点以生成整个边列表可能会有所帮助,但我看不到在框架内执行此操作的任何方法(例如,Combine 旨在组合取自多个列表中的每一个的值,而不是加入列表)。
寻找任何改进建议。
由于您的示例无法轻易复制(缺少一些依赖项),我试图通过阅读代码来了解您正在创建什么样的图表。我可能错过了什么。
这是我想出的一个简单方法:
@Provide
Arbitrary<Set<Tuple2<String, Set<Tuple2<String, Integer>>>>> nodes() {
int maxVertices = 20;
int degreeMax = 4;
int minEdgeFlow = 1;
int maxEdgeFlow = 10;
Arbitrary<String> anyVertix = Arbitraries.strings().withCharRange('a', 'z').ofLength(3);
SetArbitrary<String> anyVertices = anyVertix.set().ofMinSize(1).ofMaxSize(maxVertices);
return anyVertices.flatMapEach((vertices, vertix) -> {
// Single vertix is a special case
if (vertices.size() <= 1) {
return Arbitraries.just(Tuple.of(vertix, Collections.emptySet()));
}
Set<String> possibleTargetVertices = new HashSet<>(vertices);
possibleTargetVertices.remove(vertix);
Arbitrary<Integer> anyEdgeFlow = Arbitraries.integers().between(minEdgeFlow, maxEdgeFlow);
Arbitrary<Tuple2<String, Integer>> anyConnection =
Combinators.combine(Arbitraries.of(possibleTargetVertices), anyEdgeFlow).as(Tuple::of);
SetArbitrary<Tuple2<String, Integer>> anyConnections = anyConnection.set().ofMaxSize(degreeMax);
return anyConnections.map(connections -> Tuple.of(vertix, connections));
});
}
@Property(tries = 100)
@Report(Reporting.GENERATED)
@StatisticsReport(label = "count nodes", format = NumberRangeHistogram.class)
@StatisticsReport(label = "max degree", format = Histogram.class)
void checkNodes(@ForAll("nodes") Set<Tuple2<String, Set<Tuple2<String, Integer>>>> nodes) {
Statistics.label("count nodes").collect(nodes.size());
int maxDegree = nodes.stream().mapToInt(node -> node.get2().size()).max().orElse(0);
Statistics.label("max degree").collect(maxDegree);
}
此提供程序方法不会生成边列表,而是生成一组顶点及其连接和每个连接的边流。一个当然可以变成另一个。
我在生成器中努力的事情是尽量减少平面映射的数量,因为平面映射通常会使收缩变得更难。
就是说,图形生成是一个您可以获得博士学位的主题,而且有些人拥有。有几种生成图形的标准方法(例如 https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model)和许多更复杂的方法。您可能想查看其他一些 SO 答案和资源:
- How to generate random graphs?
- https://blog.finxter.com/how-to-generate-random-graphs-with-python/
将这些方法转化为 jqwik 代码有时很容易,有时更复杂。
P.S。 TypeUsage type, ArbitraryProvider.SubtypeProvider subtype
参数是可选的。仅添加您在方法中使用的那些。
目前我正在使用:
@Provide
Arbitrary<List<Tuple.Tuple3<Integer,Integer,Integer>>> edgeLists (
TypeUsage type, ArbitraryProvider.SubtypeProvider subtype) {
int vertices = 10;
int degree_min = 1;
int degree_max = 4;
int min_edge_flow = 1;
int max_edge_flow = 10;
for (Annotation a : type.getAnnotations()) {
if (a instanceof MaxFlowParameters) {
MaxFlowParameters params = (MaxFlowParameters) a;
vertices = Math.max(1, params.vertices());
degree_min = Math.max(1, params.degree_min());
degree_max = Math.min(vertices, Math.max(degree_min, params.degree_max()));
min_edge_flow = Math.min(vertices, Math.max(0, params.min_edge_flow()));
max_edge_flow = Math.min(vertices, Math.max(min_edge_flow, params.max_edge_flow()));
}
}
Function<List<Integer>,List<Integer>> expand = new Function<List<Integer>,List<Integer>> () {
@Override
public List<Integer> apply (List<Integer> t) {
List<Integer> result = new ArrayList<>();
ListUtils.enumerate(t, (idx,copies) -> {
while (copies-- > 0) result.add(idx+1);
return true;
});
return result;
}
};
int num_vertices = vertices;
int the_min_edge_flow = min_edge_flow;
int the_max_edge_flow = max_edge_flow;
return Arbitraries.integers().between(degree_min, degree_max).list().ofSize(vertices).map(expand)
.flatMap(sources -> Arbitraries.integers().between(1, num_vertices).list().ofSize(sources.size())
.flatMap(targets -> Arbitraries.integers().between(the_min_edge_flow, the_max_edge_flow).list().ofSize(sources.size())
.map(flows -> {
int limit = sources.size();
List<Tuple3<Integer,Integer,Integer>> result = new ArrayList<>(limit);
for (int i = 0; i < limit; i++) {
result.add(Tuple.of(sources.get(i), targets.get(i), flows.get(i)));
}
return result;
})));
}
@Provide
Arbitrary<Graph<String,IntegerFlow>> graphs (TypeUsage type, ArbitraryProvider.SubtypeProvider subtype) {
return Combinators.withBuilder(() -> new GraphBuilder())
.use(this.edgeLists(type, subtype)).in((builder, edges) -> builder.withEdges(edges))
.build(builder -> builder.build());
}
@Property
void searchOrdersEqual (
@ForAll @From("edgeLists") List<Tuple.Tuple3<Integer,Integer,Integer>> edgeList,
@ForAll Random random) {
// for current in memory graph impl the search order in which augmenting paths are found will change
// if the order the edges are declared in changes. so if we see that one search order does not
// yield the same result as another, that the algo can not always be finding the max flow. if search
// orders return the same result, its still not guaranteed its finding max-flow, that will
// require additional tests. if this test fails, however, we definitely know that the algo is not
// always finding max flow.
int last = -1;
for (int i = 0; i < 10; i++) {
Collections.shuffle(edgeList, random);
Graph<String,IntegerFlow> graph = new GraphBuilder().withEdges(edgeList).build();
int next = new FordFulkerson<>(graph, graph.get(0), graph.get(graph.vertexCount()-1)).maxflow();
if (last < 0) last = next;
Assertions.assertThat(next).isEqualTo(last);
}
}
@Property
void validMinCutCandidate (@ForAll @From("graphs") Graph<String,IntegerFlow> graph) {
// given the testing constraints we are not going to find the actual min-cut, as that would involve
// re-implementation in some form. however we can check if its possible that there is a valid min-cut
// very easily. if we find that its not even possible that a valid min-cut is specified by a solution
// we know that the algorithm can not be finding the true max-flow.
Vertex<String,IntegerFlow> source = graph.get(0);
Vertex<String,IntegerFlow> sink = graph.get(graph.vertexCount() - 1);
MaxIntegerFlow<String,IntegerFlow> algorithm = new FordFulkerson<>(graph, source, sink);
int flow = algorithm.maxflow();
int possibleCut = 0;
for (Vertex<String,IntegerFlow> vertex : graph.vertices()) {
if (vertex == sink) continue;
for (Traverser<Edge<String,IntegerFlow>> trav = vertex.outgoing(); trav.moveNext();) {
if (trav.get().label().available() == 0) {
possibleCut += trav.get().label().flow();
}
}
}
Assertions.assertThat(possibleCut).isGreaterThanOrEqualTo(flow);
}
在这种情况下,我只是通过 id 指定 source/target 个顶点并添加一个流组件(它可以是一个权重或任何数量的其他关联值)。该方案是从 [degree_min,degree_max] 中为每个顶点制作一个度值列表,然后将该列表扩展为一个列表,其中每个源重复度数次。一旦我有了这个列表,我就可以生成目标和标签的序列并组合起来形成边缘。
这足以保证我有一个完整的顶点列表,并且每个顶点都有适当数量的出边。但是我没有看到这种方法可以很好地扩展以添加更多 realistic/useful 约束。特别是考虑到可能需要的额外过滤和映射步骤,而且目前可能已经太多了......
例如,我认为能够为每个节点的边创建一个任意节点,然后加入任意节点以生成整个边列表可能会有所帮助,但我看不到在框架内执行此操作的任何方法(例如,Combine 旨在组合取自多个列表中的每一个的值,而不是加入列表)。
寻找任何改进建议。
由于您的示例无法轻易复制(缺少一些依赖项),我试图通过阅读代码来了解您正在创建什么样的图表。我可能错过了什么。
这是我想出的一个简单方法:
@Provide
Arbitrary<Set<Tuple2<String, Set<Tuple2<String, Integer>>>>> nodes() {
int maxVertices = 20;
int degreeMax = 4;
int minEdgeFlow = 1;
int maxEdgeFlow = 10;
Arbitrary<String> anyVertix = Arbitraries.strings().withCharRange('a', 'z').ofLength(3);
SetArbitrary<String> anyVertices = anyVertix.set().ofMinSize(1).ofMaxSize(maxVertices);
return anyVertices.flatMapEach((vertices, vertix) -> {
// Single vertix is a special case
if (vertices.size() <= 1) {
return Arbitraries.just(Tuple.of(vertix, Collections.emptySet()));
}
Set<String> possibleTargetVertices = new HashSet<>(vertices);
possibleTargetVertices.remove(vertix);
Arbitrary<Integer> anyEdgeFlow = Arbitraries.integers().between(minEdgeFlow, maxEdgeFlow);
Arbitrary<Tuple2<String, Integer>> anyConnection =
Combinators.combine(Arbitraries.of(possibleTargetVertices), anyEdgeFlow).as(Tuple::of);
SetArbitrary<Tuple2<String, Integer>> anyConnections = anyConnection.set().ofMaxSize(degreeMax);
return anyConnections.map(connections -> Tuple.of(vertix, connections));
});
}
@Property(tries = 100)
@Report(Reporting.GENERATED)
@StatisticsReport(label = "count nodes", format = NumberRangeHistogram.class)
@StatisticsReport(label = "max degree", format = Histogram.class)
void checkNodes(@ForAll("nodes") Set<Tuple2<String, Set<Tuple2<String, Integer>>>> nodes) {
Statistics.label("count nodes").collect(nodes.size());
int maxDegree = nodes.stream().mapToInt(node -> node.get2().size()).max().orElse(0);
Statistics.label("max degree").collect(maxDegree);
}
此提供程序方法不会生成边列表,而是生成一组顶点及其连接和每个连接的边流。一个当然可以变成另一个。
我在生成器中努力的事情是尽量减少平面映射的数量,因为平面映射通常会使收缩变得更难。
就是说,图形生成是一个您可以获得博士学位的主题,而且有些人拥有。有几种生成图形的标准方法(例如 https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model)和许多更复杂的方法。您可能想查看其他一些 SO 答案和资源:
- How to generate random graphs?
- https://blog.finxter.com/how-to-generate-random-graphs-with-python/
将这些方法转化为 jqwik 代码有时很容易,有时更复杂。
P.S。 TypeUsage type, ArbitraryProvider.SubtypeProvider subtype
参数是可选的。仅添加您在方法中使用的那些。