我的数据有复杂的规范 - 如何生成样本?

I have complex Spec for my data - how to generate samples?

我的 Clojure 规范如下:

(spec/def ::global-id string?)
(spec/def ::part-of string?)
(spec/def ::type string?)
(spec/def ::value string?)
(spec/def ::name string?)
(spec/def ::text string?)
(spec/def ::date (spec/nilable (spec/and string? #(re-matches #"^\d{4}-\d{2}-\d{2}$" %))))
(spec/def ::interaction-name string?)
(spec/def ::center (spec/coll-of string? :kind vector? :count 2))
(spec/def ::context- (spec/keys :req [::global-id ::type]
                                :opt [::part-of ::center]))
(spec/def ::contexts (spec/coll-of ::context-))
(spec/def ::datasource string?)
(spec/def ::datasource- (spec/nilable (spec/keys :req [::global-id ::name])))
(spec/def ::datasources (spec/coll-of ::datasource-))
(spec/def ::location string?)
(spec/def ::location-meaning- (spec/keys :req [::global-id ::location ::contexts ::type]))
(spec/def ::location-meanings (spec/coll-of ::location-meaning-))
(spec/def ::context string?)
(spec/def ::context-association-type string?)
(spec/def ::context-association-name string?)
(spec/def ::priority string?)
(spec/def ::has-context- (spec/keys :req [::context ::context-association-type ::context-association-name ::priority]))
(spec/def ::has-contexts (spec/coll-of ::has-context-))
(spec/def ::fact- (spec/keys :req [::global-id ::type ::name ::value]))
(spec/def ::facts (spec/coll-of ::fact-))
(spec/def ::attribute- (spec/keys :req [::name ::type ::value]))
(spec/def ::attributes (spec/coll-of ::attribute-))
(spec/def ::fulltext (spec/keys :req [::global-id ::text]))
(spec/def ::feature- (spec/keys :req [::global-id ::date ::location-meanings ::has-contexts ::facts ::attributes ::interaction-name]
                                :opt [::fulltext]))
(spec/def ::features (spec/coll-of ::feature-))
(spec/def ::attribute- (spec/keys :req [::name ::type ::value]))
(spec/def ::attributes (spec/coll-of ::attribute-))
(spec/def ::ioi-slice string?)
(spec/def ::ioi- (spec/keys :req [::global-id ::type ::datasource ::features ::attributes ::ioi-slice]))
(spec/def ::iois (spec/coll-of ::ioi-))
(spec/def ::data (spec/keys :req [::contexts ::datasources ::iois]))
(spec/def ::data- ::data)

但它无法生成样本:

(spec/fdef data->graph
  :args (spec/cat :data ::xml-spec/data-))

(println (stest/check `data->graph))

那么会生成异常失败: Couldn't satisfy such-that predicate after 100 tries.

stest/check自动生成spec很方便,但是如何在spec旁边也有生成器呢?

当您在从规格生成数据时看到错误 Couldn't satisfy such-that predicate after 100 tries. 时,常见原因是 s/and 规格,因为规格仅基于 [=63] 为 s/and 规格构建生成器=]第一个内部规格

这个规范似乎最有可能导致这种情况,因为 s/and 中的第一个内部 spec/predicate 是 string?,而下面的谓词是一个正则表达式:

(s/def ::date (s/nilable (s/and string? #(re-matches #"^\d{4}-\d{2}-\d{2}$" %))))

如果您对 string? 生成器进行采样,您会发现它生成的内容不太可能与您的正则表达式匹配:

(gen/sample (s/gen string?))
=> ("" "" "X" "" "" "hT9" "7x97" "S" "9" "1Z")

test.check 将尝试(默认情况下 100 次)获取满足 such-that 条件的值,如果不满足则抛出您看到的异常。

生成日期

您可以通过多种方式为此规范实现自定义生成器。这是一个 test.check 生成器,它将创建 ISO 本地日期字符串:

(def gen-local-date-str
  (let [day-range (.range (ChronoField/EPOCH_DAY))
        day-min (.getMinimum day-range)
        day-max (.getMaximum day-range)]
    (gen/fmap #(str (LocalDate/ofEpochDay %))
              (gen/large-integer* {:min day-min :max day-max}))))

此方法获取有效纪元日的范围,使用它来控制 large-integer* 生成器的范围,然后 fmaps LocalDate/ofEpochDay 生成的整数。

(def gen-local-date-str
  (gen/fmap #(-> (Instant/ofEpochMilli %)
                 (LocalDateTime/ofInstant ZoneOffset/UTC)
                 (.toLocalDate)
                 (str))
            gen/large-integer))

这从默认的 large-integer 生成器开始,并使用 fmap 提供一个函数,该函数从生成的整数创建 java.time.Instant,将其转换为 java.time.LocalDate,并将其转换为恰好与您的日期字符串格式匹配的字符串。 (这在 Java 9 及更高版本 java.time.LocalDate/ofInstant 上稍微简单一些。)

另一种方法可能使用 test.chuck 的 regex-based string generator 或不同的日期 classes/formatters。请注意,我的两个示例都会生成 before/after -9999/+9999 年,这与您的 \d{4} 年正则表达式不匹配,但是生成器应该经常生成令人满意的值,以至于它可能不会对您的用例很重要。有很多方法可以生成日期值!

(gen/sample gen-local-date-str)
=>
("1969-12-31"
 "1970-01-01"
 "1970-01-01"
 ...)

使用规范的自定义生成器

然后您可以使用 s/with-gen:

将此生成器与您的规格相关联
(s/def ::date
  (s/nilable
   (s/with-gen
    (s/and string? #(re-matches #"^\d{4}-\d{2}-\d{2}$" %))
    (constantly gen-local-date-str))))

(gen/sample (s/gen ::date))
=>
("1969-12-31"
 nil ;; note that it also makes nils b/c it's wrapped in s/nilable
 "1970-01-01"
 ...)

如果您不想将自定义生成器直接绑定到规范定义:

(gen/sample (s/gen ::data {::date (constantly gen-local-date-str)}))

使用此规范和生成器,我能够生成更大的 ::data 规范,尽管由于某些集合规范,输出非常 很大。您还可以使用规范中的 :gen-max 选项在生成期间控制这些文件的大小。