在 Clojure 中过滤两个文本文件之间的匹配字符串

Filter matching strings between two text files in Clojure

文本文件的路径列表具有不同的前缀。

假设 before.txt 看起来像这样:

before/pictures/img1.jpeg
before/pictures/img2.jpeg
before/pictures/img3.jpeg

和 after.txt 看起来像这样:

after/pictures/img1.jpeg
after/pictures/img3.jpeg

函数 deleted-files 应该删除不同的前缀(之前,之后)并比较两个文件以打印 after.txt.

的缺失列表

到目前为止的代码:

(ns dirdiff.core
(:gen-class))

(defn deleted-files [prefix-file1 prefix-file2 file1 file2]
    (let [before (slurp "resources/davor.txt")
    (let [after (slurp "resources/danach.txt")
)

预期输出:被删除的那个

/pictures/img2.jpeg

如何过滤 clojure.clj 中的列表以仅显示缺失的列表?

这是我的处理方式,从 this template project:

开始
(ns tst.demo.core
  (:use tupelo.core tupelo.test)
  (:require
    [clojure.set :as set]
    [tupelo.string :as str]
    ))

(defn file-dump->names
  [file-dump-str prefix ]
  (it-> file-dump-str
    (str/whitespace-collapse it)
    (str/split it #" ")
    (mapv #(str/replace % prefix "") it)))

(defn delta-files
  [before-files-in after-files-in
   before-prefix after-prefix]
  (let-spy [before-files     (file-dump->names before-files-in before-prefix)
            after-files      (file-dump->names after-files-in after-prefix)
            before-files-set (set before-files)
            after-files-set  (set after-files)
            delta-sorted     (vec (sort (set/difference before-files-set after-files-set)))]
    delta-sorted))

和一个单元测试来展示它的实际效果:

(dotest
  (let [before-files  "before/pictures/img1.jpeg
                       before/pictures/img2.jpeg
                       before/pictures/img3.jpeg "

        after-files   "after/pictures/img1.jpeg
                       after/pictures/img3.jpeg "
        before-prefix "before"
        after-prefix  "after"]
    (is= (delta-files before-files after-files before-prefix after-prefix)
      ["/pictures/img2.jpeg"])
    ))

一定要学习 these documentation sources,包括 Getting ClojureClojure CheatSheet 等书籍。


备注:

我喜欢用let-spylet-spy-pretty来说明代码的进程。它产生如下输出:

-------------------------------
   Clojure 1.10.2    Java 15
-------------------------------

Testing tst.demo.core
before-files => ["/pictures/img1.jpeg" "/pictures/img2.jpeg" "/pictures/img3.jpeg"]
after-files => ["/pictures/img1.jpeg" "/pictures/img3.jpeg"]
before-files-set => #{"/pictures/img3.jpeg" "/pictures/img2.jpeg" "/pictures/img1.jpeg"}
after-files-set => #{"/pictures/img3.jpeg" "/pictures/img1.jpeg"}
delta-sorted => ["/pictures/img2.jpeg"]

Ran 2 tests containing 1 assertions.
0 failures, 0 errors.

spyx 宏对于调试也非常有用。参见 the README and the API docs

您可能想计算删除前缀后两组文件名之间的设置差异

(defn deprefixing [prefix]
  (comp (filter #(clojure.string/starts-with? % prefix))
        (map #(subs % (count prefix)))))

(defn load-string-set [xf filename]
  (->> filename
       slurp
       clojure.string/split-lines
       (into #{} xf)))

(defn deleted-files [prefix-file1 prefix-file2 file1 file2]
  (clojure.set/difference (load-string-set (deprefixing prefix-file1) file1)
                          (load-string-set (deprefixing prefix-file2) file2)))

(deleted-files "before" "after"
               "/tmp/before.txt" "/tmp/after.txt")
;; => #{"/pictures/img2.jpeg"}