Clojure - 在不丢失分隔符的情况下拆分字符串

Clojure - Split String without loosing the separator

Clojure 是否有一个 Split 函数可以将 String 拆分成包含分隔符的子字符串? 如"a=b",分隔符“=” return:"a",“=”,'b'。 谢谢!

据我所知,您可以使用 interpose:

来完成
user=> (def mystring "a=b=cde=fg=hij")
#'user/mystring
user=> (interpose "=" (clojure.string/split mystring #"="))
("a" "=" "b" "=" "cde" "=" "fg" "=" "hij")

split-with 主要是这样做的,尽管这需要您做一些工作。

(split-with #(not= \= %) "a=b")

产量

[(\a) (\= \b)] 

我能想到的最惯用的补救方法是:

(->> "a=b=c=d" ; Thread the string through the last argument of...
     (split-with #(not= \= %)) ; Splitting on =
     (flatten) ; Then flattening
     (map str)) ; And turning the characters into strings

("a" "=" "b" "=" "c" "=" "d")

由于 flatten,这可能不会高效,所以如果在长列表中不断调用它,这将不实用。

我发现正则表达式是最简单的变体:

user> (re-seq #"[^=]+|=" "asd=dfg=hgf=jjj")
;;=> ("asd" "=" "dfg" "=" "hgf" "=" "jjj")

user> (re-seq #"[^=]+|=" "asd=dfg=hgf=")
;;=> ("asd" "=" "dfg" "=" "hgf" "=")

user> (re-seq #"[^=]+|=" "=dfg=hgf=dffff")
;;=> ("=" "dfg" "=" "hgf" "=" "dffff")
(defn split-but-keep 
  "Sep  must be escaped str (er, double escaped actually).
  Use `|` separated seps for multiple e.g. \(|\)  as a str for open or close paren"
  [s sep]                                
  (let [re (re-pattern (str "[^" sep "]+|" sep))]
    (re-seq re s)))

这只是@leetwinski 的回答

使用 clojure.string/split 但在整个拆分正则表达式周围放置一个组。

cljs.user=> (def my-regex-with-group #"(=)")
#'cljs.user/my-regex-with-group
cljs.user=> (require '[clojure.string :as s])
nil
cljs.user=> (s/split "ab=dd=cb" my-regex-with-group)
["ab" "=" "dd" "=" "cb"]

如您所见,这在 Clojurescript 中也适用。正则表达式可以是任何有效的正则表达式。

cljs.user=> (s/split "ab=dd=cb" #"(dd=)")
["ab=" "dd=" "cb"]

没有分组,匹配被省略

cljs.user=> (s/split "ab=dd=cb" #"dd=")
["ab=" "cb"]