Clojure - 在不丢失分隔符的情况下拆分字符串
Clojure - Split String without loosing the separator
Clojure 是否有一个 Split 函数可以将 String 拆分成包含分隔符的子字符串?
如"a=b",分隔符“=”
return:"a",“=”,'b'。
谢谢!
据我所知,您可以使用 interpose
:
来完成
user=> (def mystring "a=b=cde=fg=hij")
#'user/mystring
user=> (interpose "=" (clojure.string/split mystring #"="))
("a" "=" "b" "=" "cde" "=" "fg" "=" "hij")
split-with
主要是这样做的,尽管这需要您做一些工作。
(split-with #(not= \= %) "a=b")
产量
[(\a) (\= \b)]
我能想到的最惯用的补救方法是:
(->> "a=b=c=d" ; Thread the string through the last argument of...
(split-with #(not= \= %)) ; Splitting on =
(flatten) ; Then flattening
(map str)) ; And turning the characters into strings
("a" "=" "b" "=" "c" "=" "d")
由于 flatten
,这可能不会高效,所以如果在长列表中不断调用它,这将不实用。
我发现正则表达式是最简单的变体:
user> (re-seq #"[^=]+|=" "asd=dfg=hgf=jjj")
;;=> ("asd" "=" "dfg" "=" "hgf" "=" "jjj")
user> (re-seq #"[^=]+|=" "asd=dfg=hgf=")
;;=> ("asd" "=" "dfg" "=" "hgf" "=")
user> (re-seq #"[^=]+|=" "=dfg=hgf=dffff")
;;=> ("=" "dfg" "=" "hgf" "=" "dffff")
(defn split-but-keep
"Sep must be escaped str (er, double escaped actually).
Use `|` separated seps for multiple e.g. \(|\) as a str for open or close paren"
[s sep]
(let [re (re-pattern (str "[^" sep "]+|" sep))]
(re-seq re s)))
这只是@leetwinski 的回答
使用 clojure.string/split
但在整个拆分正则表达式周围放置一个组。
cljs.user=> (def my-regex-with-group #"(=)")
#'cljs.user/my-regex-with-group
cljs.user=> (require '[clojure.string :as s])
nil
cljs.user=> (s/split "ab=dd=cb" my-regex-with-group)
["ab" "=" "dd" "=" "cb"]
如您所见,这在 Clojurescript 中也适用。正则表达式可以是任何有效的正则表达式。
cljs.user=> (s/split "ab=dd=cb" #"(dd=)")
["ab=" "dd=" "cb"]
没有分组,匹配被省略
cljs.user=> (s/split "ab=dd=cb" #"dd=")
["ab=" "cb"]
Clojure 是否有一个 Split 函数可以将 String 拆分成包含分隔符的子字符串? 如"a=b",分隔符“=” return:"a",“=”,'b'。 谢谢!
据我所知,您可以使用 interpose
:
user=> (def mystring "a=b=cde=fg=hij")
#'user/mystring
user=> (interpose "=" (clojure.string/split mystring #"="))
("a" "=" "b" "=" "cde" "=" "fg" "=" "hij")
split-with
主要是这样做的,尽管这需要您做一些工作。
(split-with #(not= \= %) "a=b")
产量
[(\a) (\= \b)]
我能想到的最惯用的补救方法是:
(->> "a=b=c=d" ; Thread the string through the last argument of...
(split-with #(not= \= %)) ; Splitting on =
(flatten) ; Then flattening
(map str)) ; And turning the characters into strings
("a" "=" "b" "=" "c" "=" "d")
由于 flatten
,这可能不会高效,所以如果在长列表中不断调用它,这将不实用。
我发现正则表达式是最简单的变体:
user> (re-seq #"[^=]+|=" "asd=dfg=hgf=jjj")
;;=> ("asd" "=" "dfg" "=" "hgf" "=" "jjj")
user> (re-seq #"[^=]+|=" "asd=dfg=hgf=")
;;=> ("asd" "=" "dfg" "=" "hgf" "=")
user> (re-seq #"[^=]+|=" "=dfg=hgf=dffff")
;;=> ("=" "dfg" "=" "hgf" "=" "dffff")
(defn split-but-keep
"Sep must be escaped str (er, double escaped actually).
Use `|` separated seps for multiple e.g. \(|\) as a str for open or close paren"
[s sep]
(let [re (re-pattern (str "[^" sep "]+|" sep))]
(re-seq re s)))
这只是@leetwinski 的回答
使用 clojure.string/split
但在整个拆分正则表达式周围放置一个组。
cljs.user=> (def my-regex-with-group #"(=)")
#'cljs.user/my-regex-with-group
cljs.user=> (require '[clojure.string :as s])
nil
cljs.user=> (s/split "ab=dd=cb" my-regex-with-group)
["ab" "=" "dd" "=" "cb"]
如您所见,这在 Clojurescript 中也适用。正则表达式可以是任何有效的正则表达式。
cljs.user=> (s/split "ab=dd=cb" #"(dd=)")
["ab=" "dd=" "cb"]
没有分组,匹配被省略
cljs.user=> (s/split "ab=dd=cb" #"dd=")
["ab=" "cb"]