您如何使用 Clojure 读取文件然后将该文件拆分为多个文件?
How do you do reading from a file and then split that file into multiple files with Clojure?
我有一个文本文件,需要按以下方式拆分:第一行是标识符,必须位于创建的每个文件的顶部,后面几行是由逗号分隔的一对数字:
例子
12345
12000,2
13000,5
12501,1
etc...
当脚本运行时,它以下列方式留下 3 个文件
p1.dat 12345 12000,2 etc...
p2.dat 12345 13000,5 etc...
p3.dat 12345 12501,1 etc...
我已经在 Ruby 中完成了一个 脚本,但我在 中生锈了]Clojure and/or 可能永远学不好,请大神指教一下。这怎么能以 'functional' 的方式完成?
Ruby代码:
#!/usr/bin/env ruby
if ARGV.length < 1
puts """To use this script give the path to the file and the number of parts to divide,
if you dont give this number 2 is assumed"""
exit
end
file_path = ARGV[0].to_s
if ARGV.length == 1
n_parts = 2
else
n_parts = ARGV[1].to_i
end
particiones = Array.new(n_parts)
j = Array(1..n_parts)
File.readlines(file_path).each_with_index do |linea, i|
#first line is id, so it must be in every divided part
if i == 0 #repeat first line in every file
j.each do |particion|
particiones[particion-1] = linea
end
next # After processing first line
end
particiones[(i-1)%n_partes] += linea # parts get stored
end
j.each do |particion| # files are written
# Crear los archivos p1, p2,... etc
File.open("p"+particion.to_s+".dat", 'w'){|f| f << particiones[particion-1]}
puts "File p"+particion.to_s+".dat was created"
end
希望我正确理解了您的任务 - 试试这个:
代码:
(defn split-file
"First argument is path to file,
second argument is number of created files, default is 2"
([path] (split-file path 2))
([path parts]
(let [numbers (-> (slurp path)
(clojure.string/replace #"\n|\r\n" " ")
(clojure.string/split #" "))
identifier (first numbers)
pairs (rest numbers)
file-names (for [i (range 1 (inc parts))]
(str "p" i ".dat"))
data-for-files (->> (iterate rest pairs)
(take parts)
(map #(take-nth parts %))
(map #(conj % identifier))
(map #(clojure.string/join "\n" %)))]
(doall (map (fn [file-name text]
(spit file-name text))
file-names
data-for-files)))))
通话:
(split-file "data.txt" 3)
(returns(nil nil nil)
,不过没关系)
文件data.txt
:
12345
12000,2
13000,5
12501,1
12000,2
13000,5
12501,1
12000,2
13000,5
12501,1
输出文件:
p1.dat
:
12345
12000,2
12000,2
12000,2
p2.dat
:
12345
13000,5
13000,5
13000,5
p3.dat
:
12345
12501,1
12501,1
12501,1
我有一个文本文件,需要按以下方式拆分:第一行是标识符,必须位于创建的每个文件的顶部,后面几行是由逗号分隔的一对数字:
例子
12345 12000,2 13000,5 12501,1 etc...
当脚本运行时,它以下列方式留下 3 个文件
p1.dat 12345 12000,2 etc...
p2.dat 12345 13000,5 etc...
p3.dat 12345 12501,1 etc...
我已经在 Ruby 中完成了一个 脚本,但我在 中生锈了]Clojure and/or 可能永远学不好,请大神指教一下。这怎么能以 'functional' 的方式完成?
Ruby代码:
#!/usr/bin/env ruby
if ARGV.length < 1
puts """To use this script give the path to the file and the number of parts to divide,
if you dont give this number 2 is assumed"""
exit
end
file_path = ARGV[0].to_s
if ARGV.length == 1
n_parts = 2
else
n_parts = ARGV[1].to_i
end
particiones = Array.new(n_parts)
j = Array(1..n_parts)
File.readlines(file_path).each_with_index do |linea, i|
#first line is id, so it must be in every divided part
if i == 0 #repeat first line in every file
j.each do |particion|
particiones[particion-1] = linea
end
next # After processing first line
end
particiones[(i-1)%n_partes] += linea # parts get stored
end
j.each do |particion| # files are written
# Crear los archivos p1, p2,... etc
File.open("p"+particion.to_s+".dat", 'w'){|f| f << particiones[particion-1]}
puts "File p"+particion.to_s+".dat was created"
end
希望我正确理解了您的任务 - 试试这个:
代码:
(defn split-file
"First argument is path to file,
second argument is number of created files, default is 2"
([path] (split-file path 2))
([path parts]
(let [numbers (-> (slurp path)
(clojure.string/replace #"\n|\r\n" " ")
(clojure.string/split #" "))
identifier (first numbers)
pairs (rest numbers)
file-names (for [i (range 1 (inc parts))]
(str "p" i ".dat"))
data-for-files (->> (iterate rest pairs)
(take parts)
(map #(take-nth parts %))
(map #(conj % identifier))
(map #(clojure.string/join "\n" %)))]
(doall (map (fn [file-name text]
(spit file-name text))
file-names
data-for-files)))))
通话:
(split-file "data.txt" 3)
(returns(nil nil nil)
,不过没关系)
文件data.txt
:
12345
12000,2
13000,5
12501,1
12000,2
13000,5
12501,1
12000,2
13000,5
12501,1
输出文件:
p1.dat
:
12345
12000,2
12000,2
12000,2
p2.dat
:
12345
13000,5
13000,5
13000,5
p3.dat
:
12345
12501,1
12501,1
12501,1