如何分别遍历两个数组
How do I iterate through two arrays individually
我自己的小项目是根据时间戳合并两个日志,两个日志具有相同的时间戳。有些行没有时间戳,应与有时间戳的行一起打印。
所以如果我有这样的日志:
2015-06-25 09:20:25,654 file1 text2
2015-06-25 09:20:23,654 file1 text1
test text1 belongs to the row above
2015-06-25 09:20:27,654 file1 text3
另一个文件是相同的方式,但时间戳不同。
由于我是 Ruby 的新手,我发现这个项目可能是一个很好的开始方式。
到目前为止,我已经找到足够的帮助,我应该使用 Enumerators,我想
loop do
code
end
但是我如何决定何时迭代 file1 而 file2 也会迭代?
我如何找出一个迭代器何时位于文件末尾以便我可以打印另一个文件的其余部分?
我应该先将文件读取到每个数组,还是只对每个文件使用两个流,对输出文件使用一个流?
总结:我想遍历两个文件,直到一个文件到达末尾,然后打印另一个文件中的最后一行,并控制两个文件中应该何时进行迭代。
感谢您的宝贵时间和意见!
**编辑:**
但我想将它们与时间戳合并在一起。喜欢:
2015-06-25 09:20:24,123 文件 1 文本 1
2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:26,123 file2 text2
输出:
2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:24,123 文件 1 文本 1
2015-06-25 09:20:26,123 file2 text2
基本上,如果我有两个数组,我会使用迭代器 x 和 y 进行迭代。
如果 x > y 然后将 y 放入输出文件并像 y++ 那样做,然后继续相互检查它们直到文件结束。如果 x 是 eof,只需将 y 的其余部分添加到输出文件。
好的,这就是我最终得出的结论。我相信它应该适用于您,实施如下:
- 如果每个文件的当前行都以可解析的日期开头,则最早的获胜(将第 49 行的
<
更改为 >
以交换它)。
- 当一行写入输出时,循环读取该行的文件并抓取下一行。另一个文件中的行保持不变,直到轮到该行被写入。
- 如果一行不是以可解析的日期开头,则该行将被写入。如上所述,循环该文件并拉入下一行,重复直到再次出现可解析的日期或到达文件末尾。
- 如果其中一个文件到达结尾,另一个文件将被流式传输到输出,直到它也到达其输出。
请注意,您需要将 'file1'
、'file2'
和 'output'
更改为文件路径,相对或绝对路径。您可以使用 ARGV
或 OptionParser 来使用命令行参数将此数据传递到程序中。
输入文件:
# file1
2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
# file2
2013-01-05 19:27:25,654 file1 text2
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-23 09:49:27,654 file1 text3
# output
2013-01-05 19:27:25,654 file1 text2
2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
2015-06-23 09:49:27,654 file1 text3
# compile_files.rb
require 'date'
# Attempt to read a line from the supplied file.
# If this fails, we are at the end of the file and return nil.
def read_line_from_file(file)
file.readline
rescue EOFError
nil
end
# Parse the date which is at the beginning of the supplied text.
# If this fails, it doesn't start with a date so we return nil.
def parse_date(text)
DateTime.parse(text)
rescue ArgumentError
nil
end
begin
# Open the files to sort
input_file_1 = File.open('file1', 'r')
input_file_2 = File.open('file2', 'r')
# Open the file that will be written. Here it is named "output"
File.open('output', 'w+') do |of|
# Read the first line from each file
left = read_line_from_file(input_file_1)
right = read_line_from_file(input_file_2)
# Loop until BOTH files have reached their end
until left.nil? && right.nil?
# If the first file was successfully read,
# attempt to parse the date at the beginning of the line
left_date = parse_date(left) if left
# If the second file was successfully read,
# attempt to parse the date at the beginning of the line
right_date = parse_date(right) if right
# If the first file was successfully read,
# but the date was not successfully parsed,
# the line is a stack trace and needs to be printed
# because it will be following the related
# timestamped line.
if left && left_date.nil?
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
next
# If the first file was successfully read,
# but the date was not successfully parsed,
# the line is a stack trace and needs to be printed
# because it will be following the related
# timestamped line.
elsif right && right_date.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
if left.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
# If we got this far, neither of the lines were stack trace
# lines. If the second file has reached its end, we need
# to print the line we grabbed from the first file.
if right.nil?
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
# Skip straight to the next iteration of the `until` loop.
next
end
# ADDED THIS SECTION
# If we got this far, the second file has not
# reached its end. If the first file has reached
# its end, we need to print the line we grabbed
# from the second file.
if left.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
# If we got this far, neither file has reached its
# end and both start with timestamps. If the first file's
# timestamp is less, it is older.
if left_date < right_date
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
# Skip straight to the next iteration of the `until` loop.
next
# Either the timestamps were the same or the second one is
# older.
else
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
end
end
ensure
# Make sure that the file descriptors are close.
input_file_1.close
input_file_2.close
end
我自己的小项目是根据时间戳合并两个日志,两个日志具有相同的时间戳。有些行没有时间戳,应与有时间戳的行一起打印。
所以如果我有这样的日志:
2015-06-25 09:20:25,654 file1 text2
2015-06-25 09:20:23,654 file1 text1
test text1 belongs to the row above
2015-06-25 09:20:27,654 file1 text3
另一个文件是相同的方式,但时间戳不同。 由于我是 Ruby 的新手,我发现这个项目可能是一个很好的开始方式。
到目前为止,我已经找到足够的帮助,我应该使用 Enumerators,我想
loop do
code
end
但是我如何决定何时迭代 file1 而 file2 也会迭代? 我如何找出一个迭代器何时位于文件末尾以便我可以打印另一个文件的其余部分?
我应该先将文件读取到每个数组,还是只对每个文件使用两个流,对输出文件使用一个流?
总结:我想遍历两个文件,直到一个文件到达末尾,然后打印另一个文件中的最后一行,并控制两个文件中应该何时进行迭代。
感谢您的宝贵时间和意见!
**编辑:**
但我想将它们与时间戳合并在一起。喜欢:
2015-06-25 09:20:24,123 文件 1 文本 1
2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:26,123 file2 text2
输出:
2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:24,123 文件 1 文本 1
2015-06-25 09:20:26,123 file2 text2
基本上,如果我有两个数组,我会使用迭代器 x 和 y 进行迭代。 如果 x > y 然后将 y 放入输出文件并像 y++ 那样做,然后继续相互检查它们直到文件结束。如果 x 是 eof,只需将 y 的其余部分添加到输出文件。
好的,这就是我最终得出的结论。我相信它应该适用于您,实施如下:
- 如果每个文件的当前行都以可解析的日期开头,则最早的获胜(将第 49 行的
<
更改为>
以交换它)。 - 当一行写入输出时,循环读取该行的文件并抓取下一行。另一个文件中的行保持不变,直到轮到该行被写入。
- 如果一行不是以可解析的日期开头,则该行将被写入。如上所述,循环该文件并拉入下一行,重复直到再次出现可解析的日期或到达文件末尾。
- 如果其中一个文件到达结尾,另一个文件将被流式传输到输出,直到它也到达其输出。
请注意,您需要将 'file1'
、'file2'
和 'output'
更改为文件路径,相对或绝对路径。您可以使用 ARGV
或 OptionParser 来使用命令行参数将此数据传递到程序中。
输入文件:
# file1
2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
# file2
2013-01-05 19:27:25,654 file1 text2
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-23 09:49:27,654 file1 text3
# output
2013-01-05 19:27:25,654 file1 text2
2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
2015-06-23 09:49:27,654 file1 text3
# compile_files.rb
require 'date'
# Attempt to read a line from the supplied file.
# If this fails, we are at the end of the file and return nil.
def read_line_from_file(file)
file.readline
rescue EOFError
nil
end
# Parse the date which is at the beginning of the supplied text.
# If this fails, it doesn't start with a date so we return nil.
def parse_date(text)
DateTime.parse(text)
rescue ArgumentError
nil
end
begin
# Open the files to sort
input_file_1 = File.open('file1', 'r')
input_file_2 = File.open('file2', 'r')
# Open the file that will be written. Here it is named "output"
File.open('output', 'w+') do |of|
# Read the first line from each file
left = read_line_from_file(input_file_1)
right = read_line_from_file(input_file_2)
# Loop until BOTH files have reached their end
until left.nil? && right.nil?
# If the first file was successfully read,
# attempt to parse the date at the beginning of the line
left_date = parse_date(left) if left
# If the second file was successfully read,
# attempt to parse the date at the beginning of the line
right_date = parse_date(right) if right
# If the first file was successfully read,
# but the date was not successfully parsed,
# the line is a stack trace and needs to be printed
# because it will be following the related
# timestamped line.
if left && left_date.nil?
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
next
# If the first file was successfully read,
# but the date was not successfully parsed,
# the line is a stack trace and needs to be printed
# because it will be following the related
# timestamped line.
elsif right && right_date.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
if left.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
# If we got this far, neither of the lines were stack trace
# lines. If the second file has reached its end, we need
# to print the line we grabbed from the first file.
if right.nil?
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
# Skip straight to the next iteration of the `until` loop.
next
end
# ADDED THIS SECTION
# If we got this far, the second file has not
# reached its end. If the first file has reached
# its end, we need to print the line we grabbed
# from the second file.
if left.nil?
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
# If we got this far, neither file has reached its
# end and both start with timestamps. If the first file's
# timestamp is less, it is older.
if left_date < right_date
of << left
# Now that we have printed that line,
# grab the next one from the same file.
left = read_line_from_file(input_file_1)
# Skip straight to the next iteration of the `until` loop.
next
# Either the timestamps were the same or the second one is
# older.
else
of << right
# Now that we have printed that line,
# grab the next one from the same file.
right = read_line_from_file(input_file_2)
# Skip straight to the next iteration of the `until` loop.
next
end
end
end
ensure
# Make sure that the file descriptors are close.
input_file_1.close
input_file_2.close
end