从文件中删除换行符
Remove linebreaks from file
我有一个以下格式的文本文件:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: {
"path": "/testpage",
"time": "2015-06-07T00:00:00.000Z"
}
Run#1 Step#4 > Request sent OK
我需要做的是处理这个文件。如果每个步骤都打印在单独的行上会更容易:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: { "path": "/testpage", "time": "2015-06-07T00:00:00.000Z" }
Run#1 Step#4 > Request sent OK
我该怎么做(在 bash 或 ruby/python/... 脚本中)?
1) 分割("\n")
2) 替换("Run#", "\nRun#")
3) 删除第一行 ("\n")
gnu sed 解决方案
cat file | sed ':a; N; $! ba; s/\n//g; s/Run#/\nRun#/g;' | sed '1d;' > outputfile
如果你所有的文件看起来都和这个一模一样,你可以用这段代码解决你的问题
file=open(filename,"r+")
lines = file.readlines()
for line in lines:
if (line.startswith("Run") and not "{" in line) or "}" in line:
print(line,end='')
else:
print(line.replace("\n",""), end='')
使用 python 根据以 Run#
开头的行对行进行分组,并将不以 运行# 开头的行的任何部分连接到前面的 运行 # 行不管内容,它也会替换原来的文件,你不需要将整个文件读入内存:
from itertools import groupby
from tempfile import NamedTemporaryFile
from shutil import move
with open("file.txt") as f, NamedTemporaryFile("w",dir=".",delete=False) as out:
grouped = groupby(f, key=lambda x: not x.startswith("Run#"))
for k, v in grouped:
if not k:
v, nxt = "".join(v), next(grouped, " ")[1]
out.write("{}{}\n".format(v.rstrip(), "".join(map(str.strip, nxt))))
else:
out.writelines(v)
move(out.name,"file.txt")
输出:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: {"path": "/testpage","time": "2015-06-07T00:00:00.000Z"}
Run#1 Step#4 > Request sent OK
我有一个以下格式的文本文件:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: {
"path": "/testpage",
"time": "2015-06-07T00:00:00.000Z"
}
Run#1 Step#4 > Request sent OK
我需要做的是处理这个文件。如果每个步骤都打印在单独的行上会更容易:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: { "path": "/testpage", "time": "2015-06-07T00:00:00.000Z" }
Run#1 Step#4 > Request sent OK
我该怎么做(在 bash 或 ruby/python/... 脚本中)?
1) 分割("\n") 2) 替换("Run#", "\nRun#") 3) 删除第一行 ("\n")
gnu sed 解决方案
cat file | sed ':a; N; $! ba; s/\n//g; s/Run#/\nRun#/g;' | sed '1d;' > outputfile
如果你所有的文件看起来都和这个一模一样,你可以用这段代码解决你的问题
file=open(filename,"r+")
lines = file.readlines()
for line in lines:
if (line.startswith("Run") and not "{" in line) or "}" in line:
print(line,end='')
else:
print(line.replace("\n",""), end='')
使用 python 根据以 Run#
开头的行对行进行分组,并将不以 运行# 开头的行的任何部分连接到前面的 运行 # 行不管内容,它也会替换原来的文件,你不需要将整个文件读入内存:
from itertools import groupby
from tempfile import NamedTemporaryFile
from shutil import move
with open("file.txt") as f, NamedTemporaryFile("w",dir=".",delete=False) as out:
grouped = groupby(f, key=lambda x: not x.startswith("Run#"))
for k, v in grouped:
if not k:
v, nxt = "".join(v), next(grouped, " ")[1]
out.write("{}{}\n".format(v.rstrip(), "".join(map(str.strip, nxt))))
else:
out.writelines(v)
move(out.name,"file.txt")
输出:
Run#1 Step#1 > Connecting to server
Run#1 Step#2 > Connected OK
Run#1 Step#3 > Sending request: {"path": "/testpage","time": "2015-06-07T00:00:00.000Z"}
Run#1 Step#4 > Request sent OK