自动文本文件编辑
Automated text file Editing
我有一个类似于此的文本文件:
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
现在的问题是有些行是这样做的:
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here or even
longer like this
现在我有不同长度的线条,并且像上面的例子那样做。我的目标是我需要每一行看起来都像第一个例子。 IE 我希望每一行都以“+PhoneNumber”而不是文本开头。所有的文本都应该退格到它的前一行,这样它就完成了句子。所以它会更像这样:
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here or even longer like this
我完全不知道如何获得脚本或任何东西来为我做这件事,所以我寻求帮助。我试过用谷歌搜索它,但没有任何帮助。现在我正在手工编辑每一行,但是有超过 30000 行文本,并且手动编辑所有这些将花费很长时间。因此,我们将不胜感激任何帮助。谢谢大家!
TLDR;如果文本所在的行不是以 +
开头,则需要一个脚本将文本返回到上一行
假设您可以访问 awk:
~ $ cat test.awk
/^\+/ { printf "\n%s", [=10=]; }
/^[^+]/ { printf " %s", [=10=]; }
END { print ""; }
~ $ cat test.input
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here or even
longer like this
~ $ awk -f test.awk <test.input | tail +2
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here or even longer like this
我建议使用两个表达式首先将 \r\n 替换为 space 然后 (.*?)+ 替换为 $1\r\n+
在记事本++中快速输出
我有一个类似于此的文本文件:
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
现在的问题是有些行是这样做的:
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here or even
longer like this
现在我有不同长度的线条,并且像上面的例子那样做。我的目标是我需要每一行看起来都像第一个例子。 IE 我希望每一行都以“+PhoneNumber”而不是文本开头。所有的文本都应该退格到它的前一行,这样它就完成了句子。所以它会更像这样:
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here or even longer like this
我完全不知道如何获得脚本或任何东西来为我做这件事,所以我寻求帮助。我试过用谷歌搜索它,但没有任何帮助。现在我正在手工编辑每一行,但是有超过 30000 行文本,并且手动编辑所有这些将花费很长时间。因此,我们将不胜感激任何帮助。谢谢大家!
TLDR;如果文本所在的行不是以 +
开头,则需要一个脚本将文本返回到上一行假设您可以访问 awk:
~ $ cat test.awk
/^\+/ { printf "\n%s", [=10=]; }
/^[^+]/ { printf " %s", [=10=]; }
END { print ""; }
~ $ cat test.input
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
but runs down to here or even
longer like this
~ $ awk -f test.awk <test.input | tail +2
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here
+PhoneNumber 3/5/15 7:16 PM us Text is here but runs down to here or even longer like this
我建议使用两个表达式首先将 \r\n 替换为 space 然后 (.*?)+ 替换为 $1\r\n+
在记事本++中快速输出