在 shell 脚本中拆分文件会添加不需要的换行符
Splitting a file in a shell script adds unwanted newlines
我需要处理一个长文本文件,将其拆分成许多较小的文件。我有一个单程 while - read - done 循环,当一行匹配时,这表示新输出文件的开始。在输入文件中,匹配的行始终以换行符开头。
我的问题是输出文件(最后一个文件除外)被换行符扩展。我在这个简短的例子中重现了这个问题。
#!/bin/zsh
rm inputfile outputfile1 outputfile2
IFS=''
printf "section1\nsection1end\n\nsection2\nsection2end\n" >inputfile
echo " open outputfile1"
exec 3<> outputfile1
counter=1
IFS=$'\n'
while IFS= read line; do
if [[ "$line" == "section2" ]]; then
echo " Matched start of section2. Close outputfile1 and open outputfile2"
exec 3>&-
exec 3<> outputfile2
fi
echo "$line" >&3
echo $counter $line
let "counter = $counter + 1"
done <inputfile
echo " Close outputfile2"
exec 3>&-
echo
unset IFS
echo `wc -l inputfile`
echo `wc -l outputfile1`
echo `wc -l outputfile2`
echo " The above should show 5, 2, 2 as desired number of newlines in these files."
输出:
open outputfile1
1 section1
2 section1end
3
Matched start of section2. Close outputfile1 and open outputfile2
4 section2
5 section2end
Close outputfile2
5 inputfile
3 outputfile1
2 outputfile2
The above should show 5, 2, 2 as desired number of newlines in these files.
选项 1
去掉所有空行。这仅在您不需要保留部分中间的任何空行时才有效。
变化:
echo "$line" >&3
收件人:
[[ -n "$line" ]] && echo "$line" >&3
选项 2
使用命令替换重写每个文件 trim 任何尾随换行符。最适合短文件。变化:
exec 3>&-
exec 3<> outputfile2
收件人:
exec 3>&-
data=$(<outputfile1)
echo "$data" >outputfile1
exec 3<> outputfile2
选项 3
让循环写入先前迭代的行,然后在开始新文件时不写入先前文件的最后一行:
#!/bin/zsh
rm inputfile outputfile1 outputfile2
IFS=''
printf "section1\nsection1end\n\nsection2\nsection2end\n" >inputfile
echo " open outputfile1"
exec 3<> outputfile1
counter=1
IFS=$'\n'
priorLine=MARKER
while IFS= read line; do
if [[ "$line" == "section2" ]]; then
echo " Matched start of section2. Close outputfile1 and open outputfile2"
exec 3>&-
exec 3<> outputfile2
elif [[ "$priorLine" != MARKER ]]; then
echo "$priorLine" >&3
fi
echo $counter $line
let "counter = $counter + 1"
priorLine="$line"
done <inputfile
echo "$priorLine" >&3
echo " Close outputfile2"
exec 3>&-
echo
unset IFS
echo `wc -l inputfile`
echo `wc -l outputfile1`
echo `wc -l outputfile2`
echo " The above should show 5, 2, 2 as desired number of newlines in these files."
我需要处理一个长文本文件,将其拆分成许多较小的文件。我有一个单程 while - read - done
我的问题是输出文件(最后一个文件除外)被换行符扩展。我在这个简短的例子中重现了这个问题。
#!/bin/zsh
rm inputfile outputfile1 outputfile2
IFS=''
printf "section1\nsection1end\n\nsection2\nsection2end\n" >inputfile
echo " open outputfile1"
exec 3<> outputfile1
counter=1
IFS=$'\n'
while IFS= read line; do
if [[ "$line" == "section2" ]]; then
echo " Matched start of section2. Close outputfile1 and open outputfile2"
exec 3>&-
exec 3<> outputfile2
fi
echo "$line" >&3
echo $counter $line
let "counter = $counter + 1"
done <inputfile
echo " Close outputfile2"
exec 3>&-
echo
unset IFS
echo `wc -l inputfile`
echo `wc -l outputfile1`
echo `wc -l outputfile2`
echo " The above should show 5, 2, 2 as desired number of newlines in these files."
输出:
open outputfile1
1 section1
2 section1end
3
Matched start of section2. Close outputfile1 and open outputfile2
4 section2
5 section2end
Close outputfile2
5 inputfile
3 outputfile1
2 outputfile2
The above should show 5, 2, 2 as desired number of newlines in these files.
选项 1
去掉所有空行。这仅在您不需要保留部分中间的任何空行时才有效。 变化:
echo "$line" >&3
收件人:
[[ -n "$line" ]] && echo "$line" >&3
选项 2
使用命令替换重写每个文件 trim 任何尾随换行符。最适合短文件。变化:
exec 3>&-
exec 3<> outputfile2
收件人:
exec 3>&-
data=$(<outputfile1)
echo "$data" >outputfile1
exec 3<> outputfile2
选项 3
让循环写入先前迭代的行,然后在开始新文件时不写入先前文件的最后一行:
#!/bin/zsh
rm inputfile outputfile1 outputfile2
IFS=''
printf "section1\nsection1end\n\nsection2\nsection2end\n" >inputfile
echo " open outputfile1"
exec 3<> outputfile1
counter=1
IFS=$'\n'
priorLine=MARKER
while IFS= read line; do
if [[ "$line" == "section2" ]]; then
echo " Matched start of section2. Close outputfile1 and open outputfile2"
exec 3>&-
exec 3<> outputfile2
elif [[ "$priorLine" != MARKER ]]; then
echo "$priorLine" >&3
fi
echo $counter $line
let "counter = $counter + 1"
priorLine="$line"
done <inputfile
echo "$priorLine" >&3
echo " Close outputfile2"
exec 3>&-
echo
unset IFS
echo `wc -l inputfile`
echo `wc -l outputfile1`
echo `wc -l outputfile2`
echo " The above should show 5, 2, 2 as desired number of newlines in these files."