如何使用 awk 将文件中的文本 select 从行号开始直到某个字符串
How to use awk to select text from a file starting from a line number until a certain string
我有这个文件,我想从某个行号开始读取它,直到一个字符串。我已经用过
awk "NR>=$LINE && NR<=$((LINE + 121)) {print}" db_000022_model1.dlg
从特定行读取直到增加行号,但现在我需要让它在某个字符串处自行停止,以便能够在其他文件上使用它。
DOCKED: ENDBRANCH 7 22
DOCKED: TORSDOF 3
DOCKED: TER
DOCKED: ENDMDL
我希望它在达到
后停止
DOCKED: ENDMDL
#!/bin/bash
# This script is for extracting the pdb files from a sorted list of scored
# ligands
mkdir top_poses
for d in $(head -20 summary_2.0.sort | cut -d, -f1 | cut -d/ -f1)
do
cd "$d"||continue
# find the cluster with the highest population within the dlg
RUN=$(grep '###*' "$d.dlg" | sort -k10 -r | head -1 | cut -d\| -f3 | sed 's/ //g')
LINE=$(grep -ni "BEGINNING GENETIC ALGORITHM DOCKING $RUN of 100" "$d.dlg" | cut -d: -f1)
echo "$LINE"
# extract the best pose and correct the format
awk -v line="$((LINE + 14))" "NR>=line; /DOCKED: ENDMDL/{exit}" "$d.dlg" | sed 's/^........//' > "$d.pdbqt"
# convert the pdbqt file into pdb
#obabel -ipdbqt $d.pdbqt -opdb -O../top_poses/$d.pdb
cd ..
done
当我尝试
awk -v line="$((LINE + 14))" "NR>=line; /DOCKED: ENDMDL/{exit}" "$d.dlg" | sed 's/^........//' > "$d.pdbqt"
就像在 shell 终端中一样,它可以工作。但是在脚本中它输出一个空文件。
根据您对处理 DOCKED: ENDMDL
发生在目标行之前的要求:
awk -v line="$LINE" 'NR>=line; /DOCKED: ENDMDL/{exit}' db_000022_model1.dlg
或:
awk -v line="$LINE" 'NR>=line{print; if (/DOCKED: ENDMDL/) exit}' db_000022_model1.dlg
我有这个文件,我想从某个行号开始读取它,直到一个字符串。我已经用过
awk "NR>=$LINE && NR<=$((LINE + 121)) {print}" db_000022_model1.dlg
从特定行读取直到增加行号,但现在我需要让它在某个字符串处自行停止,以便能够在其他文件上使用它。
DOCKED: ENDBRANCH 7 22
DOCKED: TORSDOF 3
DOCKED: TER
DOCKED: ENDMDL
我希望它在达到
后停止DOCKED: ENDMDL
#!/bin/bash
# This script is for extracting the pdb files from a sorted list of scored
# ligands
mkdir top_poses
for d in $(head -20 summary_2.0.sort | cut -d, -f1 | cut -d/ -f1)
do
cd "$d"||continue
# find the cluster with the highest population within the dlg
RUN=$(grep '###*' "$d.dlg" | sort -k10 -r | head -1 | cut -d\| -f3 | sed 's/ //g')
LINE=$(grep -ni "BEGINNING GENETIC ALGORITHM DOCKING $RUN of 100" "$d.dlg" | cut -d: -f1)
echo "$LINE"
# extract the best pose and correct the format
awk -v line="$((LINE + 14))" "NR>=line; /DOCKED: ENDMDL/{exit}" "$d.dlg" | sed 's/^........//' > "$d.pdbqt"
# convert the pdbqt file into pdb
#obabel -ipdbqt $d.pdbqt -opdb -O../top_poses/$d.pdb
cd ..
done
当我尝试
awk -v line="$((LINE + 14))" "NR>=line; /DOCKED: ENDMDL/{exit}" "$d.dlg" | sed 's/^........//' > "$d.pdbqt"
就像在 shell 终端中一样,它可以工作。但是在脚本中它输出一个空文件。
根据您对处理 DOCKED: ENDMDL
发生在目标行之前的要求:
awk -v line="$LINE" 'NR>=line; /DOCKED: ENDMDL/{exit}' db_000022_model1.dlg
或:
awk -v line="$LINE" 'NR>=line{print; if (/DOCKED: ENDMDL/) exit}' db_000022_model1.dlg