从文件中删除与另一个文件的空行相对应的行

Question

我有两个行数和列数相同的文件。用 ; 分隔。例子;

file_a:

1;1;1;1;1
2;2;2;2;2
3;3;3;3;3
4;4;4;4;4

file_b:

A;A;A;A;A
B;B;;;B
;;;;
D;D;D;D;D

忽略定界符，第 3 行从 file_b 开始为空。所以我也想从 file_a 中删除第 3 行，在命令之前；

paste -d ';' file_a file_b.

为了得到这样的输出：

1;1;1;1;1;A;A;A;A;A
2;2;2;2;2;B;B;;;B
4;4;4;4;4;D;D;D;D;D

编辑：每行和两个文件的列数为 93 且相同，因此两个文件具有完全相同的行和列矩阵。

Answer 1

您能否尝试在 GNU awk.

中使用显示的示例进行跟踪、编写和测试

awk '
BEGIN{
  FS=OFS=";"
}
FNR==NR{
  arr[FNR]=[=10=]
  next
}
!/^;+$/{
  print arr[FNR],[=10=]
}
' file_a file_b

说明： 为以上添加详细说明。

awk '                 ##Starting awk program from here.
BEGIN{                ##Starting BEGIN section from here.
  FS=OFS=";"          ##Setting field separator and output field separator as ; here.
}
FNR==NR{              ##Checking condition if FNR==NR which will be TRUE when file_a is being read.
  arr[FNR]=[=11=]         ##Creating arr with index FNR and value is current line.
  next                ##next will skip all further statements from here.
}
!/^;+$/{              ##Checking condition if line NOT starting from ; till end then do following.
  print arr[FNR],[=11=]   ##Printing arr with index of FNR and current line.
}
' file_a file_b       ##Mentioning Input_file names here.

Answer 2

既然你提到两个文件的行数相同，getline 适合这里：

$ awk '(getline line < "f2")==1 && line ~ /[^;]/' f1
1;1;1;1;1
2;2;2;2;2
4;4;4;4;4

您也可以在 awk 中执行 paste 功能：

$ awk '(getline line < "f2")==1 && line ~ /[^;]/{print [=11=] ";" line}' f1
1;1;1;1;1;A;A;A;A;A
2;2;2;2;2;B;B;;;B
4;4;4;4;4;D;D;D;D;D

如果行读取成功，getline 的 return 值为 1。 line ~ /[^;] 检查该行是否包含任何非 ; 字符。如果两个条件都满足，就可以打印需要的结果了。

Answer 3

基本上是对@RavinderSingh13 解决方案的修改，但我只存储空记录的 NR：

$ awk '
NR==FNR {            # process the b file
    if([=10=]~/^;+$/)    # when empty record met
        a[NR]        # hash the record number NR
    next
}
!(FNR in a)          # print non-empty matches of a file
' fileb filea

输出：

1;1;1;1;1
2;2;2;2;2
4;4;4;4;4

Answer 4

在 paste 之后过滤 更容易。假设要排除的输入行的格式完全如问题所示，您可以使用锚定到行尾的 grep 模式过滤 paste 的输出。（行尾有 5 个空字段）

paste -d ';' file_a file_b | grep -v ';;;;;$'

使用问题中显示的输入文件，这将准确打印请求的输出。

编辑：
为了满足注释的附加要求，可以修改 grep 命令以指定对应于空列数的分号数。对于不同的输入文件，只需相应地更改数字5。

paste -d ';' file_a file_b | grep -v ';\{5\}$'

如果问题中指定的列数为 93，则命令为

paste -d ';' file_a file_b | grep -v ';\{93\}$'

编辑2：
也可以从file_b
的第一行得到需要的分号个数
SEMICOLONS=$(head -1 file_b | sed 's/[^;]*//g') paste -d ';' file_a file_b | grep -v ";$SEMICOLONS"'$'

或合并为

paste -d ';' file_a file_b | grep -v ';'$(head -1 file_b | sed 's/[^;]*//g')'$'

从文件中删除与另一个文件的空行相对应的行

Remove lines from a file corresponding to blank lines of another file

awk

paste

blank-line