通过基于列值的 shell 脚本删除 CSV 中的行

Deleting rows in a CSV via shell script based on a columns value

我对 shell 脚本非常缺乏经验,我需要编写一个在名为 Views 的列包含值 0 时删除整行的脚本。列 "Views" 可能并不总是在文件中的相同位置,所以我需要一些方法来事先找到列的位置。这对 sed 或 awk 可行吗?或者还有其他我可以使用的东西吗?

谢谢!

使用 awk,可以这样做:

awk -F, 'NR == 1 { for(i = 1; i <= NF; ++i) { col[$i] = i }; next } $col["Views"] != 0' filename.csv

-F, 将字段分隔符设置为逗号,因为您提到了 CSV 文件。 密码是

NR == 1 {                    # in the first line
  for(i = 1; i <= NF; ++i) { # go through all fields
    col[$i] = i              # remember their index by name.
                             # ($i is the ith field)
  }
  next                       # and do nothing else
}

$col["Views"] != 0           # after that, select lines in which the field in
                             # the column that was titled "Views" is not zero,
                             # and do the default action on them (i.e., print)

请注意,这只会过滤掉 Views 列恰好为 0 的行。如果您还想过滤掉 Views 字段为空的行,请使用 $col["Views"]而不是 $col["Views"] != 0.

awk -F ',' 'NR==1{print;for(i=1;i<=NF;++i){if($i=="Views"){x=$i;y=i}}};NR>1{if($y!=0){print}}'  file > new_file

代码分解

NR==1{                    #for the first line 
print                     #print it 
for(i=1;i<=NF;++i){       #make a loop to read all the column and find the 
    if($i=="Views"){      #name "Views" in the first row. 
        y=i               #Save the column number in a variable named y
    }
}
}

NR>1{                     # start from line 2 going downwards targeting
     if($y!=0){           # the Views Column
       print              #if it does not contain 0, print the line
     }
}
awk '( == "badString") && !( ~ /[.]/) { next } 1' inputfile > outputfile

#if 第一列 = badString 或有 . (点)不要将其包含在输出文件中