从另外两个列的拆分中向数据集添加一列

Question

我在 ubuntu 中有以下数据集，我想在 bash 中进行迭代（while 或 for）以生成一个新列，其中包含失败和通过的主题之间的商.

id, name, country, Continent, grade, passed, failed
1, Louise Smith, UK, Europe, 7, 5, 1
2, Okio Kiomoto, Japan, Asia, 9, 5, 0
3, Ralph Watson, USA, Northern America, 5.6, 5, 2
4, Mary Mcaann, South Africa, Africa, 4.7, 5, 3
5, Jack Thomson, Australia, Oceania, 10, 5, 0
6, N'dongo Mbaye, Senegal, Africa, 7.9, 5, 1

为此，我尝试在脚本中使用以下代码。但是我无法得到任何结果，因为我找不到任何方法将这个新生成的列添加到当前数据集。有什么想法吗？

while IFS=, read _ _ _ _ _ passed failed; do
newcolumn=$($passed/$failed |bc)

done

作为指导，所需的输出如下。

id, name, country, Continent, grade, passed, failed, new
1, Louise Smith, UK, Europe, 7, 5, 1, 0.2
2, Okio Kiomoto, Japan, Asia, 9, 5, 0, 0
3, Ralph Watson, USA, Northern America, 5.6, 5, 2, 0.4
4, Mary Mcaann, South Africa, Africa, 4.7, 5, 3, 0.6
5, Jack Thomson, Australia, Oceania, 10, 5, 0, 0
6, N'dongo Mbaye, Senegal, Africa, 7.9, 5, 1, 0.2

谢谢

Answer 1

我稍微重构了您的代码并提出了以下内容：

#!/bin/bash

# create new header 
header=$(awk 'NR==1 {print}' s.dat)
printf "%s, new\n" "${header}"

# read data file data rows
while IFS=, read a b c d e passed failed; do
    newcolumn=0

    # avoid divide-by-zero
    if [[ "${passed}" -ne "0" ]] ; then
        newcolumn=$(bc <<<"scale=2; ${failed} / ${passed}")
    fi

    # output data with new generated column
    printf "%s %3.2f\n" "${a}, ${b}, ${c}, ${d}, ${e}, ${passed}, ${failed}, " "${newcolumn}"
done < <(awk 'NR!=1 {print}' s.dat)

s.dat 的内容：

id, name, country, Continent, grade, passed, failed
1, Louise Smith, UK, Europe, 7, 5, 1
2, Okio Kiomoto, Japan, Asia, 9, 5, 0
3, Ralph Watson, USA, Northern America, 5.6, 5, 2
4, Mary Mcaann, South Africa, Africa, 4.7, 5, 3
5, Jack Thomson, Australia, Oceania, 10, 5, 0
6, N'dongo Mbaye, Senegal, Africa, 7.9, 5, 1

执行脚本时的输出：

id, name, country, Continent, grade, passed, failed, new
1,  Louise Smith,  UK,  Europe,  7,  5,  1,  0.20
2,  Okio Kiomoto,  Japan,  Asia,  9,  5,  0,  0.00
3,  Ralph Watson,  USA,  Northern America,  5.6,  5,  2,  0.40
4,  Mary Mcaann,  South Africa,  Africa,  4.7,  5,  3,  0.60
5,  Jack Thomson,  Australia,  Oceania,  10,  5,  0,  0.00
6,  N'dongo Mbaye,  Senegal,  Africa,  7.9,  5,  1,  0.20

更新 - 根据 OP 在评论中的问题：

不使用 awk 获取 header 行：

header=$(head -n 1 s.dat)

不使用 awk 处理数据行：

{
     # extra read to skip first row
     read 

     # read data file data rows
     while IFS=, read a b c d e passed failed; do
        newcolumn=0

        # avoid divide-by-zero
        if [[ "${passed}" -ne "0" ]] ; then
            newcolumn=$(bc <<<"scale=2; ${failed} / ${passed}")
        fi

        # output data with new generated column
        printf "%s %3.2f\n" "${a}, ${b}, ${c}, ${d}, ${e}, ${passed}, ${failed}, " "${newcolumn}"
    done

} < s.dat

Answer 2

使用awk

$ awk  'BEGIN { FS=OFS=", " } NR == 1 { ="new" } NR > 1 { =$NF/$(NF-1) }1' input_file
id, name, country, Continent, grade, passed, failed, new
1, Louise Smith, UK, Europe, 7, 5, 1, 0.2
2, Okio Kiomoto, Japan, Asia, 9, 5, 0, 0
3, Ralph Watson, USA, Northern America, 5.6, 5, 2, 0.4
4, Mary Mcaann, South Africa, Africa, 4.7, 5, 3, 0.6
5, Jack Thomson, Australia, Oceania, 10, 5, 0, 0
6, N'dongo Mbaye, Senegal, Africa, 7.9, 5, 1, 0.2

Answer 3

已在 gawk 5.1.1、mawk 1.3.4、mawk 1.9.9.6 和 macos nawk

上进行测试和确认

                ______ # this pair of empty double quotes
              /        # is *** essential ***, since it forces string
             /         # compare, allowing rows getting "0" value in new  
            /          # column to print out properly
            \ 
{m,n,g}awk '"" ($(_=NF += !+FS) = !/[0-9]/ ? "new"  : \
                         ($--_) / (+$--_ ? $_ : --_^--_^_))' FS = '[,][ \t]*' 
                                                            OFS = ', '

——————————————————————————————————

id, name, country, Continent, grade, passed, failed, new
1, Louise Smith, UK, Europe, 7, 5, 1, 0.2
2, Okio Kiomoto, Japan, Asia, 9, 5, 0, 0
3, Ralph Watson, USA, Northern America, 5.6, 5, 2, 0.4
4, Mary Mcaann, South Africa, Africa, 4.7, 5, 3, 0.6
5, Jack Thomson, Australia, Oceania, 10, 5, 0, 0
6, N'dongo Mbaye, Senegal, Africa, 7.9, 5, 1, 0.2

——————————————————————————————————

(condensed version :::) 

mawk '""($(_=NF+=!+FS)=!/[0-9]/?"new":($--_)/(+$--_?$_:--_^--_^_) )' FS='[,][ \t]*' OFS=', '

从另外两个列的拆分中向数据集添加一列

add a column to a dataset from the split of two other columns

bash

ubuntu