使用 bash 脚本（awk、sed 等）在两个不同的列中正确地将数字更改为特定名称

Question

我的输入（我文档的一小部分，我还必须在 100 个文档上使用这个程序）：

86834 SOL4504
86955 SOL5240
86963 SOL4251
SOL15 38222
SOL17 35642
SOL110 41053

我的输出：

MGD674 SOL4504
MGD675 SOL5240
MGD675 SOL4251
SOL15 MGD297
SOL17 MGD277
SOL110 MGD319

在我的程序中，我想将号码更改为特定名称。对于从1到129的号码，我将号码更改为名称MGD1（例如号码：1，名称：MGD1；再例如号码：92，名称：MGD1；再例如号码12905，名称：MGD101等）。我也必须在 100 个文件中执行此操作。

首先，我想用这种方式来做，但是你可以创建完全不同代码：

#!/bin/bash
MGD_atom_index=1
number=1
MGD_mol_index=MGD$number
for index in {1..100} // I do this script on 100 files, that's why I use for loop
do
    for MGD_index in {1..900} //I run this 900 times for each file, because for every name (for example for every MGD1 program try to find and replace number, I will have max MGD900, because the highest number is 116100, so 116100/129 = 900.
    do
            sed -i "s/$MGD_atom_index/$MGD_mol_index/g;s/$(($MGD_atom_index+1))/$MGD_mol_index/g;s/$(($MGD_atom_index+2))/$MGD_mol_index/g.(this code will be very long, because I need write " s/$(($MGD_atom_index+2))/$MGD_mol_index/g" until I have $MGD_atom_index+128.....s/$(($MGD_atom_index+128))/$MGD_mol_index/g" new2_$index.ndx
        MGD_atom_index=$(($MGD_atom_index+129)) // I change atom index so for example first I look for numbers from 1 to 129 and change it to MGD1 and now I will find numbers from 130 to 258 and looking for MGD2
        number=$(($number+1))
        MGD_mol_index=SOL$number I change  and now I try to find and replace MGD2
    done
    MGD_atom_index=1 //here I reset all variables to one, because I will work on another file
    number=1
    MGD_mol_index=MGD$number
done

但是我有个问题，这段代码会特别长，因为我需要写129次 s/$(($MGD_atom_index+x))/$MGD_mol_index/g; ，其中 x 是 1 到 128 之间的数字）而且我还认为我的程序可能很慢。也许有更好的方法来做到这一点？

Answer 1

我认为这个 awk 就是您所需要的。

awk '
    ~/^[0-9]+$/{="MDG" int(/129+1)}
    ~/^[0-9]+$/{="MDG" int(/129+1)}
    1
' file

Answer 2

$ cat tst.awk
BEGIN { grp = 129 }
{
    for (i=1; i<=NF; i++) {
        if ( $i == ($i+0) ) {
            $i = "MGD" (int($i/grp)+1)
        }
    }
    print
}

$ awk -f tst.awk file
MGD674 SOL4504
MGD675 SOL5240
MGD675 SOL4251
SOL15 MGD297
SOL17 MGD277
SOL110 MGD319

所以您想要在 shell 脚本中使用 GNU awk 进行 "inplace" 编辑：

#!/bin/env bash
awk -i inplace '
BEGIN { grp = 129 }
{
    for (i=1; i<=NF; i++) {
        if ( $i == ($i+0) ) {
            $i = "MGD" (int($i/grp)+1)
        }
    }
    print
}
' 'new2_'{1..100}'.ndx'

或任何 awk:

#!/bin/env bash
tmp=$(mktemp) || exit 1
for index in {1..100}; do
    awk '
    BEGIN { grp = 129 }
    {
        for (i=1; i<=NF; i++) {
            if ( $i == ($i+0) ) {
                $i = "MGD" (int($i/grp)+1)
            }
        }
        print
    }
    ' "new2_$index.ndx" > "$tmp" && mv "$tmp" "new2_$index.ndx"
done

Answer 3

这可能对你有用（GNU sed 和 bash）：

sed -E 's#\b([0-9]+)\b#MGD$((/129+1))#g;s/.*/echo "&"/e' file

将所有数字组转换为所需的格式，方法是用 shell 数字表达式代替，前面加上 MGD，然后使用 echo 命令计算表达式。

使用 bash 脚本（awk、sed 等）在两个不同的列中正确地将数字更改为特定名称

Change numbers to specific name in properly step in two different columns by using bash script (awk, sed, etc)

bash

awk

text-processing

sed