将文件中的列读入变量并用于另一个文件中的替代值

Question

我有以下文件： input.txt

b73_chr10   w22_chr9
w22_chr7    w22_chr10
w22_chr8    w22_chr8

我编写了以下代码（如下所示）来读取第一列和第二列，并将第一列的值替换为 output.conf 文件中第二列的值。例如，我想更改值 b73_chr10 与 w22_chr9、w22_chr7 与 w22_chr10、w22_chr8 与 w22_chr8 并继续对所有值执行直到结束。

value1=$(echo $line| awk -F\ '{print }' input.txt)
value2=$(echo $line| awk -F\ '{print }' input.txt)
sed -i '.bak' 's/$value1/$value2/g' output.conf 
cat output.conf

output.conf

    <rules>
    <rule>
    condition =between(b73_chr10,w22_chr1)
    color = ylgn-9-seq-7
    flow=continue
    z=9
    </rule>
    <rule>
    condition =between(w22_chr7,w22_chr2)
    color = blue
    flow=continue
    z=10
    </rule>
    <rule>
    condition =between(w22_chr8,w22_chr3)
    color = vvdblue
    flow=continue
    z=11
    </rule>
    </rules>

我尝试了这些命令（如上所述），但是它为 me.Can 留下了空白文件，有人指导我哪里出错了吗？

Answer 1

我怀疑 sed 本身是错误的工具。但是，您可以单独执行 bash 中的要求：

#!/usr/bin/env bash

# Declare an associative array (requires bash 4)
declare -A repl=()

# Step through our replacement file, recording it to an array.
while read this that; do
  repl["$this"]="$that"
done < inp1

# Read the input file, replacing things strings noted in the array.
while read line; do
  for string in "${!repl[@]}"; do
    line="${line/$string/${repl[$string]}}"
  done
  echo "$line"
done < circos.conf

这种方法当然过于简单化，因此不应逐字使用——您需要确保只编辑您真正想要编辑的行（验证它们是否匹配 /condition =between/ 例如）。请注意，因为此解决方案使用关联数组 (declare -A ...)，所以它取决于 bash 版本 4.

如果你用 awk 解决这个问题，同样的基本原则将适用：

#!/usr/bin/awk -f

# Collect the tranlations from the first file.
NR==FNR { repl[]=; next }

# Step through the input file, replacing as required.
{
  for ( string in repl ) {
    sub(string, repl[string])
  }
}

# And print.
1

你运行第一个参数是翻译文件，第二个参数是输入文件：

$ ./thisscript translations.txt circos.conf

Answer 2

在你阅读更好的解决方案之前，先解释一下你做错了什么。
脚本的固定版本为

while read -r line; do
   value1=$(echo "$line"| awk -F" "  '{print }')
   value2=$(echo "$line"| awk -F" "  '{print }')
   sed -i "s/$value1/$value2/g" circos.conf 
done < input.txt

这里有什么变化？

已添加while read -r line; do ... done < input.txt
您的 "$line" 从未被初始化
awk 使用 -F" " 而不是 \;
中间有空格
awk 没有 input.txt
awk 应该从管道中读取，而不是从文件中读取
带双引号的 sed
必须评估变量。

这个解决方案有什么问题？
首先，您必须希望 input.txt 的值是 sed_friendly（没有斜杠或其他特殊字符）。当您将它用于大文件时，您将继续循环。 awk 可以处理循环，你应该避免在循环中嵌套 awk。

当 input.txt 有限时，您可能需要

sed -i -e 's/b73_chr10/w22_chr9/g' \
       -e 's/w22_chr7/w22_chr10/g' \
       -e 's/w22_chr8/w22_chr8/g' circos.conf

现在@alvits 的评论说得通了。将所有这些 sed 命令放在一个 sed-command 文件中。当你无法更改input.txt的格式时，你可以在脚本中重写它，但使用@Ghoti解决方案中的数组更好。

将文件中的列读入变量并用于另一个文件中的替代值

Read columns from a file into variables and use for substitute values in another file

bash

shell

awk

sed

gawk