比较两个文件并将缺失值添加到文件中

Comparing two files and adding the missing values to a file

我有一个大文件 (file_new.txt),其中一组属性及其值出现了多次。现在在某些集合中,与一个示例文件 (sample.txt) 属性相比,某些属性及其值会丢失。

Sample.txt

apple = 0
black = 0
cat = 0
dog = 0
elephant = 0

file_next.txt

apple = 6
black = 7
elephant = 8
==============
apple=9
cat = 10
elephant =11

我在这里寻找如下输出(sample.txt 中缺少的属性应该添加到 file_new.txt 中,值为零)

file_output.txt

apple = 6
black = 7
cat = 0
dog = 0
elephant = 8
=============
apple = 9
black = 0
cat = 10
dog = 0
elephant = 11

注意 =第一个和最后一个属性值是永久的(这里是苹果和大象)

谢谢

awk -F '[[:blank:]]*=[[:blank:]]*' '
   function Feed() {
      for( Key in ToAdd){
         if( ToAdd[ Key] == 1) print Sample[ Key]
          else ToAdd[ Key] = 1
         }
      return
      }
   FNR == NR { Sample[]=[=10=];ToAdd[]=1}
   FNR != NR && [=10=] !~ /^=====/ { ToAdd[ ]=0; print }
   [=10=] ~ /^=====/ { Feed(); print }
   END { Feed() }
   ' Sample.txt file_new.txt

使用:

  • 数据关联数组和数据计数器打印或提醒打印
  • 函数避免两次相同的代码(在=====之前和之后)

文件的顺序是强制性的

$ cat tst.awk
BEGIN   { FS="[[:space:]]*=[[:space:]]s*"; OFS=" = " }
NR==FNR { names[++numNames] = ; dflt[] = ; next }
/^=+$/  { prtRec(); print }
{ curr[] =  }
END { prtRec() }

function prtRec() {
    for (nameNr=1; nameNr<=numNames; nameNr++) {
        name = names[nameNr]
        print name, (name in curr ? curr[name] : dflt[name])
    }
    delete curr
}

$ awk -f tst.awk sample.txt file_next.txt
apple = 6
black = 7
cat = 0
dog = 0
elephant = 8
==============
apple = 9
black = 0
cat = 10
dog = 0
elephant = 11

或者如果你不关心每条输出记录中行的顺序,那就更简单了:

$ cat tst2.awk
BEGIN   { FS="[[:space:]]*=[[:space:]]*"; OFS=" = " }
NR==FNR { dflt[] = ; next }
/^=+$/  { prtRec(); print }
{ curr[] =  }
END { prtRec() }

function prtRec() {
    for (name in dflt) {
        print name, (name in curr ? curr[name] : dflt[name])
    }
    delete curr
}

$ awk -f tst2.awk sample.txt file_next.txt
apple = 6
elephant = 8
cat = 0
black = 7
dog = 0
==============
apple = 9
elephant = 11
cat = 10
black = 0
dog = 0