从前 N 个字符获取唯一值

Question

我不太明白如何解决我的问题，尽管我之前一直在使用 grep、uniq 和 sort。我非常感谢如何解决这个问题:)

我想从我的输入文件中获取前 6 个字符的 uniq，并得到如下所示的输出。不知道是不是 uniq, grep, awk 我需要用，也许有人可以帮我一下。

我的文件如下所示：

Field1     Filed2    Field3
value1   some_stuff  something
value2   another     fake  
value1   fake        value    
value3   blah        blah
value2   blah        fake 


Prefered output:

Field1    Field2    Field3
value1   some_stuff something
value2   another    fake
value3   blah       blah

Answer 1

能否请您尝试关注，

awk 'FNR==1{print;next} !a[substr([=10=],1,6)]++' Input_file

说明：为以上代码添加说明。

awk '
FNR==1{                     ##Checking condition if line is first then do following.
  print                     ##Printing current line which is first line of headers.
  next                      ##next will skip all further lines from here.
}                           ##Closing condition BLOCK here.
!a[substr([=11=],1,6)]++        ##Creating array named a whose index is first 6 characters and keeping its increment value.
                            ##awk works on function condition/pattern and action, no action mentioned here so print of line happened.
'  Input_file               ##Mentioning Input_file name here.

如果您的第一个字段只有 6 个字符，请使用以下内容。

awk '!a[]++' Input_file

关于 !a[]++ 部分。基本上它检查它是否已经存储在一个数组中（这里命名为 x）上一行解析的第一列值。

如果是(a[] != 0)，则不输出该行。否则，它将输出并存储它（a[]++，因此 a[] = a[] + 1 因此 a[] 将等于 1）用于下一行解析。看到这个 Unix answer.

从前 N 个字符获取唯一值

Getting uniqe value from the first N characters

awk

grep

uniq