搜索键值对并将该值附加到 unix 中的其他键

Question

我需要搜索一个键并将该值附加到 Unix 文件中的每个 key:value 对

输入文件数据：

1A:trans_ref_id|10:account_no|20:cust_name|30:trans_amt|40:addr
1A:trans_ref_id|10A:ccard_no|20:cust_name|30:trans_amt|40:addr

我想要的输出：

account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr

基本上，我需要将 10 或 10A 的值附加到每个 key:value 对并分成新行。需要明确的是，这并不总是第二个字段。

我是 sed、awk 和 perl 的新手。我开始使用 awk:

提取值

awk -v FS="|" -v key="59" ' == key {print }' target.txt

Answer 1

# Looks for 10 or 10A
perl -F'\|' -lane'my ($id) = map /^10A?:(.*)/s, @F; print "$id|$_" for @F'

# Looks for 10 or 10<non-digit><maybe more>
perl -F'\|' -lane'my ($id) = map /^10(?:\D[^:]*)?:(.*)/s, @F; print "$id|$_" for @F'

-n 对每行输入执行程序。
-l 在读取时删除 LF 并在打印时添加它。
-a 将 | 上的行（由 -F 指定）拆分为 @F.
第一个语句提取 ID 为 10 或 10-plus-something 字段中 : 之后的内容。
第二个语句为每个字段打印一行。

Answer 2

如果您仍然不知道从哪里开始，您将使用 字段分隔符 和 输出字段分隔符 ( FS 和 OFS) 设置等于 '|'，这会将每条记录拆分为每个 '|' 处的字段。您的字段可用 , , ... $NF。你关心得到，例如account_no 来自字段二 (</code>)，因此您 <code>split() 使用分隔符 ':' 的字段二将拆分字段保存在数组中（下面使用 a）。您希望将位于第二个数组元素 a[2] 中的字段二的第二部分用作输出中的新字段 1。

剩下的只是遍历每个字段并输出 a[2] 一个分隔符，然后是当前字段。你可以这样做：

awk  'BEGIN{FS=OFS="|"} {split (,a,":"); for(i=1;i<=NF;i++) print a[2],$i}' file

例子Use/Output

在 file 中输入您的示例，结果将是：

account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr

这似乎是您所追求的。如果您还有其他问题，请告诉我。

未知字段中的“10”或“10A”

您可以按任意顺序处理包含 "10" 和 "10A" 的字段。您只需添加一个循环来遍历字段并确定哪个保留 "10" 或 "10A" 并保存来自该字段的 split() 生成的数组中的第二个元素。其余相同，例如

awk  '
    BEGIN { FS=OFS="|" } 
    {   for (i=1;i<=NF;i++){ 
            split ($i,a,":")
            if (a[1]=="10"||a[1]=="10A"){ 
                key=a[2]
                break
            }
        }
        for (i=1;i<=NF;i++)
            print key, $i
    }
' file1

示例输入

1A:trans_ref_id|10:account_no|20:cust_name|30:trans_amt|40:addr
1A:trans_ref_id|20:cust_name|30:trans_amt|10A:ccard_no|40:addr

例子Use/Output

awk  '
>     BEGIN { FS=OFS="|" }
>     {   for (i=1;i<=NF;i++){
>             split ($i,a,":")
>             if (a[1]=="10"||a[1]=="10A"){
>                 key=a[2]
>                 break
>             }
>         }
>         for (i=1;i<=NF;i++)
>             print key, $i
>     }
> ' file1
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|10A:ccard_no
ccard_no|40:addr

它从上面第二行包含 "10A" 的第 4 个字段中选择正确的新字段 1 作为输出。

让他们知道这是否是您所需要的。

Answer 3

编辑： 查找行中任意位置的 10 或 10A 值，然后按那试试看吧。

awk '
BEGIN{
  FS=OFS="|"
}
match([=10=],/(10|10A):[^|]*/){
  split(substr([=10=],RSTART,RLENGTH),arr,":")
}
{
  for(i=1;i<=NF;i++){
    print arr[2],$i
  }
}'  Input_file

解释：为以上添加详细解释。

awk '                        ##Starting awk program from here.
BEGIN{                       ##Starting BEGIN section of this program.
  FS=OFS="|"                 ##Setting FS and OFS to | here.
}
match([=11=],/(10|10A):[^|]*/){  ##using match function to match either 10: till | OR 10A: till | here.
  split(substr([=11=],RSTART,RLENGTH),arr,":") ##Splitting matched sub string into array arr with delmiter of : here.
}
{
  for(i=1;i<=NF;i++){        ##Running for loop for each field for each line.
    print arr[2],$i          ##Printing 2nd element of ar, along with current field.
  }
}'  Input_file               ##Mentioning Input_file name here.

使用您展示的示例，请尝试以下操作。

awk '
BEGIN{
  FS=OFS="|"
}
{
  split(,arr,":")
  print arr[2],
  for(i=2;i<=NF;i++){
    print arr[2],$i
  }
}
' Input_file

Answer 4

I need the value of 10 or 10A appended to every key:value pair

按照这些要求，你可以试试这个awk:

awk '
BEGIN{FS=OFS="|"}
match([=10=], /\|10A?:[^|]+/) {
   s = substr([=10=], RSTART, RLENGTH)
   sub(/.*:/, "", s)
}
{
   for (i=1; i<=NF; ++i)
      print s, $i
}' file

account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr

Answer 5

Perl 脚本实现

use strict;
use warnings;
use feature 'say';

my $fname = shift || die "run as 'script.pl input_file key0 key1 ... key#'";

open my $fh, '<', $fname || die $!;

while( <$fh> ) {
    chomp;
    my %data = split(/[:\|]/, $_);
    for my $key (@ARGV) {
        if( $data{$key} ) {
            say "$data{$key}|$_" for split(/\|/,$_);
        }
    }
}

close $fh;

运行作为 script.pl input_file 10 10A

输出

account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr

Answer 6

这是另一个 perl 解决方案：

perl -pe '($id) = /(?<![^|])10A?:([^|]+)/; s/([^|]+)[|\n]/$id|\n/g'

($id) = /(?<![^|])10A?:([^|]+)/ 这将捕获 10: 或 10A: 之后的字符串并保存在 $id 变量中。将捕获行中的第一个此类匹配项。
s/([^|]+)[|\n]/$id|\n/g 然后每个字段都以 $id 和 | 字符

搜索键值对并将该值附加到 unix 中的其他键

search for a key value pair and append the value to other keys in unix

perl

awk

sed