在开始和结束模式中查找和修改模式并更新文件

Find and modify a pattern within the start and end patterns and update file

我想在开始和结束模式中查找和修改模式以更新多个文件。 如果这可以通过 awk / sed 实现,我将分解这些步骤。

  1. 查找 'startpat' 和 'endpat' 中出现的字符串(捕获开始和结束之间的行)
  2. 修改实例中的字符串,例如: 将 'sss: ccc' 更新为 'sss: ddd' 将 'brr: mmm' 更新为 'brr: rel/ccc'
  3. 现在使用步骤 2 中的更新字符串创建一组从 'startpat' 到 'endpat' 的新行。
  4. 在“---”之后追加到文件的开头。
  5. 如果匹配字符串 'sss: aaa' 和 'brr: rel/aaa'
  6. ,则删除 'startpat' 和 'endpat' 中最后出现的一组行

注意:最重要的是希望保留缩进,因为我正在处理 json/yaml 文件。

输入文件格式(PS 解析文件时忽略注释行 //):

---
 - startpat:        // Startpat - make this line inclusive
    ...
    sss: ccc        // pattern to be modified
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'      // pattern to be modified
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'
    ...

 - startpat:        // Endpat - make this line exclusive

处理后的预期输出:

---
 - startpat:
    sss: ddd
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'

 - startpat:        // Startpat
    ....
    sss: ccc
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'
    ...

 - startpat:        // Endpat

我认为最简单的方法是将每一行保存在一个数组中。入门指南:

$ cat f.awk
BEGIN {
    # build regualr expressions to match "start pattern" and
    # "end pattern" (in the question they are the same)

    ws = "[\t ]*"            # white-spaces
    sp = "^" ws "- startpat:" # [s]tart [p]attern
    ep = sp                   # [e]nd   [p]attern

    # a regular expression to match "---"
    # possibly suraunded by white-spaces
    op = "^" ws "---" ws "$" # where to start appending
}

{ f[NR] = [=10=] } # save every line to an array

END {
    n = NR # number of line in the file

    find_blocks() # set `nb` (number of blocks), `ss` `ee`

    for (ib = 1; ib <= nb; ib++)
        process_block(ss[ib], ee[ib]) # pass start and end of each block
                                      # set `nex' (number of extra lines) and `eex'
    write()
}

function find_blocks(   i, l, is, ie) {
    for (i = 1; i <= n; i++) {
        l = f[i]
        if (is > ie && l ~ ep) ee[++ie] = i # end
        if (           l ~ sp) ss[++is] = i # start
    }
    nb = ie
}

function process_block(is, ie,   i, l) {
    for (i = is + 1; i <= ie - 1; i++) {
        l = f[i]
        # modify a line (an example)
        if (l ~ /brr:/) sub(/'mmm'/, "'rel/cc'", l)

        eex[++nex] = l # push the line to another array
    }
}

function write(   i, j, l) {
    i = 1
    while (i <= n) { # print everything before "---"
        print l = f[i++]
        if (l ~ op) break
    }

    for (j = 1; j <= nex; j++) # add an extra part
        print eex[j]

    while (i <= n)            # print the part after "---"
        print f[i++]
}

输入文件

$ cat input
---
 - startpat:
    XXXXX
    brr: 'mmm'
 - startpat:
    YYYYY        
    brr: 'mmm'
 - startpat:

用法:

awk -f f.awk input

输出:

---
    XXXXX
    brr: 'rel/cc'
    YYYYY        
    brr: 'rel/cc'
 - startpat:
    XXXXX
    brr: 'mmm'
 - startpat:
    YYYYY        
    brr: 'mmm'
 - startpat: