AWK FS 仅使用第一个实例并忽略其余实例

Question

我有一个命令，我在其中识别变量，然后用文件中的其他值替换该变量 ($ZEPPELIN_HOME/conf/shiro.ini)。我正在使用 printenv、grep 和 awk 的组合。我的问题是关于 AWK 和其中的 FS 变量。我已经声明 FS 值是“=”，但一些变量有多个“="s and I only want to count the first variable "=”作为字段分隔符，其他变量作为字符串的一部分而不是其他字段。基本上我现在做的是使用 printenv 收集 ENV 变量，使用 grep 获取我想要关注的变量，然后使用 awk 和 sed 遍历这些环境，格式化它们，在文件中识别它们并在其中替换它们文件。 我的命令：

printenv | grep "SHIRO_" | awk 'BEGIN{FS="=";a=""}NR > 1 { a=a" && " }{b=substr(gensub(/_/, ".", "g", ),7);a=a"sed -ri \"s|^"b" =.+$|"b" =" "|g\" $ZEPPELIN_HOME/conf/shiro.ini"}END{print a}' | bash

shiro.ini 文件中的文本

ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
ldapRealm.contextFactory.environment[ldap.searchBase] = dc=COMPANY,dc=COM
ldapRealm.contextFactory.url = ldap://ldap.test.com:389
ldapRealm.userDnTemplate = uid={0},ou=Users,dc=COMPANY,dc=COM
ldapRealm.contextFactory.authenticationMechanism = simple

ENV 变量：

SHIRO_ldapRealm_contextFactory_environment_ldap_searchBase="dc=othertypesofDNS" 
SHIRO_ldapRealm_userDnTemplate="cn={0},dc=othertypesofDNS" 
SHIRO_ldapRealm_contextFactory_url="ldap://test1.com:339 ldap://test2.com:339"

shiro.ini文件中的理想输出：

ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
ldapRealm.contextFactory.environment.ldap.searchBase=dc=othertypesofDNS
ldapRealm.userDnTemplate=cn={0},dc=othertypesofDNS
ldapRealm.contextFactory.url=ldap://test1.com:339 ldap://test2.com:339
ldapRealm.contextFactory.authenticationMechanism = simple

shiro.file中的当前输出：

ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
ldapRealm.contextFactory.environment.ldap.searchBase=dc
ldapRealm.userDnTemplate=cn
ldapRealm.contextFactory.url=ldap://test1.com:339 ldap://test2.com:339
ldapRealm.contextFactory.authenticationMechanism = simple

那么，如何让我的命令只使用第一个“=”作为字段分隔符而忽略其余部分？

我在我的任务中检查了以下（以及其他）： Awk field separator awk - split only by first occurrence

Answer 1

您似乎只想从字符串中删除 "Shiro_" 和 .那么为什么不做这样的事情呢：

 a = 'SHIRO_ldapRealm_contextFactory_environment_ldap_searchBase="dc=othertypesofDNS"'
    echo ${a#*_}

这个 for 循环是这样做的：

for i in $(printenv | grep "SHIRO_");
do

echo ${i#*_} | ...

done

Answer 2

您不应该为此尝试使用 FS，因为您实际上并不希望您的记录在每个 FS 中都分成多个字段。以下是如何根据需要将 tag/name 与值分开（使用 cat file 而不是 printenv 仅用于演示目的，使用 awk 时不需要 grep）：

$ cat file |
    awk '/SHIRO_/ {
        tag=val=[=10=]
        sub(/=.*/,"",tag)
        sub(/^[^=]+=/,"",val)
        print "tag="tag ORS "val="val ORS
    }'
tag=SHIRO_ldapRealm_contextFactory_environment_ldap_searchBase
val="dc=othertypesofDNS"

tag=SHIRO_ldapRealm_userDnTemplate
val="cn={0},dc=othertypesofDNS"

tag=SHIRO_ldapRealm_contextFactory_url
val="ldap://test1.com:339 ldap://test2.com:339"

你不应该做所有关于创建 sed 命令并将其通过管道传输到 bash 来执行的所有复杂的事情 - 只需在同一个 awk 命令中做任何你想做的事情您将标签与值分开，例如：

$ cat file |
    awk 'sub(/^SHIRO_/,"") {
        tag=val=[=11=]
        sub(/=.*/,"",tag)
        sub(/^[^=]+=/,"",val)
        gsub(/_/,".",tag)
        gsub(/"/,"",val)
        print tag"="val
    }'
ldapRealm.contextFactory.environment.ldap.searchBase=dc=othertypesofDNS
ldapRealm.userDnTemplate=cn={0},dc=othertypesofDNS
ldapRealm.contextFactory.url=ldap://test1.com:339 ldap://test2.com:339

编辑：根据您的更新要求：

$ cat tst.awk
{
    tag=val=[=12=]
    sub(/ *=.*/,"",tag)
    sub(/^[^=]+= */,"",val)
    gsub(/[[_]/,".",tag)
    gsub(/]/,"",tag)
    gsub(/"/,"",val)
}
NR==FNR {
    if ( sub(/^SHIRO\./,"",tag) ) {
        tag2val[tag] = val
    }
    next
}
tag in tag2val {
    [=12=] = tag "=" tag2val[tag]
}
{ print }

$ cat envvars | awk -f tst.awk - shiro.ini
ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
ldapRealm.contextFactory.environment.ldap.searchBase=dc=othertypesofDNS
ldapRealm.contextFactory.url=ldap://test1.com:339 ldap://test2.com:339
ldapRealm.userDnTemplate=cn={0},dc=othertypesofDNS
ldapRealm.contextFactory.authenticationMechanism = simple

AWK FS 仅使用第一个实例并忽略其余实例

AWK FS use the 1st instance only and ignore the rest

awk

grep

separator