匹配后删除字符串以及之后的 word/string
Remove string after match along with the word/string after that
我有一个包含以下模式行的文件。
date=2020-02-22 time=13:32:41 type=text subtype=text ip=1.2.3.4 country="China" service="foo" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 type=text subtype=anothertext ip=1.2.3.5 country="Russian Federation" service="bar" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 type=text subtype=someothertext ip=1.2.3.6 country="Korea, Republic of" service="grault, garply" id=47448 msg="foo: bar.baz," value=60
我想删除类型、子类型和服务以及这些字段的值(= 之后的值)。
期望的输出:
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
我一直在尝试使用 cut
、awk
、sed
,但仍未接近解决方案。我已经在网上搜索了几个小时,但那也是徒劳的。有人可以帮忙吗?
你可以试试这样:
awk -F " " '{ =""; =""; =""; print}' file
您基本上是将列设置为空字符串。
您可以使用这个 sed
:
sed -E 's/(^|[[:blank:]]+)(subtype|type|service)=[^[:blank:]]+//g' file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" garply" id=47448 msg="foo: bar.baz," value=60
您以后可能想重用或构建的内容:
$ cat tst.awk
BEGIN {
split(s,tmp)
for (i in tmp) {
skip[tmp[i]]
}
FPAT = "[^ ]+(=\"[^\"]+\")?"
}
{
c=0
for (i=1; i<=NF; i++) {
tag = gensub(/=.*/,"",1,$i)
if ( !(tag in skip) ) {
printf "%s%s", (c++ ? OFS : ""), $i
}
}
print ""
}
$ awk -v s='type subtype service' -f tst.awk file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
以上使用 GNU awk 进行 FPAT 和 gensub()。
我有一个包含以下模式行的文件。
date=2020-02-22 time=13:32:41 type=text subtype=text ip=1.2.3.4 country="China" service="foo" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 type=text subtype=anothertext ip=1.2.3.5 country="Russian Federation" service="bar" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 type=text subtype=someothertext ip=1.2.3.6 country="Korea, Republic of" service="grault, garply" id=47448 msg="foo: bar.baz," value=60
我想删除类型、子类型和服务以及这些字段的值(= 之后的值)。
期望的输出:
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
我一直在尝试使用 cut
、awk
、sed
,但仍未接近解决方案。我已经在网上搜索了几个小时,但那也是徒劳的。有人可以帮忙吗?
你可以试试这样:
awk -F " " '{ =""; =""; =""; print}' file
您基本上是将列设置为空字符串。
您可以使用这个 sed
:
sed -E 's/(^|[[:blank:]]+)(subtype|type|service)=[^[:blank:]]+//g' file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" garply" id=47448 msg="foo: bar.baz," value=60
您以后可能想重用或构建的内容:
$ cat tst.awk
BEGIN {
split(s,tmp)
for (i in tmp) {
skip[tmp[i]]
}
FPAT = "[^ ]+(=\"[^\"]+\")?"
}
{
c=0
for (i=1; i<=NF; i++) {
tag = gensub(/=.*/,"",1,$i)
if ( !(tag in skip) ) {
printf "%s%s", (c++ ? OFS : ""), $i
}
}
print ""
}
$ awk -v s='type subtype service' -f tst.awk file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
以上使用 GNU awk 进行 FPAT 和 gensub()。