Group_by 和 group_concat 在 shell 脚本中
Group_by and group_concat in shell script
我的目的是识别类路径中重复的 jar。所以我使用以下命令进行了一些预处理。
mvn -o dependency:list | grep ":.*:.*:.*" | cut -d] -f2- | sed 's/:[a-z]*$//g' | sort -u -t: -k2
并且生成的文件格式为
group_id:artifact_id:type:version
所以,现在举个例子,我在文件中有以下两行
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
我想生成一个包含以下内容的文件。
jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26
此文件的内容各不相同。可以有多个具有差异版本的库。
知道如何使用 shell 脚本吗?我想避免数据库查询。
在此处添加示例文件快照...
org.glassfish.jaxb:jaxb-runtime:jar:2.4.0-b180725.0644
org.jboss.spec.javax.annotation:jboss-annotations-api_1.2_spec:jar:1.0.2.Final
org.jboss.logging:jboss-logging:jar:3.3.2.Final
org.jboss.spec.javax.transaction:jboss-transaction-api_1.2_spec:jar:1.0.1.Final
org.jboss.spec.javax.websocket:jboss-websocket-api_1.1_spec:jar:1.1.3.Final
com.github.stephenc.jcip:jcip-annotations:jar:1.0-1
com.beust:jcommander:jar:1.72
com.sun.jersey.contribs:jersey-apache-client4:jar:1.19.1
org.glassfish.jersey.ext:jersey-bean-validation:jar:2.26
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
org.glassfish.jersey.core:jersey-common:jar:2.26
org.glassfish.jersey.containers:jersey-container-servlet:jar:2.26
org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.26
com.sun.jersey:jersey-core:jar:1.19.1
org.glassfish.jersey.ext:jersey-entity-filtering:jar:2.26
org.glassfish.jersey.inject:jersey-hk2:jar:2.31
org.glassfish.jersey.media:jersey-media-jaxb:jar:2.26
org.glassfish.jersey.media:jersey-media-json-jackson:jar:2.26
org.glassfish.jersey.media:jersey-media-multipart:jar:2.26
org.glassfish.jersey.core:jersey-server:jar:2.26
org.glassfish.jersey.ext:jersey-spring4:jar:2.26
net.minidev:json-smart:jar:2.3
com.google.code.findbugs:jsr305:jar:3.0.1
javax.ws.rs:jsr311-api:jar:1.1.1
org.slf4j:jul-to-slf4j:jar:1.7.25
junit:junit:jar:4.12
org.latencyutils:LatencyUtils:jar:2.0.3
org.liquibase:liquibase-core:jar:3.5.5
log4j:log4j:jar:1.2.16
org.apache.logging.log4j:log4j-api:jar:2.10.0
com.googlecode.log4jdbc:log4jdbc:jar:1.2
org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0
ch.qos.logback:logback-classic:jar:1.2.3
ch.qos.logback:logback-core:jar:1.2.3
io.dropwizard.metrics:metrics-core:jar:4.1.6
io.dropwizard.metrics:metrics-healthchecks:jar:4.1.6
io.dropwizard.metrics:metrics-jmx:jar:4.1.6
io.micrometer:micrometer-core:jar:1.0.6
org.jvnet.mimepull:mimepull:jar:1.9.6
com.microsoft.sqlserver:mssql-jdbc:jar:6.2.2.jre8
com.netflix.netflix-commons:netflix-commons-util:jar:0.3.0
com.netflix.netflix-commons:netflix-statistics:jar:0.1.1
io.netty:netty-buffer:jar:4.1.27.Final
io.netty:netty-codec:jar:4.1.27.Final
io.netty:netty-codec-http:jar:4.1.27.Final
io.netty:netty-common:jar:4.1.27.Final
io.netty:netty-resolver:jar:4.1.27.Final
io.netty:netty-transport:jar:4.1.27.Final
io.netty:netty-transport-native-epoll:jar:4.1.27.Final
io.netty:netty-transport-native-unix-common:jar:4.1.27.Final
com.nimbusds:nimbus-jose-jwt:jar:8.3
可能有更简单的方法,但这是我现在可以做的......可能可以通过一些调整缩小到单行
[07:38 am alex ~]$ date; cat a
Wed 4 Nov 07:38:21 GMT 2020
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
[07:38 am alex ~]$ FIRST=`cat a | awk -F'[:]' '{print }' | uniq`
[07:38 am alex ~]$ SECOND=`cat a | awk -F'[:]' '{print ":"}' | xargs | sed 's/ /,/g'`
[07:38 am alex ~]$ echo "$FIRST | $SECOND"
jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26
能否请您尝试跟随,这可以在一个 awk
本身中完成。完全基于您展示的示例。
awk '
BEGIN{
FS=":"
OFS=" | "
}
FNR==1{
first=
third=
second=
next
}
FNR==2{
print second,first","":"$NF
}
' Input_file
说明: 为以上添加详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=":" ##Setting field separator colon here.
OFS=" | " ##Setting output field separator as space | space here.
}
FNR==1{ ##Checking conditon if this is first line then do following.
first= ##Creating first with 1st field value.
third= ##Creating third with 3rd field value.
second= ##Creating second with 2nd field value of current line.
next ##next will skip all further statements from here.
}
FNR==2{ ##Checking condition if this is 2nd line then do following.
print second,first","":"$NF ##Printing second first first field and last field of current line.
}
' Input_file ##Mentioning Input_file name here.
我的目的是识别类路径中重复的 jar。所以我使用以下命令进行了一些预处理。
mvn -o dependency:list | grep ":.*:.*:.*" | cut -d] -f2- | sed 's/:[a-z]*$//g' | sort -u -t: -k2
并且生成的文件格式为
group_id:artifact_id:type:version
所以,现在举个例子,我在文件中有以下两行
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
我想生成一个包含以下内容的文件。
jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26
此文件的内容各不相同。可以有多个具有差异版本的库。 知道如何使用 shell 脚本吗?我想避免数据库查询。
在此处添加示例文件快照...
org.glassfish.jaxb:jaxb-runtime:jar:2.4.0-b180725.0644
org.jboss.spec.javax.annotation:jboss-annotations-api_1.2_spec:jar:1.0.2.Final
org.jboss.logging:jboss-logging:jar:3.3.2.Final
org.jboss.spec.javax.transaction:jboss-transaction-api_1.2_spec:jar:1.0.1.Final
org.jboss.spec.javax.websocket:jboss-websocket-api_1.1_spec:jar:1.1.3.Final
com.github.stephenc.jcip:jcip-annotations:jar:1.0-1
com.beust:jcommander:jar:1.72
com.sun.jersey.contribs:jersey-apache-client4:jar:1.19.1
org.glassfish.jersey.ext:jersey-bean-validation:jar:2.26
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
org.glassfish.jersey.core:jersey-common:jar:2.26
org.glassfish.jersey.containers:jersey-container-servlet:jar:2.26
org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.26
com.sun.jersey:jersey-core:jar:1.19.1
org.glassfish.jersey.ext:jersey-entity-filtering:jar:2.26
org.glassfish.jersey.inject:jersey-hk2:jar:2.31
org.glassfish.jersey.media:jersey-media-jaxb:jar:2.26
org.glassfish.jersey.media:jersey-media-json-jackson:jar:2.26
org.glassfish.jersey.media:jersey-media-multipart:jar:2.26
org.glassfish.jersey.core:jersey-server:jar:2.26
org.glassfish.jersey.ext:jersey-spring4:jar:2.26
net.minidev:json-smart:jar:2.3
com.google.code.findbugs:jsr305:jar:3.0.1
javax.ws.rs:jsr311-api:jar:1.1.1
org.slf4j:jul-to-slf4j:jar:1.7.25
junit:junit:jar:4.12
org.latencyutils:LatencyUtils:jar:2.0.3
org.liquibase:liquibase-core:jar:3.5.5
log4j:log4j:jar:1.2.16
org.apache.logging.log4j:log4j-api:jar:2.10.0
com.googlecode.log4jdbc:log4jdbc:jar:1.2
org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0
ch.qos.logback:logback-classic:jar:1.2.3
ch.qos.logback:logback-core:jar:1.2.3
io.dropwizard.metrics:metrics-core:jar:4.1.6
io.dropwizard.metrics:metrics-healthchecks:jar:4.1.6
io.dropwizard.metrics:metrics-jmx:jar:4.1.6
io.micrometer:micrometer-core:jar:1.0.6
org.jvnet.mimepull:mimepull:jar:1.9.6
com.microsoft.sqlserver:mssql-jdbc:jar:6.2.2.jre8
com.netflix.netflix-commons:netflix-commons-util:jar:0.3.0
com.netflix.netflix-commons:netflix-statistics:jar:0.1.1
io.netty:netty-buffer:jar:4.1.27.Final
io.netty:netty-codec:jar:4.1.27.Final
io.netty:netty-codec-http:jar:4.1.27.Final
io.netty:netty-common:jar:4.1.27.Final
io.netty:netty-resolver:jar:4.1.27.Final
io.netty:netty-transport:jar:4.1.27.Final
io.netty:netty-transport-native-epoll:jar:4.1.27.Final
io.netty:netty-transport-native-unix-common:jar:4.1.27.Final
com.nimbusds:nimbus-jose-jwt:jar:8.3
可能有更简单的方法,但这是我现在可以做的......可能可以通过一些调整缩小到单行
[07:38 am alex ~]$ date; cat a
Wed 4 Nov 07:38:21 GMT 2020
com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26
[07:38 am alex ~]$ FIRST=`cat a | awk -F'[:]' '{print }' | uniq`
[07:38 am alex ~]$ SECOND=`cat a | awk -F'[:]' '{print ":"}' | xargs | sed 's/ /,/g'`
[07:38 am alex ~]$ echo "$FIRST | $SECOND"
jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26
能否请您尝试跟随,这可以在一个 awk
本身中完成。完全基于您展示的示例。
awk '
BEGIN{
FS=":"
OFS=" | "
}
FNR==1{
first=
third=
second=
next
}
FNR==2{
print second,first","":"$NF
}
' Input_file
说明: 为以上添加详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=":" ##Setting field separator colon here.
OFS=" | " ##Setting output field separator as space | space here.
}
FNR==1{ ##Checking conditon if this is first line then do following.
first= ##Creating first with 1st field value.
third= ##Creating third with 3rd field value.
second= ##Creating second with 2nd field value of current line.
next ##next will skip all further statements from here.
}
FNR==2{ ##Checking condition if this is 2nd line then do following.
print second,first","":"$NF ##Printing second first first field and last field of current line.
}
' Input_file ##Mentioning Input_file name here.