如何使用 bash 将顶级依赖项从 mvn dependency:tree 转换为 Maven 坐标列表?
How can I transform top level dependencies from mvn dependency:tree into a list of Maven coordinates using bash?
为了在不创建 uber-jars 的情况下为我的应用程序创建一个 spark 提交命令,我想在我的构建过程中创建一个逗号分隔的应用程序顶级依赖项的 maven 坐标列表,然后我可以在 spark-submit
和 --packages=
(或 spark.jars.packages=
)。
可以使用 `mvn dependency:tree' 检索此列表,它输出具有以下格式的列表:
[INFO] com.myorg:my-project:jar:1.0-SNAPSHOT
[INFO] +- org.scala-lang:scala-library:jar:2.11.12:compile
[INFO] +- org.scala-lang:scala-compiler:jar:2.11.12:compile
[INFO] | \- org.scala-lang.modules:scala-parser-combinators_2.11:jar:1.0.4:compile
[INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.10:compile
[INFO] +- io.circe:circe-config_2.11:jar:0.6.1:compile
[INFO] | +- com.typesafe:config:jar:1.3.3:compile
[INFO] | +- io.circe:circe-core_2.11:jar:0.11.1:compile
[INFO] | | +- io.circe:circe-numbers_2.11:jar:0.11.1:compile
[INFO] | | \- org.typelevel:cats-core_2.11:jar:1.5.0:compile
[INFO] | | +- org.typelevel:cats-kernel_2.11:jar:1.5.0:compile
[INFO] | | \- org.typelevel:machinist_2.11:jar:0.6.6:compile
[INFO] +- org.scalatest:scalatest_2.11:jar:3.0.8:test
[INFO] | \- org.scalactic:scalactic_2.11:jar:3.0.8:test
[INFO] \- org.mock-server:mockserver-netty:jar:5.6.1:test
[INFO] +- org.mock-server:mockserver-client-java:jar:5.6.1:test
[INFO] +- org.mock-server:mockserver-core:jar:5.6.1:test
[INFO] | +- io.netty:netty-codec-socks:jar:4.1.35.Final:test
[INFO] | +- com.github.java-json-tools:json-schema-validator:jar:2.2.10:test
[INFO] | | +- javax.mail:mailapi:jar:1.4.3:test
[INFO] | | +- com.googlecode.libphonenumber:libphonenumber:jar:8.0.0:test
[INFO] | | \- net.sf.jopt-simple:jopt-simple:jar:5.0.3:test
[INFO] | +- com.jayway.jsonpath:json-path:jar:2.4.0:test
[INFO] | | \- net.minidev:json-smart:jar:2.3:test
[INFO] | | \- net.minidev:accessors-smart:jar:1.2:test
[INFO] | | \- org.ow2.asm:asm:jar:5.0.4:test
[INFO] | +- org.apache.commons:commons-text:jar:1.3:test
[INFO] | \- org.apache.commons:commons-collections4:jar:4.2:test
[INFO] +- io.netty:netty-buffer:jar:4.1.35.Final:test
[INFO] +- io.netty:netty-handler:jar:4.1.35.Final:test
[INFO] \- io.netty:netty-transport:jar:4.1.35.Final:test
[INFO] \- io.netty:netty-resolver:jar:4.1.35.Final:test
请注意,顶级依赖项前面有 "[INFO] +- "
(在 '-'
之后有一个 space)。
只有 ":jar:"
个依赖项是相关的,而其中只有 ":compile"
个依赖项。
我只想输出满足以下所有条件的行:
- 从
"[INFO] +- "
开始
- 包含
":jar:"
- 包含
":compile"
并从中提取 orginization:package:version
,如下所示:
org.scala-lang:scala-library:jar:2.11.12:compile
==> org.scala-lang:scala-library:2.11.12
然后连接这些以逗号分隔的输出 (,
)。
以下解决方案对我有用:
mvn dependency:tree | grep -e '^\[.*\I\N\F\O.*\][[:space:]]+-[[:space:]]' | grep -e ':\j\a\r:' | grep -e ':\c\o\m\p\i\l\e$' | cut -d ' ' -f3 | sed 's/:jar:/:/g' | sed 's/:compile//g' | paste -sd ','
这考虑到了转义通常会干扰 grep
.
的特殊字符,例如空格和括号
grep
命令执行字符串过滤,cut
命令标记化和选择列,sed
命令替换字符串和 paste
命令连接。
为了在不创建 uber-jars 的情况下为我的应用程序创建一个 spark 提交命令,我想在我的构建过程中创建一个逗号分隔的应用程序顶级依赖项的 maven 坐标列表,然后我可以在 spark-submit
和 --packages=
(或 spark.jars.packages=
)。
可以使用 `mvn dependency:tree' 检索此列表,它输出具有以下格式的列表:
[INFO] com.myorg:my-project:jar:1.0-SNAPSHOT
[INFO] +- org.scala-lang:scala-library:jar:2.11.12:compile
[INFO] +- org.scala-lang:scala-compiler:jar:2.11.12:compile
[INFO] | \- org.scala-lang.modules:scala-parser-combinators_2.11:jar:1.0.4:compile
[INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.10:compile
[INFO] +- io.circe:circe-config_2.11:jar:0.6.1:compile
[INFO] | +- com.typesafe:config:jar:1.3.3:compile
[INFO] | +- io.circe:circe-core_2.11:jar:0.11.1:compile
[INFO] | | +- io.circe:circe-numbers_2.11:jar:0.11.1:compile
[INFO] | | \- org.typelevel:cats-core_2.11:jar:1.5.0:compile
[INFO] | | +- org.typelevel:cats-kernel_2.11:jar:1.5.0:compile
[INFO] | | \- org.typelevel:machinist_2.11:jar:0.6.6:compile
[INFO] +- org.scalatest:scalatest_2.11:jar:3.0.8:test
[INFO] | \- org.scalactic:scalactic_2.11:jar:3.0.8:test
[INFO] \- org.mock-server:mockserver-netty:jar:5.6.1:test
[INFO] +- org.mock-server:mockserver-client-java:jar:5.6.1:test
[INFO] +- org.mock-server:mockserver-core:jar:5.6.1:test
[INFO] | +- io.netty:netty-codec-socks:jar:4.1.35.Final:test
[INFO] | +- com.github.java-json-tools:json-schema-validator:jar:2.2.10:test
[INFO] | | +- javax.mail:mailapi:jar:1.4.3:test
[INFO] | | +- com.googlecode.libphonenumber:libphonenumber:jar:8.0.0:test
[INFO] | | \- net.sf.jopt-simple:jopt-simple:jar:5.0.3:test
[INFO] | +- com.jayway.jsonpath:json-path:jar:2.4.0:test
[INFO] | | \- net.minidev:json-smart:jar:2.3:test
[INFO] | | \- net.minidev:accessors-smart:jar:1.2:test
[INFO] | | \- org.ow2.asm:asm:jar:5.0.4:test
[INFO] | +- org.apache.commons:commons-text:jar:1.3:test
[INFO] | \- org.apache.commons:commons-collections4:jar:4.2:test
[INFO] +- io.netty:netty-buffer:jar:4.1.35.Final:test
[INFO] +- io.netty:netty-handler:jar:4.1.35.Final:test
[INFO] \- io.netty:netty-transport:jar:4.1.35.Final:test
[INFO] \- io.netty:netty-resolver:jar:4.1.35.Final:test
请注意,顶级依赖项前面有 "[INFO] +- "
(在 '-'
之后有一个 space)。
只有 ":jar:"
个依赖项是相关的,而其中只有 ":compile"
个依赖项。
我只想输出满足以下所有条件的行:
- 从
"[INFO] +- "
开始
- 包含
":jar:"
- 包含
":compile"
并从中提取 orginization:package:version
,如下所示:
org.scala-lang:scala-library:jar:2.11.12:compile
==> org.scala-lang:scala-library:2.11.12
然后连接这些以逗号分隔的输出 (,
)。
以下解决方案对我有用:
mvn dependency:tree | grep -e '^\[.*\I\N\F\O.*\][[:space:]]+-[[:space:]]' | grep -e ':\j\a\r:' | grep -e ':\c\o\m\p\i\l\e$' | cut -d ' ' -f3 | sed 's/:jar:/:/g' | sed 's/:compile//g' | paste -sd ','
这考虑到了转义通常会干扰 grep
.
grep
命令执行字符串过滤,cut
命令标记化和选择列,sed
命令替换字符串和 paste
命令连接。