从 Java 中的多模块项目生成数据流模板
Generate a Dataflow template from multi module project in Java
我在 Java 方面没有太多经验,尤其是在多模块项目方面,所以我无法从多模块项目创建数据流模板。
要从 Dataflow 模板生成模板,您必须使用如下内容:
mvn compile exec:java \
-Dexec.mainClass=com.example.myclass \
-Dexec.args="--runner=DataflowRunner \
--project=YOUR_PROJECT_ID \
--stagingLocation=gs://YOUR_BUCKET_NAME/staging \
--templateLocation=gs://YOUR_BUCKET_NAME/templates/YOUR_TEMPLATE_NAME"
这在一个简单的 Java 项目中对我来说效果很好,但目前我需要在具有以下简化结构的项目中使用以下内容:
C:.
| pom.xml
|
+---configuration
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | \---java
| | \---com
| | \---xxx
| | \---gcp
| | \---dataflow
| | \---yyy
| | +---package
| | | | java files
| |
+---pipeline
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | \---java
| | \---com
| | \---xxx
| | \---gcp
| | \---dataflow
| | \---yyy
| | \---package
| | MAINJAVACLASS.java
| |
\---transform
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | +---java
| | | +---com
| | | | \---xxx
| | | | \---gcp
| | | | \---dataflow
| | | | \---yyy
| | | | +---package
| | | | | java files
我已经执行了 mvn package,没有任何错误,输出如下:
[INFO] Reactor Build Order:
[INFO]
[INFO] pipeline-framework [pom]
[INFO] configuration [jar]
[INFO] transform [jar]
[INFO] pipeline [jar]
<...>
[INFO] Reactor Summary for pipeline-framework 0.1:
[INFO]
[INFO] pipeline-framework ................................. SUCCESS [ 19.076 s]
[INFO] configuration ...................................... SUCCESS [ 25.070 s]
[INFO] transform .......................................... SUCCESS [ 21.625 s]
[INFO] pipeline ........................................... SUCCESS [ 19.365 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
但是当我尝试执行时:
mvn compile exec:java -Dexec.mainClass=com.xxx.gcp.dataflow.yyy.pipeline.MAINJAVACLASS -Dexec.args=...
我有以下错误:
如果我从根目录执行:
[INFO] Reactor Summary for pipeline-framework 0.1:
[INFO]
[INFO] pipeline-framework ................................. FAILURE [ 5.287 s]
[INFO] configuration ...................................... SKIPPED
[INFO] transform .......................................... SKIPPED
[INFO] pipeline ........................................... SKIPPED
<...>
Caused by: java.lang.ClassNotFoundException: com.xxx.gcp.dataflow.yyy.pipeline.MAINJAVACLASS
我也试过:
mvn compile exec:java -pl pipeline <...>
如果我在管道目录中执行它:
Could not resolve dependencies for project com.xxx.gcp.dataflow:pipeline:jar:0.1: The following artifacts could not be resolved: com.xxx.gcp.dataflow:transform:jar:0.1, com.xxx.gcp.dataflow:configuration:jar:0.1: Failure to find com.xxx.gcp.dataflow:transform:jar:0.1 in https://repo.maven.apache.org/maven2
我应该执行哪个命令来构建模板?
主pom.xml文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xxx.gcp.dataflow</groupId>
<artifactId>pipeline-framework</artifactId>
<version>0.1</version>
<packaging>pom</packaging>
<modules>
<module>configuration</module>
<module>transform</module>
<module>pipeline</module>
</modules>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<beam.version>2.16.0</beam.version>
<maven-compiler-plugin.version>3.7.0</maven-compiler-plugin.version>
<maven-exec-plugin.version>1.6.0</maven-exec-plugin.version>
<maven-jar-plugin.version>3.1.2</maven-jar-plugin.version>
<slf4j.version>1.7.25</slf4j.version>
<autovalue.annotations.version>1.6</autovalue.annotations.version>
<autovalue.version>1.6.2</autovalue.version>
</properties>
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>configuration</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>transform</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>pipeline</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler-plugin.version}</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.1</version>
<configuration>
<useSystemClassLoader>false</useSystemClassLoader>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven-jar-plugin.version}</version>
<configuration>
<archive>
<manifest>
<mainClass>com.xxx.gcp.dataflow.yyy.pipeline.TerraformPipeline</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<id>bundle-and-repackage</id>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>${maven-exec-plugin.version}</version>
<configuration>
<cleanupDaemonThreads>false</cleanupDaemonThreads>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
<dependencies>
<...>
</dependencies>
</project>
我认为这是多模块 Maven 项目的常见问题,并非特定于数据流。也许其他线程可以提供帮助:Maven exec:java goal on a multi-module project
那个人提到了您遇到的 MAINJAVACLASS
找不到的问题。另一半我不太确定,我认为罐子丢失的原因是因为 package
生命周期阶段还没有 运行 在你需要 .jar
的模块上.据我所知,exec 插件在构建生命周期的任何特定阶段都没有 运行,因此根据您的信息,我猜它只是在 compile
之后 运行阶段,不产生任何罐子(发生在 package
)。
关于构建生命周期的信息:http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html
关于
的信息
最后,为了解决这个问题,我不得不在编译之前执行:mvn clean install
。有了这个,所有的依赖项都安装在我的电脑上,然后使用如下命令:
mvn compile exec:java \
-Dexec.mainClass=com.example.myclass \
-Dexec.args="--runner=DataflowRunner \
--project=YOUR_PROJECT_ID \
--stagingLocation=gs://YOUR_BUCKET_NAME/staging \
--templateLocation=gs://YOUR_BUCKET_NAME/templates/YOUR_TEMPLATE_NAME"
模板已创建并上传到 GCS
如果您想使用 Cloud Build 创建模板,可以按照 this 步骤
我在 Java 方面没有太多经验,尤其是在多模块项目方面,所以我无法从多模块项目创建数据流模板。
要从 Dataflow 模板生成模板,您必须使用如下内容:
mvn compile exec:java \
-Dexec.mainClass=com.example.myclass \
-Dexec.args="--runner=DataflowRunner \
--project=YOUR_PROJECT_ID \
--stagingLocation=gs://YOUR_BUCKET_NAME/staging \
--templateLocation=gs://YOUR_BUCKET_NAME/templates/YOUR_TEMPLATE_NAME"
这在一个简单的 Java 项目中对我来说效果很好,但目前我需要在具有以下简化结构的项目中使用以下内容:
C:.
| pom.xml
|
+---configuration
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | \---java
| | \---com
| | \---xxx
| | \---gcp
| | \---dataflow
| | \---yyy
| | +---package
| | | | java files
| |
+---pipeline
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | \---java
| | \---com
| | \---xxx
| | \---gcp
| | \---dataflow
| | \---yyy
| | \---package
| | MAINJAVACLASS.java
| |
\---transform
| | dependency-reduced-pom.xml
| | pom.xml
| |
| +---src
| | \---main
| | +---java
| | | +---com
| | | | \---xxx
| | | | \---gcp
| | | | \---dataflow
| | | | \---yyy
| | | | +---package
| | | | | java files
我已经执行了 mvn package,没有任何错误,输出如下:
[INFO] Reactor Build Order:
[INFO]
[INFO] pipeline-framework [pom]
[INFO] configuration [jar]
[INFO] transform [jar]
[INFO] pipeline [jar]
<...>
[INFO] Reactor Summary for pipeline-framework 0.1:
[INFO]
[INFO] pipeline-framework ................................. SUCCESS [ 19.076 s]
[INFO] configuration ...................................... SUCCESS [ 25.070 s]
[INFO] transform .......................................... SUCCESS [ 21.625 s]
[INFO] pipeline ........................................... SUCCESS [ 19.365 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
但是当我尝试执行时:
mvn compile exec:java -Dexec.mainClass=com.xxx.gcp.dataflow.yyy.pipeline.MAINJAVACLASS -Dexec.args=...
我有以下错误:
如果我从根目录执行:
[INFO] Reactor Summary for pipeline-framework 0.1:
[INFO]
[INFO] pipeline-framework ................................. FAILURE [ 5.287 s]
[INFO] configuration ...................................... SKIPPED
[INFO] transform .......................................... SKIPPED
[INFO] pipeline ........................................... SKIPPED
<...>
Caused by: java.lang.ClassNotFoundException: com.xxx.gcp.dataflow.yyy.pipeline.MAINJAVACLASS
我也试过:
mvn compile exec:java -pl pipeline <...>
如果我在管道目录中执行它:
Could not resolve dependencies for project com.xxx.gcp.dataflow:pipeline:jar:0.1: The following artifacts could not be resolved: com.xxx.gcp.dataflow:transform:jar:0.1, com.xxx.gcp.dataflow:configuration:jar:0.1: Failure to find com.xxx.gcp.dataflow:transform:jar:0.1 in https://repo.maven.apache.org/maven2
我应该执行哪个命令来构建模板?
主pom.xml文件
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xxx.gcp.dataflow</groupId>
<artifactId>pipeline-framework</artifactId>
<version>0.1</version>
<packaging>pom</packaging>
<modules>
<module>configuration</module>
<module>transform</module>
<module>pipeline</module>
</modules>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<beam.version>2.16.0</beam.version>
<maven-compiler-plugin.version>3.7.0</maven-compiler-plugin.version>
<maven-exec-plugin.version>1.6.0</maven-exec-plugin.version>
<maven-jar-plugin.version>3.1.2</maven-jar-plugin.version>
<slf4j.version>1.7.25</slf4j.version>
<autovalue.annotations.version>1.6</autovalue.annotations.version>
<autovalue.version>1.6.2</autovalue.version>
</properties>
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>configuration</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>transform</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>pipeline</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler-plugin.version}</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.1</version>
<configuration>
<useSystemClassLoader>false</useSystemClassLoader>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven-jar-plugin.version}</version>
<configuration>
<archive>
<manifest>
<mainClass>com.xxx.gcp.dataflow.yyy.pipeline.TerraformPipeline</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<id>bundle-and-repackage</id>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>${maven-exec-plugin.version}</version>
<configuration>
<cleanupDaemonThreads>false</cleanupDaemonThreads>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
<dependencies>
<...>
</dependencies>
</project>
我认为这是多模块 Maven 项目的常见问题,并非特定于数据流。也许其他线程可以提供帮助:Maven exec:java goal on a multi-module project
那个人提到了您遇到的 MAINJAVACLASS
找不到的问题。另一半我不太确定,我认为罐子丢失的原因是因为 package
生命周期阶段还没有 运行 在你需要 .jar
的模块上.据我所知,exec 插件在构建生命周期的任何特定阶段都没有 运行,因此根据您的信息,我猜它只是在 compile
之后 运行阶段,不产生任何罐子(发生在 package
)。
关于构建生命周期的信息:http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html 关于
的信息最后,为了解决这个问题,我不得不在编译之前执行:mvn clean install
。有了这个,所有的依赖项都安装在我的电脑上,然后使用如下命令:
mvn compile exec:java \
-Dexec.mainClass=com.example.myclass \
-Dexec.args="--runner=DataflowRunner \
--project=YOUR_PROJECT_ID \
--stagingLocation=gs://YOUR_BUCKET_NAME/staging \
--templateLocation=gs://YOUR_BUCKET_NAME/templates/YOUR_TEMPLATE_NAME"
模板已创建并上传到 GCS
如果您想使用 Cloud Build 创建模板,可以按照 this 步骤