地精:java.lang.ClassNotFoundException:org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
Gobblin: java.lang.ClassNotFoundException: org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
我正在尝试 mysql 使用 gobblin 获取 hdfs 数据。而 运行 mysql-to-gobblin.pull 使用以下步骤:
1) 启动hadoop:
sbin\start-all.cmd
2) 启动mysql服务:
sudo service mysql start
3) 设置 GOBBLIN_WORK_DIR:
export GOBBLIN_WORK_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_WORK_DIR
4) 设置GOBBLIN_JOB_CONFIG_DIR
export GOBBLIN_JOB_CONFIG_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_JOB_CONFIG_DIR
5) 独立启动
bin/gobblin.sh service standalone start --jars /mnt/C/Users/name/incubator-gobblin/build/gobblin-sql/libs/gobblin-sql-0.15.0.jar
给出以下错误
ERROR [JobScheduler-0] org.apache.gobblin.scheduler.JobScheduler$NonScheduledJobRunner 637 - Failed to run job GobblinMySql
org.apache.gobblin.runtime.JobException: Failed to run job GobblinMySql
Caused by: java.lang.ClassNotFoundException: org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
下面是 mysql 到 gobblin.pull 文件
# Job properties
job.name=GobblinMySql
job.group=MySql
job.description=Data pull from MySql
# Extract properties
extract.table.type=snapshot_only
extract.table.name=user
# Property to consider the extract as full dump
extract.is.full=true
# Source properties
# Source properties - source class to extract data from Mysql Source
source.class=org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
# Source properties
source.max.number.of.partitions=1
source.querybased.partition.interval=1
source.querybased.is.compression=true
source.querybased.watermark.type=timestamp
# Converter properties - Record from mysql source will be processed by the below series of converters
converter.classes=gobblin.converter.avro.JsonIntermediateToAvroConverter
# date columns format
converter.avro.timestamp.format=yyyy-MM-dd HH:mm:ss'.0'
converter.avro.date.format=yyyy-MM-dd
converter.avro.time.format=HH:mm:ss
# Qualitychecker properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
# Publisher properties
data.publisher.type=gobblin.publisher.BaseDataPublisher
source.querybased.schema=praveen_schema
source.entity=user
source.querybased.extract.type=snapshot
writer.builder.class=org.apache.gobblin.writer.SimpleDataWriterBuilder
writer.file.path.type=tablename
writer.destination.type=HDFS
writer.output.format=txt
data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher
mr.job.max.mappers=1
metrics.reporting.file.enabled=true
metrics.log.dir=/gobblin-kafka/metrics
metrics.reporting.file.suffix=txt
bootstrap.with.offset=earliest
fs.uri=hdfs://localhost:9000
writer.fs.uri=hdfs://localhost:9000
state.store.fs.uri=hdfs://localhost:9000
mr.job.root.dir=/gobblin-kafka/working
state.store.dir=/gobblin-kafka/state-store
task.data.root.dir=/jobs/kafkaetl/gobblin/gobblin-kafka/task-data
data.publisher.final.dir=/gobblintest/job-output
我是 运行 这个来自 /mnt/c/users/name/incubator-gobblin/build/gobblin-distribution/distributions/gobblin-dist
目录的命令。
我需要在这里做哪些更改?我该如何解决?
解决方案是添加此 jar 或依赖项以摆脱原因:java.lang.ClassNotFoundException:org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
<dependency>
<groupId>com.linkedin.gobblin</groupId>
<artifactId>gobblin-core</artifactId>
<version>0.8.0</version>
</dependency>
从这个 mvn 下载 jar website。
希望对您有所帮助。
我正在尝试 mysql 使用 gobblin 获取 hdfs 数据。而 运行 mysql-to-gobblin.pull 使用以下步骤:
1) 启动hadoop:
sbin\start-all.cmd
2) 启动mysql服务:
sudo service mysql start
3) 设置 GOBBLIN_WORK_DIR:
export GOBBLIN_WORK_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_WORK_DIR
4) 设置GOBBLIN_JOB_CONFIG_DIR
export GOBBLIN_JOB_CONFIG_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_JOB_CONFIG_DIR
5) 独立启动
bin/gobblin.sh service standalone start --jars /mnt/C/Users/name/incubator-gobblin/build/gobblin-sql/libs/gobblin-sql-0.15.0.jar
给出以下错误
ERROR [JobScheduler-0] org.apache.gobblin.scheduler.JobScheduler$NonScheduledJobRunner 637 - Failed to run job GobblinMySql
org.apache.gobblin.runtime.JobException: Failed to run job GobblinMySql
Caused by: java.lang.ClassNotFoundException: org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
下面是 mysql 到 gobblin.pull 文件
# Job properties
job.name=GobblinMySql
job.group=MySql
job.description=Data pull from MySql
# Extract properties
extract.table.type=snapshot_only
extract.table.name=user
# Property to consider the extract as full dump
extract.is.full=true
# Source properties
# Source properties - source class to extract data from Mysql Source
source.class=org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
# Source properties
source.max.number.of.partitions=1
source.querybased.partition.interval=1
source.querybased.is.compression=true
source.querybased.watermark.type=timestamp
# Converter properties - Record from mysql source will be processed by the below series of converters
converter.classes=gobblin.converter.avro.JsonIntermediateToAvroConverter
# date columns format
converter.avro.timestamp.format=yyyy-MM-dd HH:mm:ss'.0'
converter.avro.date.format=yyyy-MM-dd
converter.avro.time.format=HH:mm:ss
# Qualitychecker properties
qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
# Publisher properties
data.publisher.type=gobblin.publisher.BaseDataPublisher
source.querybased.schema=praveen_schema
source.entity=user
source.querybased.extract.type=snapshot
writer.builder.class=org.apache.gobblin.writer.SimpleDataWriterBuilder
writer.file.path.type=tablename
writer.destination.type=HDFS
writer.output.format=txt
data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher
mr.job.max.mappers=1
metrics.reporting.file.enabled=true
metrics.log.dir=/gobblin-kafka/metrics
metrics.reporting.file.suffix=txt
bootstrap.with.offset=earliest
fs.uri=hdfs://localhost:9000
writer.fs.uri=hdfs://localhost:9000
state.store.fs.uri=hdfs://localhost:9000
mr.job.root.dir=/gobblin-kafka/working
state.store.dir=/gobblin-kafka/state-store
task.data.root.dir=/jobs/kafkaetl/gobblin/gobblin-kafka/task-data
data.publisher.final.dir=/gobblintest/job-output
我是 运行 这个来自 /mnt/c/users/name/incubator-gobblin/build/gobblin-distribution/distributions/gobblin-dist
目录的命令。
我需要在这里做哪些更改?我该如何解决?
解决方案是添加此 jar 或依赖项以摆脱原因:java.lang.ClassNotFoundException:org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
<dependency>
<groupId>com.linkedin.gobblin</groupId>
<artifactId>gobblin-core</artifactId>
<version>0.8.0</version>
</dependency>
从这个 mvn 下载 jar website。
希望对您有所帮助。