如何从 oracle origin 连接到 streamsets

how to connect from oracle origin with streamsets

我想从 oracle 创建一个源代码。所以我选择作为原始 oracle cdc。然后我配置了每个参数:

2017-08-22 11:07:22,447 test/testb156f588-dbd7-4e4c-8896-caf658d14d77   ERROR   Error while connecting to DB

com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:oracle:thin:@(DESCRIPTION =
(ENABLE=BROKEN)
(ADDRESS = (PROTOCOL = TCP)(HOST = myhost)(PORT = myport))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.WORLD)))
    at com.streamsets.pipeline.lib.jdbc.JdbcUtil.createDataSourceForRead(JdbcUtil.java:638)
    at com.streamsets.pipeline.stage.origin.jdbc.cdc.oracle.OracleCDCSource.init(OracleCDCSource.java:643)
    at com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:52)
    at com.streamsets.pipeline.configurablestage.DStage.init(DStage.java:40)
    at com.streamsets.datacollector.runner.StageRuntime.init(StageRuntime.java:156)
    at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:105)
    at com.streamsets.datacollector.runner.StagePipe.init(StagePipe.java:53)
    at com.streamsets.datacollector.runner.Pipeline.initPipe(Pipeline.java:299)
    at com.streamsets.datacollector.runner.Pipeline.init(Pipeline.java:214)
    at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:96)
    at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:79)
    at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:646)
    at com.streamsets.datacollector.execution.runner.common.AsyncRunner.lambda$start(AsyncRunner.java:143)
    at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:233)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access1(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:oracle:thin:@(DESCRIPTION =
(ENABLE=BROKEN)
(ADDRESS = (PROTOCOL = TCP)(HOST = myhost)(PORT = myport))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl.WORLD)))
    at com.zaxxer.hikari.util.DriverDataSource.<init>(DriverDataSource.java:88)
    at com.zaxxer.hikari.pool.PoolElf.initializeDataSource(PoolElf.java:157)
    at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:113)
    at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:73)
    at com.streamsets.pipeline.lib.jdbc.JdbcUtil.createDataSourceForRead(JdbcUtil.java:630)
    ... 19 more
Caused by: java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:315)
    at com.zaxxer.hikari.util.DriverDataSource.<init>(DriverDataSource.java:81)
    ... 23 more

你有什么想法吗?

Legacy 配置选项卡下,尝试将 JDBC 驱动程序 Class 名称 指定为 oracle.jdbc.driver.OracleDriver

我必须在存储库下添加 ojdbc jar

/opt/streamsets-extra/streamsets-datacollector-jdbc-lib/lib

似乎缺少设置环境变量 STREAMSETS_LIBRARIES_EXTRA_DIR,这是安装外部库安装过程的一部分。

根据 Data Collector 的初始化样式,有三种安装类型:

  • 手动开始
  • 使用SysV InitCentOS 6、Oracle Linux 6、Red Hat 支持 企业 Linux 6Ubuntu 14.04 LTS )
  • 使用Systemd InitCentOS 7、Oracle Linux 7、Red Hat 支持 企业 Linux 7Ubuntu 16.04 LTS )

如果您要安装 SDC

  • 正在启动手动,然后变量STREAMSETS_LIBRARIES_EXTRA_DIR 预计由

    从命令行设置
    export STREAMSETS_LIBRARIES_EXTRA_DIR="/opt/streamsets-data-collector/streamsets-data-collector-3.15.0/streamsets-libs-extras/"
    
  • starting as a service,那么这个参数已经存在 在 $SDC_DIST/libexec/_sdc 文件中作为

    STREAMSETS_LIBRARIES_EXTRA_DIR="${STREAMSETS_LIBRARIES_EXTRA_DIR:=${SDC_DIST}/streamsets-libs-extras}" 
    

其中$SDC_DIST变量为SDC安装文件的解压目录(tarballRPM).

并且在文件$SDC_CONF/sdc-security.policy

中添加相同的路径
grant codebase "file:///opt/streamsets-data-collector/streamsets-data-collector-3.15.0/streamsets-libs-extras/-" {
  permission java.security.AllPermission;
};

其中 $SDC_CONF 变量通常由路径 /etc/sdc

现在,我们已准备好登录 Data Collector Console,以便按照以下步骤添加 JDBC 外部库

  1. 在 Data Collector 的右上角工具栏中,单击 Package 经理图标:

  2. 在导航面板中,单击外部库

Data Collector 列出所有当前安装的外部库。

  1. 立即在右上角工具栏下,单击 安装 外部库 图标:

  1. Install External Libraries对话框中,select阶段 来自文件系统的库 JDBC(假设您已经注册并下载了 streamsets-datacollector-jdbc-lib) ;

然后选择 .jar 文件,例如 ojdbc8.jar 可以从 JDBC and UCP Downloads page 下载(在我的例子中,由于我的远程数据库版本,我选择了名为 Oracle Database 12c Release 2 (12.2.0.1) drivers 的 link。

  1. 作为最后一步,不要忘记在安装外部库中单击取消 window,然后 运行 以下命令:

    service sdc restart ( for SysV Init)
    

    systemctl restart sdc ( for Systemd Init )
    

( 您可以在安装外部库 window中单击重新启动数据收集器 window从命令行手动启动数据收集器。 )