如何使用Sqoop导入数据javaapi?
how to import data using Sqoop java api?
我想使用sqoop导入数据,但不想使用shell命令。那么如何用JavaAPI做this.The Sqoop版本是1.4.6,我是用Scala+SBT做的。顺便问一下,我需要哪些依赖项?
我需要使用 Sqoop 将数据从 MySQL 导入到 Hive 在 Cloudera CDH 5.7 集群中使用 Scala,所以我开始关注 this answer.
问题是它在服务器上执行时没有获得正确的配置。
手动执行 Sqoop 是这样的:
sqoop import --hive-import --connect "jdbc:mysql://host/db" \
--username "username" --password "password" --table "viewName" \
--hive-table "outputTable" -m 1 --check-column "dateColumnName" \
--last-value "lastMinDate" --incremental append
所以最后我选择使用 Scala 的 sys.process.ProcessBuilder
作为外部进程来执行它。这样 运行 不需要任何 SBT 依赖。最后runner是这样实现的:
import sys.process._
def executeSqoop(connectionString: String, username: String, password: String,
viewName: String, outputTable: String,
dateColumnName: String, lastMinDate: String) = {
// To print every single line the process is writing into stdout and stderr respectively
val sqoopLogger = ProcessLogger(
normalLine => log.debug(normalLine),
errorLine => errorLine match {
case line if line.contains("ERROR") => log.error(line)
case line if line.contains("WARN") => log.warning(line)
case line if line.contains("INFO") => log.info(line)
case line => log.debug(line)
}
)
// Create Sqoop command, every parameter and value must be a separated String into the Seq
val command = Seq("sqoop", "import", "--hive-import",
"--connect", connectionString,
"--username", username,
"--password", password,
"--table", viewName,
"--hive-table", outputTable,
"-m", "1",
"--check-column", dateColumnName,
"--last-value", lastMinDate,
"--incremental", "append")
// result will contain the exit code of the command
val result = command ! sqoopLogger
if (result != 0) {
log.error("The Sqoop process did not finished successfully")
} else {
log.info("The Sqoop process finished successfully")
}
}
我想使用sqoop导入数据,但不想使用shell命令。那么如何用JavaAPI做this.The Sqoop版本是1.4.6,我是用Scala+SBT做的。顺便问一下,我需要哪些依赖项?
我需要使用 Sqoop 将数据从 MySQL 导入到 Hive 在 Cloudera CDH 5.7 集群中使用 Scala,所以我开始关注 this answer.
问题是它在服务器上执行时没有获得正确的配置。
手动执行 Sqoop 是这样的:
sqoop import --hive-import --connect "jdbc:mysql://host/db" \
--username "username" --password "password" --table "viewName" \
--hive-table "outputTable" -m 1 --check-column "dateColumnName" \
--last-value "lastMinDate" --incremental append
所以最后我选择使用 Scala 的 sys.process.ProcessBuilder
作为外部进程来执行它。这样 运行 不需要任何 SBT 依赖。最后runner是这样实现的:
import sys.process._
def executeSqoop(connectionString: String, username: String, password: String,
viewName: String, outputTable: String,
dateColumnName: String, lastMinDate: String) = {
// To print every single line the process is writing into stdout and stderr respectively
val sqoopLogger = ProcessLogger(
normalLine => log.debug(normalLine),
errorLine => errorLine match {
case line if line.contains("ERROR") => log.error(line)
case line if line.contains("WARN") => log.warning(line)
case line if line.contains("INFO") => log.info(line)
case line => log.debug(line)
}
)
// Create Sqoop command, every parameter and value must be a separated String into the Seq
val command = Seq("sqoop", "import", "--hive-import",
"--connect", connectionString,
"--username", username,
"--password", password,
"--table", viewName,
"--hive-table", outputTable,
"-m", "1",
"--check-column", dateColumnName,
"--last-value", lastMinDate,
"--incremental", "append")
// result will contain the exit code of the command
val result = command ! sqoopLogger
if (result != 0) {
log.error("The Sqoop process did not finished successfully")
} else {
log.info("The Sqoop process finished successfully")
}
}