SQOOP - 导入失败:无法从空字符串创建路径
SQOOP - Imported Failed: Can not create a Path from a null string
我正在使用 SQOOP 增量更新将 tables 从 SQL 服务器加载到 HBase table。但是 SQL table 中的空值并没有导入到 HBase 中。我知道 Hbase 不支持空值,并且包含空值的字段不会出现在 Hbase 中。但我担心的是,当某个特定列对大多数记录具有空值时,即使该字段中存在某些记录的值,也会被跳过。以下是SQLtable结构
CREATE TABLE [dbo].[user_test](
[user_id] [nvarchar](20) NOT NULL,
[user_name] [nvarchar](100) NULL,
[password] [varchar](128) NULL,
[created_date] [datetime2](7) NULL,
[modified_date] [datetime2](7) NULL,
[last_login_date] [datetime2](7) NULL,
[email_id] [nvarchar](100) NULL,
[security_question_id] [int] NULL,
[answered_count] [int] NULL,
[skip_count] [int] NULL,
[role_id] [smallint] NULL,
[use_yn] [char](1) NULL,
[first_login] [char](1) NULL,
[score] [int] NULL,
[secret_answer] [nvarchar](100) NULL,
PRIMARY KEY CLUSTERED
(
[user_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
在上面的table中,email_id的值在大多数记录中都是空的。但即使存在 email_id 值的记录,也不会导入到 Hbase table。 sqoop 命令成功获取 SQL 中的附加记录。 SQOOP命令如下:
sqoop import
--connect "jdbc:sqlserver://107.108.32.198:1433;database=ETL_interim_DB;"
--username "hadoop" --password "Semco123"
--query "SELECT CAST(user_id AS Integer) as
user_id,user_name,password,modified_date,last_login_date,email_id,security_question_id,answered_count,skip_count,role_id,use_yn,first_login,score,secret_answer from
ETL_interim_DB.dbo.user_test WHERE $CONDITIONS"
--hbase-table test2
--column-family cf
--hbase-row-key user_id
--split-by user_id -m 1
--incremental append
--check-column user_id
--last-value 10
但是显示如下错误。
Note: Recompile with -Xlint:deprecation for details.
0 [main] ERROR org.apache.sqoop.tool.ImportTool - Imported Failed: Can
not create a Path from a null string
任何人都可以建议如何将 SQL 服务器中存在的所有值导入到 HBase 中,如果 SQL 中存在空值,将它们导入到 Hbase tables 时会发生什么?
您可以尝试让 this.For Hbase 的列具有空值,您可以更新 SQL 数据库中的 NULL 值(空单元格)以具有类似 '0 ' 或文本 "NULL"。下面是查询。
UPDATE [Table Name] SET [Column Name]='Null' WHERE [Column Name] IS NULL.
或,
ALTER TABLE [Table Name] CHANGE COLUMN [Column Name] VARCHAR(50) NOT NULL DEFAULT '';
然后尝试将 SQL 导入到 Hbase.Hope 这有帮助!
COALESCE 操作让我通过给定默认值将 SQL 中的空字段导入 HBase。以下是相同的 sqoop 命令:
sqoop import
--connect "jdbc:sqlserver://107.108.32.198:1433;database=ETL_interim_DB;"
--username "hadoop" --password "Semco123"
--query "SELECT CAST(user_id AS Integer) as user_id
COALESCE(user_name,'xyz') as user_name, \
COALESCE(password,'123') as password, \
COALESCE(created_date, '9999-12-31 00:00:00.0000000') as created_date, \
COALESCE(modified_date,'9999-12-31 00:00:00.0000000') as modified_date, \
COALESCE(last_login_date,'9999-12-31 00:00:00.0000000') as lastlogin, \
COALESCE(email_id,'0') as email_id, \
COALESCE(security_question_id,-1) as security_question_id, \
COALESCE(answered_count,-1) as answered_count, \
COALESCE(skip_count,-1) as skip_count, \
COALESCE(secret_answer, '0') as secret_answer, \
COALESCE(role_id,0) as role_id, \
COALESCE(use_yn,'0') as use_yn, \
COALESCE(first_login,'0') as firstlogin, \
COALESCE(score,-1) as score from ETL_interim_DB.dbo.ms_user_detail_test WHERE $CONDITIONS" \
--hbase-table test2
--column-family cf
--hbase-row-key user_id
--split-by user_id -m 1
--incremental append
--check-column user_id
--last-value 10
我正在使用 SQOOP 增量更新将 tables 从 SQL 服务器加载到 HBase table。但是 SQL table 中的空值并没有导入到 HBase 中。我知道 Hbase 不支持空值,并且包含空值的字段不会出现在 Hbase 中。但我担心的是,当某个特定列对大多数记录具有空值时,即使该字段中存在某些记录的值,也会被跳过。以下是SQLtable结构
CREATE TABLE [dbo].[user_test](
[user_id] [nvarchar](20) NOT NULL,
[user_name] [nvarchar](100) NULL,
[password] [varchar](128) NULL,
[created_date] [datetime2](7) NULL,
[modified_date] [datetime2](7) NULL,
[last_login_date] [datetime2](7) NULL,
[email_id] [nvarchar](100) NULL,
[security_question_id] [int] NULL,
[answered_count] [int] NULL,
[skip_count] [int] NULL,
[role_id] [smallint] NULL,
[use_yn] [char](1) NULL,
[first_login] [char](1) NULL,
[score] [int] NULL,
[secret_answer] [nvarchar](100) NULL,
PRIMARY KEY CLUSTERED
(
[user_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
在上面的table中,email_id的值在大多数记录中都是空的。但即使存在 email_id 值的记录,也不会导入到 Hbase table。 sqoop 命令成功获取 SQL 中的附加记录。 SQOOP命令如下:
sqoop import
--connect "jdbc:sqlserver://107.108.32.198:1433;database=ETL_interim_DB;"
--username "hadoop" --password "Semco123"
--query "SELECT CAST(user_id AS Integer) as
user_id,user_name,password,modified_date,last_login_date,email_id,security_question_id,answered_count,skip_count,role_id,use_yn,first_login,score,secret_answer from
ETL_interim_DB.dbo.user_test WHERE $CONDITIONS"
--hbase-table test2
--column-family cf
--hbase-row-key user_id
--split-by user_id -m 1
--incremental append
--check-column user_id
--last-value 10
但是显示如下错误。
Note: Recompile with -Xlint:deprecation for details.
0 [main] ERROR org.apache.sqoop.tool.ImportTool - Imported Failed: Can
not create a Path from a null string
任何人都可以建议如何将 SQL 服务器中存在的所有值导入到 HBase 中,如果 SQL 中存在空值,将它们导入到 Hbase tables 时会发生什么?
您可以尝试让 this.For Hbase 的列具有空值,您可以更新 SQL 数据库中的 NULL 值(空单元格)以具有类似 '0 ' 或文本 "NULL"。下面是查询。
UPDATE [Table Name] SET [Column Name]='Null' WHERE [Column Name] IS NULL.
或,
ALTER TABLE [Table Name] CHANGE COLUMN [Column Name] VARCHAR(50) NOT NULL DEFAULT '';
然后尝试将 SQL 导入到 Hbase.Hope 这有帮助!
COALESCE 操作让我通过给定默认值将 SQL 中的空字段导入 HBase。以下是相同的 sqoop 命令:
sqoop import
--connect "jdbc:sqlserver://107.108.32.198:1433;database=ETL_interim_DB;"
--username "hadoop" --password "Semco123"
--query "SELECT CAST(user_id AS Integer) as user_id
COALESCE(user_name,'xyz') as user_name, \
COALESCE(password,'123') as password, \
COALESCE(created_date, '9999-12-31 00:00:00.0000000') as created_date, \
COALESCE(modified_date,'9999-12-31 00:00:00.0000000') as modified_date, \
COALESCE(last_login_date,'9999-12-31 00:00:00.0000000') as lastlogin, \
COALESCE(email_id,'0') as email_id, \
COALESCE(security_question_id,-1) as security_question_id, \
COALESCE(answered_count,-1) as answered_count, \
COALESCE(skip_count,-1) as skip_count, \
COALESCE(secret_answer, '0') as secret_answer, \
COALESCE(role_id,0) as role_id, \
COALESCE(use_yn,'0') as use_yn, \
COALESCE(first_login,'0') as firstlogin, \
COALESCE(score,-1) as score from ETL_interim_DB.dbo.ms_user_detail_test WHERE $CONDITIONS" \
--hbase-table test2
--column-family cf
--hbase-row-key user_id
--split-by user_id -m 1
--incremental append
--check-column user_id
--last-value 10