sqoop,select 个特定列
sqoop, select specific columns
在 sqoop 语句中,是否有规定我们可以 select 仅来自 oracle 端的特定列?
1:有效
sqoop import --target-dir /tmp/customers --query "SELECT * FROM schema1.customers where item>=1234 and $CONDITIONS" --connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod --username xyz --password xyz --hive-drop-import-delims -m 8 --fields-terminated-by , --escaped-by \ --split-by cust_id
2:失败
sqoop import --target-dir /tmp/customers --query "SELECT cust_id, name, address, date, history, occupation FROM schema1.customers where item>=1234 and $CONDITIONS" --connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod --username xyz --password xyz --hive-drop-import-delims -m 8 --fields-terminated-by , --escaped-by \ --split-by cust_id
你可以使用 --columns --table --where 子句来实现它。示例如下:
sqoop import
--connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod/DATABASE=schema1
--username xyz
--password xyz
--table customers
--columns cust_id, name, address, date, history, occupation
--where item>=1234
--target-dir /tmp//customers
--m 8
--split-by cust_id
--fields-terminated-by ,
--escaped-by \
--hive-drop-import-delims
--map-column-java
cust_id=string, name=string, address=string, date=string, history=string, occupation=string
我怀疑 SELECT cust_id, name, address, date, history, occupation FROM schema1.customers where item>=1234
不正确。我尝试了所有可能的场景。在您的数据库中尝试 运行。在第二个语句 运行 之前,您是否也删除了目录 /tmp/customers。您也应该粘贴错误。
sqoop import \
--connect "jdbc:mysql://sandbox.hortonworks.com:3306/retail_db" \
--username=retail_dba \
--password=hadoop \
--query "select department_id, department_name from departments where $CONDITIONS" \
--target-dir /user/root//testing \
--split-by department_id \
--outdir java_files \
--hive-drop-import-delims \
-m 8 \
--fields-terminated-by , \
--escaped-by '\'
在 sqoop 语句中,是否有规定我们可以 select 仅来自 oracle 端的特定列?
1:有效
sqoop import --target-dir /tmp/customers --query "SELECT * FROM schema1.customers where item>=1234 and $CONDITIONS" --connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod --username xyz --password xyz --hive-drop-import-delims -m 8 --fields-terminated-by , --escaped-by \ --split-by cust_id
2:失败
sqoop import --target-dir /tmp/customers --query "SELECT cust_id, name, address, date, history, occupation FROM schema1.customers where item>=1234 and $CONDITIONS" --connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod --username xyz --password xyz --hive-drop-import-delims -m 8 --fields-terminated-by , --escaped-by \ --split-by cust_id
你可以使用 --columns --table --where 子句来实现它。示例如下:
sqoop import
--connect jdbc:oracle:thin:@server1.companyxyz.com:4567/prod/DATABASE=schema1
--username xyz
--password xyz
--table customers
--columns cust_id, name, address, date, history, occupation
--where item>=1234
--target-dir /tmp//customers
--m 8
--split-by cust_id
--fields-terminated-by ,
--escaped-by \
--hive-drop-import-delims
--map-column-java
cust_id=string, name=string, address=string, date=string, history=string, occupation=string
我怀疑 SELECT cust_id, name, address, date, history, occupation FROM schema1.customers where item>=1234
不正确。我尝试了所有可能的场景。在您的数据库中尝试 运行。在第二个语句 运行 之前,您是否也删除了目录 /tmp/customers。您也应该粘贴错误。
sqoop import \
--connect "jdbc:mysql://sandbox.hortonworks.com:3306/retail_db" \
--username=retail_dba \
--password=hadoop \
--query "select department_id, department_name from departments where $CONDITIONS" \
--target-dir /user/root//testing \
--split-by department_id \
--outdir java_files \
--hive-drop-import-delims \
-m 8 \
--fields-terminated-by , \
--escaped-by '\'