sqoop import 为正确的 sql 查询提供了错误的结果

sqoop import gives wrong result for a correct sql query

我在 MySQL 中使用如下查询。我得到了我想要的结果。

select TABLE_NAME,count(column_name) as no_of_columns from information_schema.columns where TABLE_SCHEMA = 'testing' and TABLE_NAME NOT REGEXP 'temp|bkup|RemoveMe|test' group by TABLE_NAME

当我在 sqoop 导入语句中使用相同的查询时,结果不同。

sqoop 导入语句如下。

sqoop import --connect jdbc:mysql://xxxxxx:3306/information_schema --username xxxxx --password-file /user/xxxxx/passwds/mysql.file --query "select TABLE_NAME,count(column_name) as no_of_columns from information_schema.columns where TABLE_SCHEMA = 'testing' and TABLE_NAME NOT REGEXP 'temp|bkup|RemoveMe|test' group by TABLE_NAME and $CONDITIONS" -m 1 --target-dir /user/hive/warehouse/xxxx.db/testing_columns --outdir /home/xxxxx/logs/outdir

为什么会这样,我应该怎么做才能得到想要的结果

$CONDITIONS 标记必须在 WHERE 子句中:

sqoop import --connect jdbc:mysql://xxxxxx:3306/information_schema \
    --username xxxxx --password-file /user/xxxxx/passwds/mysql.file \
    --query "select TABLE_NAME,count(column_name) as no_of_columns \ 
               from information_schema.columns \
               where TABLE_SCHEMA = 'testing' \
                 and TABLE_NAME NOT REGEXP 'temp|bkup|RemoveMe|test' \ 
                 and $CONDITIONS \
               group by TABLE_NAME" \
    -m 1 --target-dir /user/hive/warehouse/xxxx.db/testing_columns \
    --outdir /home/xxxxx/logs/outdir

也根据Sqoop User Guide考虑:

The facility of using free-form query in the current version of Sqoop is limited to simple queries where there are no ambiguous projections and no OR conditions in the WHERE clause. Use of complex queries such as queries that have sub-queries or joins leading to ambiguous projections can lead to unexpected results.