如何在 hive 插入分区语句中使用 UDF 值或列值,而不是常量值
How to use a UDF value or column value in hive insert partition statement, rather than constant value
我有一个数据 table 创建如下:
CREATE EXTERNAL TABLE `DailyData`(
`entity_id` string,
`payload` string)
PARTITIONED BY
(`date_of_data` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u0010'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
'/home/data/dailydata'
我有一份工作 运行 每天将数据插入 daily_table
它可以使用以下语句工作
INSERT INTO TABLE DailyData partition(date_of_data="20181126")
SELECT id as entity_id, simpledata as payload from log_data;
我的期望是它可以自动使用当天而不是使用硬编码日期字符串(如 20181126")
我试过了
INSERT INTO TABLE DailyData partition(date_of_data=from_unixtime(unix_timestamp(),'yyyyMMdd'))
SELECT id as entity_id, simpledata as payload from log_data;
并得到以下异常
NoViableAltException(26@[244:1: constant : ( Number | dateLiteral | timestampLiteral | intervalLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:4928)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionVal(HiveParser_IdentifiersParser.java:10726)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionSpec(HiveParser_IdentifiersParser.java:10560)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.tableOrPartition(HiveParser_IdentifiersParser.java:10438)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableOrPartition(HiveParser.java:49929)
at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(HiveParser.java:46629)
at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:43233)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:42451)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:42321)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1681)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:447)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:330)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1233)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1274)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1160)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:65 cannot recognize input near 'from_unixtime' '(' 'unix_timestamp' in constant
我试过了
INSERT INTO TABLE DailyData partition(date_of_data=data_of_date)
SELECT id as entity_id, simpledata as payload, from_unixtime(unix_timestamp(),'yyyyMMdd') as data_of_date from log_data;
遇到类似异常。
NoViableAltException(26@[244:1: constant : ( Number | dateLiteral | timestampLiteral | intervalLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:4928)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionVal(HiveParser_IdentifiersParser.java:10726)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionSpec(HiveParser_IdentifiersParser.java:10560)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.tableOrPartition(HiveParser_IdentifiersParser.java:10438)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableOrPartition(HiveParser.java:49929)
at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(HiveParser.java:46629)
at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:43233)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:42451)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:42321)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1681)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:447)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:330)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1233)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1274)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1160)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:51 cannot recognize input near 'date_of_data' ')' 'SELECT' in constant
是否可以在分区语句中使用值而不是常量?
这可以通过设置 hive.exec.dynamic.partition.mode=nonstrict;
并将分区值作为 select 语句的一部分来实现。
请注意 select
子句中的最后一列是 date_of_data
并且通过指定 partition(date_of_data)
此列用作分区值。
警告如果分区中有多个值,记录将被发送到相应的分区。谨慎使用。
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT INTO TABLE DailyData partition(date_of_data)
SELECT id as entity_id, simpledata as payload, from_unixtime(unix_timestamp(),'yyyyMMdd') as date_of_data from log_data;
我有一个数据 table 创建如下:
CREATE EXTERNAL TABLE `DailyData`(
`entity_id` string,
`payload` string)
PARTITIONED BY
(`date_of_data` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u0010'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION
'/home/data/dailydata'
我有一份工作 运行 每天将数据插入 daily_table
它可以使用以下语句工作
INSERT INTO TABLE DailyData partition(date_of_data="20181126")
SELECT id as entity_id, simpledata as payload from log_data;
我的期望是它可以自动使用当天而不是使用硬编码日期字符串(如 20181126")
我试过了
INSERT INTO TABLE DailyData partition(date_of_data=from_unixtime(unix_timestamp(),'yyyyMMdd'))
SELECT id as entity_id, simpledata as payload from log_data;
并得到以下异常
NoViableAltException(26@[244:1: constant : ( Number | dateLiteral | timestampLiteral | intervalLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:4928)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionVal(HiveParser_IdentifiersParser.java:10726)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionSpec(HiveParser_IdentifiersParser.java:10560)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.tableOrPartition(HiveParser_IdentifiersParser.java:10438)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableOrPartition(HiveParser.java:49929)
at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(HiveParser.java:46629)
at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:43233)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:42451)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:42321)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1681)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:447)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:330)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1233)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1274)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1160)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:65 cannot recognize input near 'from_unixtime' '(' 'unix_timestamp' in constant
我试过了
INSERT INTO TABLE DailyData partition(date_of_data=data_of_date)
SELECT id as entity_id, simpledata as payload, from_unixtime(unix_timestamp(),'yyyyMMdd') as data_of_date from log_data;
遇到类似异常。
NoViableAltException(26@[244:1: constant : ( Number | dateLiteral | timestampLiteral | intervalLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:4928)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionVal(HiveParser_IdentifiersParser.java:10726)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.partitionSpec(HiveParser_IdentifiersParser.java:10560)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.tableOrPartition(HiveParser_IdentifiersParser.java:10438)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableOrPartition(HiveParser.java:49929)
at org.apache.hadoop.hive.ql.parse.HiveParser.insertClause(HiveParser.java:46629)
at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:43233)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:42451)
at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:42321)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1681)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:447)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:330)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1233)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1274)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1160)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:51 cannot recognize input near 'date_of_data' ')' 'SELECT' in constant
是否可以在分区语句中使用值而不是常量?
这可以通过设置 hive.exec.dynamic.partition.mode=nonstrict;
并将分区值作为 select 语句的一部分来实现。
请注意 select
子句中的最后一列是 date_of_data
并且通过指定 partition(date_of_data)
此列用作分区值。
警告如果分区中有多个值,记录将被发送到相应的分区。谨慎使用。
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT INTO TABLE DailyData partition(date_of_data)
SELECT id as entity_id, simpledata as payload, from_unixtime(unix_timestamp(),'yyyyMMdd') as date_of_data from log_data;