无法使用字符串数据类型中的 unixtimestamp 列类型

Unable to work with unixtimestamp column type in string data type

我有一个配置单元 table 来加载 JSON 数据。我的 JSON 中有两个值。两者的数据类型都是字符串。如果我将它们保留为 bigint,则此 table 上的 select 会给出以下错误:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
 at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]

如果我把它改成两个字符串,就可以了。

现在,因为这些列是字符串,所以我无法对这些列使用 from_unixtime 方法。

如果我尝试将这些列的数据类型从字符串更改为 bigint,则会出现以下错误:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp

下面是我的创建 table 语句:

create table ABC
(
    uploadTimeStamp bigint
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc:string
                               ,loc1:string
                               ,loc2:string
                               ,loc3:string
                               ,speed:string
                               ,xvalue:string
                               ,yvalue:string
                               ,zvalue:string
                            >
                           ,Event:string
                           ,PDID:string
                           ,`Timestamp`:string
                           ,Timezone:string
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile;

我的JSON:

{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]}

有什么方法可以将此字符串 unixtimestamp 转换为标准时间,或者我可以为这些列使用 bigint?

  1. 如果您在谈论 时间戳时区 那么您 可以 将它们定义为 int/big int 类型。
    如果您查看它们的定义,您会发现值周围没有限定符 ("),因此它们在 JSON 文档中属于数字类型:

    "Timestamp":1488793268598,"Timezone":330


create external table myjson
(
    uploadTimeStamp string
   ,PDID            string

   ,data            array
                    <
                        struct
                        <
                            Data:struct
                            <
                                unit:string
                               ,value:string
                               ,heading:string
                               ,loc3:string
                               ,loc:string
                               ,loc1:string
                               ,loc4:string
                               ,speed:string
                               ,x:string
                               ,y:string
                               ,z:string
                            >
                           ,EventID:string
                           ,PDID:string
                           ,`Timestamp`:bigint
                           ,Timezone:smallint
                           ,Version:string
                           ,pii:struct<dummy:string>
                        >
                    >
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 
stored as textfile
location '/tmp/myjson'
;

+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| myjson.uploadtimestamp | myjson.pdid |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         myjson.data                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|          1486631318873 |         123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

  1. 即使您已将 Timestamp 定义为字符串,您仍然可以在需要 bigint 的函数中使用它之前将其转换为 bigint。

    cast(`Timestamp` 作为 bigint)


hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t;

FAILED: SemanticException [Error 10014]: Line 1:45 Wrong arguments 'timestamp': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string). Possible choices: FUNC(bigint) FUNC(bigint, string) FUNC(int) FUNC(int, string)

hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t;
OK
1970-01-01 00:00:00