Amazon Athena : HIVE_METASTORE_ERROR: name expected at the position 22 of [...] but ' ' is found
Amazon Athena : HIVE_METASTORE_ERROR: name expected at the position 22 of [...] but ' ' is found
我使用无服务器文件 + CloudFormation 在 AWS Athena 服务上创建 table。
我的serverless.yml:
...
CardBulkWorkgroup:
Type: AWS::Athena::WorkGroup
Properties:
Name: ${opt:stage}-${opt:client}-CardBulk
WorkGroupConfiguration:
ResultConfiguration:
OutputLocation: s3://${lower:${opt:stage}}-${lower:${opt:client}}-card-bulk-athena-result
CardBulkDatabase:
Type: AWS::Glue::Database
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseInput:
Name: ${lower:${opt:stage}}_${lower:${opt:client}}_bulkcard
CardBulkTable:
Type: AWS::Glue::Table
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseName: !Ref CardBulkDatabase
TableInput:
Name: card
StorageDescriptor:
Columns:
- Name: cardId
Type: int
- Name: metadata
Type: struct<orderId:string, convertVirtualToPhysicalErrors:string>
- Name: orderId
Type: string
- Name: errors
Type: string
Location: s3://${lower:${opt:stage}}_${lower:${opt:client}}-files/cards
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
SerdeInfo:
SerializationLibrary: org.openx.data.jsonserde.JsonSerDe
Parameters:
"serialization.format": "1"
CardBulkAthenaBucketResult:
Type: AWS::S3::Bucket
Properties:
BucketName: ${lower:${opt:stage}}-${lower:${opt:client}}-card-bulk-athena-result
...
当我部署堆栈时,数据库 dev_connect_bulkcard 和我的 table 卡片 已正确创建。
问题:
使用我的 API,当我想从 card[=44= 中检索数据时] table 来自我的 dev_connect_bulkcard 数据库,我得到这个错误。
"HIVE_METASTORE_ERROR: Error: name expected at the position 22 of
'struct<orderId:string, convertVirtualToPhysicalErrors:string>' but '
' is found.
但是,如果我直接从 AWS 控制台(从 Athena 服务)删除 card table 并生成它与此查询:
CREATE EXTERNAL TABLE `card`(
`cardid` int COMMENT 'from deserializer',
`orderid` string COMMENT 'from deserializer',
`metadata` struct<orderid:string,convertvirtualtophysicalerrors:string> COMMENT 'from deserializer',
`errors` array<string> COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
's3://dev-connect-files/cards'
TBLPROPERTIES (
'has_encrypted_data'='false',
'transient_lastDdlTime'='1627378097')
成功了,我可以用我的API.
return卡table的数据
你知道为什么我必须手动删除我的 table 然后重新创建它以取回结果吗?
提前致谢,
错误是由于列 metadata
中存在 space。删除 orderId
和 convertVirtualToPhysicalErrors
之间的 space 。 Athena 不会接受 space 作为其列名中的特殊字符。有关详细信息,请参阅 this。
我使用无服务器文件 + CloudFormation 在 AWS Athena 服务上创建 table。
我的serverless.yml:
...
CardBulkWorkgroup:
Type: AWS::Athena::WorkGroup
Properties:
Name: ${opt:stage}-${opt:client}-CardBulk
WorkGroupConfiguration:
ResultConfiguration:
OutputLocation: s3://${lower:${opt:stage}}-${lower:${opt:client}}-card-bulk-athena-result
CardBulkDatabase:
Type: AWS::Glue::Database
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseInput:
Name: ${lower:${opt:stage}}_${lower:${opt:client}}_bulkcard
CardBulkTable:
Type: AWS::Glue::Table
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseName: !Ref CardBulkDatabase
TableInput:
Name: card
StorageDescriptor:
Columns:
- Name: cardId
Type: int
- Name: metadata
Type: struct<orderId:string, convertVirtualToPhysicalErrors:string>
- Name: orderId
Type: string
- Name: errors
Type: string
Location: s3://${lower:${opt:stage}}_${lower:${opt:client}}-files/cards
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
SerdeInfo:
SerializationLibrary: org.openx.data.jsonserde.JsonSerDe
Parameters:
"serialization.format": "1"
CardBulkAthenaBucketResult:
Type: AWS::S3::Bucket
Properties:
BucketName: ${lower:${opt:stage}}-${lower:${opt:client}}-card-bulk-athena-result
...
当我部署堆栈时,数据库 dev_connect_bulkcard 和我的 table 卡片 已正确创建。
问题:
使用我的 API,当我想从 card[=44= 中检索数据时] table 来自我的 dev_connect_bulkcard 数据库,我得到这个错误。
"HIVE_METASTORE_ERROR: Error: name expected at the position 22 of 'struct<orderId:string, convertVirtualToPhysicalErrors:string>' but ' ' is found.
但是,如果我直接从 AWS 控制台(从 Athena 服务)删除 card table 并生成它与此查询:
CREATE EXTERNAL TABLE `card`(
`cardid` int COMMENT 'from deserializer',
`orderid` string COMMENT 'from deserializer',
`metadata` struct<orderid:string,convertvirtualtophysicalerrors:string> COMMENT 'from deserializer',
`errors` array<string> COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
's3://dev-connect-files/cards'
TBLPROPERTIES (
'has_encrypted_data'='false',
'transient_lastDdlTime'='1627378097')
成功了,我可以用我的API.
return卡table的数据你知道为什么我必须手动删除我的 table 然后重新创建它以取回结果吗?
提前致谢,
错误是由于列 metadata
中存在 space。删除 orderId
和 convertVirtualToPhysicalErrors
之间的 space 。 Athena 不会接受 space 作为其列名中的特殊字符。有关详细信息,请参阅 this。