将 Table 上传到 BigQuery 时出现问题
Having Issues Uploading Table to BigQuery
我正在尝试将 .csv 文件上传到 BigQuery 以用于 Google 数据分析证书程序中的项目,但我不断 运行 遇到此错误:
Unexpected error
Tracking number: c1642301846583819
Here 是 table 创建菜单,其中包含我填写的字段(因为我是新用户而链接)
我知道当前有一个错误会产生类似的错误消息,但 table 在刷新页面后仍会创建,但这不适用于此文件。刷新页面后,table还是不可用,肯定是没有创建。该文件比之前同样以这种方式工作的文件大。我也试过将文件上传到 google 云并从那里上传到 BigQuery,它 returns 这个错误消息:
Failed to create table: Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'
当我进入 Google Cloud Logging 时,出现以下错误代码:
{
"protoPayload": {
"@type": "type.googleapis.com/google.cloud.audit.AuditLog",
"status": {
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
"authenticationInfo": {
"principalEmail": "REDACTED"
},
"requestMetadata": {
"callerIp": "2600:8801:21c:3f00:306d:ff1a:ed67:fb92",
"callerSuppliedUserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36,gzip(gfe),gzip(gfe)"
},
"serviceName": "bigquery.googleapis.com",
"methodName": "google.cloud.bigquery.v2.JobService.InsertJob",
"authorizationInfo": [
{
"resource": "projects/analytics-capstone-project",
"permission": "bigquery.jobs.create",
"granted": true
}
],
"resourceName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
"metadata": {
"jobChange": {
"after": "DONE",
"job": {
"jobName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
"jobConfig": {
"type": "IMPORT",
"loadConfig": {
"sourceUris": [
"gs://exampleforml/heartrate_seconds_merged.csv"
],
"schemaJson": "{\n}",
"destinationTable": "projects/analytics-capstone-project/datasets/FitBit_Fitness_Tracker_Data/tables/hr_seconds",
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_EMPTY"
}
},
"jobStatus": {
"jobState": "DONE",
"errorResult": {
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
"errors": [
{
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
{
"code": 3,
"message": "Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0"
},
{
"code": 3,
"message": "Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
}
]
},
"jobStats": {
"createTime": "2022-05-20T02:11:47.618Z",
"startTime": "2022-05-20T02:11:47.759Z",
"endTime": "2022-05-20T02:11:48.821Z",
"loadStats": {},
"totalSlotMs": "138"
}
}
},
"@type": "type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
}
},
"insertId": "-fsqzz0e2lplj",
"resource": {
"type": "bigquery_project",
"labels": {
"project_id": "analytics-capstone-project",
"location": "US"
}
},
"timestamp": "2022-05-20T02:11:48.832533Z",
"severity": "ERROR",
"logName": "projects/analytics-capstone-project/logs/cloudaudit.googleapis.com%2Fdata_access",
"operation": {
"id": "1653012707618-analytics-capstone-project:bquxjob_42de9715_180df3c3ba2",
"producer": "bigquery.googleapis.com",
"last": true
},
"receiveTimestamp": "2022-05-20T02:11:49.112933352Z"
}
我尝试上传的原始文件位于 this kaggle dataset 中,名为 heartrate_seconds_merged
我不确定如何解决这个问题,希望得到任何帮助!谢谢!
编辑:我尝试继续为该项目上传更多数据,但现在无法再创建 table。我尝试创建它们的方式有问题吗?
Google 对它们的各种日期和时间字段在摄取期间如何格式化非常挑剔。他们在文档 here.
中过于简短地提到了这一点
根据您的日志,错误是
"Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP".
根据 Google 的文档:
- 日期部分可以格式化为
YYYY-MM-DD
或YYYY/MM/DD
- 时间戳部分的格式必须为
HH:MM[:SS[.SSSSSS]]
(秒和秒的小数部分是可选的)。
- 日期和时间必须用 space 或 'T' 分隔。
- 可以选择在日期和时间后跟 UTC 偏移量或 UTC 时区指示符 (Z),您可以了解有关 timezones here 的更多信息。
那你会去哪里呢?不幸的是,日期时间类型也需要类似的 YYYY-[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.F]]
格式。假设您想坚持使用模式的时间戳类型并且您的时间戳以 UTC 显示,这意味着您将不得不预处理每个文件以将您的值从 4/12/2016 7:21:00 AM
转换为 2016/04/12 07:21:00
(请注意,每个段的前导零不是可选的,但在这种情况下您可以选择删除秒数)。
我正在尝试将 .csv 文件上传到 BigQuery 以用于 Google 数据分析证书程序中的项目,但我不断 运行 遇到此错误:
Unexpected error
Tracking number: c1642301846583819
Here 是 table 创建菜单,其中包含我填写的字段(因为我是新用户而链接)
我知道当前有一个错误会产生类似的错误消息,但 table 在刷新页面后仍会创建,但这不适用于此文件。刷新页面后,table还是不可用,肯定是没有创建。该文件比之前同样以这种方式工作的文件大。我也试过将文件上传到 google 云并从那里上传到 BigQuery,它 returns 这个错误消息:
Failed to create table: Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'
当我进入 Google Cloud Logging 时,出现以下错误代码:
{
"protoPayload": {
"@type": "type.googleapis.com/google.cloud.audit.AuditLog",
"status": {
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
"authenticationInfo": {
"principalEmail": "REDACTED"
},
"requestMetadata": {
"callerIp": "2600:8801:21c:3f00:306d:ff1a:ed67:fb92",
"callerSuppliedUserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36,gzip(gfe),gzip(gfe)"
},
"serviceName": "bigquery.googleapis.com",
"methodName": "google.cloud.bigquery.v2.JobService.InsertJob",
"authorizationInfo": [
{
"resource": "projects/analytics-capstone-project",
"permission": "bigquery.jobs.create",
"granted": true
}
],
"resourceName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
"metadata": {
"jobChange": {
"after": "DONE",
"job": {
"jobName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
"jobConfig": {
"type": "IMPORT",
"loadConfig": {
"sourceUris": [
"gs://exampleforml/heartrate_seconds_merged.csv"
],
"schemaJson": "{\n}",
"destinationTable": "projects/analytics-capstone-project/datasets/FitBit_Fitness_Tracker_Data/tables/hr_seconds",
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_EMPTY"
}
},
"jobStatus": {
"jobState": "DONE",
"errorResult": {
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
"errors": [
{
"code": 3,
"message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
},
{
"code": 3,
"message": "Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0"
},
{
"code": 3,
"message": "Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'"
}
]
},
"jobStats": {
"createTime": "2022-05-20T02:11:47.618Z",
"startTime": "2022-05-20T02:11:47.759Z",
"endTime": "2022-05-20T02:11:48.821Z",
"loadStats": {},
"totalSlotMs": "138"
}
}
},
"@type": "type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
}
},
"insertId": "-fsqzz0e2lplj",
"resource": {
"type": "bigquery_project",
"labels": {
"project_id": "analytics-capstone-project",
"location": "US"
}
},
"timestamp": "2022-05-20T02:11:48.832533Z",
"severity": "ERROR",
"logName": "projects/analytics-capstone-project/logs/cloudaudit.googleapis.com%2Fdata_access",
"operation": {
"id": "1653012707618-analytics-capstone-project:bquxjob_42de9715_180df3c3ba2",
"producer": "bigquery.googleapis.com",
"last": true
},
"receiveTimestamp": "2022-05-20T02:11:49.112933352Z"
}
我尝试上传的原始文件位于 this kaggle dataset 中,名为 heartrate_seconds_merged
我不确定如何解决这个问题,希望得到任何帮助!谢谢!
编辑:我尝试继续为该项目上传更多数据,但现在无法再创建 table。我尝试创建它们的方式有问题吗?
Google 对它们的各种日期和时间字段在摄取期间如何格式化非常挑剔。他们在文档 here.
中过于简短地提到了这一点根据您的日志,错误是
"Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP".
根据 Google 的文档:
- 日期部分可以格式化为
YYYY-MM-DD
或YYYY/MM/DD
- 时间戳部分的格式必须为
HH:MM[:SS[.SSSSSS]]
(秒和秒的小数部分是可选的)。 - 日期和时间必须用 space 或 'T' 分隔。
- 可以选择在日期和时间后跟 UTC 偏移量或 UTC 时区指示符 (Z),您可以了解有关 timezones here 的更多信息。
那你会去哪里呢?不幸的是,日期时间类型也需要类似的 YYYY-[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.F]]
格式。假设您想坚持使用模式的时间戳类型并且您的时间戳以 UTC 显示,这意味着您将不得不预处理每个文件以将您的值从 4/12/2016 7:21:00 AM
转换为 2016/04/12 07:21:00
(请注意,每个段的前导零不是可选的,但在这种情况下您可以选择删除秒数)。