将 Table 上传到 BigQuery 时出现问题

Having Issues Uploading Table to BigQuery

我正在尝试将 .csv 文件上传到 BigQuery 以用于 Google 数据分析证书程序中的项目,但我不断 运行 遇到此错误:

Unexpected error
Tracking number: c1642301846583819

Here 是 table 创建菜单,其中包含我填写的字段(因为我是新用户而链接)

我知道当前有一个错误会产生类似的错误消息,但 table 在刷新页面后仍会创建,但这不适用于此文件。刷新页面后,table还是不可用,肯定是没有创建。该文件比之前同样以这种方式工作的文件大。我也试过将文件上传到 google 云并从那里上传到 BigQuery,它 returns 这个错误消息:

Failed to create table: Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15 with message 'Invalid time zone: AM'

当我进入 Google Cloud Logging 时,出现以下错误代码:

{
  "protoPayload": {
    "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
    "status": {
      "code": 3,
      "message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15  with message 'Invalid time zone: AM'"
    },
    "authenticationInfo": {
      "principalEmail": "REDACTED"
    },
    "requestMetadata": {
      "callerIp": "2600:8801:21c:3f00:306d:ff1a:ed67:fb92",
      "callerSuppliedUserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36,gzip(gfe),gzip(gfe)"
    },
    "serviceName": "bigquery.googleapis.com",
    "methodName": "google.cloud.bigquery.v2.JobService.InsertJob",
    "authorizationInfo": [
      {
        "resource": "projects/analytics-capstone-project",
        "permission": "bigquery.jobs.create",
        "granted": true
      }
    ],
    "resourceName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
    "metadata": {
      "jobChange": {
        "after": "DONE",
        "job": {
          "jobName": "projects/analytics-capstone-project/jobs/bquxjob_42de9715_180df3c3ba2",
          "jobConfig": {
            "type": "IMPORT",
            "loadConfig": {
              "sourceUris": [
                "gs://exampleforml/heartrate_seconds_merged.csv"
              ],
              "schemaJson": "{\n}",
              "destinationTable": "projects/analytics-capstone-project/datasets/FitBit_Fitness_Tracker_Data/tables/hr_seconds",
              "createDisposition": "CREATE_IF_NEEDED",
              "writeDisposition": "WRITE_EMPTY"
            }
          },
          "jobStatus": {
            "jobState": "DONE",
            "errorResult": {
              "code": 3,
              "message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15  with message 'Invalid time zone: AM'"
            },
            "errors": [
              {
                "code": 3,
                "message": "Error while reading data, error message: Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15  with message 'Invalid time zone: AM'"
              },
              {
                "code": 3,
                "message": "Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0"
              },
              {
                "code": 3,
                "message": "Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP for field Time (position 1) starting at location 15  with message 'Invalid time zone: AM'"
              }
            ]
          },
          "jobStats": {
            "createTime": "2022-05-20T02:11:47.618Z",
            "startTime": "2022-05-20T02:11:47.759Z",
            "endTime": "2022-05-20T02:11:48.821Z",
            "loadStats": {},
            "totalSlotMs": "138"
          }
        }
      },
      "@type": "type.googleapis.com/google.cloud.audit.BigQueryAuditMetadata"
    }
  },
  "insertId": "-fsqzz0e2lplj",
  "resource": {
    "type": "bigquery_project",
    "labels": {
      "project_id": "analytics-capstone-project",
      "location": "US"
    }
  },
  "timestamp": "2022-05-20T02:11:48.832533Z",
  "severity": "ERROR",
  "logName": "projects/analytics-capstone-project/logs/cloudaudit.googleapis.com%2Fdata_access",
  "operation": {
    "id": "1653012707618-analytics-capstone-project:bquxjob_42de9715_180df3c3ba2",
    "producer": "bigquery.googleapis.com",
    "last": true
  },
  "receiveTimestamp": "2022-05-20T02:11:49.112933352Z"
}

我尝试上传的原始文件位于 this kaggle dataset 中,名为 heartrate_seconds_merged

我不确定如何解决这个问题,希望得到任何帮助!谢谢!

编辑:我尝试继续为该项目上传更多数据,但现在无法再创建 table。我尝试创建它们的方式有问题吗?

Google 对它们的各种日期和时间字段在摄取期间如何格式化非常挑剔。他们在文档 here.

中过于简短地提到了这一点

根据您的日志,错误是

"Could not parse '4/12/2016 7:21:00 AM' as TIMESTAMP".

根据 Google 的文档:

  • 日期部分可以格式化为YYYY-MM-DDYYYY/MM/DD
  • 时间戳部分的格式必须为HH:MM[:SS[.SSSSSS]](秒和秒的小数部分是可选的)。
  • 日期和时间必须用 space 或 'T' 分隔。
  • 可以选择在日期和时间后跟 UTC 偏移量或 UTC 时区指示符 (Z),您可以了解有关 timezones here 的更多信息。

那你会去哪里呢?不幸的是,日期时间类型也需要类似的 YYYY-[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.F]] 格式。假设您想坚持使用模式的时间戳类型并且您的时间戳以 UTC 显示,这意味着您将不得不预处理每个文件以将您的值从 4/12/2016 7:21:00 AM 转换为 2016/04/12 07:21:00(请注意,每个段的前导零不是可选的,但在这种情况下您可以选择删除秒数)。