导入数据存储备份时,大查询加载失败并显示错误字符 (ASCII 0)

Big query load fails with Bad Character (ASCII 0) while importing Datastore backup

这可能看起来像是已经讨论过的场景。我正在尝试使用 Talend tBigQueryBulkExec 组件将 Google App Engine DataStore 备份加载到 BQ,这与 BQ Shell CLI 相同。它连接到 BQ 并尝试从 GCS 读取文件并移动到组件设置中给出的定义 Dataset.Tablename。

错误信息:

location":"File: 0 / Line:8 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"}

Entire message:

{"configuration":{"load":{"createDisposition":"CREATE_NEVER","destinationTable":{"datasetId":"sample_red","projectId":" test","tableId":"bqload1"},"schema":{"fields":[{"name":"file","type":"STRING"}]},"skipLeadingRows":1,"sourceUris":["gs:// test.appspot.com/bucket/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUFkbWluX09wZXJhdGlvbhiB64MBDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.Challenge.backup_info"],"writeDisposition":"WRITE_TRUNCATE"}},"etag":"\"AJDc2PKvhXhnNlIwTi02BO3aoe8/1ZnlNbMA0eEnHxZQC_gKepG8Mio\"","id":" test:job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","jobReference":{"jobId":"job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","projectId":"test"},"kind":"bigquery#job","selfLink":"https://www.googleapis.com/bigquery/v2/projects/buckett/jobs/job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","statistics":{"endTime":"1427358416307","startTime":"1427358414687","creationTime":"1427358397621","load":{"inputFiles":"1","inputFileBytes":"565","outputRows":"0","outputBytes":"0"}},"status":{"errorResult":{"location":"File: 0 / Line:11 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u000Bcontent>","reason":"invalid"},"errors":[{"location":"File: 0 / Line:5 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u0006status\u0012>","reason":"invalid"},{"location":"File: 0 / Line:6 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\tstartDa>","reason":"invalid"},{"location":"File: 0 / Line:8 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"},{"location":"File: 0 / Line:10 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"},{"location":"File: 0 / Line:11 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u000Bcontent>","reason":"invalid"}],"state":"DONE"},"user_email":"xx@gmail.com"}

我从其他帖子上看到 Bad Character ASCII 是一个错误,将在下一个版本中修复,还没有完成吗?

看起来你有一个 unicode 制表符,Talend 无法正确解析它,因为它需要 ASCII 文本。

如果您转到 tBigQueryBulkExec 组件的高级设置,应该有一个编码选项。如果您将其设置为 "utf-8",现在应该可以使用。