使用 BigQuery 查询 BigTable 数据:如何制作正确的 table 定义文件

Querying BigTable data with BigQuery: how to make a correct table definition file

我尝试为 BigQuery 设置外部 tables,源是 BigTable。我可以创建 table 但在执行 sql 语句时出现此错误:

Error while reading data, error message: Error detected while parsing row starting at position: 6. Error: Data between close double quote (") and field separator.

很可能 table 定义文件是错误的。我花了最后一天修修补补,但我没有找到问题所在。有人可以指出我犯的错误吗?

谢谢

我的工作流程:

我的table定义文件:

我也尝试了它的不同变体。

{
"sourceFormat": "BIGTABLE",
"sourceUris": [
    "https://googleapis.com/bigtable/projects/ostabprj/instances/cryptorealtime/tables/cryptorealtime"
],
"bigtableOptions": {
    "columnFamilies" : [
        {
            "familyId": "market",
            "type": "STRING",
            "encoding": "TEXT"
        }
    ]
}

}

Bigtable 中的数据如下所示:

(来自本教程:https://www.cloudskillsboost.google/focuses/5570?locale=en&parent=catalog

----------------------------------------
XRP/USD#bitfinex#1641740466459#3848397969954
  market:delta                             @ 2022/01/09-15:01:06.459000
    "459"
  market:exchangeTime                      @ 2022/01/09-15:01:06.459000
    "1641740466000"
  market:market                            @ 2022/01/09-15:01:06.459000
    "bitfinex"
  market:orderType                         @ 2022/01/09-15:01:06.459000
    "BID"
  market:price                             @ 2022/01/09-15:01:06.459000
    "0.74557"
  market:volume                            @ 2022/01/09-15:01:06.459000
    "50"

我使用的指南:

https://cloud.google.com/bigquery/external-data-bigtable#permanent-tables

https://cloud.google.com/bigquery/external-table-definition#tabledef-bigtable

(链接已更改)

一位同事同时找到了解决问题的方法。 他还找到了一个 youtube 视频,其中涵盖了我的大部分问题。

https://www.youtube.com/watch?v=tW4h6-cQz9s

我犯了 2 个基本错误:

  • BigTable 和 BigQuery 不在同一台服务器上(美国和欧盟)
  • table 定义文件必须通过控制台上传,而不是上传到存储桶中

他对 table 定义文件的解决方案如下所示:

  • 'readRowKeyAsString'应该是真的
        "sourceFormat": "BIGTABLE",
        "sourceUris": [
            "https://googleapis.com/bigtable/projects/..."
        ],
        "bigtableOptions": {
            "readRowkeyAsString": "true",
            "columnFamilies": [
                {
                    "familyId": "market",
                    "columns":[
                        {
                            "qualifierString": "market",
                            "type":"STRING"
                        },
                        {
                            "qualifierString": "exchangeTime",
                            "type":"STRING"
                        },
                        {
                            "qualifierString": "delta",
                            "type":"STRING"
                        },
                        {
                            "qualifierString": "orderType",
                            "type":"STRING"
                        },
                        {
                            "qualifierString": "volume",
                            "type":"STRING"
                        },
                        {
                            "qualifierString": "price",
                            "type":"STRING"
                        }
                    ]
                } 
            ]
        }
    }

要在 BigQuery 中使用整个东西,需要一个数据视图

CREATE VIEW Table.Dataview AS
SELECT
    rowkey,
    market.delta as ts,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.delta.cell)), "") AS delta,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.market.cell)), "") AS market,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.exchangeTime.cell)), "") AS exchangeTime,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.orderType.cell)), "") AS orderType,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.price.cell)), "") AS price,
    ARRAY_TO_STRING(ARRAY(SELECT value FROM UNNEST(market.volume.cell)), "") AS volume
FROM Table.Data