Google Apps 脚本 - 如何将 JSON 数据流式传输到 BigQuery?

Google Apps Script - How to stream JSON data into BigQuery?

在此引用中 https://developers.google.com/apps-script/advanced/bigquery,

为了将 CSV 数据加载到 BigQuery 中,他们使用:

var file = DriveApp.getFileById(csvFileId);
  var data = file.getBlob().setContentType('application/octet-stream');

  // Create the data upload job.
  var job = {
    configuration: {
      load: {
        destinationTable: {
          projectId: projectId,
          datasetId: datasetId,
          tableId: tableId
        },
        skipLeadingRows: 1
      }
    }
  };
  job = BigQuery.Jobs.insert(job, projectId, data);

据我了解,他们向 BigQuery file.getBlob().setContentType('application/octet-stream'); 发送了一个不友好的 blob

如何在 Apps 脚本中将 JSON 发送到 BigQuery?

使用库 @google-cloud/bigquery(在 Apps 脚本之外的项目中使用),我可以这样做:

https://cloud.google.com/bigquery/streaming-data-into-bigquery#streaminginsertexamples

// Import the Google Cloud client library
const { BigQuery } = require('@google-cloud/bigquery')
const moment = require('moment')

exports.insertUsageLog = async (userId) => {
  const datasetId = 'usage'
  const tableId = 'logs'
  const rows = [
    // The JSON data is collected here
    {
      timestamp: moment.utc().toISOString(),
      userId,
      // Something else ...
    },
  ]

  // Create a client
  const bigqueryClient = new BigQuery()

  // Insert data into a table
  await bigqueryClient
    .dataset(datasetId)
    .table(tableId)
    .insert(rows)
  console.log(`Inserted ${rows.length} rows`)
}

BigQuery.Jobs.insert() 的数据负载必须是 blob。

您可以从 CSV 内容或 newline delimited JSON. Newline delimited JSON is a distinct form of JSON that is required by BigQuery. It is not natively supported by Apps Script. However, you should be able to convert standard JSON to that format by creating a custom replacer function and passing it as a parameter to JSON.stringify() 创建该 blob。或者,您可以利用现有的 Javascript 库(您可以通过 NPM 找到一些东西,或者只是在 Github 上搜索)。

生成换行符分隔的 JSON(作为字符串或字节数组)后,您需要使用 Utilities.newBlob() 将其转换为 blob 并将其传递给 BigQuery.Jobs.insert()方法。