Bigquery 使用 Terraform 从 Sheet 个文件创建 table
Bigquery create table from Sheet files using Terraform
我正在尝试使用 Terraform ingesting data from Google Sheets 创建 BQ table 这是我的 external_data_configuration 块
resource "google_bigquery_table" "sheet" {
dataset_id = google_bigquery_dataset.bq-dataset.dataset_id
table_id = "sheet"
external_data_configuration {
autodetect = true
source_format = "GOOGLE_SHEETS"
google_sheets_options {
skip_leading_rows = 1
}
source_uris = [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxx",
]
}
我创建了文件 public 但是当我尝试创建 table 时出现错误:
Error: googleapi: Error 400: Error while reading table: sheet, error
message: Failed to read the spreadsheet. Errors: No OAuth token with
Google Drive scope was found., invalid
我阅读了 Terraform documentation,似乎我需要在我的文件中指定 access_token 和 scopes provider.tf 文件 我只是不知道该怎么做,因为我认为它会与我当前的身份验证方法(服务帐户)冲突
解决方案
将范围参数添加到您的 provider.tf
provider "google" {
credentials = "${file("${var.path}/secret.json")}"
scopes = ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/bigquery"]
project = "${var.project_id}"
region = "${var.gcp_region}"
}
您需要为 Google Driver 和 Bigquery 添加作用域
我怀疑您只需要提供范围,同时保留现有的服务帐户凭据。服务帐户凭据文件未指定范围。根据 terraform 文档,默认使用以下范围:
> https://www.googleapis.com/auth/compute
> https://www.googleapis.com/auth/cloud-platform
> https://www.googleapis.com/auth/ndev.clouddns.readwrite
> https://www.googleapis.com/auth/devstorage.full_control
> https://www.googleapis.com/auth/userinfo.email
默认情况下,大多数 GCP 服务接受并使用云平台范围。但是,Google Drive 不 accept/use 云平台范围,因此 BigQuery 中的这一特殊功能需要指定额外的范围。为了完成这项工作,您应该使用 Google Drive scope https://www.googleapis.com/auth/drive
(relevant BQ documentation). For a more exhaustive list of documented scopes, see https://developers.google.com/identity/protocols/oauth2/scopes
扩充默认的 terraform 范围列表
访问令牌意味着您已经完成身份验证流程并提供了必要的范围,因此您同时提供范围和令牌是没有意义的。您要么生成具有范围的令牌,要么使用具有其他范围的服务帐户。
希望对您有所帮助。
示例:
resource "google_service_account" "gdrive-connector" {
project = "test-project"
account_id = "gdrive-connector"
display_name = "Service account Google Drive transfers"
}
data "google_service_account_access_token" "gdrive-connector" {
target_service_account = google_service_account.gdrive-connector.email
scopes = ["https://www.googleapis.com/auth/drive", "https://www.googleapis.com/auth/bigquery"]
lifetime = "300s"
}
provider "google" {
alias = "gdrive-connector"
access_token = data.google_service_account_access_token.gdrive-connector.access_token
}
resource "google_bigquery_dataset_iam_member" "gdrive-connector" {
project = "test-project"
dataset_id = "test-dataset"
role = "roles/bigquery.dataOwner"
member = "serviceAccount:${google_service_account.gdrive-connector.email}"
}
resource "google_project_iam_member" "gdrive-connector" {
project = "test-project"
role = "roles/bigquery.jobUser"
member = "serviceAccount:${google_service_account.gdrive-connector.email}"
}
resource "google_bigquery_table" "sheets_table" {
provider = google.gdrive-connector
project = "test-project"
dataset_id = "test-dataset"
table_id = "sheets_table"
external_data_configuration {
autodetect = true
source_format = "GOOGLE_SHEETS"
google_sheets_options {
skip_leading_rows = 1
}
source_uris = [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxx/edit?usp=sharing",
]
}
}
我正在尝试使用 Terraform ingesting data from Google Sheets 创建 BQ table 这是我的 external_data_configuration 块
resource "google_bigquery_table" "sheet" {
dataset_id = google_bigquery_dataset.bq-dataset.dataset_id
table_id = "sheet"
external_data_configuration {
autodetect = true
source_format = "GOOGLE_SHEETS"
google_sheets_options {
skip_leading_rows = 1
}
source_uris = [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxx",
]
}
我创建了文件 public 但是当我尝试创建 table 时出现错误:
Error: googleapi: Error 400: Error while reading table: sheet, error message: Failed to read the spreadsheet. Errors: No OAuth token with Google Drive scope was found., invalid
我阅读了 Terraform documentation,似乎我需要在我的文件中指定 access_token 和 scopes provider.tf 文件 我只是不知道该怎么做,因为我认为它会与我当前的身份验证方法(服务帐户)冲突
解决方案
将范围参数添加到您的 provider.tf
provider "google" {
credentials = "${file("${var.path}/secret.json")}"
scopes = ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/bigquery"]
project = "${var.project_id}"
region = "${var.gcp_region}"
}
您需要为 Google Driver 和 Bigquery 添加作用域
我怀疑您只需要提供范围,同时保留现有的服务帐户凭据。服务帐户凭据文件未指定范围。根据 terraform 文档,默认使用以下范围:
> https://www.googleapis.com/auth/compute
> https://www.googleapis.com/auth/cloud-platform
> https://www.googleapis.com/auth/ndev.clouddns.readwrite
> https://www.googleapis.com/auth/devstorage.full_control
> https://www.googleapis.com/auth/userinfo.email
默认情况下,大多数 GCP 服务接受并使用云平台范围。但是,Google Drive 不 accept/use 云平台范围,因此 BigQuery 中的这一特殊功能需要指定额外的范围。为了完成这项工作,您应该使用 Google Drive scope https://www.googleapis.com/auth/drive
(relevant BQ documentation). For a more exhaustive list of documented scopes, see https://developers.google.com/identity/protocols/oauth2/scopes
访问令牌意味着您已经完成身份验证流程并提供了必要的范围,因此您同时提供范围和令牌是没有意义的。您要么生成具有范围的令牌,要么使用具有其他范围的服务帐户。
希望对您有所帮助。
示例:
resource "google_service_account" "gdrive-connector" {
project = "test-project"
account_id = "gdrive-connector"
display_name = "Service account Google Drive transfers"
}
data "google_service_account_access_token" "gdrive-connector" {
target_service_account = google_service_account.gdrive-connector.email
scopes = ["https://www.googleapis.com/auth/drive", "https://www.googleapis.com/auth/bigquery"]
lifetime = "300s"
}
provider "google" {
alias = "gdrive-connector"
access_token = data.google_service_account_access_token.gdrive-connector.access_token
}
resource "google_bigquery_dataset_iam_member" "gdrive-connector" {
project = "test-project"
dataset_id = "test-dataset"
role = "roles/bigquery.dataOwner"
member = "serviceAccount:${google_service_account.gdrive-connector.email}"
}
resource "google_project_iam_member" "gdrive-connector" {
project = "test-project"
role = "roles/bigquery.jobUser"
member = "serviceAccount:${google_service_account.gdrive-connector.email}"
}
resource "google_bigquery_table" "sheets_table" {
provider = google.gdrive-connector
project = "test-project"
dataset_id = "test-dataset"
table_id = "sheets_table"
external_data_configuration {
autodetect = true
source_format = "GOOGLE_SHEETS"
google_sheets_options {
skip_leading_rows = 1
}
source_uris = [
"https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxx/edit?usp=sharing",
]
}
}