Error: Databricks API requires you to set `host` property
Error: Databricks API requires you to set `host` property
相关问题:Terraform Databricks AWS instance profile - "authentication is not configured for provider"
解决该问题中的错误并继续操作后,我开始在多个不同的操作中遇到以下错误(创建数据块实例配置文件、查询 terraform 数据块数据源,如 databricks_current_user
或 databricks_spark_version
)等:
Error: cannot create instance profile: Databricks API (/api/2.0/instance-profiles/add) requires you to set `host` property (or DATABRICKS_HOST env variable) to result of `databricks_mws_workspaces.this.workspace_url`. This error may happen if you're using provider in both normal and multiworkspace mode. Please refactor your code into different modules. Runnable example that we use for integration testing can be found in this repository at https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/guides/aws-workspace
我能够在 Databricks 工作区管理控制台中手动创建实例配置文件,并且能够在其中创建集群和 运行 笔记本。
相关代码:
main.tf:
module "create-workspace" {
source = "./modules/create-workspace"
env = var.env
region = var.region
databricks_host = var.databricks_host
databricks_account_username = var.databricks_account_username
databricks_account_password = var.databricks_account_password
databricks_account_id = var.databricks_account_id
}
providers-main.tf:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.4"
}
aws = {
source = "hashicorp/aws"
version = ">= 3.49.0"
}
}
}
provider "aws" {
region = var.region
profile = var.aws_profile
}
provider "databricks" {
host = var.databricks_host
token = var.databricks_manually_created_workspace_token
}
modules/create-workspace/providers.tf:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.4"
}
aws = {
source = "hashicorp/aws"
version = ">= 3.49.0"
}
}
}
provider "aws" {
region = var.region
profile = var.aws_profile
}
provider "databricks" {
host = var.databricks_host
# token = var.databricks_manually_created_workspace_token - doesn't make a difference switching from username/password to token
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
provider "databricks" {
alias = "mws"
# host =
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
modules/create-workspace/databricks-workspace.tf:
resource "databricks_mws_credentials" "this" {
provider = databricks.mws
account_id = var.databricks_account_id
role_arn = aws_iam_role.cross_account_role.arn
credentials_name = "${local.prefix}-creds"
depends_on = [aws_iam_role_policy.this]
}
resource "databricks_mws_workspaces" "this" {
provider = databricks.mws
account_id = var.databricks_account_id
aws_region = var.region
workspace_name = local.prefix
deployment_name = local.prefix
credentials_id = databricks_mws_credentials.this.credentials_id
storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
network_id = databricks_mws_networks.this.network_id
}
modules/create-workspace/IAM.tf:
data "databricks_aws_assume_role_policy" "this" {
external_id = var.databricks_account_id
}
resource "aws_iam_role" "cross_account_role" {
name = "${local.prefix}-crossaccount"
assume_role_policy = data.databricks_aws_assume_role_policy.this.json
}
resource "time_sleep" "wait" {
depends_on = [
aws_iam_role.cross_account_role]
create_duration = "10s"
}
data "databricks_aws_crossaccount_policy" "this" {}
resource "aws_iam_role_policy" "this" {
name = "${local.prefix}-policy"
role = aws_iam_role.cross_account_role.id
policy = data.databricks_aws_crossaccount_policy.this.json
}
data "aws_iam_policy_document" "pass_role_for_s3_access" {
statement {
effect = "Allow"
actions = ["iam:PassRole"]
resources = [aws_iam_role.cross_account_role.arn]
}
}
resource "aws_iam_policy" "pass_role_for_s3_access" {
name = "databricks-shared-pass-role-for-s3-access"
path = "/"
policy = data.aws_iam_policy_document.pass_role_for_s3_access.json
}
resource "aws_iam_role_policy_attachment" "cross_account" {
policy_arn = aws_iam_policy.pass_role_for_s3_access.arn
role = aws_iam_role.cross_account_role.name
}
resource "aws_iam_instance_profile" "shared" {
name = "databricks-shared-instance-profile"
role = aws_iam_role.cross_account_role.name
}
resource "databricks_instance_profile" "shared" {
instance_profile_arn = aws_iam_instance_profile.shared.arn
depends_on = [databricks_mws_workspaces.this]
}
在这种情况下,问题是您需要有两个 Databricks 提供程序:
- 用于配置 Databricks 工作区本身 - 它使用帐户 ID、用户名和密码
- 用于在 Databricks 工作区内配置资源 - 它使用主机和令牌
其中一个提供程序需要使用别名进行声明,以便 Terraform 可以将它们彼此区分开来。 Databricks 提供程序的文档 shows how to do that。但问题是 Terraform 会尝试尽可能并行应用所有更改,因为它不知道资源之间的依赖关系,直到您明确使用 depends_on
,并在知道主机之前尝试创建 Databricks 资源Databricks 工作区的值(即使它已经创建)。
很遗憾,无法将 depends_on
放入提供程序块中。因此,目前避免此类问题的建议是将代码拆分为多个模块:
- 创建 Databricks 工作区和 returns 主机和令牌的模块
- 创建 Databricks 对象的模块,提供程序从接收到的初始化 host/token
此外,Terraform doc recommends 提供程序的初始化没有发生在模块中 - 最好在 top-level 模板中声明所有提供程序的别名,并将提供程序显式传递给模块(参见下面的示例).在这种情况下,模块应该只有所需模块的声明,而不是它们的配置。
例如,top-level 模板可能如下所示:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.5"
}
}
}
provider "databricks" {
host = var.databricks_host
token = var.token
}
provider "databricks" {
alias = "mws"
host = "https://accounts.cloud.databricks.com"
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
module "workspace" {
source = "./workspace"
providers = {
databricks = databricks.workspace
}}
module "databricks" {
depends_on = [ module.workspace ]
source = "./databricks"
# No provider block required as we're using default provider
}
模块本身是这样的:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = ">= 0.4.4"
}
}
}
resource "databricks_cluster" {
...
}
相关问题:Terraform Databricks AWS instance profile - "authentication is not configured for provider"
解决该问题中的错误并继续操作后,我开始在多个不同的操作中遇到以下错误(创建数据块实例配置文件、查询 terraform 数据块数据源,如 databricks_current_user
或 databricks_spark_version
)等:
Error: cannot create instance profile: Databricks API (/api/2.0/instance-profiles/add) requires you to set `host` property (or DATABRICKS_HOST env variable) to result of `databricks_mws_workspaces.this.workspace_url`. This error may happen if you're using provider in both normal and multiworkspace mode. Please refactor your code into different modules. Runnable example that we use for integration testing can be found in this repository at https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs/guides/aws-workspace
我能够在 Databricks 工作区管理控制台中手动创建实例配置文件,并且能够在其中创建集群和 运行 笔记本。
相关代码:
main.tf:
module "create-workspace" {
source = "./modules/create-workspace"
env = var.env
region = var.region
databricks_host = var.databricks_host
databricks_account_username = var.databricks_account_username
databricks_account_password = var.databricks_account_password
databricks_account_id = var.databricks_account_id
}
providers-main.tf:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.4"
}
aws = {
source = "hashicorp/aws"
version = ">= 3.49.0"
}
}
}
provider "aws" {
region = var.region
profile = var.aws_profile
}
provider "databricks" {
host = var.databricks_host
token = var.databricks_manually_created_workspace_token
}
modules/create-workspace/providers.tf:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.4"
}
aws = {
source = "hashicorp/aws"
version = ">= 3.49.0"
}
}
}
provider "aws" {
region = var.region
profile = var.aws_profile
}
provider "databricks" {
host = var.databricks_host
# token = var.databricks_manually_created_workspace_token - doesn't make a difference switching from username/password to token
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
provider "databricks" {
alias = "mws"
# host =
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
modules/create-workspace/databricks-workspace.tf:
resource "databricks_mws_credentials" "this" {
provider = databricks.mws
account_id = var.databricks_account_id
role_arn = aws_iam_role.cross_account_role.arn
credentials_name = "${local.prefix}-creds"
depends_on = [aws_iam_role_policy.this]
}
resource "databricks_mws_workspaces" "this" {
provider = databricks.mws
account_id = var.databricks_account_id
aws_region = var.region
workspace_name = local.prefix
deployment_name = local.prefix
credentials_id = databricks_mws_credentials.this.credentials_id
storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
network_id = databricks_mws_networks.this.network_id
}
modules/create-workspace/IAM.tf:
data "databricks_aws_assume_role_policy" "this" {
external_id = var.databricks_account_id
}
resource "aws_iam_role" "cross_account_role" {
name = "${local.prefix}-crossaccount"
assume_role_policy = data.databricks_aws_assume_role_policy.this.json
}
resource "time_sleep" "wait" {
depends_on = [
aws_iam_role.cross_account_role]
create_duration = "10s"
}
data "databricks_aws_crossaccount_policy" "this" {}
resource "aws_iam_role_policy" "this" {
name = "${local.prefix}-policy"
role = aws_iam_role.cross_account_role.id
policy = data.databricks_aws_crossaccount_policy.this.json
}
data "aws_iam_policy_document" "pass_role_for_s3_access" {
statement {
effect = "Allow"
actions = ["iam:PassRole"]
resources = [aws_iam_role.cross_account_role.arn]
}
}
resource "aws_iam_policy" "pass_role_for_s3_access" {
name = "databricks-shared-pass-role-for-s3-access"
path = "/"
policy = data.aws_iam_policy_document.pass_role_for_s3_access.json
}
resource "aws_iam_role_policy_attachment" "cross_account" {
policy_arn = aws_iam_policy.pass_role_for_s3_access.arn
role = aws_iam_role.cross_account_role.name
}
resource "aws_iam_instance_profile" "shared" {
name = "databricks-shared-instance-profile"
role = aws_iam_role.cross_account_role.name
}
resource "databricks_instance_profile" "shared" {
instance_profile_arn = aws_iam_instance_profile.shared.arn
depends_on = [databricks_mws_workspaces.this]
}
在这种情况下,问题是您需要有两个 Databricks 提供程序:
- 用于配置 Databricks 工作区本身 - 它使用帐户 ID、用户名和密码
- 用于在 Databricks 工作区内配置资源 - 它使用主机和令牌
其中一个提供程序需要使用别名进行声明,以便 Terraform 可以将它们彼此区分开来。 Databricks 提供程序的文档 shows how to do that。但问题是 Terraform 会尝试尽可能并行应用所有更改,因为它不知道资源之间的依赖关系,直到您明确使用 depends_on
,并在知道主机之前尝试创建 Databricks 资源Databricks 工作区的值(即使它已经创建)。
很遗憾,无法将 depends_on
放入提供程序块中。因此,目前避免此类问题的建议是将代码拆分为多个模块:
- 创建 Databricks 工作区和 returns 主机和令牌的模块
- 创建 Databricks 对象的模块,提供程序从接收到的初始化 host/token
此外,Terraform doc recommends 提供程序的初始化没有发生在模块中 - 最好在 top-level 模板中声明所有提供程序的别名,并将提供程序显式传递给模块(参见下面的示例).在这种情况下,模块应该只有所需模块的声明,而不是它们的配置。
例如,top-level 模板可能如下所示:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.4.5"
}
}
}
provider "databricks" {
host = var.databricks_host
token = var.token
}
provider "databricks" {
alias = "mws"
host = "https://accounts.cloud.databricks.com"
username = var.databricks_account_username
password = var.databricks_account_password
account_id = var.databricks_account_id
}
module "workspace" {
source = "./workspace"
providers = {
databricks = databricks.workspace
}}
module "databricks" {
depends_on = [ module.workspace ]
source = "./databricks"
# No provider block required as we're using default provider
}
模块本身是这样的:
terraform {
required_version = ">= 1.1.0"
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = ">= 0.4.4"
}
}
}
resource "databricks_cluster" {
...
}