Terraform Azure Databricks 提供程序错误
Terraform Azure Databricks Provider Error
我需要一些帮助来理解登录 Databricks 的各种形式。我正在使用 Terraform 预配 Azure Databricks
我想知道下面两个代码的区别
当我使用选项 1 时,出现如图所示的错误
选项 1:
required_providers {
azuread = "~> 1.0"
azurerm = "~> 2.0"
azuredevops = { source = "registry.terraform.io/microsoft/azuredevops", version = "~> 0.0" }
databricks = { source = "registry.terraform.io/databrickslabs/databricks", version = "~> 0.0" }
}
}
provider "random" {}
provider "azuread" {
tenant_id = var.project.arm.tenant.id
client_id = var.project.arm.client.id
client_secret = var.secret.arm.client.secret
}
provider "databricks" {
host = azurerm_databricks_workspace.db-workspace.workspace_url
azure_use_msi = true
}
resource "azurerm_databricks_workspace" "db-workspace" {
name = module.names-db-workspace.environment.databricks_workspace.name_unique
resource_group_name = module.resourcegroup.resource_group.name
location = module.resourcegroup.resource_group.location
sku = "premium"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
virtual_network_id = module.virtualnetwork["centralus"].virtual_network.self.id
public_subnet_name = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-1-public"].name
private_subnet_name = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-2-private"].name
public_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-public-nsg-db-sub-1-public"].id
private_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-private-nsg-db-sub-2-private"].id
}
tags = local.tags
}
Databricks 集群创建
resource "databricks_cluster" "dbcselfservice" {
cluster_name = format("adb-cluster-%s-%s", var.project.name, var.project.environment.name)
spark_version = var.spark_version
node_type_id = var.node_type_id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.db-workspace
]
}
Databricks 工作区 RBAC 权限
resource "databricks_group" "db-group" {
display_name = format("adb-users-%s", var.project.name)
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
resource "databricks_user" "dbuser" {
count = length(local.display_name)
display_name = local.display_name[count.index]
user_name = local.user_name[count.index]
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
将成员添加到 Databricks 管理员组
resource "databricks_group_member" "i-am-admin" {
for_each = toset(local.email_address)
group_id = data.databricks_group.admins.id
member_id = databricks_user.dbuser[index(local.email_address, each.key)].id
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
data "databricks_group" "admins" {
display_name = "admins"
depends_on = [
# resource.databricks_cluster.dbcselfservice,
resource.azurerm_databricks_workspace.db-workspace
]
}
我在申请 TF 时遇到的错误如下:
Error: User not authorized
with databricks_user.dbuser[1],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{
Error: User not authorized
with databricks_user.dbuser[0],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{
Error: cannot refresh AAD token: adal:Refresh request failed. Status Code = '500'. Response body: {"error":"server_error", "error_description":"Internal server error"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
with databricks_group.db-group,
on resources.adb.tf line 80, in resource "databricks_group" "db-group":
71: resource "databricks_group" "db-group"{
错误是因为下面这个块吗?
provider "databricks" {
host = azurerm_databricks_workspace.db-workspace.workspace_url
azure_use_msi = true
}
我只需要在单击门户中的 URL 时自动登录。那我该用什么?为什么我们需要提供两次数据块提供程序,一次在 required_providers 下,一次在提供程序“databricks”中?
我已经看到如果我不提供第二个提供者我会收到错误消息:
"authentication is not configured for provider"
azure_use_msi
选项主要用于在 managed identity assigned to them. All possible authentication options are described in the documenation, but simplest way is to use authentication via Azure CLI, so you just need to leave host
parameter in the provider block. If you don't have Azure CLI on that machine, you can use combination of host + personal access token 机器上执行的 CI/CD 管道。
如果您 运行 来自机器的代码分配了托管身份,那么您需要确保此身份已添加到工作区中,或者它具有 Contributor 访问权限 - 请参阅 Azure Databricks documentation了解更多详情。
如评论中所述,如果您使用的是 Azure CLI 身份验证,即 az login
使用您的用户名和密码,那么您可以使用以下代码:
terraform {
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.3.11"
}
}
}
provider "azurerm" {
features {}
}
provider "databricks" {
host = azurerm_databricks_workspace.example.workspace_url
}
resource "azurerm_databricks_workspace" "example" {
name = "DBW-ansuman"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
sku = "premium"
managed_resource_group_name = "ansuman-DBW-managed-without-lb"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
public_subnet_name = azurerm_subnet.public.name
private_subnet_name = azurerm_subnet.private.name
virtual_network_id = azurerm_virtual_network.example.id
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
}
tags = {
Environment = "Production"
Pricing = "Standard"
}
}
data "databricks_node_type" "smallest" {
local_disk = true
depends_on = [
azurerm_databricks_workspace.example
]
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_cluster" "dbcselfservice" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_group" "db-group" {
display_name = "adb-users-admin"
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_user" "dbuser" {
display_name = "Rahul Sharma"
user_name = "example@contoso.com"
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_group_member" "i-am-admin" {
group_id = databricks_group.db-group.id
member_id = databricks_user.dbuser.id
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
输出:
如果您使用 Service Principal
作为身份验证,那么您可以使用如下内容:
terraform {
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.3.11"
}
}
}
provider "azurerm" {
subscription_id = "948d4068-xxxx-xxxx-xxxx-e00a844e059b"
tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
client_secret = "inl7Q~Gvdxxxx-xxxx-xxxxyaGPF3uSoL"
features {}
}
provider "databricks" {
host = azurerm_databricks_workspace.example.workspace_url
azure_client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
azure_client_secret = "inl7Q~xxxx-xxxx-xxxxg6ntiyaGPF3uSoL"
azure_tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
}
resource "azurerm_databricks_workspace" "example" {
name = "DBW-ansuman"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
sku = "premium"
managed_resource_group_name = "ansuman-DBW-managed-without-lb"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
public_subnet_name = azurerm_subnet.public.name
private_subnet_name = azurerm_subnet.private.name
virtual_network_id = azurerm_virtual_network.example.id
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
}
tags = {
Environment = "Production"
Pricing = "Standard"
}
}
data "databricks_node_type" "smallest" {
local_disk = true
depends_on = [
azurerm_databricks_workspace.example
]
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_cluster" "dbcselfservice" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_group" "db-group" {
display_name = "adb-users-admin"
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_user" "dbuser" {
display_name = "Rahul Sharma"
user_name = "example@contoso.com"
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_group_member" "i-am-admin" {
group_id = databricks_group.db-group.id
member_id = databricks_user.dbuser.id
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
And why do we need to provide
two times databricks providers, once under required_providers and
again in provider "databricks"?
required_providers
用于从源下载和初始化所需的提供程序,即 Terraform Registry
。但是 Provider Block
用于进一步配置下载的提供程序,如描述 client_id、功能块等,可用于身份验证或其他配置。
我需要一些帮助来理解登录 Databricks 的各种形式。我正在使用 Terraform 预配 Azure Databricks 我想知道下面两个代码的区别 当我使用选项 1 时,出现如图所示的错误
选项 1:
required_providers {
azuread = "~> 1.0"
azurerm = "~> 2.0"
azuredevops = { source = "registry.terraform.io/microsoft/azuredevops", version = "~> 0.0" }
databricks = { source = "registry.terraform.io/databrickslabs/databricks", version = "~> 0.0" }
}
}
provider "random" {}
provider "azuread" {
tenant_id = var.project.arm.tenant.id
client_id = var.project.arm.client.id
client_secret = var.secret.arm.client.secret
}
provider "databricks" {
host = azurerm_databricks_workspace.db-workspace.workspace_url
azure_use_msi = true
}
resource "azurerm_databricks_workspace" "db-workspace" {
name = module.names-db-workspace.environment.databricks_workspace.name_unique
resource_group_name = module.resourcegroup.resource_group.name
location = module.resourcegroup.resource_group.location
sku = "premium"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
virtual_network_id = module.virtualnetwork["centralus"].virtual_network.self.id
public_subnet_name = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-1-public"].name
private_subnet_name = module.virtualnetwork["centralus"].virtual_network.subnets["db-sub-2-private"].name
public_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-public-nsg-db-sub-1-public"].id
private_subnet_network_security_group_association_id = module.virtualnetwork["centralus"].virtual_network.nsgs.associations.subnets["databricks-private-nsg-db-sub-2-private"].id
}
tags = local.tags
}
Databricks 集群创建
resource "databricks_cluster" "dbcselfservice" {
cluster_name = format("adb-cluster-%s-%s", var.project.name, var.project.environment.name)
spark_version = var.spark_version
node_type_id = var.node_type_id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.db-workspace
]
}
Databricks 工作区 RBAC 权限
resource "databricks_group" "db-group" {
display_name = format("adb-users-%s", var.project.name)
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
resource "databricks_user" "dbuser" {
count = length(local.display_name)
display_name = local.display_name[count.index]
user_name = local.user_name[count.index]
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
将成员添加到 Databricks 管理员组
resource "databricks_group_member" "i-am-admin" {
for_each = toset(local.email_address)
group_id = data.databricks_group.admins.id
member_id = databricks_user.dbuser[index(local.email_address, each.key)].id
depends_on = [
resource.azurerm_databricks_workspace.db-workspace
]
}
data "databricks_group" "admins" {
display_name = "admins"
depends_on = [
# resource.databricks_cluster.dbcselfservice,
resource.azurerm_databricks_workspace.db-workspace
]
}
我在申请 TF 时遇到的错误如下:
Error: User not authorized
with databricks_user.dbuser[1],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{
Error: User not authorized
with databricks_user.dbuser[0],
on resources.adb.tf line 80, in resource "databricks_user" "dbuser":
80: resource "databricks_user" "dbuser"{
Error: cannot refresh AAD token: adal:Refresh request failed. Status Code = '500'. Response body: {"error":"server_error", "error_description":"Internal server error"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.core.windows.net%2F
with databricks_group.db-group,
on resources.adb.tf line 80, in resource "databricks_group" "db-group":
71: resource "databricks_group" "db-group"{
错误是因为下面这个块吗?
provider "databricks" {
host = azurerm_databricks_workspace.db-workspace.workspace_url
azure_use_msi = true
}
我只需要在单击门户中的 URL 时自动登录。那我该用什么?为什么我们需要提供两次数据块提供程序,一次在 required_providers 下,一次在提供程序“databricks”中? 我已经看到如果我不提供第二个提供者我会收到错误消息:
"authentication is not configured for provider"
azure_use_msi
选项主要用于在 managed identity assigned to them. All possible authentication options are described in the documenation, but simplest way is to use authentication via Azure CLI, so you just need to leave host
parameter in the provider block. If you don't have Azure CLI on that machine, you can use combination of host + personal access token 机器上执行的 CI/CD 管道。
如果您 运行 来自机器的代码分配了托管身份,那么您需要确保此身份已添加到工作区中,或者它具有 Contributor 访问权限 - 请参阅 Azure Databricks documentation了解更多详情。
如评论中所述,如果您使用的是 Azure CLI 身份验证,即 az login
使用您的用户名和密码,那么您可以使用以下代码:
terraform {
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.3.11"
}
}
}
provider "azurerm" {
features {}
}
provider "databricks" {
host = azurerm_databricks_workspace.example.workspace_url
}
resource "azurerm_databricks_workspace" "example" {
name = "DBW-ansuman"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
sku = "premium"
managed_resource_group_name = "ansuman-DBW-managed-without-lb"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
public_subnet_name = azurerm_subnet.public.name
private_subnet_name = azurerm_subnet.private.name
virtual_network_id = azurerm_virtual_network.example.id
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
}
tags = {
Environment = "Production"
Pricing = "Standard"
}
}
data "databricks_node_type" "smallest" {
local_disk = true
depends_on = [
azurerm_databricks_workspace.example
]
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_cluster" "dbcselfservice" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_group" "db-group" {
display_name = "adb-users-admin"
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_user" "dbuser" {
display_name = "Rahul Sharma"
user_name = "example@contoso.com"
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_group_member" "i-am-admin" {
group_id = databricks_group.db-group.id
member_id = databricks_user.dbuser.id
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
输出:
如果您使用 Service Principal
作为身份验证,那么您可以使用如下内容:
terraform {
required_providers {
databricks = {
source = "databrickslabs/databricks"
version = "0.3.11"
}
}
}
provider "azurerm" {
subscription_id = "948d4068-xxxx-xxxx-xxxx-e00a844e059b"
tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
client_secret = "inl7Q~Gvdxxxx-xxxx-xxxxyaGPF3uSoL"
features {}
}
provider "databricks" {
host = azurerm_databricks_workspace.example.workspace_url
azure_client_id = "f6a2f33d-xxxx-xxxx-xxxx-d713a1bb37c0"
azure_client_secret = "inl7Q~xxxx-xxxx-xxxxg6ntiyaGPF3uSoL"
azure_tenant_id = "72f988bf-xxxx-xxxx-xxxx-2d7cd011db47"
}
resource "azurerm_databricks_workspace" "example" {
name = "DBW-ansuman"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
sku = "premium"
managed_resource_group_name = "ansuman-DBW-managed-without-lb"
public_network_access_enabled = true
custom_parameters {
no_public_ip = true
public_subnet_name = azurerm_subnet.public.name
private_subnet_name = azurerm_subnet.private.name
virtual_network_id = azurerm_virtual_network.example.id
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
}
tags = {
Environment = "Production"
Pricing = "Standard"
}
}
data "databricks_node_type" "smallest" {
local_disk = true
depends_on = [
azurerm_databricks_workspace.example
]
}
data "databricks_spark_version" "latest_lts" {
long_term_support = true
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_cluster" "dbcselfservice" {
cluster_name = "Shared Autoscaling"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = data.databricks_node_type.smallest.id
autotermination_minutes = 20
autoscale {
min_workers = 1
max_workers = 7
}
azure_attributes {
availability = "SPOT_AZURE"
first_on_demand = 1
spot_bid_max_price = 100
}
depends_on = [
azurerm_databricks_workspace.example
]
}
resource "databricks_group" "db-group" {
display_name = "adb-users-admin"
allow_cluster_create = true
allow_instance_pool_create = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_user" "dbuser" {
display_name = "Rahul Sharma"
user_name = "example@contoso.com"
workspace_access = true
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
resource "databricks_group_member" "i-am-admin" {
group_id = databricks_group.db-group.id
member_id = databricks_user.dbuser.id
depends_on = [
resource.azurerm_databricks_workspace.example
]
}
And why do we need to provide two times databricks providers, once under required_providers and again in provider "databricks"?
required_providers
用于从源下载和初始化所需的提供程序,即 Terraform Registry
。但是 Provider Block
用于进一步配置下载的提供程序,如描述 client_id、功能块等,可用于身份验证或其他配置。