在 azure devops 管道中找不到 Databricks 命令

Databricks command not found in azure devops pipeline

我正在尝试通过 Azure Devops 管道将文件复制到 Azure Databricks DBFS。以下是我正在使用的 yml 文件的片段:

stages:
- stage: MYBuild
  displayName: "My Build"
  jobs:
    - job: BuildwhlAndRunPytest
      pool:
        vmImage: 'ubuntu-16.04'

      steps:
      - task: UsePythonVersion@0
        displayName: 'Use Python 3.7'
        inputs:
          versionSpec: '3.7'
          addToPath: true
          architecture: 'x64'

      - script: |
          pip install pytest requests setuptools wheel pytest-cov
          pip install -U databricks-connect==7.3.*
        displayName: 'Load Python Dependencies'

      - checkout: self
        persistCredentials: true
        clean: true

      - script: |
          echo "y
          $(databricks-host)
          $(databricks-token)
          $(databricks-cluster)
          $(databricks-org-id)
          8787" | databricks-connect configure
          databricks-connect test
        env:
          databricks-token: $(databricks-token)
        displayName: 'Configure DBConnect'

      - script: |
          databricks fs cp test-proj/pyspark-lib/configs/config.ini dbfs:/configs/test-proj/config.ini

我在调用 databricks fs cp 命令的阶段收到以下错误:

/home/vsts/work/_temp/2278f7d5-1d96-4c4e-a501-77c07419773b.sh: line 7: databricks: command not found

但是,当我运行databricks-connect test时,是能够成功执行命令的。如果我在某处遗漏了一些步骤,请提供帮助。

databricks 命令位于 databricks-cli 包中,而不是 databricks-connect,因此您需要更改 pip install 命令。

此外,对于 databricks 命令,您只需设置环境变量 DATABRICKS_HOSTDATABRICKS_TOKEN 即可,如下所示:

- script: |
    pip install pytest requests setuptools wheel
    pip install -U databricks-cli
  displayName: 'Load Python Dependencies'

- script: |
    databricks fs cp ... dbfs:/...
  env:
    DATABRICKS_HOST: $(DATABRICKS_HOST)
    DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
  displayName: 'Copy artifacts'

P.S。这是一个 example on how to do CI/CD on Databricks + notebooks. You could be also interested in cicd-templates project.