Databricks Connect:自动接受许可提示

Databricks Connect: Automatically Accept License Prompt

我正在尝试编写一个 Dockerfile 来构建一个利用 Databricks Conenect 的容器。因此,我需要通过 Docker RUN 命令设置和安装 Databricks Connect。我有以下内容:

FROM python:3.8
COPY requirements.txt /tmp/
RUN apt-get update\
    && apt-get install software-properties-common -y\
    && apt-get update\
    && apt-add-repository "deb http://security.debian.org/debian-security stretch/updates main"\
    && apt-get update\
    && apt-get install openjdk-8-jdk -y
RUN pip install --requirement /tmp/requirements.txt\
    && databricks-connect configure\
    && databricks-connect test

作为产生我的问题的简化示例。步骤:databricks-connect configure 提示接受默认 N 的许可,因此抛出以下错误:

...
#14 1.345 Do you accept the above agreement? [y/N] Traceback (most recent call last):
#14 1.346   File "/usr/local/bin/databricks-connect", line 8, in <module>
#14 1.346     sys.exit(main())
#14 1.346   File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 281, in main
#14 1.346     configure()
#14 1.346   File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 119, in configure
#14 1.346     accept = input().strip()
#14 1.346 EOFError: EOF when reading a line
------
executor failed running [/bin/sh -c databricks-connect configure]: exit code: 1

如何将其作为 Docker 构建的一部分自动接受?

你需要使用这样的东西(从this demo偷来的),因为除了接受许可条款外,你还需要提供其他参数:

echo "y
$(databricks_host)
$(databricks_token)
$(cluster_id)
$(org_id)
15001" | databricks-connect configure

或者您可以只生成 ~/.databricks-connect 文件,它只是 JSON:

{
  "host": "https://host",
  "cluster_id": "cluster",
  "org_id": "org_id",
  "port": "15001"
}