使用 Databricks 在 Apache Spark 中安装 Azure Data Lake 时出错
Error Mounting Azure Data Lake in Apache Spark using Databricks
我正在尝试在 Apache Spark
中使用以下 Python 代码安装我们的 Azure Data Lake
def check(mntPoint):
a= []
for test in dbutils.fs.mounts():
a.append(test.mountPoint)
result = a.count(mntPoint)
return result
mount = "/mnt/lake"
if check(mount)==1:
resultMsg = "<div>%s is already mounted. </div>" % mount
else:
dbutils.fs.mount(
source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
mount_point = mount,
extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":""})
resultMsg = "<div>%s was mounted. </div>" % mount
displayHTML(resultMsg)
但我不断收到以下错误:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.IllegalArgumentException: Storage Key is not a valid base64 encoded string.
完整错误如下:
ExecutionError Traceback (most recent call last)
<command-3313750897057283> in <module>
4 resultMsg = "<div>%s is already mounted. </div>" % mount
5 else:
----> 6 dbutils.fs.mount(
7 source = "wasbs://root@adlsprexxxxxxxkadlsdev.blob.core.windows.net",
8 mount_point = mount,
/local_disk0/tmp/1619799109257-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
322 exc.__context__ = None
323 exc.__cause__ = None
--> 324 raise exc
325 return f_with_exception_handling
326
有人可以告诉我如何解决这个问题吗?
您需要提供存储密钥,而现在您有空字符串。通常人们将存储密钥放入 Azure KeyVault(并将其安装为秘密范围)或使用 Databricks-baked 秘密范围,然后通过 dbutils.secrets.get
访问该存储密钥(如 documentation 所示) :
dbutils.fs.mount(
source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
mount_point = mount,
extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":
dbuitils.secrets.get(scope_name, secret_name)})
我正在尝试在 Apache Spark
中使用以下 Python 代码安装我们的 Azure Data Lakedef check(mntPoint):
a= []
for test in dbutils.fs.mounts():
a.append(test.mountPoint)
result = a.count(mntPoint)
return result
mount = "/mnt/lake"
if check(mount)==1:
resultMsg = "<div>%s is already mounted. </div>" % mount
else:
dbutils.fs.mount(
source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
mount_point = mount,
extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":""})
resultMsg = "<div>%s was mounted. </div>" % mount
displayHTML(resultMsg)
但我不断收到以下错误:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.IllegalArgumentException: Storage Key is not a valid base64 encoded string.
完整错误如下:
ExecutionError Traceback (most recent call last)
<command-3313750897057283> in <module>
4 resultMsg = "<div>%s is already mounted. </div>" % mount
5 else:
----> 6 dbutils.fs.mount(
7 source = "wasbs://root@adlsprexxxxxxxkadlsdev.blob.core.windows.net",
8 mount_point = mount,
/local_disk0/tmp/1619799109257-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
322 exc.__context__ = None
323 exc.__cause__ = None
--> 324 raise exc
325 return f_with_exception_handling
326
有人可以告诉我如何解决这个问题吗?
您需要提供存储密钥,而现在您有空字符串。通常人们将存储密钥放入 Azure KeyVault(并将其安装为秘密范围)或使用 Databricks-baked 秘密范围,然后通过 dbutils.secrets.get
访问该存储密钥(如 documentation 所示) :
dbutils.fs.mount(
source = "wasbs://root@adlsprexxxxxdlsdev.blob.core.windows.net",
mount_point = mount,
extra_configs = {"fs.azure.account.key.adlspretxxxxdlsdev.blob.core.windows.net":
dbuitils.secrets.get(scope_name, secret_name)})