在 Microsoft Databricks 上使用 pandas.to_json() 时出错
Error to use pandas.to_json() on Microsoft Databricks
有谁知道我们是否有其他方法可以将 pandas dataFrame 保存为 Microsoft Databricks 上的 Json 文件?
我正在尝试这个:
dataframe.to_json('wasbs://<container>@<storage_account>.blob.core.windows.net/<file_name.json>', orient='records')
但是它returns我"FileNotFoundError: [Errno 2] No such file or directory:"
也试过本地保存,但是returns同样的错误。
如果您正在使用 python ,那么您可以执行以下操作
from azure.storage.blob import (
BlockBlobService
)
import pandas as pd
import io
output = io.StringIO()
head = ["col1" , "col2" , "col3"]
l = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]
df = pd.DataFrame (l , columns = head)
print df
output = df.to_json(orient='records')
print(output)
accountName = "***"
accountKey = "***"
containerName = "test1"
blobName = "test3.json"
blobService = BlockBlobService(account_name=accountName, account_key=accountKey)
blobService.create_blob_from_text('test1', 'OutFilePy.csv', output)
或者您可以挂载 blobstorage 并进一步使用它
四个安装在python
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net",
mount_point = "/mnt/<mount-name>",
extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")})
在 scala 中
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>",
mountPoint = "/mnt/<mount-name>",
extraConfigs = Map("<conf-key>" -> dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")))
终于可以访问挂载文件夹了
data = pd.read_csv('/dbfs/mnt/mnt-name/opportunity.csv')
类似
df.to_json('/dbfs/mnt/mnt-name/opportunity.json',orient='records')
希望对您有所帮助。
有谁知道我们是否有其他方法可以将 pandas dataFrame 保存为 Microsoft Databricks 上的 Json 文件?
我正在尝试这个:
dataframe.to_json('wasbs://<container>@<storage_account>.blob.core.windows.net/<file_name.json>', orient='records')
但是它returns我"FileNotFoundError: [Errno 2] No such file or directory:"
也试过本地保存,但是returns同样的错误。
如果您正在使用 python ,那么您可以执行以下操作
from azure.storage.blob import (
BlockBlobService
)
import pandas as pd
import io
output = io.StringIO()
head = ["col1" , "col2" , "col3"]
l = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]
df = pd.DataFrame (l , columns = head)
print df
output = df.to_json(orient='records')
print(output)
accountName = "***"
accountKey = "***"
containerName = "test1"
blobName = "test3.json"
blobService = BlockBlobService(account_name=accountName, account_key=accountKey)
blobService.create_blob_from_text('test1', 'OutFilePy.csv', output)
或者您可以挂载 blobstorage 并进一步使用它
四个安装在python
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net",
mount_point = "/mnt/<mount-name>",
extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")})
在 scala 中
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>",
mountPoint = "/mnt/<mount-name>",
extraConfigs = Map("<conf-key>" -> dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")))
终于可以访问挂载文件夹了
data = pd.read_csv('/dbfs/mnt/mnt-name/opportunity.csv')
类似
df.to_json('/dbfs/mnt/mnt-name/opportunity.json',orient='records')
希望对您有所帮助。