如何在 Databricks 中导入文本文件

How to import text file in Data bricks



#write a file to DBFS using Python I/O APIs
with open("/dbfs/FileStore/tables/test_dbfs.txt", 'w') as f:
  f.write("Apache Spark is awesome!\n")
  f.write("End of example!")

# read the file
with open("/dbfs/tmp/test_dbfs.txt", "r") as f_read:
  for line in f_read:

错误 FileNotFoundError:[Errno 2] 没有这样的文件或目录:'/dbfs/FileStore/tables/test_dbfs.txt'

/dbfs 安装在 DBR >= 7.x 的社区版上不起作用 - 这是一个已知的限制。

要解决此限制,您需要使用驱动程序节点上的文件并使用 dbutils.fs.cp 命令 (docs) 上传或下载文件。所以你的写作将如下所示:

#write a file to local filesystem using Python I/O APIs
with open("'file:/tmp/local-path'", 'w') as f:
  f.write("Apache Spark is awesome!\n")
  f.write("End of example!")
# upload file to DBFS
dbutils.fs.cp('file:/tmp/local-path', 'dbfs:/FileStore/tables/test_dbfs.txt')

从 DBFS 读取将如下所示:

# copy file from DBFS to local file_system
dbutils.fs.cp('dbfs:/tmp/test_dbfs.txt', 'file:/tmp/local-path')
# read the file locally
with open("/tmp/local-path", "r") as f_read:
  for line in f_read: