python 函数从子目录读取文件
python function read file from subdirectory
我正在尝试编写此函数,以便我可以传递文件或文件夹并使用 pandas 从中读取。
import pandas as pd
import os
path = os.getcwd()
path = '..' #this would be root
revenue_folder = '../Data/Revenue'
random_file = '2017-08-01_Aug.csv'
def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
for f in files:
with open(os.path.join(root, csv_file)) as f1:
pd.read_csv(f1, sep = ';')
print(f1)
csv_reader(random_file)
FileNotFoundError: [Errno 2] No such file or directory: '../2017-08-01_Aug.csv'
此后我尝试进行一些更改,但现在的问题是它转到了另一个子目录。我想要的是遍历我所有的文件和文件夹,找到所需的文件,然后读取它。要清楚我想要的文件在 revenue_folder.
def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
for f in files:
base, ext = os.path.splitext(f)
if ('csv' in ext):
print (root)
with open(os.path.join(root, csv_file)) as f1:
pd.read_excel(f1, sep = ':')
print(f1)
csv_reader(random_file)
FileNotFoundError: [Errno 2] No such file or directory: './Data/Backlog/2017-08-01_Aug.csv'
经过编辑,问题的整个场景都发生了变化。下面的代码通过 Files
和 Folders
递归搜索以查找符合条件
的文件
def get_all_matching_files(root_path, matching_criteria):
"""
Gets all files that match a string criteria.
:param root_path: the root directory path from where searching needs to begin
:param matching_criteria: a string or a tuple of strings that needs to be matched in the file n
:return: a list of all matching files
"""
return [os.path.join(root, name) for root, dirs, files in os.walk(root_path) for name in files
if name.endswith(matching_criteria)]
def main(root_path):
"""
The main method to start finding the file.
:param root_path: The root dir where the search needs to be started.
:return: None
"""
if len(root_path) < 2:
raise ValueError('The root path must be more than 2 characters')
all_matching_files = get_all_matching_files(root_path, '2017-08-01_Aug.csv')
if not all_matching_files:
print('no files were found matching that criteria.')
return
for matched_files in all_matching_files:
data_frame = pd.read_csv(matched_files)
# your code here on what to do with the dataframe
print('Completed search!')
if __name__ == '__main__':
root_dir_path = os.getcwd()
main(root_dir_path)
注意 endswith()
我曾经用来匹配文件,这样您就可以灵活地发送文件 extension
(.csv
) 并获得所有文件。此外,endswith()
也接受一个元组,因此创建一个包含所有文件或扩展名的 tuple
,该方法将起作用。
其他建议:
当尝试使用 pandas 读取文件时,您没有输入代码:
with open(os.path.join(root, csv_file)) as f1:
pd.read_csv(f1, sep = ';')
print(f1)
相反你需要做:
# set the file path into a variable to make code readable
filepath = os.path.join(revenue_folder, random_file)
# read the data and store it into a variable of type DataFrame
my_dataframe_from_file = pd.read_csv(filepath,sep=';')
我正在尝试编写此函数,以便我可以传递文件或文件夹并使用 pandas 从中读取。
import pandas as pd
import os
path = os.getcwd()
path = '..' #this would be root
revenue_folder = '../Data/Revenue'
random_file = '2017-08-01_Aug.csv'
def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
for f in files:
with open(os.path.join(root, csv_file)) as f1:
pd.read_csv(f1, sep = ';')
print(f1)
csv_reader(random_file)
FileNotFoundError: [Errno 2] No such file or directory: '../2017-08-01_Aug.csv'
此后我尝试进行一些更改,但现在的问题是它转到了另一个子目录。我想要的是遍历我所有的文件和文件夹,找到所需的文件,然后读取它。要清楚我想要的文件在 revenue_folder.
def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
for f in files:
base, ext = os.path.splitext(f)
if ('csv' in ext):
print (root)
with open(os.path.join(root, csv_file)) as f1:
pd.read_excel(f1, sep = ':')
print(f1)
csv_reader(random_file)
FileNotFoundError: [Errno 2] No such file or directory: './Data/Backlog/2017-08-01_Aug.csv'
经过编辑,问题的整个场景都发生了变化。下面的代码通过 Files
和 Folders
递归搜索以查找符合条件
def get_all_matching_files(root_path, matching_criteria):
"""
Gets all files that match a string criteria.
:param root_path: the root directory path from where searching needs to begin
:param matching_criteria: a string or a tuple of strings that needs to be matched in the file n
:return: a list of all matching files
"""
return [os.path.join(root, name) for root, dirs, files in os.walk(root_path) for name in files
if name.endswith(matching_criteria)]
def main(root_path):
"""
The main method to start finding the file.
:param root_path: The root dir where the search needs to be started.
:return: None
"""
if len(root_path) < 2:
raise ValueError('The root path must be more than 2 characters')
all_matching_files = get_all_matching_files(root_path, '2017-08-01_Aug.csv')
if not all_matching_files:
print('no files were found matching that criteria.')
return
for matched_files in all_matching_files:
data_frame = pd.read_csv(matched_files)
# your code here on what to do with the dataframe
print('Completed search!')
if __name__ == '__main__':
root_dir_path = os.getcwd()
main(root_dir_path)
注意 endswith()
我曾经用来匹配文件,这样您就可以灵活地发送文件 extension
(.csv
) 并获得所有文件。此外,endswith()
也接受一个元组,因此创建一个包含所有文件或扩展名的 tuple
,该方法将起作用。
其他建议:
当尝试使用 pandas 读取文件时,您没有输入代码:
with open(os.path.join(root, csv_file)) as f1:
pd.read_csv(f1, sep = ';')
print(f1)
相反你需要做:
# set the file path into a variable to make code readable
filepath = os.path.join(revenue_folder, random_file)
# read the data and store it into a variable of type DataFrame
my_dataframe_from_file = pd.read_csv(filepath,sep=';')