如何将许多腌制文件发送到数据框中?
How to send many pickled file into a dataframe?
我有很多文件是使用 "pickle" 创建的。
我想将它们发送到数据框,计算每个数据的平均值(从第二行到最后),将其乘以 1000 并将其四舍五入到小数点后两位。
到目前为止,我已经使用 1 个 pickle 文件实现了这一点。
import pandas as pd
df = pd.read_pickle(r'C:\Users\file_inference_time')
df = pd.DataFrame(df)
df.rename(columns={0:'MobileNet'},inplace=True)
df_mean=(df.iloc[2::,:].mean()* 1000).round(decimals=2)
df_mean2=pd.DataFrame(df_mean)
df_mean2
我从 1 个文件中得到结果。
这些是我需要阅读的文件 ("pickle")
编辑
这是我在 运行 第二个选项
时得到的错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-b72e45d8bcfc> in <module>
16
17
---> 18 df_mean_all = pd.concat(df_mean_list).reset_index(drop=True)
19
20 print(df_mean_all)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
253 verify_integrity=verify_integrity,
254 copy=copy,
--> 255 sort=sort,
256 )
257
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
302
303 if len(objs) == 0:
--> 304 raise ValueError("No objects to concatenate")
305
306 if keys is None:
ValueError: No objects to concatenate
这是一个有结果的情节
获得 dict
个 dataframes
- 将每个文件的计算平均结果保存到
dict
from pathlib import Path
dir_path = Path(r'C:\Users\path_to_files')
files = dir_path.glob('**/file_inference_time*') # get all pkl files in main dir and subdirectories
df_mean_dict = dict()
for i, file in enumerate(files):
df = pd.DataFrame(pd.read_pickle(file))
df.rename(columns={0:'MobileNet'}, inplace=True)
df_mean_dict[i] = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2))
# if all the file names are unique, the dict key can be the file name (w/o the file extension)
# df_mean_dict[file.stem] = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2))
获取单个数据框 - 这就是我要做的
- 结果
df_mean_all
将是一个 2 列数据框。
- 第 0 列将是
MobileNet
- 第 1 列将是
file
dir_path = Path(r'C:\Users\path_to_files')
files = dir_path.glob('**/file_inference_time*') # get all pkl files in main dir and subdirectories
# to check if the files are found
# if an empty list prints, no files are found
files = list(files)
print(files[:5]
df_mean_list = list()
for file in files:
df = pd.DataFrame(pd.read_pickle(file))
df_mean = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2)).reset_index(drop=True).rename(columns={0: 'MobileNet'})
df_mean['file'] = file # or file.stem for just the file name
df_mean_list.append(df_mean)
# df_mean_list is a list of dataframes, pd.concat combines them all into one dataframe
df_mean_all = pd.concat(df_mean_list).reset_index(drop=True)
print(df_mean_all)
MobileNet file
0 3.24 C:\Users\file_inference_time\file1.pkl
1 2.34 C:\Users\file_inference_time\file2.pkl
2 4.23 C:\Users\file_inference_time\file3.pkl
我有很多文件是使用 "pickle" 创建的。 我想将它们发送到数据框,计算每个数据的平均值(从第二行到最后),将其乘以 1000 并将其四舍五入到小数点后两位。
到目前为止,我已经使用 1 个 pickle 文件实现了这一点。
import pandas as pd
df = pd.read_pickle(r'C:\Users\file_inference_time')
df = pd.DataFrame(df)
df.rename(columns={0:'MobileNet'},inplace=True)
df_mean=(df.iloc[2::,:].mean()* 1000).round(decimals=2)
df_mean2=pd.DataFrame(df_mean)
df_mean2
我从 1 个文件中得到结果。
这些是我需要阅读的文件 ("pickle")
编辑 这是我在 运行 第二个选项
时得到的错误---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-b72e45d8bcfc> in <module>
16
17
---> 18 df_mean_all = pd.concat(df_mean_list).reset_index(drop=True)
19
20 print(df_mean_all)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
253 verify_integrity=verify_integrity,
254 copy=copy,
--> 255 sort=sort,
256 )
257
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
302
303 if len(objs) == 0:
--> 304 raise ValueError("No objects to concatenate")
305
306 if keys is None:
ValueError: No objects to concatenate
这是一个有结果的情节
获得 dict
个 dataframes
- 将每个文件的计算平均结果保存到
dict
from pathlib import Path
dir_path = Path(r'C:\Users\path_to_files')
files = dir_path.glob('**/file_inference_time*') # get all pkl files in main dir and subdirectories
df_mean_dict = dict()
for i, file in enumerate(files):
df = pd.DataFrame(pd.read_pickle(file))
df.rename(columns={0:'MobileNet'}, inplace=True)
df_mean_dict[i] = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2))
# if all the file names are unique, the dict key can be the file name (w/o the file extension)
# df_mean_dict[file.stem] = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2))
获取单个数据框 - 这就是我要做的
- 结果
df_mean_all
将是一个 2 列数据框。- 第 0 列将是
MobileNet
- 第 1 列将是
file
- 第 0 列将是
dir_path = Path(r'C:\Users\path_to_files')
files = dir_path.glob('**/file_inference_time*') # get all pkl files in main dir and subdirectories
# to check if the files are found
# if an empty list prints, no files are found
files = list(files)
print(files[:5]
df_mean_list = list()
for file in files:
df = pd.DataFrame(pd.read_pickle(file))
df_mean = pd.DataFrame((df.iloc[2::,:].mean()* 1000).round(decimals=2)).reset_index(drop=True).rename(columns={0: 'MobileNet'})
df_mean['file'] = file # or file.stem for just the file name
df_mean_list.append(df_mean)
# df_mean_list is a list of dataframes, pd.concat combines them all into one dataframe
df_mean_all = pd.concat(df_mean_list).reset_index(drop=True)
print(df_mean_all)
MobileNet file
0 3.24 C:\Users\file_inference_time\file1.pkl
1 2.34 C:\Users\file_inference_time\file2.pkl
2 4.23 C:\Users\file_inference_time\file3.pkl