在多个 DataFrame 中查找数字的最大值 Python
Finding the max of numbers in multiple DataFrames Python
我有 1000 多个包含股票日期和价格的 .txt 文件,我已将它们转换为字典(以文件名(股票代码)作为键,每个文件的数据作为数据框)。我用 .rolling 计算移动平均线,然后找到移动平均线和价格之间的百分比差异。因此,百分比差异是每个 DataFrame 自己的列。所有这些的代码如下所示:
filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'
dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='\t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
for value in dic1.values():
value['ma'] = value['Prices'].rolling(window=50).mean()
for value in dic1.values():
value['diff'] = value['Prices'] - value['ma']
for value in dic1.values():
value['pctdiff']= value['diff']/value['Prices']
我的问题是如何找到 pctdiff 列的前 5 个最大(和最小,因为它们可能为负)?
我试过:
for df in dic1.values():
for num in df['pctdiff'].max():
print(num.max())
但我收到以下错误:“'float' 对象不可迭代”
你是这个意思吗?
list_result = []
for key,value in dic1.items():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']
list_result.append([key,value['pctdiff'].max()])
list_result.sort(key = lambda x : x[1] )
highest_list = list_result[-5:]
smallest_list = list_result[:5]
只是为了让代码更干净一点,运行所有变量都添加到一个 for 循环而不是四个循环中
filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'
dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='\t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']
然后使用@Edchum 的答案按绝对值对 pctdiff
进行排序(如果对象是其他对象,则将其转换为 pandas 系列)。类似的东西(如果你想存储它排序)
...
for value in dic1.values():
...
pctdiff = value['diff']/value['Prices']
pctdiff = pctdiff.reindex(pctdiff.abs().sort_values().index)
value['pctdiff']= pctdiff
我有 1000 多个包含股票日期和价格的 .txt 文件,我已将它们转换为字典(以文件名(股票代码)作为键,每个文件的数据作为数据框)。我用 .rolling 计算移动平均线,然后找到移动平均线和价格之间的百分比差异。因此,百分比差异是每个 DataFrame 自己的列。所有这些的代码如下所示:
filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'
dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='\t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
for value in dic1.values():
value['ma'] = value['Prices'].rolling(window=50).mean()
for value in dic1.values():
value['diff'] = value['Prices'] - value['ma']
for value in dic1.values():
value['pctdiff']= value['diff']/value['Prices']
我的问题是如何找到 pctdiff 列的前 5 个最大(和最小,因为它们可能为负)?
我试过:
for df in dic1.values():
for num in df['pctdiff'].max():
print(num.max())
但我收到以下错误:“'float' 对象不可迭代”
你是这个意思吗?
list_result = []
for key,value in dic1.items():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']
list_result.append([key,value['pctdiff'].max()])
list_result.sort(key = lambda x : x[1] )
highest_list = list_result[-5:]
smallest_list = list_result[:5]
只是为了让代码更干净一点,运行所有变量都添加到一个 for 循环而不是四个循环中
filelist = os.listdir(r'Insert File Path')
filepath = r'Insert File Path'
dic1 = {}
for file in filelist:
df = pd.read_csv(filepath + file,sep='\t')
dic1[file]= df
for value in dic1.values():
value.rename(columns={value.columns[0]:'Dates',value.columns[1]:'Prices'},inplace=True)
value['ma'] = value['Prices'].rolling(window=50).mean()
value['diff'] = value['Prices'] - value['ma']
value['pctdiff']= value['diff']/value['Prices']
然后使用@Edchum pctdiff
进行排序(如果对象是其他对象,则将其转换为 pandas 系列)。类似的东西(如果你想存储它排序)
...
for value in dic1.values():
...
pctdiff = value['diff']/value['Prices']
pctdiff = pctdiff.reindex(pctdiff.abs().sort_values().index)
value['pctdiff']= pctdiff