如何解决 IndexError 以及如何将 for 循环中计算的 3 个数据保存到 array/.csv?
How to resolve IndexError and how to save 3 data computed in for loop to array/.csv?
我使用 pandas 导入了一个文件。数据如下:
我编码获取 'open' 的数据,从每年的第一天保存为 start_open 到每年的最后一天保存为 end_open 27 年。我的代码如下:
import pandas as pd
df = pd.read_csv(r'C:\Users\Shivank Chadda\Desktop\Data Analysis\BATS_SPY, 1D.csv')
df['time'] = pd.to_datetime(df['time'],unit='s').dt.normalize()
df['year'] = pd.DatetimeIndex(df['time']).year
sub_df=df[['year','open']]
n=1993
for i in sub_df['year']:
sub_93 = sub_df[(sub_df['year']==n) & (sub_df['year']<2022)]
start_open=sub_93.iloc[0]['open']
end_open=sub_93.iloc[-1]['open']
per= ((end_open-start_open)/start_open)*100
print('The value at the start of the year',n,'is:',start_open,'\nThe value at the end of year',n,' is:',end_open)
n+=1
i+=1
代码打印如下
The value at the start of the year 1993 is: 43.9688
The value at the end of year 1993 is: 46.9375
The value at the start of the year 1994 is: 46.59375
The value at the end of year 1994 is: 46.20312
The value at the start of the year 1995 is: 45.70312
The value at the end of year 1995 is: 61.46875
The value at the start of the year 1996 is: 61.40625
The value at the end of year 1996 is: 75.28125
The value at the start of the year 1997 is: 74.375
The value at the end of year 1997 is: 96.875
(持续到 2021 年)
出现以下错误:
File "C:\Users\Shivank Chadda\Desktop\Data Analysis\untitled7.py", line 16, in <module>
start_open=sub_93.iloc[0]['open']
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 879, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1496, in _getitem_axis
self._validate_integer(key, axis)
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1437, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
我有两个问题
(1) 我该如何解决这个错误?
(2) 我想得到一个包含年份、start_open、end_open 和百分比的数组,而不是在句子中打印。如果可能的话,我想制作一个收集到的数据的 .csv。
请告诉我下一步应该做什么
我无法测试它,但错误显示
中的 IndexError
有问题
start_open = sub_93.iloc[0]['open']
所以你可能得到空 sub_93
而它没有 [0]
(和 [-1]
)。
你应该检查它并跳过计算
sub_93 = sub_df[(sub_df['year'] == n) & (sub_df['year'] < 2022)]
if len(sub_93) == 0:
print('No data for year', n)
else:
start_open = sub_93.iloc[0]['open']
end_open = sub_93.iloc[-1]['open']
per = ((end_open-start_open)/start_open)*100
print('The value at the start of the year', n, 'is:', start_open, '\nThe value at the end of year', n,'is:', end_open)
n += 1
编辑:
第二个问题——创建列表——看起来很简单,所以我什至没有考虑它。
之前for
-循环创建列表results = []
.
内部for
-循环追加值results.append([year, start_open, end_open, percentage])
你会得到包含子列表的列表。
您可以将其转换为pandas.DataFrame
并保存为CSV
# - before `for`-loop -
results = []
# - `for`-loop -
for i in sub_df['year']:
# ... code ...
results.append( [year, start_open, end_open, percentage] )
# - after `for`-loop -
df_results = pd.DataFrame(results, header=["Year", "Start", "End", "Percentage"])
#df_results.to_csv("output.csv", index=False)
df_results.to_csv("output.csv")
我使用 pandas 导入了一个文件。数据如下:
我编码获取 'open' 的数据,从每年的第一天保存为 start_open 到每年的最后一天保存为 end_open 27 年。我的代码如下:
import pandas as pd
df = pd.read_csv(r'C:\Users\Shivank Chadda\Desktop\Data Analysis\BATS_SPY, 1D.csv')
df['time'] = pd.to_datetime(df['time'],unit='s').dt.normalize()
df['year'] = pd.DatetimeIndex(df['time']).year
sub_df=df[['year','open']]
n=1993
for i in sub_df['year']:
sub_93 = sub_df[(sub_df['year']==n) & (sub_df['year']<2022)]
start_open=sub_93.iloc[0]['open']
end_open=sub_93.iloc[-1]['open']
per= ((end_open-start_open)/start_open)*100
print('The value at the start of the year',n,'is:',start_open,'\nThe value at the end of year',n,' is:',end_open)
n+=1
i+=1
代码打印如下
The value at the start of the year 1993 is: 43.9688
The value at the end of year 1993 is: 46.9375
The value at the start of the year 1994 is: 46.59375
The value at the end of year 1994 is: 46.20312
The value at the start of the year 1995 is: 45.70312
The value at the end of year 1995 is: 61.46875
The value at the start of the year 1996 is: 61.40625
The value at the end of year 1996 is: 75.28125
The value at the start of the year 1997 is: 74.375
The value at the end of year 1997 is: 96.875
(持续到 2021 年)
出现以下错误:
File "C:\Users\Shivank Chadda\Desktop\Data Analysis\untitled7.py", line 16, in <module>
start_open=sub_93.iloc[0]['open']
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 879, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1496, in _getitem_axis
self._validate_integer(key, axis)
File "C:\Users\Shivank Chadda\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1437, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
我有两个问题
(1) 我该如何解决这个错误?
(2) 我想得到一个包含年份、start_open、end_open 和百分比的数组,而不是在句子中打印。如果可能的话,我想制作一个收集到的数据的 .csv。
请告诉我下一步应该做什么
我无法测试它,但错误显示
中的IndexError
有问题
start_open = sub_93.iloc[0]['open']
所以你可能得到空 sub_93
而它没有 [0]
(和 [-1]
)。
你应该检查它并跳过计算
sub_93 = sub_df[(sub_df['year'] == n) & (sub_df['year'] < 2022)]
if len(sub_93) == 0:
print('No data for year', n)
else:
start_open = sub_93.iloc[0]['open']
end_open = sub_93.iloc[-1]['open']
per = ((end_open-start_open)/start_open)*100
print('The value at the start of the year', n, 'is:', start_open, '\nThe value at the end of year', n,'is:', end_open)
n += 1
编辑:
第二个问题——创建列表——看起来很简单,所以我什至没有考虑它。
之前
for
-循环创建列表results = []
.内部
for
-循环追加值results.append([year, start_open, end_open, percentage])
你会得到包含子列表的列表。
您可以将其转换为pandas.DataFrame
并保存为CSV
# - before `for`-loop -
results = []
# - `for`-loop -
for i in sub_df['year']:
# ... code ...
results.append( [year, start_open, end_open, percentage] )
# - after `for`-loop -
df_results = pd.DataFrame(results, header=["Year", "Start", "End", "Percentage"])
#df_results.to_csv("output.csv", index=False)
df_results.to_csv("output.csv")