ValueError: attempt to get argmax of an empty sequence when trying to pull index of max value in columns

ValueError: attempt to get argmax of an empty sequence when trying to pull index of max value in columns

尝试获取 Hi 列中最大值的索引,以便我可以获取 Time 列中的值

NumPy/pandas 是最新的

熊猫数据框的输出:

        Date      Time     Hi       Lo      Open   Close    Volume
241 2021-12-10   9:28  175.190  175.100  175.120  175.15    10780
242 2021-12-10   9:29  175.270  175.150  175.150  175.23    12863
243 2021-12-10   9:30  176.030  175.140  175.250  175.71  1370478
244 2021-12-10   9:31  175.900  175.460  175.710  175.90   435577
245 2021-12-10   9:32  176.100  175.680  175.880  175.73   485381
246 2021-12-10   9:33  175.870  175.370  175.740  175.62   450575
247 2021-12-10   9:34  176.100  175.520  175.609  176.05   485467
248 2021-12-10   9:35  176.110  175.540  176.060  175.64   484336
249 2021-12-10   9:36  176.150  175.510  175.650  176.01   462430
250 2021-12-10   9:37  176.320  175.870  175.992  176.17   502685
251 2021-12-10   9:38  176.530  176.140  176.165  176.47   668669
252 2021-12-10   9:39  176.556  176.345  176.480  176.37   577773
253 2021-12-10   9:40  176.420  176.005  176.350  176.01   388618
254 2021-12-10   9:41  176.050  175.660  176.010  176.01   511461
255 2021-12-10   9:42  176.030  175.810  176.011  175.89   277475
256 2021-12-10   9:43  176.215  175.880  175.908  176.19   315341
257 2021-12-10   9:44  176.450  176.010  176.180  176.03   426582
258 2021-12-10   9:45  176.360  175.880  176.020  175.94   513756
259 2021-12-10   9:46  176.030  175.760  175.940  175.80   367906
260 2021-12-10   9:47  175.775  175.450  175.775  175.56   481068
261 2021-12-10   9:48  175.760  175.450  175.550  175.74   369607
262 2021-12-10   9:49  175.890  175.560  175.730  175.66   290529
263 2021-12-10   9:50  175.860  175.550  175.660  175.83   310516
264 2021-12-10   9:51  176.120  175.810  175.840  176.01   428011
265 2021-12-10   9:52  176.060  175.721  176.015  175.83   275272
266 2021-12-10   9:53  176.010  175.745  175.830  175.78   291982
267 2021-12-10   9:54  175.895  175.670  175.790  175.70   188332
268 2021-12-10   9:55  175.705  175.240  175.685  175.38   448620
269 2021-12-10   9:56  175.380  175.050  175.380  175.16   430128
270 2021-12-10   9:57  175.400  174.925  175.150  174.93   453117
271 2021-12-10   9:58  175.001  174.690  174.920  174.78   422128
272 2021-12-10   9:59  175.210  174.750  174.775  175.18   380997
273 2021-12-10  10:00  175.510  175.090  175.180  175.45   361698
274 2021-12-10  10:01  175.630  175.360  175.455  175.42   260332

我目前正在使用的代码如下:

import csv
import pandas as pd
import os
import numpy

Symbol = "AAPL"

with open(Symbol +'.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)

    df = pd.DataFrame(csv_reader, columns=['Date','Hi', 'Lo','Open','Close','Volume'])

    #df.set_index('Volume', inplace=True)

    new_df_no_expand = df['Date'].str.split(' ') #splitting the date form the time and adding the time to a new column
    New_df = df['Date'].str.split(' ', expand=True).rename(columns={0:'Date',1:'Time'}) #splitting the date form the time and adding the time to a new column needs tweaking

    df[['Date', 'Time']] = df['Date'].str.split(' ', n=1, expand=True)
   
    df = pd.DataFrame(df, columns=['Date', 'Time', 'Hi', 'Lo','Open','Close','Volume'])

    df['Open'] = pd.to_numeric(df['Open'])
    df['Hi'] = pd.to_numeric(df['Hi'])
    df['Lo'] = pd.to_numeric(df['Lo'])
    df['Close'] = pd.to_numeric(df['Close'])
    df['Volume'] = pd.to_numeric(df['Volume'])

    df['Date'] = pd.to_datetime(df['Date'],infer_datetime_format=True)

    hodtime = df.loc[df[(df['Time'] >= "9:30") & (df['Time'] <= "10:00")]['Hi'].idxmax()]

    print(hodtime)

当我 运行 它给了我这个输出:

ValueError: attempt to get argmax of an empty sequence

我想要的输出:(hi 列中的最大值在索引 252 处)

High of Day Time is 9:39

我怀疑你的问题出在时间间隔的子集部分;它可能 returns 一个空的数据结构,这就是这个错误告诉你的。 idxmax() 找不到其中没有任何内容的系列的 argmax。

下面的部分可能是问题所在。如果你打印它会是什么样子? 'Time' 列的数据类型是什么?

df[(df['Time'] >= "9:30") & (df['Time'] <= "10:00")]

您正在使用 split 创建列 DateTime,然后,您使用 to_datetime 转换 Date 列,但是 Time 列仍然是 object 类型。您需要将 Time 列转换为 timedelta 才能执行比较。

...
...
df['Date'] = pd.to_datetime(df['Date'],infer_datetime_format=True)

df['Time'] = pd.to_timedelta(df.Time+":00")

lower_time = pd.to_timedelta("9:30:00")
higher_time = pd.to_timedelta("10:00:00")

hodtime = df.loc[df[(df['Time'] >= lower_time) & (df['Time'] <= higher_time)]['Hi'].idxmax()]
print(hodtime)
Date           2021-12-10
Time      0 days 09:39:00
Hi                176.556
Lo                176.345
Open               176.48
Close              176.37
Volume             577773
Name: 252, dtype: object