使用分钟条数据(多个股票,一个 df 中的多个会话)SettingWithCopyWarning

Get stock Low of Day (LOD) price for incomplete daily bar using minute bar data (multiple stocks, multiple sessions in one df) SettingWithCopyWarning

我有多个股票的 分钟 数据的数据框,每个股票都有多个会话。请参阅下面的示例

         Symbol            Time     Open    High    Low  Close  Volume  LOD
2724312   AEHR 2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200  NaN
2724313   AEHR 2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062  NaN
2724314   AEHR 2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665  NaN
2724315   AEHR 2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100  NaN
2724316   AEHR 2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400  NaN
...        ...                 ...    ...    ...    ...    ...     ...  ...
4266341     ZI 2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  NaN
4266342     ZI 2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  NaN
4266343     ZI 2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  NaN
4266344     ZI 2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  NaN
4266345     ZI 2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  NaN

我需要 (LOD) 的低点(LOD),直至每一行中的时间(9:30-4pm)。

完成的 df 应该是这样的

         Symbol            Time     Open    High    Low  Close  Volume  LOD
2724312   AEHR 2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200  1.42   
2724313   AEHR 2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062  1.34   
2724314   AEHR 2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665  1.29   
2724315   AEHR 2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100  1.29   
2724316   AEHR 2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400  1.29   
...        ...                 ...    ...    ...    ...    ...     ...  ...
4266341     ZI 2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  63.08  
4266342     ZI 2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  63.07  
4266343     ZI 2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  63.07  
4266344     ZI 2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  63.06  
4266345     ZI 2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  63.06 

我目前的解决方案

prev_symbol = "WXYZ"
prev_low = 10000000
prev_session = datetime.date(1920, 1, 1)
session_start = 1

for i, row in df.iterrows():
    current_session = (df['Time'].iloc[i]).time()
    current_symbol = df['Symbol'].iloc[i]
    if current_symbol == prev_symbol:
        if current_session == prev_session:
            sesh_low = df.iloc[session_start:i, 'Low'].min()
            df.at[i, 'LOD'] = sesh_low
        else:
            df.at[i, 'LOD'] = df.at[i, 'Low']
            prev_session = current_session
            session_start = i
    else:
        df.at[i, 'LOD'] = df.at[i, 'Low']
        prev_symbol = current_symbol
        prev_session = current_session
        session_start = i

这 returns 一个 SettingWithCopyWarning 错误。请帮助

你可以试试 .groupby() + .expanding():

# if you have values already converted/sorted, skip:
# df["Time"] = pd.to_datetime(df["Time"])
# df = df.sort_values(by=["Symbol", "Time"])

df["LOD"] = df.groupby("Symbol")["Low"].expanding().min().values
print(df)

打印:

        Symbol                 Time   Open   High    Low  Close  Volume    LOD
2724312   AEHR  2019-09-23 09:31:00   1.42   1.42   1.42   1.42     200   1.42
2724313   AEHR  2019-09-23 09:43:00   1.35   1.35   1.34   1.34    6062   1.34
2724314   AEHR  2019-09-23 09:58:00   1.35   1.35   1.29   1.30    8665   1.29
2724315   AEHR  2019-09-23 09:59:00   1.32   1.32   1.32   1.32     100   1.29
2724316   AEHR  2019-09-23 10:00:00   1.35   1.35   1.35   1.35     400   1.29
4266341     ZI  2021-09-10 15:56:00  63.08  63.16  63.08  63.15   18205  63.08
4266342     ZI  2021-09-10 15:57:00  63.14  63.14  63.07  63.07   19355  63.07
4266343     ZI  2021-09-10 15:58:00  63.07  63.12  63.07  63.10   16650  63.07
4266344     ZI  2021-09-10 15:59:00  63.09  63.12  63.06  63.11   25775  63.06
4266345     ZI  2021-09-10 16:00:00  63.11  63.17  63.11  63.17   28578  63.06