检查数据框中的值是否会增加或减少一定百分比

Check if a value in dataframe will increase or decrease a certain percentage

我有一个 OHLC 数据框,例如:

index open close high low
2021-03-23 10:00:00+00:00 1421.100 1424.500 1427.720 1422.650
2021-03-23 11:00:00+00:00 1424.500 1421.480 1422.400 1411.890
2021-03-23 12:00:00+00:00 1421.480 1435.170 1443.980 1433.780
2021-03-23 13:00:00+00:00 1435.170 1440.860 1443.190 1437.590
2021-03-23 14:00:00+00:00 1440.860 1438.920 1443.570 1435.200
2021-03-23 15:00:00+00:00 1438.920 1435.990 1444.840 1435.060
2021-03-23 16:00:00+00:00 1435.990 1441.920 1446.610 1441.450

现在我想知道价格是先涨还是跌1%。到目前为止,我所拥有的是以下工作代码:

def check(x):

    check = ohlc[ohlc.index > x.name]
    price = ohlc.at[x.name, 'close']

    high_thr = price * 1.01
    low_thr = price * 0.99

    high_indexes = check[check['high'] > high_thr]
    low_indexes = check[check['low'] < low_thr]

    if high_indexes.shape[0] > 0 and low_indexes.shape[0] > 0:

        high = high_indexes.index[0]
        low = low_indexes.index[0]

        if high < low:
            return 1
        elif high > low:
            return -1
        else:
            return 0
    else:
        return 0


ohlc['check'] = ohlc.apply(find_threshold, axis=1)

这对于较大的数据集来说非常慢。除了遍历每一行、切片并找到所有索引以获得最近的索引之外,还有其他更好的方法吗?

我认为最好的方法与您的做法没有太大区别:

from datetime import timedelta

def check(x, change=0.01):
    time = x.name
    price = ohlc.loc[time, 'close']
    while True:
        if time not in ohlc.index:          # If we reach the end
            return 0
        high = ohlc.loc[time, 'high']
        low = ohlc.loc[time, 'low']
        if high > (1.0 + change) * price:   # Upper thresh broken
            return 1
        elif low < 1.0 - change) * price:   # Lower thresh broken
            return -1
        time = time + timedelta(hours=1)    # Time update

ohlc['check'] = ohlc.apply(check, axis=1)

如果您担心的是效率,那么应用这种方式效率会稍微高一些,因为它只向前看需要突破阈值的距离。或者,您可以通过修改 while 循环将每行的检查次数限制为 100:

    endtime = time + timedelta(hours=100)
    while time < endtime:
        # etc