忽略“分钟/小时”数据的 Plotly 烛台中的非交易日(假期/删除缺口)

Ignore Non Trading days (Holidays / remove gaps) in Plotly candlesticks for "Minutes / Hours) data

但我不知道在哪里使用该参数(来自 matplotlib 的新鲜内容仅适用于烛台)

通过一些链接,我发现 运行 下面的代码在 Day 数据上,我可以很容易地消除差距:


 dt_all = pd.date_range(start=stocks.iloc[0,0],end=stocks.iloc[-1,0], freq = f'{freq}min')
            dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in pd.to_datetime(stocks.DATE)]
            dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
            
            range_selector = dict(buttons = list([
                    dict(count = 5, label = '5Min', step = 'minute', stepmode = 'backward'),
                    dict(count = 15, label = '15Min', step = 'minute', stepmode = 'backward'),
                    dict(count = 75, label = '75M', step = 'minute', stepmode = 'backward'),
                    dict(count = 1, label = '1D', step = 'day', stepmode = 'backward'),
                    dict(step = 'all')]))

candle = go.Figure(data = [go.Candlestick(opacity = 0.9,x = stocks['Date'], name = 'X',
                                                       open = stocks['Open'], 
                                                       high = stocks['High'], 
                                                       low = stocks['Low'], 
                                                       close = stocks['Close']),])

candle.update_xaxes(
            title_text = 'Date',
            rangeslider_visible = True, 
        rangebreaks=[dict(values=dt_breaks)], 
range_selector = range_selector)

但我有 5 分钟的数据:

    DATE    OPEN    HIGH    LOW CLOSE   52W H   52W L   SYMBOL
374 2022-01-14 15:25:00+05:30   720.25  722.35  720.25  721.55  NaN NaN BHARTIARTL
373 2022-01-14 15:20:00+05:30   720.30  720.45  719.45  720.25  NaN NaN BHARTIARTL
372 2022-01-14 15:15:00+05:30   720.75  720.90  720.15  720.30  NaN NaN BHARTIARTL
371 2022-01-14 15:10:00+05:30   720.35  720.90  720.20  720.70  NaN NaN BHARTIARTL
370 2022-01-14 15:05:00+05:30   720.70  720.90  720.05  720.20  NaN NaN BHARTIARTL
... ... ... ... ... ... ... ... ...
4   2022-01-10 09:35:00+05:30   706.05  707.15  705.65  706.55  NaN NaN BHARTIARTL
3   2022-01-10 09:30:00+05:30   705.90  706.40  705.05  706.05  NaN NaN BHARTIARTL
2   2022-01-10 09:25:00+05:30   707.10  707.95  705.60  705.60  NaN NaN BHARTIARTL
1   2022-01-10 09:20:00+05:30   709.00  709.40  706.15  707.10  NaN NaN BHARTIARTL
0   2022-01-10 09:15:00+05:30   705.40  709.00  705.40  708.55  NaN NaN BHARTIARTL

使用上面的代码得到的结果为:

在这种情况下可以做什么?

还有其他更适合您的用例的答案。这 可以 以 5 分钟为间隔有点棘手。只需考虑时间戳的格式,并仔细执行以下步骤:

  • 找出从第一次观察到最后一次观察的所有个时间间隔
  • 找出您在该完整时间轴中发生的观察结果
  • 隔离其余日期并将它们包含在 x 轴的 rangebreaks 属性中
  • 调整 rangebreaksdvalues 属性以匹配 your time interval in millisecondsfig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)])

基本代码元素:

# grab first and last observations from df.date and make a continuous date range from that
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1], freq = '5min')

# check which dates from your source that also accur in the continuous date range
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in df['Date']]

# isolate missing timestamps
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]

# adjust xaxis for rangebreaks
fig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)])

情节 1:显示缺少时间戳

图 2:未显示缺少的时间戳

完整代码:

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np

# sample data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv').tail(90)
df = df[df.columns[:6]]
df['Date'] = pd.date_range("2018-01-01", periods=len(df), freq="5min")
df.columns = ['Date', 'Open', 'High', 'Low', 'Close', 'Volume']
df = df.tail(10)

# remove some data
np.random.seed(0)
remove_n = 4
drop_indices = np.random.choice(df.index, remove_n, replace=False)
df = df.drop(drop_indices)

# plotly candlestick figure
fig = go.Figure(data=[go.Candlestick(
    x=df['Date'],
    open=df['Open'], high=df['High'],
    low=df['Low'], close=df['Close'],
)])

# grab first and last observations from df.date and make a continuous date range from that
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1], freq = '5min')

# check which dates from your source that also accur in the continuous date range
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in df['Date']]

# isolate missing timestamps
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
dt_breaks = pd.to_datetime(dt_breaks)

fig.show()
fig.update_xaxes(rangebreaks=[dict(dvalue = 5*60*1000, values=dt_breaks)] )
print(fig.layout.xaxis.rangebreaks)
fig.show()