Altair 折线图在开头添加了不需要的额外日期

Altair line chart adds unwanted extra day at the beginning

我正在使用数据框中显示的数据创建折线图。当我这样做时,Altair 会计算实际开始日期前一天的一些数据。我怎样才能避免这种情况?

这是生成图表的代码:

data = df
x_val = 'monthdate(creationTime):T'
y_val = 'count(total):Q'

def line_chart(data, title_chart, x_val, y_val, title_x, title_y):

    lines = (
        alt.Chart(
            data,
            title=title_chart,
        )
        .mark_line(point=alt.OverlayMarkDef())
        .encode(
            x=alt.X(
                x_val,
                title=title_x,
                axis=alt.Axis(grid=False),
            ),
            y=alt.Y(
                y_val,
                title=title_y,
                axis=alt.Axis(grid=False)
            ),
        )
    )

    hover = alt.selection_single(
        fields=[x_val],
        nearest=True,
        on="mouseover",
        empty="none",
    )

    points = lines.transform_filter(hover).mark_circle(size=65)

    tooltips = (
        alt.Chart(data)
        .mark_rule()
        .encode(
            x=x_val,
            y=y_val,
            opacity=alt.condition(hover, alt.value(0.3), alt.value(0)),
            tooltip=[
                alt.Tooltip(x_val, title=title_x),
                alt.Tooltip(y_val, title=title_y),
            ],
        )
        .add_selection(hover)
    )

    final = (lines + points + tooltips).configure_view(strokeWidth=0).interactive()
    return st.altair_chart(final, use_container_width=True)

听起来您遇到了此处报告的问题 https://github.com/altair-viz/altair/issues/2540。那里有一些更多的细节,但这个解决方法现在应该修复它:

原来的移位情节:

import pandas as pd
import altair as alt

df = pd.DataFrame({
    'reportday': ['2021-11-08', '2021-11-09', '2021-11-10', '2021-11-11','2021-11-12', '2021-11-15','2021-11-16', '2021-11-17', '2021-11-18','2021-11-19'],
    'price': [328.0, 310.0, 301.0, 3330.0, 3278.0, 3200.0, 2189.0, 1701.0, 1698.0, 1703.0],
    'production': [24.75, 16.30, 14.77, 14.10, 27.70, 26.70, 29.05, 19.58, 24.88, 17.35]
})

alt.Chart(df).mark_bar().encode(
    x=alt.X('monthdate(reportday):O', axis=alt.Axis(labelAngle=325)),
    y='production'
)

解决方法:

df['reportday'] = pd.to_datetime(df['reportday']).dt.strftime('%b %d')

base = alt.Chart(df).encode(x=alt.X('reportday:O', axis=alt.Axis(labelAngle=325)))
line =  base.mark_line(color='red').encode(y=alt.Y('price:Q', axis=alt.Axis(grid=True)))
bar = base.mark_bar().encode(y='production:Q')

(bar + line).resolve_scale(y='independent').properties(width=600)

我设法使用 df.dt.tz_localize() 解决了这个问题。显然 Altair 认为日期默认为 UTC。