Altair 中的空散点图

Empty scatter plot in Altair

我很困惑为什么这段代码不起作用:

#@title Visualize trends in accelerators

x_axis = 'Release Year'
y_axis = 'FP32 (single precision) Performance [FLOP/s]'

# Libraries
import pandas as pd
import numpy as np
import altair as alt

# Download data
key = '1AAIebjNsnJj_uKALHbXNfn3_YsT6sHXtCU0q7OIPuc4'
sheet_name = 'HARDWARE_DATA'
url = f'https://docs.google.com/spreadsheets/d/{key}/gviz/tq?tqx=out:csv&sheet={sheet_name}'
df = pd.read_csv(url)

# Filter NaN datapoints
df = df[~df[x_axis].isna()]
df = df[~df[y_axis].isna()]

# Plot the dataset
alt.themes.enable('fivethirtyeight')
alt.Chart(df).mark_circle(size=60).encode(
  x=alt.X(f'{x_axis}:Q',
          scale=alt.Scale(
                          domain=(df[x_axis].min(), df[x_axis].max())
                          ),
          axis=alt.Axis(format=".0f")
          ),
  y=alt.Y(f'{y_axis}:Q',
          scale=alt.Scale(
                          domain=(df[y_axis].min(), df[y_axis].max())
                          ),
          axis=alt.Axis(format=".1e")
          ),
)

当我使用 seaborn 绘图时它有效

import seaborn as sns
sns.set_theme()
sns.regplot(x=df[x_axis], y=df[y_axis]);

没有显示错误消息 - 只是空图。控制台抛出这个警告

DevTools failed to load source map: Could not load content for https://cdn.jsdelivr.net/npm/vega-embed.min.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

这是怎么回事?

问题是您的列名中有特殊字符,需要在 Altair 中对其进行转义(例如,参见 https://altair-viz.github.io/user_guide/generated/core/altair.Color.html?highlight=escape 中的 field 文档)

这是为什么? Vega-Lite 中的 .[] 等字符用于访问列的嵌套属性。

最简单的方法是避免在数据框列名称中使用此类特殊字符。或者,您可以使用反斜杠 (\) 转义特殊字符,但要注意 Python 字符串中反斜杠的影响(使用 r 前缀进行原始字符串编码).例如:

x_axis = 'Release Year'
y_axis = 'FP32 (single precision) Performance [FLOP/s]'
y_axis_escaped = r'FP32 (single precision) Performance \[FLOP/s\]'

# Libraries
import pandas as pd
import numpy as np
import altair as alt

# Download data
key = '1AAIebjNsnJj_uKALHbXNfn3_YsT6sHXtCU0q7OIPuc4'
sheet_name = 'HARDWARE_DATA'
url = f'https://docs.google.com/spreadsheets/d/{key}/gviz/tq?tqx=out:csv&sheet={sheet_name}'
df = pd.read_csv(url)

# Filter NaN datapoints
df = df[~df[x_axis].isna()]
df = df[~df[y_axis].isna()]

# Plot the dataset
alt.themes.enable('fivethirtyeight')
alt.Chart(df).mark_circle(size=60).encode(
  x=alt.X(f'{x_axis}:Q',
          scale=alt.Scale(
                          domain=(df[x_axis].min(), df[x_axis].max())
                          ),
          axis=alt.Axis(format=".0f")
          ),
  y=alt.Y(f'{y_axis_escaped}:Q',
          scale=alt.Scale(
                          domain=(df[y_axis].min(), df[y_axis].max())
                          ),
          axis=alt.Axis(format=".1e")
          ),
)