.resample('W') 仅选取一列，而 subset 包含两列

Question

我正在尝试在对 2 列进行子集化后对 Pandas 数据帧进行重新采样。下面是数据框的头部。两列都是 Pandas 系列。

temp_2011_clean[['visibility', 'dry_bulb_faren']].head()
                    visibility  dry_bulb_faren
2011-01-01 00:53:00     10.00   51.0
2011-01-01 01:53:00     10.00   51.0
2011-01-01 02:53:00     10.00   51.0
2011-01-01 03:53:00     10.00   50.0
2011-01-01 04:53:00     10.00   50.0

type(temp_2011_clean['visibility'])
pandas.core.series.Series

type(temp_2011_clean['dry_bulb_faren'])
pandas.core.series.Series

虽然 .resample('W') 方法成功创建了重采样对象，但如果我将 .mean() 方法链接到相同的对象，它只会拾取一列，而不是预期的，两者列。有人可以建议可能是什么问题吗？为什么漏掉了一栏？

temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W')
<pandas.core.resample.DatetimeIndexResampler object at 0x0000016F4B943288>

temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean().head()
            dry_bulb_faren
2011-01-02  44.791667
2011-01-09  50.246637
2011-01-16  41.103774
2011-01-23  47.194313
2011-01-30  53.486188

Answer 1

我认为问题应该是第 visibility 列不是数字，因此排除了非数字列。

print (temp_2011_clean.dtypes)
visibility         object
dry_bulb_faren    float64
dtype: object

df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
            dry_bulb_faren
2011-01-02            50.6

因此，通过 to_numeric 将列转换为数字，使用 errors='coerce' 将非数字值转换为 NaNs:

temp_2011_clean['visibility'] = pd.to_numeric(temp_2011_clean['visibility'], errors='coerce')

print (temp_2011_clean.dtypes)
visibility        float64
dry_bulb_faren    float64
dtype: object

df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
            visibility  dry_bulb_faren
2011-01-02        10.0            50.6

.resample('W') 仅选取一列，而 subset 包含两列

.resample('W') picks one column only while subset includes two columns

python

pandas

summarize