.resample('W') 仅选取一列,而 subset 包含两列
.resample('W') picks one column only while subset includes two columns
我正在尝试在对 2 列进行子集化后对 Pandas 数据帧进行重新采样。下面是数据框的头部。两列都是 Pandas 系列。
temp_2011_clean[['visibility', 'dry_bulb_faren']].head()
visibility dry_bulb_faren
2011-01-01 00:53:00 10.00 51.0
2011-01-01 01:53:00 10.00 51.0
2011-01-01 02:53:00 10.00 51.0
2011-01-01 03:53:00 10.00 50.0
2011-01-01 04:53:00 10.00 50.0
type(temp_2011_clean['visibility'])
pandas.core.series.Series
type(temp_2011_clean['dry_bulb_faren'])
pandas.core.series.Series
虽然 .resample('W') 方法成功创建了重采样对象,但如果我将 .mean() 方法链接到相同的对象,它只会拾取一列,而不是预期的,两者列。有人可以建议可能是什么问题吗?为什么漏掉了一栏?
temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W')
<pandas.core.resample.DatetimeIndexResampler object at 0x0000016F4B943288>
temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean().head()
dry_bulb_faren
2011-01-02 44.791667
2011-01-09 50.246637
2011-01-16 41.103774
2011-01-23 47.194313
2011-01-30 53.486188
我认为问题应该是第 visibility
列不是数字,因此排除了非数字列。
print (temp_2011_clean.dtypes)
visibility object
dry_bulb_faren float64
dtype: object
df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
dry_bulb_faren
2011-01-02 50.6
因此,通过 to_numeric
将列转换为数字,使用 errors='coerce'
将非数字值转换为 NaN
s:
temp_2011_clean['visibility'] = pd.to_numeric(temp_2011_clean['visibility'], errors='coerce')
print (temp_2011_clean.dtypes)
visibility float64
dry_bulb_faren float64
dtype: object
df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
visibility dry_bulb_faren
2011-01-02 10.0 50.6
我正在尝试在对 2 列进行子集化后对 Pandas 数据帧进行重新采样。下面是数据框的头部。两列都是 Pandas 系列。
temp_2011_clean[['visibility', 'dry_bulb_faren']].head()
visibility dry_bulb_faren
2011-01-01 00:53:00 10.00 51.0
2011-01-01 01:53:00 10.00 51.0
2011-01-01 02:53:00 10.00 51.0
2011-01-01 03:53:00 10.00 50.0
2011-01-01 04:53:00 10.00 50.0
type(temp_2011_clean['visibility'])
pandas.core.series.Series
type(temp_2011_clean['dry_bulb_faren'])
pandas.core.series.Series
虽然 .resample('W') 方法成功创建了重采样对象,但如果我将 .mean() 方法链接到相同的对象,它只会拾取一列,而不是预期的,两者列。有人可以建议可能是什么问题吗?为什么漏掉了一栏?
temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W')
<pandas.core.resample.DatetimeIndexResampler object at 0x0000016F4B943288>
temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean().head()
dry_bulb_faren
2011-01-02 44.791667
2011-01-09 50.246637
2011-01-16 41.103774
2011-01-23 47.194313
2011-01-30 53.486188
我认为问题应该是第 visibility
列不是数字,因此排除了非数字列。
print (temp_2011_clean.dtypes)
visibility object
dry_bulb_faren float64
dtype: object
df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
dry_bulb_faren
2011-01-02 50.6
因此,通过 to_numeric
将列转换为数字,使用 errors='coerce'
将非数字值转换为 NaN
s:
temp_2011_clean['visibility'] = pd.to_numeric(temp_2011_clean['visibility'], errors='coerce')
print (temp_2011_clean.dtypes)
visibility float64
dry_bulb_faren float64
dtype: object
df = temp_2011_clean[['visibility', 'dry_bulb_faren']].resample('W').mean()
print (df)
visibility dry_bulb_faren
2011-01-02 10.0 50.6