将对 timedelta 的更改应用于包含给定字符串的列
Apply change to timedelta to columns containing a given string
在pythonpandas中是以下DataFrame:
date
time_SEL
time_02_SEL_01
time_03_SEL_05
other
2022-01-01
34756
233232
3432423
756
2022-01-03
23322
4343
3334
343
2022-02-01
123232
3242
23423
434
2022-03-01
7323232
32423
323423
34324
除日期以外的所有列均表示时间的一小部分(以秒为单位)。我的想法是将这些值传递给 TimeDelta,请记住我只想将更改应用于包含字符串“_SEL”的列。
当然我想按字符串应用它们,因为在原始数据集中,这个字符串会有超过 3 列。如果只有3个,我就知道怎么手动了。
使用DataFrame.filter
for get all columns ends by _SEL
, convert to timedeltas by to_timedelta
and replace original by DataFrame.update
:
df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(x, unit='s')))
print (df)
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
另一个想法是通过 Series.str.endswith
:
过滤列
m = df.columns.str.endswith('_SEL')
df.loc[:, m] = df.loc[:, m].apply(lambda x: pd.to_timedelta(x, unit='s'))
print (df)
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
编辑:要将列值转换为整数,请使用 .astype(int)
:
df.update(df.filter(regex='_SEL$').astype(int).apply(lambda x: pd.to_timedelta(x, unit='s')))
如果失败,因为一些非数值使用:
df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(pd.to_numeric(x, errors='coerce'), unit='s')))
您可以apply
pandas.to_timedelta
on all columns selected by filter
and update
原始数据框:
df.update(df.filter(like='_SEL').apply(pd.to_timedelta, unit='s'))
注意。没有输出,修改就地
更新数据帧:
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
更新“类型错误:类型提升无效”
确保你有号码:
(df.update(df.filter(like='_SEL')
.apply(lambda c: pd.to_timedelta(pd.to_numeric(c, errors='coerce'),
unit='s'))
)
在pythonpandas中是以下DataFrame:
date | time_SEL | time_02_SEL_01 | time_03_SEL_05 | other |
---|---|---|---|---|
2022-01-01 | 34756 | 233232 | 3432423 | 756 |
2022-01-03 | 23322 | 4343 | 3334 | 343 |
2022-02-01 | 123232 | 3242 | 23423 | 434 |
2022-03-01 | 7323232 | 32423 | 323423 | 34324 |
除日期以外的所有列均表示时间的一小部分(以秒为单位)。我的想法是将这些值传递给 TimeDelta,请记住我只想将更改应用于包含字符串“_SEL”的列。
当然我想按字符串应用它们,因为在原始数据集中,这个字符串会有超过 3 列。如果只有3个,我就知道怎么手动了。
使用DataFrame.filter
for get all columns ends by _SEL
, convert to timedeltas by to_timedelta
and replace original by DataFrame.update
:
df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(x, unit='s')))
print (df)
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
另一个想法是通过 Series.str.endswith
:
m = df.columns.str.endswith('_SEL')
df.loc[:, m] = df.loc[:, m].apply(lambda x: pd.to_timedelta(x, unit='s'))
print (df)
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
编辑:要将列值转换为整数,请使用 .astype(int)
:
df.update(df.filter(regex='_SEL$').astype(int).apply(lambda x: pd.to_timedelta(x, unit='s')))
如果失败,因为一些非数值使用:
df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(pd.to_numeric(x, errors='coerce'), unit='s')))
您可以apply
pandas.to_timedelta
on all columns selected by filter
and update
原始数据框:
df.update(df.filter(like='_SEL').apply(pd.to_timedelta, unit='s'))
注意。没有输出,修改就地
更新数据帧:
date time_SEL time_02_SEL time_03_SEL other
0 2022-01-01 0 days 09:39:16 2 days 16:47:12 39 days 17:27:03 756
1 2022-01-03 0 days 06:28:42 0 days 01:12:23 0 days 00:55:34 343
2 2022-02-01 1 days 10:13:52 0 days 00:54:02 0 days 06:30:23 434
3 2022-03-01 84 days 18:13:52 0 days 09:00:23 3 days 17:50:23 34324
更新“类型错误:类型提升无效”
确保你有号码:
(df.update(df.filter(like='_SEL')
.apply(lambda c: pd.to_timedelta(pd.to_numeric(c, errors='coerce'),
unit='s'))
)