将对 timedelta 的更改应用于包含给定字符串的列

Apply change to timedelta to columns containing a given string

在pythonpandas中是以下DataFrame:

date time_SEL time_02_SEL_01 time_03_SEL_05 other
2022-01-01 34756 233232 3432423 756
2022-01-03 23322 4343 3334 343
2022-02-01 123232 3242 23423 434
2022-03-01 7323232 32423 323423 34324

除日期以外的所有列均表示时间的一小部分(以秒为单位)。我的想法是将这些值传递给 TimeDelta,请记住我只想将更改应用于包含字符串“_SEL”的列。

当然我想按字符串应用它们,因为在原始数据集中,这个字符串会有超过 3 列。如果只有3个,我就知道怎么手动了。

使用DataFrame.filter for get all columns ends by _SEL, convert to timedeltas by to_timedelta and replace original by DataFrame.update:

df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(x, unit='s')))
print (df)
         date         time_SEL     time_02_SEL      time_03_SEL  other
0  2022-01-01  0 days 09:39:16 2 days 16:47:12 39 days 17:27:03    756
1  2022-01-03  0 days 06:28:42 0 days 01:12:23  0 days 00:55:34    343
2  2022-02-01  1 days 10:13:52 0 days 00:54:02  0 days 06:30:23    434
3  2022-03-01 84 days 18:13:52 0 days 09:00:23  3 days 17:50:23  34324

另一个想法是通过 Series.str.endswith:

过滤列
m = df.columns.str.endswith('_SEL')
df.loc[:, m] = df.loc[:, m].apply(lambda x: pd.to_timedelta(x, unit='s'))
print (df)
         date         time_SEL     time_02_SEL      time_03_SEL  other
0  2022-01-01  0 days 09:39:16 2 days 16:47:12 39 days 17:27:03    756
1  2022-01-03  0 days 06:28:42 0 days 01:12:23  0 days 00:55:34    343
2  2022-02-01  1 days 10:13:52 0 days 00:54:02  0 days 06:30:23    434
3  2022-03-01 84 days 18:13:52 0 days 09:00:23  3 days 17:50:23  34324

编辑:要将列值转换为整数,请使用 .astype(int):

df.update(df.filter(regex='_SEL$').astype(int).apply(lambda x: pd.to_timedelta(x, unit='s')))

如果失败,因为一些非数值使用:

df.update(df.filter(regex='_SEL$').apply(lambda x: pd.to_timedelta(pd.to_numeric(x, errors='coerce'), unit='s')))

您可以apply pandas.to_timedelta on all columns selected by filter and update原始数据框:

df.update(df.filter(like='_SEL').apply(pd.to_timedelta, unit='s'))

注意。没有输出,修改就地

更新数据帧:

         date         time_SEL     time_02_SEL      time_03_SEL  other
0  2022-01-01  0 days 09:39:16 2 days 16:47:12 39 days 17:27:03    756
1  2022-01-03  0 days 06:28:42 0 days 01:12:23  0 days 00:55:34    343
2  2022-02-01  1 days 10:13:52 0 days 00:54:02  0 days 06:30:23    434
3  2022-03-01 84 days 18:13:52 0 days 09:00:23  3 days 17:50:23  34324
更新“类型错误:类型提升无效”

确保你有号码:

(df.update(df.filter(like='_SEL')
             .apply(lambda c: pd.to_timedelta(pd.to_numeric(c, errors='coerce'),
                                              unit='s'))
)