导出到 excel 时将 class 'pandas.tslib.Timedelta' 转换为字符串
Convert class 'pandas.tslib.Timedelta' to string when export to excel
初始数据帧:
arrivalTime
0 2016-01-12 06:35:42
2 2016-01-12 06:54:02
3 2016-01-12 07:01:43
4 2016-01-12 07:02:28
5 2016-01-12 07:12:29
6 2016-01-12 07:18:41
在数据上我应用了这个函数:
def function(df):
df['arrivalTime_cal'] = pd.to_datetime(df['arrivalTime'], format='%Y-%m-%d %H:%M:%S')
df['diff_time'] = df['arrivalTime_cal'].diff().fillna(0)
del df['arrivalTime_cal']
return df
我得到了这些结果(更正 ipython):
diff_time
0 00:00:00
1 00:04:37
2 00:13:43
3 00:07:41
4 00:00:45
当导出到 excel 时,结果会更改格式:
arrivalTime diff_time
0 2016-01-12 06:35:42 0
1 2016-01-12 06:40:19 0,003206019
2 2016-01-12 06:54:02 0,009525463
3 2016-01-12 07:01:43 0,005335648
4 2016-01-12 07:02:28 0,000520833
如何在 Excel 中保留字符串格式?
提前致谢
IIUC 然后你可以将类型转换为 str
然后 split
str:
In [53]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0]
Out[53]:
index
0 00:00:00
2 00:18:20
3 00:07:41
4 00:00:45
5 00:10:01
6 00:06:12
dtype: object
将以上内容分解为多个步骤,使用 astype
:
转换为 str
In [54]:
df['diff_time'].astype(str)
Out[54]:
index
0 0 days 00:00:00.000000000
2 0 days 00:18:20.000000000
3 0 days 00:07:41.000000000
4 0 days 00:00:45.000000000
5 0 days 00:10:01.000000000
6 0 days 00:06:12.000000000
Name: diff_time, dtype: object
现在拆分(默认字符为空格)并只取最后一个拆分元素,即时间分量:
In [55]:
df['diff_time'].astype(str).str.split().str[-1]
Out[55]:
index
0 00:00:00.000000000
2 00:18:20.000000000
3 00:07:41.000000000
4 00:00:45.000000000
5 00:10:01.000000000
6 00:06:12.000000000
dtype: object
现在 rsplit
减去 hte 微秒
In [56]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.')
Out[56]:
index
0 [00:00:00, 000000000]
2 [00:18:20, 000000000]
3 [00:07:41, 000000000]
4 [00:00:45, 000000000]
5 [00:10:01, 000000000]
6 [00:06:12, 000000000]
dtype: object
可以看到转换后的值确实是str
:
In [57]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0][0]
Out[57]:
'00:00:00'
初始数据帧:
arrivalTime
0 2016-01-12 06:35:42
2 2016-01-12 06:54:02
3 2016-01-12 07:01:43
4 2016-01-12 07:02:28
5 2016-01-12 07:12:29
6 2016-01-12 07:18:41
在数据上我应用了这个函数:
def function(df):
df['arrivalTime_cal'] = pd.to_datetime(df['arrivalTime'], format='%Y-%m-%d %H:%M:%S')
df['diff_time'] = df['arrivalTime_cal'].diff().fillna(0)
del df['arrivalTime_cal']
return df
我得到了这些结果(更正 ipython):
diff_time
0 00:00:00
1 00:04:37
2 00:13:43
3 00:07:41
4 00:00:45
当导出到 excel 时,结果会更改格式:
arrivalTime diff_time
0 2016-01-12 06:35:42 0
1 2016-01-12 06:40:19 0,003206019
2 2016-01-12 06:54:02 0,009525463
3 2016-01-12 07:01:43 0,005335648
4 2016-01-12 07:02:28 0,000520833
如何在 Excel 中保留字符串格式?
提前致谢
IIUC 然后你可以将类型转换为 str
然后 split
str:
In [53]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0]
Out[53]:
index
0 00:00:00
2 00:18:20
3 00:07:41
4 00:00:45
5 00:10:01
6 00:06:12
dtype: object
将以上内容分解为多个步骤,使用 astype
:
str
In [54]:
df['diff_time'].astype(str)
Out[54]:
index
0 0 days 00:00:00.000000000
2 0 days 00:18:20.000000000
3 0 days 00:07:41.000000000
4 0 days 00:00:45.000000000
5 0 days 00:10:01.000000000
6 0 days 00:06:12.000000000
Name: diff_time, dtype: object
现在拆分(默认字符为空格)并只取最后一个拆分元素,即时间分量:
In [55]:
df['diff_time'].astype(str).str.split().str[-1]
Out[55]:
index
0 00:00:00.000000000
2 00:18:20.000000000
3 00:07:41.000000000
4 00:00:45.000000000
5 00:10:01.000000000
6 00:06:12.000000000
dtype: object
现在 rsplit
减去 hte 微秒
In [56]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.')
Out[56]:
index
0 [00:00:00, 000000000]
2 [00:18:20, 000000000]
3 [00:07:41, 000000000]
4 [00:00:45, 000000000]
5 [00:10:01, 000000000]
6 [00:06:12, 000000000]
dtype: object
可以看到转换后的值确实是str
:
In [57]:
df['diff_time'].astype(str).str.split().str[-1].str.rsplit('.').str[0][0]
Out[57]:
'00:00:00'