Rpy2 base.as_Date 字符数据框列到日期列的转换
Rpy2 base.as_Date conversion of character dataframe column to date column
我有一个将日期映射为字符列的 rpy2 数据框,因为我不想要 POSIXt/ct 列。我以为我可以将该字符列转换为日期,它会在 r_df 内,但我收到了一个 float
设置:
from rpy2.robjects.packages import importr
base = importr("base")
简短示例:
> base.as_Date('2020-01-01')
R object with classes: ('Date',) mapped to:
[18262.000000]
> base.as_Date('2020-01-01', format='%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[18262.000000]
我的实际数据框:
> r_df
R object with classes: ('data.frame',) mapped to:
[IntSexpVe..., IntSexpVe..., IntSexpVe..., FloatSexp..., ..., StrSexpVe..., StrSexpVe..., StrSexpVe..., StrSexpVe...]
....
> r_df[i]
R object with classes: ('character',) mapped to:
['2016-11-..., '2020-02-..., '2020-07-..., '2019-01-..., ..., '2020-01-..., '2017-01-..., '2020-01-..., '2020-01-...]
> base.as_Date(r_df[i], format = "%Y-%m-%d")
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
使用相同数据帧的另一次尝试:
> r_df.rx2(col_name)
R object with classes: ('character',) mapped to:
['2016-11-..., '2020-02-..., '2020-07-..., '2019-01-..., ..., '2020-01-..., '2017-01-..., '2020-01-..., '2020-01-...]
> base.as_Date(r_df.rx2(col_name), '%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
上次尝试尝试从 POSIXt/ct 转换为 Date,认为它可能能够更准确地解析:
> r_df.rx2(col_name)
R object with classes: ('POSIXct', 'POSIXt') mapped to:
[2016-11-01, 2020-02-01, ..., 2020-01-01, 2020-01-01, 2017-01-01, 2020-01-01]
> base.as_Date(r_df.rx2(col_name), '%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
在 r studio 中执行以及我的期望是:
> as.Date('2020-01-01')
[1] "2020-01-01"
这对我来说似乎不正确。我使用 rpy2 转换器进行 python pandas df 到 r 数据帧的转换。我没有在默认转换器之外执行代码。知道如何解决这个问题并正确转换字符串
版本:
pandas==1.0.1
rpy2~=3.3.5
R == 4.0.0
在 R 中,Date
对象是带有标签的浮点数(数组)以告诉 R 它们是日期。
>>> dt = base.as_Date('2020-01-01')
>>> dt
R object with classes: ('Date',) mapped to:
[18262.000000]
但是,当使用R自带的打印时:
>>> print(dt)
[1] "2020-01-01"
而在 R 的 C-API 级别,这是一个浮点数
>>> dt.typeof
<RTYPES.REALSXP: 14>
有一个 R class 属性告诉 R 这是一个日期。
>>> tuple(dt.rclass)
('Date',)
我有一个将日期映射为字符列的 rpy2 数据框,因为我不想要 POSIXt/ct 列。我以为我可以将该字符列转换为日期,它会在 r_df 内,但我收到了一个 float
设置:
from rpy2.robjects.packages import importr
base = importr("base")
简短示例:
> base.as_Date('2020-01-01')
R object with classes: ('Date',) mapped to:
[18262.000000]
> base.as_Date('2020-01-01', format='%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[18262.000000]
我的实际数据框:
> r_df
R object with classes: ('data.frame',) mapped to:
[IntSexpVe..., IntSexpVe..., IntSexpVe..., FloatSexp..., ..., StrSexpVe..., StrSexpVe..., StrSexpVe..., StrSexpVe...]
....
> r_df[i]
R object with classes: ('character',) mapped to:
['2016-11-..., '2020-02-..., '2020-07-..., '2019-01-..., ..., '2020-01-..., '2017-01-..., '2020-01-..., '2020-01-...]
> base.as_Date(r_df[i], format = "%Y-%m-%d")
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
使用相同数据帧的另一次尝试:
> r_df.rx2(col_name)
R object with classes: ('character',) mapped to:
['2016-11-..., '2020-02-..., '2020-07-..., '2019-01-..., ..., '2020-01-..., '2017-01-..., '2020-01-..., '2020-01-...]
> base.as_Date(r_df.rx2(col_name), '%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
上次尝试尝试从 POSIXt/ct 转换为 Date,认为它可能能够更准确地解析:
> r_df.rx2(col_name)
R object with classes: ('POSIXct', 'POSIXt') mapped to:
[2016-11-01, 2020-02-01, ..., 2020-01-01, 2020-01-01, 2017-01-01, 2020-01-01]
> base.as_Date(r_df.rx2(col_name), '%Y-%m-%d')
R object with classes: ('Date',) mapped to:
[17106.000000, 18293.000000, 18444.000000, 17897.000000, ..., 18262.000000, 17167.000000, 18262.000000, 18262.000000]
在 r studio 中执行以及我的期望是:
> as.Date('2020-01-01')
[1] "2020-01-01"
这对我来说似乎不正确。我使用 rpy2 转换器进行 python pandas df 到 r 数据帧的转换。我没有在默认转换器之外执行代码。知道如何解决这个问题并正确转换字符串
版本:
pandas==1.0.1
rpy2~=3.3.5
R == 4.0.0
在 R 中,Date
对象是带有标签的浮点数(数组)以告诉 R 它们是日期。
>>> dt = base.as_Date('2020-01-01')
>>> dt
R object with classes: ('Date',) mapped to:
[18262.000000]
但是,当使用R自带的打印时:
>>> print(dt)
[1] "2020-01-01"
而在 R 的 C-API 级别,这是一个浮点数
>>> dt.typeof
<RTYPES.REALSXP: 14>
有一个 R class 属性告诉 R 这是一个日期。
>>> tuple(dt.rclass)
('Date',)