Tidyr 的 gather() 与 NAs
Tidyr's gather() with NAs
我正在使用 tidyr
和 lubridate
将宽 table 转换为长 table。以下工作正常。
> (df <- data.frame(hh_id = 1:2,
bday_01 = ymd(20150309),
bday_02 = ymd(19850911),
bday_03 = ymd(19801231)))
hh_id bday_01 bday_02 bday_03
1 1 2015-03-09 1985-09-11 1980-12-31
2 2 2015-03-09 1985-09-11 1980-12-31
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 2015-03-09
2 2 bday_01 2015-03-09
3 1 bday_02 1985-09-11
4 2 bday_02 1985-09-11
5 1 bday_03 1980-12-31
6 2 bday_03 1980-12-31
但是,当混合中有 NA 时,日期将转换为字符串。
> (df <- data.frame(hh_id = 1:2,
bday_01 = ymd(20150309),
bday_02 = ymd(19850911),
bday_03 = NA))
hh_id bday_01 bday_02 bday_03
1 1 2015-03-09 1985-09-11 NA
2 2 2015-03-09 1985-09-11 NA
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 1425859200
2 2 bday_01 1425859200
3 1 bday_02 495244800
4 2 bday_02 495244800
5 1 bday_03 NA
6 2 bday_03 NA
Warning message:
attributes are not identical across measure variables; they will be dropped
请注意,当常规字符串也与 NA 混合时,仍然会出现警告。
> (df <- data.frame(hh_id = 1:2,
bday_01 = '20150309',
bday_02 = '19850911',
bday_03 = NA))
hh_id bday_01 bday_02 bday_03
1 1 20150309 19850911 NA
2 2 20150309 19850911 NA
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 20150309
2 2 bday_01 20150309
3 1 bday_02 19850911
4 2 bday_02 19850911
5 1 bday_03 <NA>
6 2 bday_03 <NA>
Warning message:
attributes are not identical across measure variables; they will be dropped
是否可以将 tidyr 与 NA 一起使用,同时避免警告并保留格式?
数据没有被转换为字符串,它正在回落到自 1970 年 1 月 1 日以来的秒数的整数表示,这是 df
中的原始 Date
值表示的:
x <- df$bday_01
x
#[1] "2015-03-09 UTC" "2015-03-09 UTC"
attributes(x) <- NULL
x
#[1] 1425859200 1425859200
警告消息为您提供了解决方法的提示:
attributes are not identical across measure variables; they will be
dropped
所以,尝试:
attributes(df$bday_03) <- attributes(df$bday_02)
gather(df, person_num, bday, starts_with("bday_0"))
# hh_id person_num bday
#1 1 bday_01 2015-03-09
#2 2 bday_01 2015-03-09
#3 1 bday_02 1985-09-11
#4 2 bday_02 1985-09-11
#5 1 bday_03 <NA>
#6 2 bday_03 <NA>
我正在使用 tidyr
和 lubridate
将宽 table 转换为长 table。以下工作正常。
> (df <- data.frame(hh_id = 1:2,
bday_01 = ymd(20150309),
bday_02 = ymd(19850911),
bday_03 = ymd(19801231)))
hh_id bday_01 bday_02 bday_03
1 1 2015-03-09 1985-09-11 1980-12-31
2 2 2015-03-09 1985-09-11 1980-12-31
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 2015-03-09
2 2 bday_01 2015-03-09
3 1 bday_02 1985-09-11
4 2 bday_02 1985-09-11
5 1 bday_03 1980-12-31
6 2 bday_03 1980-12-31
但是,当混合中有 NA 时,日期将转换为字符串。
> (df <- data.frame(hh_id = 1:2,
bday_01 = ymd(20150309),
bday_02 = ymd(19850911),
bday_03 = NA))
hh_id bday_01 bday_02 bday_03
1 1 2015-03-09 1985-09-11 NA
2 2 2015-03-09 1985-09-11 NA
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 1425859200
2 2 bday_01 1425859200
3 1 bday_02 495244800
4 2 bday_02 495244800
5 1 bday_03 NA
6 2 bday_03 NA
Warning message:
attributes are not identical across measure variables; they will be dropped
请注意,当常规字符串也与 NA 混合时,仍然会出现警告。
> (df <- data.frame(hh_id = 1:2,
bday_01 = '20150309',
bday_02 = '19850911',
bday_03 = NA))
hh_id bday_01 bday_02 bday_03
1 1 20150309 19850911 NA
2 2 20150309 19850911 NA
> gather(df, person_num, bday, starts_with("bday_0"))
hh_id person_num bday
1 1 bday_01 20150309
2 2 bday_01 20150309
3 1 bday_02 19850911
4 2 bday_02 19850911
5 1 bday_03 <NA>
6 2 bday_03 <NA>
Warning message:
attributes are not identical across measure variables; they will be dropped
是否可以将 tidyr 与 NA 一起使用,同时避免警告并保留格式?
数据没有被转换为字符串,它正在回落到自 1970 年 1 月 1 日以来的秒数的整数表示,这是 df
中的原始 Date
值表示的:
x <- df$bday_01
x
#[1] "2015-03-09 UTC" "2015-03-09 UTC"
attributes(x) <- NULL
x
#[1] 1425859200 1425859200
警告消息为您提供了解决方法的提示:
attributes are not identical across measure variables; they will be dropped
所以,尝试:
attributes(df$bday_03) <- attributes(df$bday_02)
gather(df, person_num, bday, starts_with("bday_0"))
# hh_id person_num bday
#1 1 bday_01 2015-03-09
#2 2 bday_01 2015-03-09
#3 1 bday_02 1985-09-11
#4 2 bday_02 1985-09-11
#5 1 bday_03 <NA>
#6 2 bday_03 <NA>