按时间和 id r 传播字符列
Spreading a character column by time and id r
问题的风格和很多人一样,但还是有区别的。
我经常看到人们寻求一种将一列分成几列的方法,但通常是在 df 中,其中列中的每个名称都有一个度量。
像这样:
head(df)
id time fish weight
1 1 1 marlin 4
2 1 1 cod 1
3 1 2 cod 1
4 2 1 salmon 2
5 2 1 cod 2
6 2 2 cod 3
所以我可以像这样使用传播(或 dcast 或类似的:
df<-spread(df, fish,weight, fill=F)
id time cod marlin salmon
1 1 1 1 4 <NA>
2 1 2 1 <NA> <NA>
3 2 1 2 <NA> 2
4 2 2 3 <NA> <NA>
但是,如果您没有变量的值(此处为权重),而只想传播鱼的种类怎么办?
所以输出是这样的
id time Fish1 Fish2
1 1 marlin salmon
1 2 cod <NA>
2 1 salmon cod
2 2 cod <NA>
你是怎么做到的?
感谢您的任何帮助。不胜感激。
我们需要按顺序分组
df %>%
select(-weight) %>%
group_by(id, time) %>%
mutate(ind = paste0("Fish", row_number())) %>%
spread(ind, fish)
# A tibble: 4 x 4
# Groups: id, time [4]
# id time Fish1 Fish2
# <int> <int> <chr> <chr>
#1 1 1 marlin cod
#2 1 2 cod NA
#3 2 1 salmon cod
#4 2 2 cod NA
数据
df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), time = c(1L, 1L,
2L, 1L, 1L, 2L), fish = c("marlin", "cod", "cod", "salmon", "cod",
"cod"), weight = c(4L, 1L, 1L, 2L, 2L, 3L)), .Names = c("id",
"time", "fish", "weight"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
问题的风格和很多人一样,但还是有区别的。 我经常看到人们寻求一种将一列分成几列的方法,但通常是在 df 中,其中列中的每个名称都有一个度量。
像这样:
head(df)
id time fish weight
1 1 1 marlin 4
2 1 1 cod 1
3 1 2 cod 1
4 2 1 salmon 2
5 2 1 cod 2
6 2 2 cod 3
所以我可以像这样使用传播(或 dcast 或类似的:
df<-spread(df, fish,weight, fill=F)
id time cod marlin salmon
1 1 1 1 4 <NA>
2 1 2 1 <NA> <NA>
3 2 1 2 <NA> 2
4 2 2 3 <NA> <NA>
但是,如果您没有变量的值(此处为权重),而只想传播鱼的种类怎么办? 所以输出是这样的
id time Fish1 Fish2
1 1 marlin salmon
1 2 cod <NA>
2 1 salmon cod
2 2 cod <NA>
你是怎么做到的? 感谢您的任何帮助。不胜感激。
我们需要按顺序分组
df %>%
select(-weight) %>%
group_by(id, time) %>%
mutate(ind = paste0("Fish", row_number())) %>%
spread(ind, fish)
# A tibble: 4 x 4
# Groups: id, time [4]
# id time Fish1 Fish2
# <int> <int> <chr> <chr>
#1 1 1 marlin cod
#2 1 2 cod NA
#3 2 1 salmon cod
#4 2 2 cod NA
数据
df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), time = c(1L, 1L,
2L, 1L, 1L, 2L), fish = c("marlin", "cod", "cod", "salmon", "cod",
"cod"), weight = c(4L, 1L, 1L, 2L, 2L, 3L)), .Names = c("id",
"time", "fish", "weight"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))