如何从连续的多次观察中创建新的分类变量?
How do I create a new categorical variable from continuous multiple observations?
这是我的数据:
ID dist
1 23
1 10
2 12
2 20
3 14
3 33
我想遍历每个 ID,并为每个 ID 的较大值创建一个新列 ("state"),将其命名为 "high",对于较小的值,将其命名为 [=19] =].
最好的方法是什么?
我们可以创建条件 max/min
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(state = case_when(dist == max(dist) ~ "high",
dist == min(dist) ~ "low",
TRUE ~ NA_character_))
因为每个都有两个值'ID',不需要第二个条件
df1 %>%
group_by(ID) %>%
mutate(state = case_when(dist == max(dist) ~ "high",
TRUE ~"low"))
数据
df1 <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), dist = c(23L,
10L, 12L, 20L, 14L, 33L)), class = "data.frame", row.names = c(NA,
-6L))
使用 R 基础
> transform(df1, state = ave(dist, ID, FUN= function(x)ifelse(x==max(x), "high", "low")))
ID dist state
1 1 23 high
2 1 10 low
3 2 12 low
4 2 20 high
5 3 14 low
6 3 33 high
与data.table...
library(data.table)
setDT(DF)
DF[order(ID, dist), v := c("lo", "hi")]
ID dist v
1: 1 23 hi
2: 1 10 lo
3: 2 12 lo
4: 2 20 hi
5: 3 14 lo
6: 3 33 hi
这是我的数据:
ID dist
1 23
1 10
2 12
2 20
3 14
3 33
我想遍历每个 ID,并为每个 ID 的较大值创建一个新列 ("state"),将其命名为 "high",对于较小的值,将其命名为 [=19] =].
最好的方法是什么?
我们可以创建条件 max/min
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(state = case_when(dist == max(dist) ~ "high",
dist == min(dist) ~ "low",
TRUE ~ NA_character_))
因为每个都有两个值'ID',不需要第二个条件
df1 %>%
group_by(ID) %>%
mutate(state = case_when(dist == max(dist) ~ "high",
TRUE ~"low"))
数据
df1 <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), dist = c(23L,
10L, 12L, 20L, 14L, 33L)), class = "data.frame", row.names = c(NA,
-6L))
使用 R 基础
> transform(df1, state = ave(dist, ID, FUN= function(x)ifelse(x==max(x), "high", "low")))
ID dist state
1 1 23 high
2 1 10 low
3 2 12 low
4 2 20 high
5 3 14 low
6 3 33 high
与data.table...
library(data.table)
setDT(DF)
DF[order(ID, dist), v := c("lo", "hi")]
ID dist v
1: 1 23 hi
2: 1 10 lo
3: 2 12 lo
4: 2 20 hi
5: 3 14 lo
6: 3 33 hi