在新列中赋值——请提高效率
assigning values in a new column -- efficiency, please
我正在寻找一种更有效的方法来完成一项非常基本的任务:添加一个新列,其中包含为现有行指定的值。示例数据框(称为 ess)具有国家和(调查)回合。我想添加一列“dem”,其中包含来自外部来源的值。这是一个片段:
id cntry essround dem
1 AL 1
2 AT 1
3 BE 1
4 BG 1
5 HR 1
6 AL 2
7 AT 2
8 BE 2
9 BG 2
10 HR 2
执行此操作的“长”方法如下:
ess$dem <- NA
ess$dem[ess$cntry=="AL" & ess$essround==1] <- 3.5
ess$dem[ess$cntry=="AT" & ess$essround==1] <- 1
ess$dem[ess$cntry=="BE" & ess$essround==1] <- 1.5
ess$dem[ess$cntry=="BG" & ess$essround==1] <- 2
ess$dem[ess$cntry=="HR" & ess$essround==1] <- 2
ess$dem[ess$cntry=="AL" & ess$essround==2] <- 3
ess$dem[ess$cntry=="AT" & ess$essround==2] <- 1
ess$dem[ess$cntry=="BE" & ess$essround==2] <- 1
ess$dem[ess$cntry=="BG" & ess$essround==2] <- 1.5
ess$dem[ess$cntry=="HR" & ess$essround==2] <- 2
问题是,当我有 36 个国家和 6 个回合时,这种方法变得 非常 长——这样我最终得到 216 行代码。 (当我想在同一模式下创建多个新列时,情况会变得更糟...)
有没有办法压缩这样的操作??是否可以在代码依赖于相应值列表中的“位置”的单行上完成?
正在创建虚拟数据:
ess = data.frame(
contry = sample(c("AL","AT","BE","BG","HR","AL","AT","BE","BG","HR"), 20, TRUE),
essround = sample(1:2, 20, TRUE))
现在代码:
ess$dem <- NA
values = c(3.5,1,1.5,2,2,3,1,1,1.5,2)
groups = unique(ess$contry)
for(i in 1:length(groups)){
ess[ess$contry==groups[i],"dem"] <- values[i]
}
输出:
contry essround dem
1 BE 1 3.5
2 HR 2 1.0
3 AT 2 1.5
4 BG 1 2.0
5 AT 1 1.5
6 AT 2 1.5
7 AT 2 1.5
8 AT 2 1.5
9 BG 2 2.0
10 BE 2 3.5
11 AT 1 1.5
12 AT 2 1.5
13 BE 1 3.5
14 AT 2 1.5
15 HR 1 1.0
16 BG 1 2.0
17 BE 1 3.5
18 BG 1 2.0
19 AT 2 1.5
20 AT 2 1.5
首先使用 tidyverse
,您需要创建一个包含值的 data.frame:即 ess$dem[ess$cntry=="AL" & ess$essround==1] <- 3.5
应该成为 conds
data.frame 中的一行:
## expand grid to create all possible combinations of cntry and essround
conds <- expand.grid(cntry=c("AL","AT","BE","BG","HR"), essround=1:2) %>% mutate(dem = c(3.5,1,1.5,2,2,3,1,1,1.5,2))
## first row will be "AL" 1 3.5 which is the first condition
conds
cntry essround dem
1 AL 1 3.5
2 AT 1 1.0
3 BE 1 1.5
4 BG 1 2.0
5 HR 1 2.0
6 AL 2 3.0
7 AT 2 1.0
8 BE 2 1.0
9 BG 2 1.5
10 HR 2 2.0
ess %>% left_join(conds)
Joining, by = c("cntry", "essround")
cntry essround dem
1 AT 1 1.0
2 AT 2 1.0
3 HR 2 2.0
4 BG 2 1.5
5 HR 2 2.0
6 HR 1 2.0
7 BG 2 1.5
8 BG 1 2.0
9 HR 2 2.0
10 BG 1 2.0
11 AT 1 1.0
12 BG 2 1.5
13 AL 1 3.5
14 HR 1 2.0
15 BE 2 1.0
16 AL 2 3.0
17 AL 1 3.5
18 AL 1 3.5
19 AT 1 1.0
20 AT 1 1.0
我正在寻找一种更有效的方法来完成一项非常基本的任务:添加一个新列,其中包含为现有行指定的值。示例数据框(称为 ess)具有国家和(调查)回合。我想添加一列“dem”,其中包含来自外部来源的值。这是一个片段:
id cntry essround dem
1 AL 1
2 AT 1
3 BE 1
4 BG 1
5 HR 1
6 AL 2
7 AT 2
8 BE 2
9 BG 2
10 HR 2
执行此操作的“长”方法如下:
ess$dem <- NA
ess$dem[ess$cntry=="AL" & ess$essround==1] <- 3.5
ess$dem[ess$cntry=="AT" & ess$essround==1] <- 1
ess$dem[ess$cntry=="BE" & ess$essround==1] <- 1.5
ess$dem[ess$cntry=="BG" & ess$essround==1] <- 2
ess$dem[ess$cntry=="HR" & ess$essround==1] <- 2
ess$dem[ess$cntry=="AL" & ess$essround==2] <- 3
ess$dem[ess$cntry=="AT" & ess$essround==2] <- 1
ess$dem[ess$cntry=="BE" & ess$essround==2] <- 1
ess$dem[ess$cntry=="BG" & ess$essround==2] <- 1.5
ess$dem[ess$cntry=="HR" & ess$essround==2] <- 2
问题是,当我有 36 个国家和 6 个回合时,这种方法变得 非常 长——这样我最终得到 216 行代码。 (当我想在同一模式下创建多个新列时,情况会变得更糟...)
有没有办法压缩这样的操作??是否可以在代码依赖于相应值列表中的“位置”的单行上完成?
正在创建虚拟数据:
ess = data.frame(
contry = sample(c("AL","AT","BE","BG","HR","AL","AT","BE","BG","HR"), 20, TRUE),
essround = sample(1:2, 20, TRUE))
现在代码:
ess$dem <- NA
values = c(3.5,1,1.5,2,2,3,1,1,1.5,2)
groups = unique(ess$contry)
for(i in 1:length(groups)){
ess[ess$contry==groups[i],"dem"] <- values[i]
}
输出:
contry essround dem
1 BE 1 3.5
2 HR 2 1.0
3 AT 2 1.5
4 BG 1 2.0
5 AT 1 1.5
6 AT 2 1.5
7 AT 2 1.5
8 AT 2 1.5
9 BG 2 2.0
10 BE 2 3.5
11 AT 1 1.5
12 AT 2 1.5
13 BE 1 3.5
14 AT 2 1.5
15 HR 1 1.0
16 BG 1 2.0
17 BE 1 3.5
18 BG 1 2.0
19 AT 2 1.5
20 AT 2 1.5
首先使用 tidyverse
,您需要创建一个包含值的 data.frame:即 ess$dem[ess$cntry=="AL" & ess$essround==1] <- 3.5
应该成为 conds
data.frame 中的一行:
## expand grid to create all possible combinations of cntry and essround
conds <- expand.grid(cntry=c("AL","AT","BE","BG","HR"), essround=1:2) %>% mutate(dem = c(3.5,1,1.5,2,2,3,1,1,1.5,2))
## first row will be "AL" 1 3.5 which is the first condition
conds
cntry essround dem
1 AL 1 3.5
2 AT 1 1.0
3 BE 1 1.5
4 BG 1 2.0
5 HR 1 2.0
6 AL 2 3.0
7 AT 2 1.0
8 BE 2 1.0
9 BG 2 1.5
10 HR 2 2.0
ess %>% left_join(conds)
Joining, by = c("cntry", "essround")
cntry essround dem
1 AT 1 1.0
2 AT 2 1.0
3 HR 2 2.0
4 BG 2 1.5
5 HR 2 2.0
6 HR 1 2.0
7 BG 2 1.5
8 BG 1 2.0
9 HR 2 2.0
10 BG 1 2.0
11 AT 1 1.0
12 BG 2 1.5
13 AL 1 3.5
14 HR 1 2.0
15 BE 2 1.0
16 AL 2 3.0
17 AL 1 3.5
18 AL 1 3.5
19 AT 1 1.0
20 AT 1 1.0