如何在 For 循环中索引多个对象/变量
How to index multiple objects / variables within a For loop
我的数据由三列组成:
家庭 ID、产品 ID (H14aq2)、值。
我有大约 7000 行(家庭 ID),可以分为 12 个地区和 160 个产品。 HH ID 可以出现多次,因为它们使用多种产品。我的目标是对每个产品的家庭价值求和,这样我就能得到整个地区的产品价值总和。我知道如何手动实现这一点,但我想使用循环,因为我将对多个数据集执行此操作。
这是我当前的代码。这实际上运行没有错误,显示了 156 次迭代,但是当我查看 total_values_05 对象时,只附加了一个额外的向量,val_i.
for(i in 105:161){
total_val_i <- cons_05 %>%
filter(H14aq2 == i) %>%
group_by(Districtn05) %>%
summarise(val_i = sum(total_val_yr)) %>%
ungroup()
total_values_05 <- total_values_05 %>%
left_join(total_val_i)
rm(total_val_i)
}
有 161 种产品(使用变量 H14aq2 进行索引,从 101 到 161)。在此循环之前,我创建了对象 total_values_05,出于其他原因,我在其中处理产品 101 到 104。
在每次迭代中,我想过滤单个产品,对包含值的 total_val_yr 变量求和,然后将新向量 val_i 附加到现有对象 total_values_05.最终我想要一个结构如下的对象:
District
val_101
val_102
val_103
First
row
row
row
Second
row
row
row
(最多 val_161 和第 12 区)
在我看来,我遗漏了一件让这项工作真正起作用的小事,因为代码运行并且实际上已经附加了一个名为 val_i 的变量 - 我认为索引多个事物存在问题我。
这是我第一次尝试循环!非常感谢任何帮助:)
这是示例数据(仅包含我的问题所需的 4 个变量)
structure(list(Hhid = structure(c("1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c("Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = "%16.0g", labels = c(Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161), class = c("haven_labelled",
"vctrs_vctr", "double")), total_val_yr = c(3250, 10400, 156000,
10400, 260000, 312000)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame")) ```
您可以按多列分组,然后将汇总结果转换为宽格式,如下所示:
library(tidyverse)
data <- structure(list(
Hhid = structure(c(
"1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"
), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c(
"Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"
), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = "%16.0g", labels = c(
Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161
), class = c(
"haven_labelled",
"vctrs_vctr", "double"
)), total_val_yr = c(
3250, 10400, 156000,
10400, 260000, 312000
)
), row.names = c(NA, -6L), class = c(
"tbl_df",
"tbl", "data.frame"
))
data %>%
group_by(Districtn05, H14aq2) %>%
summarise(total_val_yr = sum(total_val_yr)) %>%
select(total_val_yr, H14aq2) %>%
pivot_wider(names_from = H14aq2, values_from = total_val_yr, names_prefix = "val_")
#> `summarise()` has grouped output by 'Districtn05'. You can override using the
#> `.groups` argument.
#> Adding missing grouping variables: `Districtn05`
#> # A tibble: 1 × 7
#> # Groups: Districtn05 [1]
#> Districtn05 val_103 val_112 val_135 val_136 val_140 val_150
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Kiboga 312000 260000 10400 10400 156000 3250
由 reprex package (v2.0.0)
于 2022-05-25 创建
我的数据由三列组成:
家庭 ID、产品 ID (H14aq2)、值。
我有大约 7000 行(家庭 ID),可以分为 12 个地区和 160 个产品。 HH ID 可以出现多次,因为它们使用多种产品。我的目标是对每个产品的家庭价值求和,这样我就能得到整个地区的产品价值总和。我知道如何手动实现这一点,但我想使用循环,因为我将对多个数据集执行此操作。
这是我当前的代码。这实际上运行没有错误,显示了 156 次迭代,但是当我查看 total_values_05 对象时,只附加了一个额外的向量,val_i.
for(i in 105:161){
total_val_i <- cons_05 %>%
filter(H14aq2 == i) %>%
group_by(Districtn05) %>%
summarise(val_i = sum(total_val_yr)) %>%
ungroup()
total_values_05 <- total_values_05 %>%
left_join(total_val_i)
rm(total_val_i)
}
有 161 种产品(使用变量 H14aq2 进行索引,从 101 到 161)。在此循环之前,我创建了对象 total_values_05,出于其他原因,我在其中处理产品 101 到 104。
在每次迭代中,我想过滤单个产品,对包含值的 total_val_yr 变量求和,然后将新向量 val_i 附加到现有对象 total_values_05.最终我想要一个结构如下的对象:
District | val_101 | val_102 | val_103 |
---|---|---|---|
First | row | row | row |
Second | row | row | row |
(最多 val_161 和第 12 区)
在我看来,我遗漏了一件让这项工作真正起作用的小事,因为代码运行并且实际上已经附加了一个名为 val_i 的变量 - 我认为索引多个事物存在问题我。
这是我第一次尝试循环!非常感谢任何帮助:)
这是示例数据(仅包含我的问题所需的 4 个变量)
structure(list(Hhid = structure(c("1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c("Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = "%16.0g", labels = c(Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161), class = c("haven_labelled",
"vctrs_vctr", "double")), total_val_yr = c(3250, 10400, 156000,
10400, 260000, 312000)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame")) ```
您可以按多列分组,然后将汇总结果转换为宽格式,如下所示:
library(tidyverse)
data <- structure(list(
Hhid = structure(c(
"1033000301", "1033000301",
"1033000301", "1033000301", "1033000301", "1033000301"
), label = "Unique hh identifier across panel waves", format.stata = "%-10s"),
Districtn05 = structure(c(
"Kiboga", "Kiboga", "Kiboga", "Kiboga",
"Kiboga", "Kiboga"
), label = "District name as in 2005/06", format.stata = "%-13s"),
H14aq2 = structure(c(150, 135, 140, 136, 112, 103), label = "Consumption item code", format.stata = "%16.0g", labels = c(
Matooke = 101,
Matooke = 102, Matooke = 103, Matooke = 104, `Sweet potatoes fresh` = 105,
`Sweet potatoes dry` = 106, `Cassava fresh` = 107, `Cassava dry/flour` = 108,
`Irish potatoes` = 109, Rice = 110, `Maize grains` = 111,
`Maize cobs` = 112, `Maize flour` = 113, Bread = 114, Millet = 115,
Sorghum = 116, Beef = 117, Pork = 118, `Goat meat` = 119,
`Other meat` = 120, Chicken = 121, `Fresh fish` = 122, `Dry/smoked fish` = 123,
Eggs = 124, `Fresh milk` = 125, `Infant formula foods` = 126,
`Cooking oil` = 127, Ghee = 128, `Margarine,butter` = 129,
`Passion fruits` = 130, `Sweet bananas` = 131, Mangoes = 132,
Oranges = 133, `Other fruits` = 134, Onions = 135, Tomatoes = 136,
Cabbages = 137, Dodo = 138, `Other vegetables` = 139, `Beans fresh` = 140,
`Beans dry` = 141, `Ground nuts in shell` = 142, `Ground nuts shelled` = 143,
`Ground nuts pounded` = 144, Peas = 145, Simsim = 146, Sugar = 147,
Coffee = 148, Tea = 149, Salt = 150, Soda = 151, Beer = 152,
`Other alcoholic drinks` = 153, `Other drinks` = 154, Cigarettes = 155,
`Other tobbaco` = 156, `Expenditure in restaurants on food` = 157,
`Expenditure in restaurants on soda` = 158, `Expenditure in restaurants on beer` = 159,
`Other juice` = 160, `Other foods` = 161
), class = c(
"haven_labelled",
"vctrs_vctr", "double"
)), total_val_yr = c(
3250, 10400, 156000,
10400, 260000, 312000
)
), row.names = c(NA, -6L), class = c(
"tbl_df",
"tbl", "data.frame"
))
data %>%
group_by(Districtn05, H14aq2) %>%
summarise(total_val_yr = sum(total_val_yr)) %>%
select(total_val_yr, H14aq2) %>%
pivot_wider(names_from = H14aq2, values_from = total_val_yr, names_prefix = "val_")
#> `summarise()` has grouped output by 'Districtn05'. You can override using the
#> `.groups` argument.
#> Adding missing grouping variables: `Districtn05`
#> # A tibble: 1 × 7
#> # Groups: Districtn05 [1]
#> Districtn05 val_103 val_112 val_135 val_136 val_140 val_150
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Kiboga 312000 260000 10400 10400 156000 3250
由 reprex package (v2.0.0)
于 2022-05-25 创建