按人口划分个案

Question

在 tidyr 包的 table2 数据集中，我们有：

  country  year       type      count
         <chr> <int>      <chr>      <int>
 1 Afghanistan  1999      cases        745
 2 Afghanistan  1999 population   19987071
 3 Afghanistan  2000      cases       2666
 4 Afghanistan  2000 population   20595360
 5      Brazil  1999      cases      37737
 6      Brazil  1999 population  172006362
 7      Brazil  2000      cases      80488
 8      Brazil  2000 population  174504898
 9       China  1999      cases     212258
10       China  1999 population 1272915272
11       China  2000      cases     213766
12       China  2000 population 1280428583

我该如何编码才能将类型案例除以类型人口，然后乘以 10000。（是的，这是 Hadley Wickham 的 R for Data Science 中的一个问题。）

我想到了：

sum_1 <- vector()
for (i,j in 1:nrow(table2)) {
  if (i %% 2 != 0) {
    sum_1 <- (table2[i] / table2[j]) * 10000

Answer 1

假设每个'country'、'year'只有2个'type'的值，那么在按'country'、'year'、arrange 除以 'type'（以防顺序不同）并将 'count' 的 first 值除以 'count' 的 last 值以创建 'newcol'

library(dplyr)
table2 %>%
    group_by(country, year) %>%
    arrange(country, year, type) %>% 
    mutate(newcol = 10000*first(count)/last(count))

如果我们只需要汇总输出，请将 mutate 替换为 summarise

如果type中除了'cases'和'population'还有其他值，那么我们根据逻辑索引

对'count'进行子集化

table2 %>% 
   group_by(country, year) %>% 
   mutate(newcol = 10000*count[type=="cases"]/count[type=="population"])

这里也假设每个 'country'、'year'

只有一个 'cases' 和 'population'

按人口划分个案

Divide case by population

r

tidyr