复制行 n 次，其中 n 是字符串的值

Question

我有一个数据集，其中我列出了各州及其各自的城市，其中一些地方已经汇总（不是我汇总的）并被归类为 "Other ([count of places])" (e.g. Other (99))。附加到此位置列表的是数字 'count' 值。我想 1.) 找到每个位置的平均计数和 2.) 根据括号内的数字复制这些 'Other...' 位置以及平均值。示例如下：

set.seed(5)
df <- data.frame(state = c('A','B'), city = c('Other (3)','Other (2)'), count = c('250','50'))

输出：

state	city	count
A	Other (3)	83.333
A	Other (3)	83.333
A	Other (3)	83.333
B	Other (2)	25.000
B	Other (2)	25.000

到目前为止，我只能弄清楚如何从括号中提取数字并创建平均值：

average = df$count/as.numeric(gsub(".*\((.*)\).*", "\1", df$city))

Answer 1

您可以使用以下代码扩展示例：

set.seed(5)
df <- data.frame(state = c('A','B'), city = c('Other (3)','Other (2)'), count = c('250','50'))
times <- as.numeric(gsub(".*\((.*)\).*", "\1", df$city))
df$count <- as.numeric(df$count)/times
output <- df[rep(seq_along(times),times),]

关键添加是创建输出的行，它使用输入数据帧上的行索引根据需要重复每一行。

Answer 2

带有 uncount 的选项。使用 parse_number 提取 'city' 中的数字部分，将 'count' 除以 'n' 并使用 uncount

复制行

library(dplyr)
library(tidyr)
df %>%
    mutate(n = readr::parse_number(city), count = as.numeric(count)/n) %>%
    uncount(n)

-输出

state      city    count
1     A Other (3) 83.33333
2     A Other (3) 83.33333
3     A Other (3) 83.33333
4     B Other (2) 25.00000
5     B Other (2) 25.00000

复制行 n 次，其中 n 是字符串的值

Duplicating rows n times, where n is a value of a string

r

repeat