带有 dplyr 的用户定义函数 - 根据组合参数改变列
User-defined function with dplyr - mutate columns based on combining arguments
我正在使用以下示例数据开发一个闪亮的应用程序:
library(tidyr)
library(dplyr)
df <- data.frame(Year = rep(2014:2017, each = 10),
ID = rep(1:10, times = 4),
Score1 = runif(40),
Score2 = runif(40),
Score3 = runif(40)) %>%
gather(Score1, Score2, Score3, key = "Measure", value = "Value") %>%
unite(Measure, Year, col = "Measure", sep = "_") %>%
spread(Measure, Value)
给出:
> glimpse(df)
Observations: 10
Variables: 13
$ ID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
$ Score1_2014 <dbl> 0.03936843, 0.62027828, 0.56994489, 0.94410280, 0.98747476, 0.78021699, 0.5...
$ Score1_2015 <dbl> 0.456492381, 0.881373411, 0.601315132, 0.003073382, 0.436619197, 0.49193024...
$ Score1_2016 <dbl> 0.4937857, 0.4414206, 0.6716621, 0.2483740, 0.2376593, 0.4231311, 0.5250772...
$ Score1_2017 <dbl> 0.6824536, 0.1020127, 0.9973474, 0.4304465, 0.9194684, 0.8938086, 0.9133654...
$ Score2_2014 <dbl> 0.01550399, 0.03318784, 0.31463461, 0.99324685, 0.19417234, 0.10408623, 0.9...
$ Score2_2015 <dbl> 0.7631779, 0.4471922, 0.9119910, 0.5792838, 0.8458717, 0.9716529, 0.9580503...
$ Score2_2016 <dbl> 0.78565372, 0.20382477, 0.04103231, 0.33246223, 0.65301709, 0.03227641, 0.3...
$ Score2_2017 <dbl> 0.320235691, 0.211477745, 0.575208127, 0.290498894, 0.696220903, 0.94622610...
$ Score3_2014 <dbl> 0.93234031, 0.40570043, 0.07134056, 0.83916278, 0.57897129, 0.59457072, 0.3...
...
我想创建一个用户定义的函数,允许选择分数类型(例如 Score1
、Score2
或 Score3
)、开始年份(year_from
) 和结束年份 (year_to
),并计算年份之间的差异。例如,选择 Score1
、2015
和 2016
会得到:
ID Score1_2016 Score1_2015 Diff
1 1 0.4937857 0.456492381 0.03729332
2 2 0.4414206 0.881373411 -0.43995279
3 3 0.6716621 0.601315132 0.07034700
4 4 0.2483740 0.003073382 0.24530064
5 5 0.2376593 0.436619197 -0.19895987
6 6 0.4231311 0.491930246 -0.06879918
7 7 0.5250772 0.596241541 -0.07116431
8 8 0.1416265 0.019224651 0.12240182
9 9 0.7573208 0.073456457 0.68386434
10 10 0.3575724 0.566328136 -0.20875574
我已经阅读了 programming with dplyr
的文档,但对 quosures
的使用不是很有信心。尝试以下公式失败:
selectr <- function(data, value, year_from, year_to){
recent <- max(year_from, year_to) # determine earlier year
older <- min(year_from, year_to)
recent.name <- paste(value, recent, sep = "_") # Create column names from original df
older.name <- paste(value, older, sep = "_")
recent.name <- enquo(recent.name)
older.name <- enquo(older.name)
data %>%
select(ID, !!recent.name, !!older.name) %>%
mutate(Diff = !!recent.name - !!older.name)
}
selectr(data = df,
value = "Score1",
year_from = 2015,
year_to = 2016)
产生错误:Error in !older.name : invalid argument type
如果我省略 mutate(Diff = !!recent.name - !!older.name)
,函数的其余部分可以工作,但我确实需要公式中的差值计算。
我认为您需要更改两点才能使您的函数正常工作:
- 您想将 recent.name 和 older.name 从字符串转换为符号。这可以通过
as.name()
函数来实现。函数 enquo()
将 "promise" 对象转换为 quosure(符号)。
- mutate 步骤中的运算符优先级似乎有问题。如果把
!!
改成UQ()
(等价),问题就解决了。
这是更正后的版本:
selectr <- function(data, value, year_from, year_to) {
recent <- max(year_from, year_to)
older <- min(year_from, year_to)
recent.name <- paste(value, recent, sep = "_")
older.name <- paste(value, older, sep = "_")
recent.name <- as.name(recent.name)
older.name <- as.name(older.name)
data %>%
select(ID, UQ(recent.name), UQ(older.name)) %>%
mutate(Diff = UQ(recent.name) - UQ(older.name))
}
selectr(data = df,
value = "Score1",
year_from = 2015,
year_to = 2016)
我正在使用以下示例数据开发一个闪亮的应用程序:
library(tidyr)
library(dplyr)
df <- data.frame(Year = rep(2014:2017, each = 10),
ID = rep(1:10, times = 4),
Score1 = runif(40),
Score2 = runif(40),
Score3 = runif(40)) %>%
gather(Score1, Score2, Score3, key = "Measure", value = "Value") %>%
unite(Measure, Year, col = "Measure", sep = "_") %>%
spread(Measure, Value)
给出:
> glimpse(df)
Observations: 10
Variables: 13
$ ID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
$ Score1_2014 <dbl> 0.03936843, 0.62027828, 0.56994489, 0.94410280, 0.98747476, 0.78021699, 0.5...
$ Score1_2015 <dbl> 0.456492381, 0.881373411, 0.601315132, 0.003073382, 0.436619197, 0.49193024...
$ Score1_2016 <dbl> 0.4937857, 0.4414206, 0.6716621, 0.2483740, 0.2376593, 0.4231311, 0.5250772...
$ Score1_2017 <dbl> 0.6824536, 0.1020127, 0.9973474, 0.4304465, 0.9194684, 0.8938086, 0.9133654...
$ Score2_2014 <dbl> 0.01550399, 0.03318784, 0.31463461, 0.99324685, 0.19417234, 0.10408623, 0.9...
$ Score2_2015 <dbl> 0.7631779, 0.4471922, 0.9119910, 0.5792838, 0.8458717, 0.9716529, 0.9580503...
$ Score2_2016 <dbl> 0.78565372, 0.20382477, 0.04103231, 0.33246223, 0.65301709, 0.03227641, 0.3...
$ Score2_2017 <dbl> 0.320235691, 0.211477745, 0.575208127, 0.290498894, 0.696220903, 0.94622610...
$ Score3_2014 <dbl> 0.93234031, 0.40570043, 0.07134056, 0.83916278, 0.57897129, 0.59457072, 0.3...
...
我想创建一个用户定义的函数,允许选择分数类型(例如 Score1
、Score2
或 Score3
)、开始年份(year_from
) 和结束年份 (year_to
),并计算年份之间的差异。例如,选择 Score1
、2015
和 2016
会得到:
ID Score1_2016 Score1_2015 Diff
1 1 0.4937857 0.456492381 0.03729332
2 2 0.4414206 0.881373411 -0.43995279
3 3 0.6716621 0.601315132 0.07034700
4 4 0.2483740 0.003073382 0.24530064
5 5 0.2376593 0.436619197 -0.19895987
6 6 0.4231311 0.491930246 -0.06879918
7 7 0.5250772 0.596241541 -0.07116431
8 8 0.1416265 0.019224651 0.12240182
9 9 0.7573208 0.073456457 0.68386434
10 10 0.3575724 0.566328136 -0.20875574
我已经阅读了 programming with dplyr
的文档,但对 quosures
的使用不是很有信心。尝试以下公式失败:
selectr <- function(data, value, year_from, year_to){
recent <- max(year_from, year_to) # determine earlier year
older <- min(year_from, year_to)
recent.name <- paste(value, recent, sep = "_") # Create column names from original df
older.name <- paste(value, older, sep = "_")
recent.name <- enquo(recent.name)
older.name <- enquo(older.name)
data %>%
select(ID, !!recent.name, !!older.name) %>%
mutate(Diff = !!recent.name - !!older.name)
}
selectr(data = df,
value = "Score1",
year_from = 2015,
year_to = 2016)
产生错误:Error in !older.name : invalid argument type
如果我省略 mutate(Diff = !!recent.name - !!older.name)
,函数的其余部分可以工作,但我确实需要公式中的差值计算。
我认为您需要更改两点才能使您的函数正常工作:
- 您想将 recent.name 和 older.name 从字符串转换为符号。这可以通过
as.name()
函数来实现。函数enquo()
将 "promise" 对象转换为 quosure(符号)。 - mutate 步骤中的运算符优先级似乎有问题。如果把
!!
改成UQ()
(等价),问题就解决了。
这是更正后的版本:
selectr <- function(data, value, year_from, year_to) {
recent <- max(year_from, year_to)
older <- min(year_from, year_to)
recent.name <- paste(value, recent, sep = "_")
older.name <- paste(value, older, sep = "_")
recent.name <- as.name(recent.name)
older.name <- as.name(older.name)
data %>%
select(ID, UQ(recent.name), UQ(older.name)) %>%
mutate(Diff = UQ(recent.name) - UQ(older.name))
}
selectr(data = df,
value = "Score1",
year_from = 2015,
year_to = 2016)