从 R 中的数据框构建句子

Building Sentences from a dataframe in R

我正在尝试从数据框生成句子 下面是数据框

# Code
mycode <- c("AAABBB", "AAABBB", "AAACCC", "AAABBD")
mycode <- sample(mycode, 20, replace = TRUE)

# Date
mydate <-c("2016-10-17","2016-10-18","2016-10-19","2016-10-20")
mydate <-sample(mydate, 20, replace = TRUE)

# resort
myresort <-c("GB","IE","GR","DK")
myresort <-sample(myresort, 20, replace = TRUE)

# Number of holidaymakers
HolidayMakers <- sample(1000, 20, replace = TRUE)

mydf <- data.frame(mycode,
                  mydate,
                  myresort,
                  HolidayMakers)

所以如果我们以 mycode 为例,我想创建一个像 "For the code mycode, the biggest destinations are myresorts where the top days of visiting were mydate with a total of HolidayMakers"

这样的句子

如果我们假设每个代码有多行。我想要的是一个句子,例如,我想说的不是每个 mydatemyresort 一个句子,而是像

"For the code AAABBB, the biggest destinations are GB,GR,DK,IE where the top days of visiting were 2016-10-17,2016-10-18,2016-10-19 with a total of 650"

根据 mycode

,650 基本上是那些天所有这些国家/地区的所有度假者的总和

有人帮忙吗?

感谢您的宝贵时间

你可以试试:

library(dplyr)
res <- mydf %>%
  group_by(mycode) %>%
  summarise(d = toString(unique(mydate)), 
            r = toString(unique(myresort)), 
            h = sum(HolidayMakers)) %>%
  mutate(s = paste("For the code", mycode, 
                   "the biggest destinations are", r, 
                   "where the top days of visiting were", d, 
                   "with a total of", h))

给出:

> res$s

#[1] "For the code AAABBB the biggest destinations are GB, GR, IE, DK 
#     where the top days of visiting were 2016-10-17, 2016-10-18, 
#     2016-10-20, 2016-10-19 with a total of 6577"
#[2] "For the code AAABBD the biggest destinations are IE 
#     where the top days of visiting were 2016-10-17, 2016-10-18 
#     with a total of 1925"                                    
#[3] "For the code AAACCC the biggest destinations are IE, GR, DK 
#     where the top days of visiting were 2016-10-20, 2016-10-17, 
#     2016-10-19, 2016-10-18 with a total of 2878"    

注意:由于您没有就打算如何计算 "top visiting days" 提供任何指导,我只是将所有天数包括在内。您可以轻松编辑以上内容以适合您的实际情况。