在r中折叠同一作者的每4个连续文本行

Question

我想将一个作者的每四个 post 组合到一个广泛的数据框中，如果剩下的 post 少于四个 post 则组合这些（例如，一个作者有 11 posts，我最终得到 2 post of 4 和 1 post of 3).

这是我的数据框的示例：

name  text
bee   _ so we know that right           
bee   said so           
alma  hello,            
alma  Good to hear back from you.           
bee   I've currently written an application         
alma  I'm happy about it            
bee   It was not the last.          
alma  Will this ever stop.          
alma  Yet another line.         
alma  so

我想改成这样：

name  text
bee   _ so we know that right said so I've currently written an application It was not the last.
alma  hello, Good to hear back from you. I'm happy about it Will this ever stop
alma  Yet another line. so

这是初始数据框：

df = structure(list(name = c("bee", "bee", "alma", "alma", "bee", "alma", "bee", "alma", "alma", "alma"), text = c( "_ so we know that right", "said so", "hello,", "Good to hear back from you.", "I've currently written an application", "I'm happy about it", "It was not the last.", "Will this ever stop.", "Yet another line.", "so")), .Names = c("name", "text"), row.names = c(NA, -10L), class = "data.frame")

Answer 1

利用 dplyr 的一个选项可能是：

df %>%
 group_by(name) %>%
 mutate(ID = ceiling(row_number()/4)) %>%
 group_by(name, ID) %>%
 summarise_all(paste, collapse = " ")

  name     ID text                                                                         
  <chr> <dbl> <chr>                                                                        
1 alma      1 hello, Good to hear back from you. I'm happy about it Will this ever stop.   
2 alma      2 Yet another line. so                                                         
3 bee       1 _ so we know that right said so I've currently written an application It was…

在r中折叠同一作者的每4个连续文本行

Collapsing every 4 sequential text rows of same author in r

r

transform

collapse