根据 R 中的另一个值组合行值

Making combinations of row values based on another value in R

我需要做一个网络可视化,我有数据,但格式还不正确!数据在 R 中的数据框中如下所示:

Title       Name
Article1    Johnson
Article1    Hansson
Article1    Michaels
Article2    Nielsson
Article2    Madsen
Article2    Shannon
Article2    Paddington

我想找到基于标题的名称组合 - 即合作作者,所以以这种格式输出

Source     Target      Title
Johnson    Hansson     Article1
Johnson    Michaels    Article1
Hansson    Michaels    Article1
Nielsson   Madsen      Article2
Nielsson   Shannon     Article2
Nielsson   Paddington  Article2
Madsen     Shannon     Article2
Madsen     Paddington  Article2
Shannon    Paddington  Article2

网络是无向的,所以source/target只是列名来说明。那么我怎样才能在 R 中做到这一点呢?我确定有一个简单的方法,但我找不到它。

试试这个,在 base R:

 combos<-tapply(df$Name,df$Title,function(x) t(combn(x,2)))
 cbind(setNames(as.data.frame(do.call(rbind,combos)),c("Source","Target")),Title=rep(names(combos),vapply(combos,nrow,1L)))

#    Source     Target    Title
#1  Johnson    Hansson Article1
#2  Johnson   Michaels Article1
#3  Hansson   Michaels Article1
#4 Nielsson     Madsen Article2
#5 Nielsson    Shannon Article2
#6 Nielsson Paddington Article2
#7   Madsen    Shannon Article2
#8   Madsen Paddington Article2
#9  Shannon Paddington Article2

这是一个可能的解决方案,使用 data.table v >= 1.9.5 和新的 tstrsplit 函数

library(data.table) # v >= 1.9.5
setDT(df)[, setNames(tstrsplit(combn(Name, 2, toString, simplify = FALSE), ", "), 
                     c("Source", "Target")), 
          by = Title]
#       Title   Source     Target
# 1: Article1  Johnson    Hansson
# 2: Article1  Johnson   Michaels
# 3: Article1  Hansson   Michaels
# 4: Article2 Nielsson     Madsen
# 5: Article2 Nielsson    Shannon
# 6: Article2 Nielsson Paddington
# 7: Article2   Madsen    Shannon
# 8: Article2   Madsen Paddington
# 9: Article2  Shannon Paddington