同一部电影中演员的无向组合

Undirected combinations of actors in the same movie

我不确定如何描述我正在尝试执行的操作。我有一个包含两列(电影和演员)的数据框。我想根据他们在一起的电影创建一个独特的 2 演员组合列表。下面是创建我拥有的数据框示例的代码,以及另一个数据框,它是我想要的结果。


start_data <- tibble::tribble(
  ~movie, ~actor,
  "titanic", "john",
  "star wars", "john",
  "baby driver", "john",
  "shawshank", "billy",
  "titanic", "billy",
  "star wars", "sarah",
  "titanic", "sarah"
)

end_data <- tibble::tribble(
  ~movie, ~actor1, ~actor2,
  "titanic", "john", "billy",
  "titanic", "john", "sarah",
  "titanic", "billy", "sarah",
  "star wars", "john", "sarah"
)

感谢任何帮助,谢谢!短的话加分++

您可以使用combn(..., 2)找到两个演员组合,可以将其转换为两列tibble并存储在列表列中summarize;要获得平面数据框,请使用 unnest:

library(tidyverse)

start_data %>% 
    group_by(movie) %>% 
    summarise(acts = list(
        if(length(actor) > 1) set_names(as.tibble(t(combn(actor, 2))), c('actor1', 'actor2')) 
        else tibble()
    )) %>% 
    unnest()

# A tibble: 4 x 3
#      movie actor1 actor2
#      <chr>  <chr>  <chr>
#1 star wars   john  sarah
#2   titanic   john  billy
#3   titanic   john  sarah
#4   titanic  billy  sarah
library(tidyverse)
library(stringr)

inner_join(start_data, start_data, by = "movie") %>% 
  filter(actor.x != actor.y) %>% 
  rowwise() %>% 
  mutate(combo = str_c(min(actor.x, actor.y), "_", max(actor.x, actor.y))) %>% 
  ungroup() %>%
  select(movie, combo) %>% 
  distinct %>% 
  separate(combo, c("actor1", "actor2"))