如果我只有边名称,如何创建网络?

How to create a network if I only have the edges names?

我正在尝试连接在同一过程中被引用的作者。我的节点是作者,边缘是进程,但我不知道如何创建边缘列表。

我现在拥有的('Doutrina'表示作者,'Numero'表示进程号):

我想要这样的东西(这里'N'表示这个连接发生了多少次,即它们被一起引用了多少次):


示例数据:

library(dplyr)

df <- tribble(
  ~Doutrina,           ~Numero,
  "MILARE, 2014",      "1009526-53.2015.8.26.0032",
  "SEGUIN, 2000",      "0054387-89.2011.8.26.0224",
  "SILVA, 2009",       "0054387-89.2011.8.26.0224",
  "MILARE, 2015",      "0000351-14.2013.8.26.0326",
  "SILVA, 2011",       "0000351-14.2013.8.26.0326",
  "MAXIMILIANO, 1961", "0000351-14.2013.8.26.0326",
  "SILVA, 2009",       "0000431-26.2013.8.26.0698",
  "SEGUIN, 2000",      "0000431-26.2013.8.26.0698",
  "SILVA, 2009",       "0054391-29.2011.8.26.0224",
  "SEGUIN, 2000",      "0054391-29.2011.8.26.0224",
  "MAXIMILIANO, 2015", "0012360-28.2010.8.26.0224",
  "MILARE, 2015",      "0012360-28.2010.8.26.0224"
)

df
#> # A tibble: 12 x 2
#>    Doutrina          Numero                   
#>    <chr>             <chr>                    
#>  1 MILARE, 2014      1009526-53.2015.8.26.0032
#>  2 SEGUIN, 2000      0054387-89.2011.8.26.0224
#>  3 SILVA, 2009       0054387-89.2011.8.26.0224
#>  4 MILARE, 2015      0000351-14.2013.8.26.0326
#>  5 SILVA, 2011       0000351-14.2013.8.26.0326
#>  6 MAXIMILIANO, 1961 0000351-14.2013.8.26.0326
#>  7 SILVA, 2009       0000431-26.2013.8.26.0698
#>  8 SEGUIN, 2000      0000431-26.2013.8.26.0698
#>  9 SILVA, 2009       0054391-29.2011.8.26.0224
#> 10 SEGUIN, 2000      0054391-29.2011.8.26.0224
#> 11 MAXIMILIANO, 2015 0012360-28.2010.8.26.0224
#> 12 MILARE, 2015      0012360-28.2010.8.26.0224

我修改了您的示例数据,这样结果会更有趣。

library(dplyr)

df <- tribble(
  ~Doutrina,           ~Numero,
  "MILARE, 2014",      "1009526-53.2015.8.26.0032",
  "SEGUIN, 2000",      "0054387-89.2011.8.26.0224",
  "SILVA, 2009",       "0054387-89.2011.8.26.0224",
  "MILARE, 2015",      "0000351-14.2013.8.26.0326",
  "SILVA, 2011",       "0000351-14.2013.8.26.0326",
  "MAXIMILIANO, 1961", "0000351-14.2013.8.26.0326",
  "SILVA, 2009",       "0000431-26.2013.8.26.0698",
  "SEGUIN, 2000",      "0000431-26.2013.8.26.0698",
  "SILVA, 2009",       "0054391-29.2011.8.26.0224",
  "SEGUIN, 2000",      "0054391-29.2011.8.26.0224",
  "MAXIMILIANO, 2015", "0012360-28.2010.8.26.0224",
  "MILARE, 2015",      "0012360-28.2010.8.26.0224"
)

df %>% 
  mutate(Doutrina = sub(", [0-9]{4}", "", Doutrina)) %>%  # remove the year
  full_join(x = ., y = ., by = "Numero") %>%  # join data to itself by Numero
  select(Doutrina = Doutrina.x, Doutrina2 = Doutrina.y) %>%  # keep only name columns
  filter(Doutrina != Doutrina2) %>%  # remove self-reference rows
  filter(Doutrina < Doutrina2) %>%  # only keep rows for one diretion of edge/link
  group_by(Doutrina, Doutrina2) %>% 
  summarise(N = n(), .groups = "drop")
#> # A tibble: 4 x 3
#>   Doutrina    Doutrina2     N
#>   <chr>       <chr>     <int>
#> 1 MAXIMILIANO MILARE        2
#> 2 MAXIMILIANO SILVA         1
#> 3 MILARE      SILVA         1
#> 4 SEGUIN      SILVA         3