在 R 中计算学术 "g-index"(h 指数的变体)?

Calculating the academic "g-index" (a variant of the h-index) in R?

此数据框显示了两位研究人员及其每篇论文的引用次数:

   researcher citations
   <chr>          <dbl>
 1 Berger             8
 2 Berger            11
 3 Berger            26
 4 Berger            25
 5 Berger            10
 6 Meyer             45
 7 Meyer             12
 8 Meyer             12
 9 Meyer              8
10 Meyer             21

如何计算每个研究人员在 R 中的“g 指数”?

这是Wikipedia definition of the g-index:

The index is calculated based on the distribution of citations received by a given researcher's publications, such that given a set of articles ranked in decreasing order of the number of citations that they received, the g-index is the unique largest number such that the top g articles received together at least g2 citations. Hence, a g-index of 10 indicates that the top 10 publications of an author have been cited at least 100 times (102), a g-index of 20 indicates that the top 20 publications of an author have been cited 400 times (202).

数据框:

structure(list(researcher = c("Berger", "Berger", "Berger", "Berger", 
"Berger", "Meyer", "Meyer", "Meyer", "Meyer", "Meyer"), citations = c(8, 
11, 26, 25, 10, 45, 12, 12, 8, 21)), row.names = c(NA, -10L), groups = structure(list(
    researcher = c("Berger", "Meyer"), .rows = structure(list(
        1:5, 6:10), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

data.table:

setorder(dt, researcher, -citations)
dtg <- dt[, .(gscore = max((1:.N)*(cumsum(citations) > (1:.N)))), by = "researcher"]
dtg
#>    researcher gscore
#> 1:     Berger      5
#> 2:      Meyer      5