为列表中的每个元素应用向量

sapply over vector for each element in a list

我有一个很大的列表,其中包括从语料库中提取的术语。

    mylist <- list(c("flower"), 
               c("plant", "animal", "cats", "doggy"),
               c("tree", "trees", "cat", "dog"))

提取的术语来自数据框(作为主要词、相似词和类别)

   ref <- data.frame(id = c(1:5), 
                  main = c("tree", "plant", "flower", "dog", "cat"), 
                  similar = c("trees","plantlike", "flowery", "doggy", "cats"),
                  category = c("plant", "plant", "plant", "animal", "animal"))

我需要更改列表以便使用类别而不是单词。并可能像这样删除重复项...

    needed <- list("plant",
                   c("plant", "animal", "animal", "animal"),
                   c("plant", "plant", "animal", "animal"))
    
    orbetter <- list("plant",
                   c("plant", "animal"),
                   c("plant", "animal"))

但我不知道如何为列表中的每个元素应用。感谢您的帮助。

mylist <- list(c("flower"), 
               c("plant", "animal", "cats", "doggy"),
               c("tree", "trees", "cat", "dog"))

ref <- data.frame(id = c(1:5), 
                  main = c("tree", "plant", "flower", "dog", "cat"), 
                  similar = c("trees","plantlike", "flowery", "doggy", "cats"),
                  category = c("plant", "plant", "plant", "animal", "animal"))

library(tidyr)

ref_long <- ref %>% 
  pivot_longer(-c(id, category))

lapply(mylist, function(x) unique(ref_long$category[match(x, table = ref_long$value)]))
#> [[1]]
#> [1] "plant"
#> 
#> [[2]]
#> [1] "plant"  NA       "animal"
#> 
#> [[3]]
#> [1] "plant"  "animal"

reprex package (v2.0.1)

创建于 2022-01-14