在 R 中创建一个循环以计算不同表中特定列的词频

Question

我有 15 个不同的 table，每个都包含一个带有长文本的“文本”列（对投票问题的一系列回答）。我想通过在名为“word”的列中为“text”中的每个单词创建一行来整理 tables。然后我想知道每个 table 的词频。我写了这段代码：

Table1.tidy <- Table1 %>%
  unnest_tokens(word, text) %>%
  anti_join(stop_words) %>%
Table1.tidy %>%
  count(word, sort = TRUE)

它工作正常，但现在我想避免为每个 table 重复此代码。有人知道怎么做吗？

Answer 1

(1) 将您所有的 data.frames 放入列表中。

(2) 使用 purrr 的 map 函数来应用您的工作流程：

library(dplyr)
library(tidyr)
library(purrr)

my_list <- list(Table1, Table2, Table3)

my_tidy_list <- my_list %>%
  map(~ .x %>%
        unnest_tokens(word, text) %>%
        anti_join(stop_words) %>%
#        Table1.tidy %>% # I think this line is a mistake?
        count(word, sort = TRUE))

my_tidy_list[[1]]returnsTable1.tidy,my_tidy_list[[2]]returnsTable2.tidy等

在 R 中创建一个循环以计算不同表中特定列的词频

Making a loop in R for counting word frequencies from specific columns in different tables

loops

r

dry

unnest