将 ID 列添加到数据框列表

Question

我有一个包含 142 个数据帧的列表 file_content 和一个来自 id_list <- list(as.character(1:length(file_content)))

的列表

我正在尝试向 file_content 中的每个数据框添加一个新列 period。

所有数据框都类似于下面的2021-03-16。

`2021-03-16` <- file_content[[1]] # take a look at 1/142 dataframes in file_content

head(`2021-03-16`)
     author_id                created_at           id                                                                                           tweet
1 3.304380e+09 2018-12-01 22:58:55+00:00 1.069003e+18                                          @Acosta I hope he didnâ€™t really say â€œmuckâ€\u009d.
2 5.291559e+08 2018-12-01 22:57:31+00:00 1.069003e+18      @Acosta I like Mattis, but why does he only speak this way when Individual-1 isn't around?
3 2.195313e+09 2018-12-01 22:56:41+00:00 1.069002e+18 @Acosta What did Mattis say about the informal conversation between Trump and Putin at the G20?
4 3.704188e+07 2018-12-01 22:56:41+00:00 1.069002e+18                                                           @Acosta Good! Tree huggers be damned!
5 1.068995e+18 2018-12-01 22:56:11+00:00 1.069002e+18                                                    @Acosta @NinerMBA_01
6 9.983321e+17 2018-12-01 22:55:13+00:00 1.069002e+18                                                                                 @Acosta Really?

我尝试使用以下代码添加 period 列，但它会将 id_list 中的所有 142 个值添加到 file_content 中每个数据框中的每一行。

for (id in length(id_list)) {
  file_content <- lapply(file_content, function(x) { x$period <- paste(id_list[id], sep = "_"); x }) 
}

Answer 1

我们可以使用imap

library(purrr)
library(dplyr)
imap(file_content, ~ .x %>% 
      mutate(period = .y))

或 Map 来自 base R

Map(cbind, file_content, period = names(file_content))

在 OP 的代码中，id_list 通过用 list 换行创建为 single list 元素，即

list(1:5)

对

as.list(1:5)

在这里，我们不需要转换为list，因为一个向量就足够了

id_list <- seq_along(file_content)

此外，for 循环在单个元素上循环，即最后一个带有 length

的元素

for (id in length(id_list)) {
            ^^

相反，它将是 1:length。此外，分配应该在单个列表元素 file_content[[id]] 而不是整个 list

for(id in seq_along(id_list)) {
    file_content[[id]]$period <- id_list[id]
       
}

Answer 2

你很接近，错误是你需要在 id_list[[id]].

中使用双括号

for (id in length(id_list)) {
  file_content <- lapply(file_content, function(x) {
    x$period <- paste(id_list[[id]], sep = "_")
    x
  }) 
}
# $`1`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`2`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`3`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3

您也可以尝试 Map() 并节省几行。

Map(`[<-`, file_content, 'period', value=id_list)
# $`1`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`2`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3
# 
# $`3`
#   X1 X2 X3 X4 period
# 1  1  4  7 10      1
# 2  2  5  8 11      2
# 3  3  6  9 12      3

数据：

file_content <- replicate(3, data.frame(matrix(1:12, 3, 4)), simplify=F) |> setNames(1:3)
id_list <- list(as.character(1:length(file_content)))

将 ID 列添加到数据框列表

Add ID column to a list of data frames

r

list

lapply