如何使用 map* 和 mutate 将列表转换为一组附加列？

Question

在天的时间里，我可能已经尝试了数百种此代码的排列，以尝试获得一个可以执行我想要的功能的函数，但我最终放弃了。感觉应该是绝对可行的，我已经很接近了！

我试图用下面的代表回到这里的核心。

基本上我有一个单行数据框，~~一列包含~~一个字符串列表（"concepts"）。我想为这些字符串中的每一个创建一个额外的列，使用 mutate，理想情况下该列从字符串中获取其名称，然后用函数调用的结果填充该列（？它不现在不管哪个功能？-我只需要该功能的基础设施即可工作。）

我觉得，像往常一样，我肯定遗漏了一些明显的东西……也许只是一个语法错误。我也想知道我是否需要使用 purrr::map，也许更简单的矢量化映射就可以了。

我觉得新列被命名为 ..1 而不是概念名称这一事实是关于问题所在的一些线索。

我可以通过手动调用每个概念来创建我想要的数据框（请参阅 reprex 的结尾）但是由于概念列表对于不同的数据框是不同的，我想使用管道和 tidyverse 技术来实现它而不是做手动。

我已阅读以下问题以寻求帮助：

How to mutate multiple columns with dynamic variable using purrr:map function?
(R) Cleaner way to use map() with list-columns
Creating new variables with purrr (how does one go about that?)

但其中 none 帮助我解决了我遇到的问题。 [编辑： 在最后一个问题中添加到该列表中，这可能是我需要的技术。

<!-- language-all: lang-r -->


    # load packages -----------------------------------------------------------

    library(rlang)
    library(dplyr)
    library(tidyr)
    library(magrittr)
    library(purrr)
    library(nomisr)



    # set up initial list of tibbles ------------------------------------------

    df <- list(
      district_population = tibble(
        dataset_title = "Population estimates - local authority based by single year",
        dataset_id = "NM_2002_1"
      ),
      jsa_claimants = tibble(
        dataset_title = "Jobseeker\'s Allowance with rates and proportions",
        dataset_id = "NM_1_1"
      )
    )


    # just use the first tibble for now, for testing --------------------------
    # ideally I want to map across dfs through a list -------------------------

    df <- df[[1]]

    # nitty gritty functions --------------------------------------------------

    get_concept_list <- function(df) {
      dataset_id <- pluck(df, "dataset_id")
      nomis_overview(id = dataset_id,
                     select = c("dimensions", "codes")) %>%
        pluck("value", 1, "dimension") %>%
        filter(!concept == "geography") %>%
        pull("concept")
    }

    # get_concept_list() returns the strings I need:
    get_concept_list(df)
    #> [1] "time"     "gender"   "c_age"    "measures"

    # Here is a list of examples of types of map* that do various things,
    # none of which is what I need it to do
    # I'm using toupper() here for simplicity - ultimately I will use
    # get_concept_info() to populate the new columns

    # this creates four new tibbles
    get_concept_list(df) %>% 
      map(~ mutate(df, {{.x}} := toupper(.x)))
    #> [[1]]
    #> # A tibble: 1 x 3
    #>   dataset_title                                               dataset_id ..1  
    #>   <chr>                                                       <chr>      <chr>
    #> 1 Population estimates - local authority based by single year NM_2002_1  TIME 
    #> 
    #> [[2]]
    #> # A tibble: 1 x 3
    #>   dataset_title                                               dataset_id ..1   
    #>   <chr>                                                       <chr>      <chr> 
    #> 1 Population estimates - local authority based by single year NM_2002_1  GENDER
    #> 
    #> [[3]]
    #> # A tibble: 1 x 3
    #>   dataset_title                                               dataset_id ..1  
    #>   <chr>                                                       <chr>      <chr>
    #> 1 Population estimates - local authority based by single year NM_2002_1  C_AGE
    #> 
    #> [[4]]
    #> # A tibble: 1 x 3
    #>   dataset_title                                               dataset_id ..1    
    #>   <chr>                                                       <chr>      <chr>  
    #> 1 Population estimates - local authority based by single year NM_2002_1  MEASUR~

    # this throws an error
    get_concept_list(df) %>% 
      map_chr(~ mutate(df, {{.x}} := toupper(.x)))
    #> Error: Result 1 must be a single string, not a vector of class `tbl_df/tbl/data.frame` and of length 3

    # this creates three extra rows in the tibble
    get_concept_list(df) %>% 
      map_df(~ mutate(df, {{.x}} := toupper(.x)))
    #> # A tibble: 4 x 3
    #>   dataset_title                                               dataset_id ..1    
    #>   <chr>                                                       <chr>      <chr>  
    #> 1 Population estimates - local authority based by single year NM_2002_1  TIME   
    #> 2 Population estimates - local authority based by single year NM_2002_1  GENDER 
    #> 3 Population estimates - local authority based by single year NM_2002_1  C_AGE  
    #> 4 Population estimates - local authority based by single year NM_2002_1  MEASUR~

    # this does the same as map_df
    get_concept_list(df) %>% 
      map_dfr(~ mutate(df, {{.x}} := toupper(.x)))
    #> # A tibble: 4 x 3
    #>   dataset_title                                               dataset_id ..1    
    #>   <chr>                                                       <chr>      <chr>  
    #> 1 Population estimates - local authority based by single year NM_2002_1  TIME   
    #> 2 Population estimates - local authority based by single year NM_2002_1  GENDER 
    #> 3 Population estimates - local authority based by single year NM_2002_1  C_AGE  
    #> 4 Population estimates - local authority based by single year NM_2002_1  MEASUR~

    # this creates a single tibble 12 columns wide
    get_concept_list(df) %>% 
      map_dfc(~ mutate(df, {{.x}} := toupper(.x)))
    #> # A tibble: 1 x 12
    #>   dataset_title dataset_id ..1   dataset_title1 dataset_id1 ..11  dataset_title2
    #>   <chr>         <chr>      <chr> <chr>          <chr>       <chr> <chr>         
    #> 1 Population e~ NM_2002_1  TIME  Population es~ NM_2002_1   GEND~ Population es~
    #> # ... with 5 more variables: dataset_id2 <chr>, ..12 <chr>,
    #> #   dataset_title3 <chr>, dataset_id3 <chr>, ..13 <chr>

    # function to get info on each concept (except geography) -----------------
    # this is the function I want to use eventually to populate my new columns

    get_concept_info <- function(df, concept_name) {
      dataset_id <- pluck(df, "dataset_id")
      nomis_overview(id = dataset_id) %>%
        filter(name == "dimensions") %>%
        pluck("value", 1, "dimension") %>%
        filter(concept == concept_name) %>%
        pluck("codes.code", 1) %>%
        select(name, value) %>%
        nest(data = everything()) %>%
        as.list() %>%
        pluck("data")
    }


    # individual mutate works, for comparison ---------------------------------
    # I can create the kind of table I want manually using a line like the one below

    # df %>% map(~ mutate(., measures = get_concept_info(., concept_name = "measures")))
    df %>% mutate(., measures = get_concept_info(df, "measures"))
    #> # A tibble: 1 x 3
    #>   dataset_title                                        dataset_id measures      
    #>   <chr>                                                <chr>      <list>        
    #> 1 Population estimates - local authority based by sin~ NM_2002_1  <tibble [2 x ~

<sup>Created on 2020-02-10 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>

Answer 1

使用 !! 和 := 可以动态命名列。然后，我们可以使用 reduce() 减少 map() 的列表输出，其中 left_joins() 使用数据集标题和 id 列列表中的所有数据帧。

df_2 <- 
  map(get_concept_list(df),
      ~ mutate(df,
               !!.x := get_concept_info(df, .x))) %>% 
  reduce(left_join, by = c("dataset_title", "dataset_id"))

df_2

# A tibble: 1 x 6
  dataset_title                                               dataset_id           time         gender          c_age       measures
  <chr>                                                       <chr>      <list<df[,2]>> <list<df[,2]>> <list<df[,2]>> <list<df[,2]>>
1 Population estimates - local authority based by single year NM_2002_1        [28 x 2]        [3 x 2]      [121 x 2]        [2 x 2]

如何使用 map* 和 mutate 将列表转换为一组附加列？

How can I use map* and mutate to convert a list into a set of additional columns?

r

dplyr

purrr

tidyeval