R:将列输出为文本

R: Output columns as text

背景:我在 SPSS .sav 文件中有调查结果。一些调查问题是开放式的,受访者可以在其中键入自己的回答。在 SPSS 中,我可以只 select 一列或多列,响应将输出为文本文件,响应按每个问题(列内容)分组,每个响应由一个空行分隔。然后可以通过为文本中的短语或句子分配代码来将该文本文件用于主题分析。

我似乎找不到在 R 中做同样事情的简单方法。所有常用的导出格式输出为 table。在 RStudio 中选择列会给出文本输出,其中每个响应都按受访者而不是按列分组。

玩具示例:

library(labelled)
library(tidyverse)

comments<-tibble(
  shakey=as.character(c("To be or not to be", "", "Alas poor Yorick", "", "Is this a dagger that I see before me?", "A rose by any other name")), 
  versey=as.character(c("", "The boy stood on the burning deck", "", "Oft in the stilly night", "", "Lars Porsena of Clusium, by the nine gods he swore"))
  )

var_label(comments$shakey)<-"Can you quote some Shakespeare?"
var_label(comments$versey)<-"Can you quote some poetry?"

我想要的输出是:

*Can you quote some Shakespeare?*

To be or not to be

Alas poor Yorick

Is this a dagger that I see before me?

A rose by any other name

*Can you quote some poetry?*

The boy stood on the burning deck

Oft in the stilly night

Lars Porsena of Clusium, by the nine gods he swore

以列标签作为标题,每列的 non-blank 回复一个接一个地列出,以空行分隔。

到目前为止我最接近的是:

comlong<-pivot_longer(comments, everything(),
                     names_to="question",
                     values_to="response") %>%
  arrange(question) %>% 
  filter(response!="")

但是虽然这会在一栏中获得所有回复,但需要进行一些编辑才能使其成为上面所需的格式,即 non-trivial 更广泛的数据。

最终结果:

Akrun 的附加 summarise 行,我认为是最优雅的。调整输出与 SPSS 非常相似:

comments %>%
  summarise(across(everything(), ~ c(paste0(sprintf('**%s**', cur_column()), "\n\n", sprintf('*%s*', var_label(.))), .))) %>%
  pivot_longer(cols = everything(), 
               names_to = 'question', 
               values_to = 'response') %>% 
  arrange(question) %>% 
  filter(response != '') %>% 
  select(response) %>% 
  write.table("comments.md",quote=FALSE, eol="\n\n", row.names=FALSE, col.names=FALSE)

这会插入列名和标签(因为我发现有时标签描述性不够),并将其输出为 markdown 文件,可以在 (eg) Qualcoder。 运行:

pandoc comments.md -o comments.odt
如果需要,

也会产生 word-processor 输出。

我们可以先使用 summariseacross 来提取 var_label 并通过连接追加,然后使用 pivot_longer

library(labelled)
library(dplyr)
library(tidyr)
comments %>% 
    summarise(across(everything(), ~ c(var_label(.), .))) %>% 
    pivot_longer(cols = everything(), names_to = 'question', 
        values_to = 'response') %>% 
    arrange(question) %>% 
    filter(response != '')

-输出

# A tibble: 9 x 2
  question response                                          
  <chr>    <chr>                                             
1 shakey   Can you quote some Shakespeare?                   
2 shakey   To be or not to be                                
3 shakey   Alas poor Yorick                                  
4 shakey   Is this a dagger that I see before me?            
5 shakey   A rose by any other name                          
6 versey   Can you quote some poetry?                        
7 versey   The boy stood on the burning deck                 
8 versey   Oft in the stilly night                           
9 versey   Lars Porsena of Clusium, by the nine gods he swore

或者如果我们首先进行整形,还有一个选项,即按 'question' 进行分组并对 'question' 的第一个值进行子集以从中提取 var_label原始数据并连接

pivot_longer(comments, everything(),
                     names_to="question",
                     values_to="response") %>%
  arrange(question) %>% 
  filter(response!="") %>% 
  group_by(question) %>% 
  summarise(response = c(var_label(comments[[first(question)]]), 
     response), .groups = 'drop')

-输出

# A tibble: 9 x 2
  question response                                          
  <chr>    <chr>                                             
1 shakey   Can you quote some Shakespeare?                   
2 shakey   To be or not to be                                
3 shakey   Alas poor Yorick                                  
4 shakey   Is this a dagger that I see before me?            
5 shakey   A rose by any other name                          
6 versey   Can you quote some poetry?                        
7 versey   The boy stood on the burning deck                 
8 versey   Oft in the stilly night                           
9 versey   Lars Porsena of Clusium, by the nine gods he swore

这是 lapply -

的输出
lapply(comments, function(x) {
  y <- x[x != '']
  values <- paste0(sprintf('*%s*', attr(x, 'label')), "\n\n", 
            paste0(y, collapse = '\n\n'))
  cat(paste0(values, '\n\n'), file = 'output.txt', append = TRUE)
})