R:将列输出为文本
R: Output columns as text
背景:我在 SPSS .sav 文件中有调查结果。一些调查问题是开放式的,受访者可以在其中键入自己的回答。在 SPSS 中,我可以只 select 一列或多列,响应将输出为文本文件,响应按每个问题(列内容)分组,每个响应由一个空行分隔。然后可以通过为文本中的短语或句子分配代码来将该文本文件用于主题分析。
我似乎找不到在 R 中做同样事情的简单方法。所有常用的导出格式输出为 table。在 RStudio 中选择列会给出文本输出,其中每个响应都按受访者而不是按列分组。
玩具示例:
library(labelled)
library(tidyverse)
comments<-tibble(
shakey=as.character(c("To be or not to be", "", "Alas poor Yorick", "", "Is this a dagger that I see before me?", "A rose by any other name")),
versey=as.character(c("", "The boy stood on the burning deck", "", "Oft in the stilly night", "", "Lars Porsena of Clusium, by the nine gods he swore"))
)
var_label(comments$shakey)<-"Can you quote some Shakespeare?"
var_label(comments$versey)<-"Can you quote some poetry?"
我想要的输出是:
*Can you quote some Shakespeare?*
To be or not to be
Alas poor Yorick
Is this a dagger that I see before me?
A rose by any other name
*Can you quote some poetry?*
The boy stood on the burning deck
Oft in the stilly night
Lars Porsena of Clusium, by the nine gods he swore
以列标签作为标题,每列的 non-blank 回复一个接一个地列出,以空行分隔。
到目前为止我最接近的是:
comlong<-pivot_longer(comments, everything(),
names_to="question",
values_to="response") %>%
arrange(question) %>%
filter(response!="")
但是虽然这会在一栏中获得所有回复,但需要进行一些编辑才能使其成为上面所需的格式,即 non-trivial 更广泛的数据。
最终结果:
Akrun 的附加 summarise
行,我认为是最优雅的。调整输出与 SPSS 非常相似:
comments %>%
summarise(across(everything(), ~ c(paste0(sprintf('**%s**', cur_column()), "\n\n", sprintf('*%s*', var_label(.))), .))) %>%
pivot_longer(cols = everything(),
names_to = 'question',
values_to = 'response') %>%
arrange(question) %>%
filter(response != '') %>%
select(response) %>%
write.table("comments.md",quote=FALSE, eol="\n\n", row.names=FALSE, col.names=FALSE)
这会插入列名和标签(因为我发现有时标签描述性不够),并将其输出为 markdown 文件,可以在 (eg) Qualcoder。 运行:
pandoc comments.md -o comments.odt
如果需要,也会产生 word-processor 输出。
我们可以先使用 summarise
和 across
来提取 var_label
并通过连接追加,然后使用 pivot_longer
library(labelled)
library(dplyr)
library(tidyr)
comments %>%
summarise(across(everything(), ~ c(var_label(.), .))) %>%
pivot_longer(cols = everything(), names_to = 'question',
values_to = 'response') %>%
arrange(question) %>%
filter(response != '')
-输出
# A tibble: 9 x 2
question response
<chr> <chr>
1 shakey Can you quote some Shakespeare?
2 shakey To be or not to be
3 shakey Alas poor Yorick
4 shakey Is this a dagger that I see before me?
5 shakey A rose by any other name
6 versey Can you quote some poetry?
7 versey The boy stood on the burning deck
8 versey Oft in the stilly night
9 versey Lars Porsena of Clusium, by the nine gods he swore
或者如果我们首先进行整形,还有一个选项,即按 'question' 进行分组并对 'question' 的第一个值进行子集以从中提取 var_label
原始数据并连接
pivot_longer(comments, everything(),
names_to="question",
values_to="response") %>%
arrange(question) %>%
filter(response!="") %>%
group_by(question) %>%
summarise(response = c(var_label(comments[[first(question)]]),
response), .groups = 'drop')
-输出
# A tibble: 9 x 2
question response
<chr> <chr>
1 shakey Can you quote some Shakespeare?
2 shakey To be or not to be
3 shakey Alas poor Yorick
4 shakey Is this a dagger that I see before me?
5 shakey A rose by any other name
6 versey Can you quote some poetry?
7 versey The boy stood on the burning deck
8 versey Oft in the stilly night
9 versey Lars Porsena of Clusium, by the nine gods he swore
这是 lapply
-
的输出
lapply(comments, function(x) {
y <- x[x != '']
values <- paste0(sprintf('*%s*', attr(x, 'label')), "\n\n",
paste0(y, collapse = '\n\n'))
cat(paste0(values, '\n\n'), file = 'output.txt', append = TRUE)
})
背景:我在 SPSS .sav 文件中有调查结果。一些调查问题是开放式的,受访者可以在其中键入自己的回答。在 SPSS 中,我可以只 select 一列或多列,响应将输出为文本文件,响应按每个问题(列内容)分组,每个响应由一个空行分隔。然后可以通过为文本中的短语或句子分配代码来将该文本文件用于主题分析。
我似乎找不到在 R 中做同样事情的简单方法。所有常用的导出格式输出为 table。在 RStudio 中选择列会给出文本输出,其中每个响应都按受访者而不是按列分组。
玩具示例:
library(labelled)
library(tidyverse)
comments<-tibble(
shakey=as.character(c("To be or not to be", "", "Alas poor Yorick", "", "Is this a dagger that I see before me?", "A rose by any other name")),
versey=as.character(c("", "The boy stood on the burning deck", "", "Oft in the stilly night", "", "Lars Porsena of Clusium, by the nine gods he swore"))
)
var_label(comments$shakey)<-"Can you quote some Shakespeare?"
var_label(comments$versey)<-"Can you quote some poetry?"
我想要的输出是:
*Can you quote some Shakespeare?*
To be or not to be
Alas poor Yorick
Is this a dagger that I see before me?
A rose by any other name
*Can you quote some poetry?*
The boy stood on the burning deck
Oft in the stilly night
Lars Porsena of Clusium, by the nine gods he swore
以列标签作为标题,每列的 non-blank 回复一个接一个地列出,以空行分隔。
到目前为止我最接近的是:
comlong<-pivot_longer(comments, everything(),
names_to="question",
values_to="response") %>%
arrange(question) %>%
filter(response!="")
但是虽然这会在一栏中获得所有回复,但需要进行一些编辑才能使其成为上面所需的格式,即 non-trivial 更广泛的数据。
最终结果:
Akrun 的附加 summarise
行,我认为是最优雅的。调整输出与 SPSS 非常相似:
comments %>%
summarise(across(everything(), ~ c(paste0(sprintf('**%s**', cur_column()), "\n\n", sprintf('*%s*', var_label(.))), .))) %>%
pivot_longer(cols = everything(),
names_to = 'question',
values_to = 'response') %>%
arrange(question) %>%
filter(response != '') %>%
select(response) %>%
write.table("comments.md",quote=FALSE, eol="\n\n", row.names=FALSE, col.names=FALSE)
这会插入列名和标签(因为我发现有时标签描述性不够),并将其输出为 markdown 文件,可以在 (eg) Qualcoder。 运行:
pandoc comments.md -o comments.odt
如果需要,也会产生 word-processor 输出。
我们可以先使用 summarise
和 across
来提取 var_label
并通过连接追加,然后使用 pivot_longer
library(labelled)
library(dplyr)
library(tidyr)
comments %>%
summarise(across(everything(), ~ c(var_label(.), .))) %>%
pivot_longer(cols = everything(), names_to = 'question',
values_to = 'response') %>%
arrange(question) %>%
filter(response != '')
-输出
# A tibble: 9 x 2
question response
<chr> <chr>
1 shakey Can you quote some Shakespeare?
2 shakey To be or not to be
3 shakey Alas poor Yorick
4 shakey Is this a dagger that I see before me?
5 shakey A rose by any other name
6 versey Can you quote some poetry?
7 versey The boy stood on the burning deck
8 versey Oft in the stilly night
9 versey Lars Porsena of Clusium, by the nine gods he swore
或者如果我们首先进行整形,还有一个选项,即按 'question' 进行分组并对 'question' 的第一个值进行子集以从中提取 var_label
原始数据并连接
pivot_longer(comments, everything(),
names_to="question",
values_to="response") %>%
arrange(question) %>%
filter(response!="") %>%
group_by(question) %>%
summarise(response = c(var_label(comments[[first(question)]]),
response), .groups = 'drop')
-输出
# A tibble: 9 x 2
question response
<chr> <chr>
1 shakey Can you quote some Shakespeare?
2 shakey To be or not to be
3 shakey Alas poor Yorick
4 shakey Is this a dagger that I see before me?
5 shakey A rose by any other name
6 versey Can you quote some poetry?
7 versey The boy stood on the burning deck
8 versey Oft in the stilly night
9 versey Lars Porsena of Clusium, by the nine gods he swore
这是 lapply
-
lapply(comments, function(x) {
y <- x[x != '']
values <- paste0(sprintf('*%s*', attr(x, 'label')), "\n\n",
paste0(y, collapse = '\n\n'))
cat(paste0(values, '\n\n'), file = 'output.txt', append = TRUE)
})