如何在不同的列中分隔多项选择短语(Google 形式)?

How to separate multiple choice phrases (Google Forms) in different columns?

我看到有一些关于这个问题的主题 (here and here),但在这两种情况下,示例都有多个逗号分隔的选项。在这种情况下有点不同。

  1. 在我的调查中,有 select 多个短语的选项(为了增加难度,其中一些包含逗号)
  2. 有一个“其他原因”选项,受访者可以在其中写下自己的句子。
  3. 每个句子都以大写字母开头(句子中间没有其他大写字母)。除了“其他原因”选项,可以根据受访者的写法以小写字母开头。

预定义选项列表注册如下:

Q1.list <- c ("Phrase one without comma", "Phrase two also without comma", "Phrase three, with comma")

数据库如下所示:

Q1
"Phrase one without comma, Phrase two also without comma"
"Phrase two also without comma, Phrase three, with comma"
"Phrase three, with comma, Phrase four, other reasons"
"Phrase one without comma, Phrase four, other reasons, Phrase five other reasons"

我想这样转换数据集:

Q1.1          Q1.2          Q1.3          Others
1             1             0             0
0             1             1             0
0             0             1             "Phrase four, other reasons"
1             0             0             "Phrase four, other reasons, Phrase five other reasons [and everything else that is not on the Q1.list]"

有人可以阐明如何解决这个问题吗?

您可以使用 dplyr & co。并进行如下操作。

library(dplyr)
library(stringr)

data %>%
  transmute(Q1.1 = +(str_detect(Q1, Q1.list[1])),
            Q1.2 = +(str_detect(Q1, Q1.list[2])),
            Q1.3 = +(str_detect(Q1, Q1.list[3])),
            Others = str_remove_all(Q1, str_c(Q1.list, collapse = '|')),
            Others = if_else(str_sub(Others, 1, 2) == ', ',
                             str_sub(Others, 3),
                             Others),
            Others = if_else(Others == '', '0', Others))

#    Q1.1  Q1.2  Q1.3 Others                                               
#   <int> <int> <int> <chr>                                                
# 1     1     1     0 0                                                    
# 2     0     1     1 0                                                    
# 3     0     0     1 Phrase four, other reasons                           
# 4     1     0     0 Phrase four, other reasons, Phrase five other reasons

数据

data <- structure(list(Q1 = c("Phrase one without comma, Phrase two also without comma", 
"Phrase two also without comma, Phrase three, with comma", "Phrase three, with comma, Phrase four, other reasons", 
"Phrase one without comma, Phrase four, other reasons, Phrase five other reasons"
)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
))

Q1.list <- c("Phrase one without comma", "Phrase two also without comma", "Phrase three, with comma")