R - return 第一组引号内的字符串

Question

所以我有一个数据框，由我从 .csv 导入的数千条记录组成。数据框中的一个变量是从词典中派生的自由文本字段。数据行采用以下格式。

请注意，下面不是向量，而是变量 'date' 中的字符数据行（它们恰好看起来完全像一个向量）：

c("9th november 2018", "27th october 2018"),

c("three months", "6 months"),

c("24th december ", "2th january 2019", "25th january 2019")

基本上我感兴趣的是从第一组引号中取出字符串并删除其余部分，所以：

c("9th november 2018", "27th october 2018") 
9th november 2018

我正在使用以下代码，但它从最后一组引号中获取字符串：

LexiDate3$finaldat3 <- sub('.*,"*(.*?) *" *', '\1', LexiDate3$Date_new)

哪个returns:

27th october 2018")

不理想，我这辈子都想不通。任何帮助将不胜感激。

谢谢。

Answer 1

这看起来怎么样？请注意，输出周围的引号由 print 方法放置在那里，而不是嵌入到字符串中。

library(stringr)
test <- 'c("9th november 2018", "27th october 2018"),'
str_extract(test,'(?<=")(.*?)(?=")')
#> [1] "9th november 2018"
Created on 2019-02-21 by the reprex package (v0.2.1)

R - return 第一组引号内的字符串

R - return string within first set of quotation marks

string

substring

r

quotations