R - gsub 在同一行代码中替换和清理
R - gsub replace and clean in same line of code
我想知道是否可以替换一个字符并删除一行中的空白space。
string = c("av13 personal care", "-11", "av13 personal care", "av14 personal services",
"av15 meals", "av29 visit friends", "av17 free time travel",
"av27 pubs", "av28 restaurants", "av28 restaurants", "av29 visit friends",
"av37 conversation", "av14 personal services", "av13 personal care",
"av13 personal care", "av13 personal care", "-11", "av13 personal care",
"av13 personal care", "av15 meals", "av6 cook, wash up", "av40 other leisure",
"av37 conversation", "av21 walking", "av40 other leisure", "av15 meals",
"av6 cook, wash up", "av13 personal care", "av21 walking", "av17 free time travel",
"av15 meals", "av35 read papers, magazines", "av27 pubs", "av13 personal care",
"-11", "av13 personal care", "av2 paidwork at home", "av25 dances or parties",
"av1 paid work", "av1 paid work", "av1 paid work", "av1 paid work",
"av2 paidwork at home", "av2 paidwork at home", "av13 personal care",
"av17 free time travel", "av29 visit friends", "av17 free time travel",
"av13 personal care", "-11", "av13 personal care")
而不是总是这样做
clean = gsub(pattern = "[A-z]", replacement = "", x = string)
clean = gsub(pattern = "[[:blank:]]", replacement = "", x = clean)
是否可以直接在第一行插入[[:blank:]]
?
我也遇到了 ","
("35,"
) 的问题,我怎样才能在第一行也去掉它?
小更新 我意识到在我的(巨大的)数据集中我也在 gsub
我的 string
之后得到 /
- 你也可以帮我删除它?
这里是一行:
gsub(pattern = "[A-z ,/]", replacement = "", x = string) # added / to address the update
或
gsub(pattern = "[A-z]| |,|/", replacement = "", x = string)
正如 hwnd 指出的那样,范围 [A-z]
实际上包括几个特殊字符,它们在 Ascii 代码中位于 A-Z 和 a-z 之间 (relevant SO answer and the ASCII table)。这些特殊字符是:[
、\
、]
、^
、_
和 `
字符class [A-z]
也会匹配其他字符,我会使用:
gsub('[a-zA-Z\t ,]', '', string)
注意: POSIX 括号表达式 [:blank:]
匹配 space 和制表符。
如果只需要 space 则:
gsub('[a-zA-Z, ]', '', string)
我想知道是否可以替换一个字符并删除一行中的空白space。
string = c("av13 personal care", "-11", "av13 personal care", "av14 personal services",
"av15 meals", "av29 visit friends", "av17 free time travel",
"av27 pubs", "av28 restaurants", "av28 restaurants", "av29 visit friends",
"av37 conversation", "av14 personal services", "av13 personal care",
"av13 personal care", "av13 personal care", "-11", "av13 personal care",
"av13 personal care", "av15 meals", "av6 cook, wash up", "av40 other leisure",
"av37 conversation", "av21 walking", "av40 other leisure", "av15 meals",
"av6 cook, wash up", "av13 personal care", "av21 walking", "av17 free time travel",
"av15 meals", "av35 read papers, magazines", "av27 pubs", "av13 personal care",
"-11", "av13 personal care", "av2 paidwork at home", "av25 dances or parties",
"av1 paid work", "av1 paid work", "av1 paid work", "av1 paid work",
"av2 paidwork at home", "av2 paidwork at home", "av13 personal care",
"av17 free time travel", "av29 visit friends", "av17 free time travel",
"av13 personal care", "-11", "av13 personal care")
而不是总是这样做
clean = gsub(pattern = "[A-z]", replacement = "", x = string)
clean = gsub(pattern = "[[:blank:]]", replacement = "", x = clean)
是否可以直接在第一行插入[[:blank:]]
?
我也遇到了 ","
("35,"
) 的问题,我怎样才能在第一行也去掉它?
小更新 我意识到在我的(巨大的)数据集中我也在 gsub
我的 string
之后得到 /
- 你也可以帮我删除它?
这里是一行:
gsub(pattern = "[A-z ,/]", replacement = "", x = string) # added / to address the update
或
gsub(pattern = "[A-z]| |,|/", replacement = "", x = string)
正如 hwnd 指出的那样,范围 [A-z]
实际上包括几个特殊字符,它们在 Ascii 代码中位于 A-Z 和 a-z 之间 (relevant SO answer and the ASCII table)。这些特殊字符是:[
、\
、]
、^
、_
和 `
字符class [A-z]
也会匹配其他字符,我会使用:
gsub('[a-zA-Z\t ,]', '', string)
注意: POSIX 括号表达式 [:blank:]
匹配 space 和制表符。
如果只需要 space 则:
gsub('[a-zA-Z, ]', '', string)