gsub 图形字符的不同短语

gsub different phrases of graphical characters

我有一个包含多行字符的数据框,例如:

hello my name is sam <U+ab93>
hi i love fast cars <U+e>
my favourite colour is yellow <U+E><U+c><U+60>

如何删除此数据框中没有意义的所有术语?

我尝试了 apply(document, 1, function(x) gsub("<[:graph:]>", "", x)) 但它不起作用。

对于

document = c("hello my name is sam <U+ab93>", 
             "hi i love fast cars <U+e>", 
             "my favourite colour is yellow <U+E><U+c><U+60>")

会是,

gsub("<[[:graph:]]+>", "", document )

DEMO

[:graph:] 不是有效的 POSIX 字符 class.


或者,您也可以尝试

gsub("<[^>]*>", "", document)

使用 gsub

text = c("hello my name is sam <U+ab93>" , "hi i love fast cars <U+e>" , 
"my favourite colour is yellow <U+E><U+c><U+60>")

df <- data.frame(DOC = text)

df$DOC <- gsub(df$DOC , pattern =  "<.*>", replacement = "")