从 .txt 文件中读取 R 中具有特殊字符的数据帧
Read in a dataframe from .txt file with special characters in R
我在数据框的一列中有很多特殊字符的语音转录,如下所示:
">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"
当我使用 read.table
读入数据帧时,我得到以下输出,其中错误地插入了几个有趣的新字符:
R 中的输出:
">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"
我该如何解决这个问题?
您可以在导入时指定编码,也可以在导入数据后指定。
选项 1
df <- read.table('path/file.ext', econding = "UTF-8", ...)
选项 2
x <- c(
">like I don't understand< sorry like how old's your mom¿",
"°ye[a:h]°",
"°I don't know°")
Encoding(x) <- 'UTF-8'
print(x)
我在数据框的一列中有很多特殊字符的语音转录,如下所示:
">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"
当我使用 read.table
读入数据帧时,我得到以下输出,其中错误地插入了几个有趣的新字符:
R 中的输出:
">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"
我该如何解决这个问题?
您可以在导入时指定编码,也可以在导入数据后指定。
选项 1
df <- read.table('path/file.ext', econding = "UTF-8", ...)
选项 2
x <- c(
">like I don't understand< sorry like how old's your mom¿",
"°ye[a:h]°",
"°I don't know°")
Encoding(x) <- 'UTF-8'
print(x)