从 .txt 文件中读取 R 中具有特殊字符的数据帧

Read in a dataframe from .txt file with special characters in R

我在数据框的一列中有很多特殊字符的语音转录,如下所示:

">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"

当我使用 read.table 读入数据帧时,我得到以下输出,其中错误地插入了几个有趣的新字符:

R 中的输出:

">like I don't understand< sorry like how old's your mom¿"
"°ye[a:h]°"
"°I don't know°"

我该如何解决这个问题?

您可以在导入时指定编码,也可以在导入数据后指定。

选项 1

df <- read.table('path/file.ext', econding = "UTF-8", ...)

选项 2

x <- c(
  ">like I don't understand< sorry like how old's your mom¿",
  "°ye[a:h]°",
  "°I don't know°")

Encoding(x) <- 'UTF-8'

print(x)