在导入 table 期间直接从“\t”[sic] 分隔符转义可能吗？

Question

我最近收到了一个非常不寻常的格式的 .txt 文件，需要处理：

"Pony ID"/t"colour"/t"location"/t"age"
"Pony A"/t"white;brown;black"/t"stable1"/t12
"Pony B"/t"pink"/t"stable2"/t13
"Pony C"/t"white"/t"stable3"/t9

因此，如果我尝试从 utils 或 readr (e.g.read.tsv、read.delim) 导入经典阅读功能，我最终会得到 1 列，可能是因为 sep="/t"输入不被解释为文字分隔符。下面的代码解决了它：

library(tidyverse)

a<-read.delim("ponies.txt",sep="/", header = FALSE)
a<-data.frame(cbind(a[,1],sapply(a[,-1], function(x) str_sub(x,2))))
colnames(a)<-a[1,]
a<-a[-1,]

Pony ID            colour location age
2  Pony A white;brown;black  stable1  12
3  Pony B              pink  stable2  13
4  Pony C             white  stable3   9

我希望这个问题不会太晦涩，但我很好奇：有谁知道是否有办法在导入过程中直接转义文字“/t”delim？

Answer 1

这可以通过使用 readLines 阅读变得更加紧凑，使用 gsub 更改分隔符，然后使用 read.csv/read.table

阅读

read.csv(text = gsub("/t", ",", gsub('"', '', readLines("ponies.txt"))), 
       check.names = FALSE)

-输出

  Pony ID            colour location age
1  Pony A white;brown;black  stable1  12
2  Pony B              pink  stable2  13
3  Pony C             white  stable3   9

在导入 table 期间直接从“\t”[sic] 分隔符转义可能吗？

Direct escape from "\t"[sic] seperator during import of table possible?

import

r

data-cleaning