read_tsv 将数据错误地解析为 R
read_tsv Parsing data incorrectly into R
我目前正在研究 Mac OS 并尝试使用 tidyverse
中的 read_tsv
来读取下面的 txt
文件:
igg oxygen
881 34.6
1290 45
2147 62.3
1909 58.9
1282 42.5
1530 44.3
2067 67.9
1982 58.5
1019 35.6
1651 49.6
752 33
1687 52
1782 61.4
1529 50.2
969 34.1
1660 52.5
2121 69.9
1382 38.8
1714 50.6
1959 69.4
1158 37.4
965 35.1
1456 43
1273 44.1
1418 49.8
1743 54.4
1997 68.5
2177 69.5
1965 63
1264 43.2
但是,当我尝试读入文件时,出现以下问题:
exerimmun <- read_tsv(file = "./exerimmun.txt")
── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────
cols(
i = col_logical(),
col_logical()
)
Warning: 124 parsing failures.
row col expected actual file
1 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
1 -- 2 columns 1 columns './exerimmun.txt'
2 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
2 1/0/T/F/TRUE/FALSE './exerimmun.txt'
3 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
... ... .................. ......... ...........................................
See problems(...) for more details.
据我所知,数据似乎在 txt
文件中被正确解析,所以我不确定为什么我在将它读入 R
时遇到问题。这是我使用 problems(exerimmun)
时的结果
> problems(exerimmun)
# A tibble: 124 x 5
row col expected actual file
<int> <chr> <chr> <chr> <chr>
1 1 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
2 1 NA 2 columns "1 columns" './exerimmun.txt'
3 2 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
4 2 "" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
5 3 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
6 3 NA 2 columns "1 columns" './exerimmun.txt'
7 4 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
8 4 "" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
9 5 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
10 5 NA 2 columns "1 columns" './exerimmun.txt'
# … with 114 more rows
对我来说,这应该可以正常工作,因为数据只有两列。在查看有关如何读取 txt
文件的文档后,我不确定我遗漏了什么。
编辑:我试过 read.table("./exerimmun.txt")
并得到以下结果:
Error in type.convert.default(data[[i]], as.is = as.is[i], dec = dec, :
invalid multibyte string at '<ff><fe>i'
In addition: Warning messages:
1: In read.table(file = "./exerimmun.txt") :
line 1 appears to contain embedded nulls
2: In read.table(file = "./exerimmun.txt") :
line 2 appears to contain embedded nulls
3: In read.table(file = "./exerimmun.txt") :
line 3 appears to contain embedded nulls
4: In read.table(file = "./exerimmun.txt") :
line 4 appears to contain embedded nulls
5: In read.table(file = "./exerimmun.txt") :
line 5 appears to contain embedded nulls
6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
embedded nul(s) found in input
提前致谢。
也许现在最简单的方法就是避免使用该功能?之后您可以随时转换为 tibble
。
这里我将你的数据保存为 /tmp/data.tsv
,我使用普通的 base R 来处理它:
> x <- read.table("/tmp/data.tsv", header=TRUE)
> str(x)
'data.frame': 30 obs. of 2 variables:
$ igg : int 881 1290 2147 1909 1282 1530 2067 1982 1019 1651 ...
$ oxygen: num 34.6 45 62.3 58.9 42.5 44.3 67.9 58.5 35.6 49.6 ...
> summary(x)
igg oxygen
Min. : 752 Min. :33.0
1st Qu.:1275 1st Qu.:42.6
Median :1590 Median :50.0
Mean :1558 Mean :50.6
3rd Qu.:1946 3rd Qu.:60.8
Max. :2177 Max. :69.9
>
我目前正在研究 Mac OS 并尝试使用 tidyverse
中的 read_tsv
来读取下面的 txt
文件:
igg oxygen
881 34.6
1290 45
2147 62.3
1909 58.9
1282 42.5
1530 44.3
2067 67.9
1982 58.5
1019 35.6
1651 49.6
752 33
1687 52
1782 61.4
1529 50.2
969 34.1
1660 52.5
2121 69.9
1382 38.8
1714 50.6
1959 69.4
1158 37.4
965 35.1
1456 43
1273 44.1
1418 49.8
1743 54.4
1997 68.5
2177 69.5
1965 63
1264 43.2
但是,当我尝试读入文件时,出现以下问题:
exerimmun <- read_tsv(file = "./exerimmun.txt")
── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────
cols(
i = col_logical(),
col_logical()
)
Warning: 124 parsing failures.
row col expected actual file
1 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
1 -- 2 columns 1 columns './exerimmun.txt'
2 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
2 1/0/T/F/TRUE/FALSE './exerimmun.txt'
3 i 1/0/T/F/TRUE/FALSE './exerimmun.txt'
... ... .................. ......... ...........................................
See problems(...) for more details.
据我所知,数据似乎在 txt
文件中被正确解析,所以我不确定为什么我在将它读入 R
时遇到问题。这是我使用 problems(exerimmun)
> problems(exerimmun)
# A tibble: 124 x 5
row col expected actual file
<int> <chr> <chr> <chr> <chr>
1 1 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
2 1 NA 2 columns "1 columns" './exerimmun.txt'
3 2 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
4 2 "" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
5 3 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
6 3 NA 2 columns "1 columns" './exerimmun.txt'
7 4 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
8 4 "" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
9 5 "i" 1/0/T/F/TRUE/FALSE "" './exerimmun.txt'
10 5 NA 2 columns "1 columns" './exerimmun.txt'
# … with 114 more rows
对我来说,这应该可以正常工作,因为数据只有两列。在查看有关如何读取 txt
文件的文档后,我不确定我遗漏了什么。
编辑:我试过 read.table("./exerimmun.txt")
并得到以下结果:
Error in type.convert.default(data[[i]], as.is = as.is[i], dec = dec, :
invalid multibyte string at '<ff><fe>i'
In addition: Warning messages:
1: In read.table(file = "./exerimmun.txt") :
line 1 appears to contain embedded nulls
2: In read.table(file = "./exerimmun.txt") :
line 2 appears to contain embedded nulls
3: In read.table(file = "./exerimmun.txt") :
line 3 appears to contain embedded nulls
4: In read.table(file = "./exerimmun.txt") :
line 4 appears to contain embedded nulls
5: In read.table(file = "./exerimmun.txt") :
line 5 appears to contain embedded nulls
6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
embedded nul(s) found in input
提前致谢。
也许现在最简单的方法就是避免使用该功能?之后您可以随时转换为 tibble
。
这里我将你的数据保存为 /tmp/data.tsv
,我使用普通的 base R 来处理它:
> x <- read.table("/tmp/data.tsv", header=TRUE)
> str(x)
'data.frame': 30 obs. of 2 variables:
$ igg : int 881 1290 2147 1909 1282 1530 2067 1982 1019 1651 ...
$ oxygen: num 34.6 45 62.3 58.9 42.5 44.3 67.9 58.5 35.6 49.6 ...
> summary(x)
igg oxygen
Min. : 752 Min. :33.0
1st Qu.:1275 1st Qu.:42.6
Median :1590 Median :50.0
Mean :1558 Mean :50.6
3rd Qu.:1946 3rd Qu.:60.8
Max. :2177 Max. :69.9
>