为什么 fread 不接受 skip 命令?
Why is fread not accepting the skip command?
我有一个 .txt 数据集,其中前 12 行是文本,后跟 2 个空行,然后是数据
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
但是当我做一个
dat <- fread('data.txt'),
它跳过 15 行,并使用第一行数据作为导入数据集的列名。它忽略 header 行。
01/01/1933 NO RECORD NO RECORD MISSING MISSING
skip 参数根本不影响我导入的内容。我如何提及需要用作列名的行号。或者我可以重命名列名,但不应忽略第一行数据。
诊断
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.001319 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... '\t'
Detected 5 columns. Longest stretch was from line 15 to line 30
Starting data input on line 15 (either column names or first row of data). First 10 characters: 01/01/1933
The line before starting line 15 is non-empty and will be ignored (it has too few or too many items to be column names or data): DATE HEIGHT INPUT OUTPUT TESTMEASURE the fields on line 15 are character fields. Treating as the column names.
你有 12 行文本,2 行空格,然后是你的数据。但是我注意到 DATE
和 HEIGHT
之间有多余的空格。因此,制作一个这样的文本文件,其中您的数据是制表符分隔的,并在 DATE
和 HEIGHT
之间添加 2 tabs 而不是 1选项卡
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
做 fread(data)
给我:
fread(data)
01/01/1933 NO RECORD NO RECORD MISSING MISSING
1: 01/02/1933 NO RECORD NO RECORD MISSING MISSING
删除 DATE
和 HEIGHT
之间的额外制表符会得到:
DATE HEIGHT INPUT OUTPUT TESTMEASURE
1: 01/01/1933 NO RECORD NO RECORD MISSING MISSING
2: 01/02/1933 NO RECORD NO RECORD MISSING MISSING
我有一个 .txt 数据集,其中前 12 行是文本,后跟 2 个空行,然后是数据
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
但是当我做一个
dat <- fread('data.txt'),
它跳过 15 行,并使用第一行数据作为导入数据集的列名。它忽略 header 行。
01/01/1933 NO RECORD NO RECORD MISSING MISSING
skip 参数根本不影响我导入的内容。我如何提及需要用作列名的行号。或者我可以重命名列名,但不应忽略第一行数据。
诊断
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.001319 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... '\t'
Detected 5 columns. Longest stretch was from line 15 to line 30
Starting data input on line 15 (either column names or first row of data). First 10 characters: 01/01/1933
The line before starting line 15 is non-empty and will be ignored (it has too few or too many items to be column names or data): DATE HEIGHT INPUT OUTPUT TESTMEASURE the fields on line 15 are character fields. Treating as the column names.
你有 12 行文本,2 行空格,然后是你的数据。但是我注意到 DATE
和 HEIGHT
之间有多余的空格。因此,制作一个这样的文本文件,其中您的数据是制表符分隔的,并在 DATE
和 HEIGHT
之间添加 2 tabs 而不是 1选项卡
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
garbage
DATE HEIGHT INPUT OUTPUT TESTMEASURE
01/01/1933 NO RECORD NO RECORD MISSING MISSING
01/02/1933 NO RECORD NO RECORD MISSING MISSING
做 fread(data)
给我:
fread(data)
01/01/1933 NO RECORD NO RECORD MISSING MISSING
1: 01/02/1933 NO RECORD NO RECORD MISSING MISSING
删除 DATE
和 HEIGHT
之间的额外制表符会得到:
DATE HEIGHT INPUT OUTPUT TESTMEASURE
1: 01/01/1933 NO RECORD NO RECORD MISSING MISSING
2: 01/02/1933 NO RECORD NO RECORD MISSING MISSING