将 skip 与 read.csv 一起使用时无法获取列名

Question

在从 csv 文件读入我的数据框之前，我使用 read.csv 中的跳过选项跳过几行。但是，当我在执行此操作时执行 names(dataframe) 时，我会丢失列名并获得一些随机字符串作为列名。为什么会这样？

> mydf = read.csv("mycsvfile.csv",skip=100)
> names(mydf)
[1] "X2297256" "X3"

没有跳过选项，它工作正常

> mydf = read.csv("mycsvfile.csv")
> names(mydf)
[1] "col1" "col2"

Answer 1

如果您跳过文件中的行，则会跳过整行，因此如果您的 header 在第一行并且您跳过 100 行，那么 header 行将被跳过。如果您想跳过部分文件并仍然保留 headers，则需要单独阅读它们

headers <- names(read.csv("mycsvfile.csv",nrows=1))
mydf <- read.csv("mycsvfile.csv", header=F, col.names=headers, skip=100)

Answer 2

headers不需要单独读入。您可以通过在数据帧上使用负索引在一行中完成此操作，其中负索引表示 "keep all lines except the negative index (range)".

因此，如果您想保留 headers 然后跳过前 N 行，您只需要这样做：

mydf<-read.csv("mycsvfile.csv",header=T)[-1:-N,]

unable to get column names when using skip along with read.csv