read_tsv returns 1 列 df 需要很多列

read_tsv returns a 1 column df expected many columns

我正在尝试将 tsv 文件读入 r。使用 rstudio 的查看文件实用程序,我的原始文件如下所示:

                 nzid                 | converted  | logins_cnt | shootypes_cnt | galleries_cnt | photos_cnt | favorite_images_cnt | image_downloaded_cnt | gallery_visitors_cnt |    storage_used     | shared_gallery_cnt | password_set | site_created | site_published | pricelist_created | used_desktop | custom_domain | added_watermark | added_galley | added_logo | added_social_link 
--------------------------------------+------------+------------+---------------+---------------+------------+---------------------+----------------------+----------------------+---------------------+--------------------+--------------+--------------+----------------+-------------------+--------------+---------------+-----------------+--------------+------------+-------------------
 abc123 |            |          0 |             4 |             0 |         31 |            0.000000 |             0.000000 |             4.000000 |    278895839.000000 |                  0 |            1 |            0 |              0 |                 0 |            1 |             0 |               0 |            1 |          0 |                 0
 jhgfdfghj543454 |            |          1 |             9 |             0 |        140 |            2.000000 |          1127.000000 |           137.000000 |   1077768195.000000 |                  1 |            1 |            0 |              0 |                 0 |            0 |             0 |               0 |            1 |          0 |                 0
 ijhgfdrfgh765456 |            |          0 |             4 |             0 |         30 |                   0 |                    0 |                    0 |    278796703.000000 |                  0 |            1 |       

我尝试了什么:

rawd <- read_tsv('training-data.tsv')

运行但是:

rawd %>% glimpse
Rows: 10,173
Columns: 1
$ `nzid                 | converted  | logins_cnt | shootypes_cnt | galleries_cnt | photos_cnt | favorite_images_cnt | image_downloaded_cnt | gallery_visitors_cnt |    storage_used     | shared_gallery_cnt | password_set | site_created | site_published | pricelist_created | used_desktop | custom_domain | added_watermark | added_galley | added_logo | added_social_link` <chr> …

所有内容都在一栏中。

从原始 tsv 文件来看,似乎使用竖线分隔字段。尝试过:

rawd <- read_tsv('training-data.tsv', delim = '|')
Error in read_tsv("training-data.tsv", delim = "|") : 
  unused argument (delim = "|")

意外,因为 delim 是带有帮助的参数 ?read_tsv

如何将 'tsv' 文件读入 r?假设它确实是一个 tsv 文件?

最后使用注释中的数据:

L <- readLines('training-data.tsv')
DF <- read.table(text = L[-2], sep = "|", strip.white = TRUE, 
  header = TRUE, fill = TRUE)
str(DF)

给予:

'data.frame':   3 obs. of  21 variables:
 $ nzid                : chr  "abc123" "jhgfdfghj543454" "ijhgfdrfgh765456"
 $ converted           : logi  NA NA NA
 $ logins_cnt          : int  0 1 0
 $ shootypes_cnt       : int  4 9 4
 $ galleries_cnt       : int  0 0 0
 $ photos_cnt          : int  31 140 30
 $ favorite_images_cnt : num  0 2 0
 $ image_downloaded_cnt: num  0 1127 0
 $ gallery_visitors_cnt: num  4 137 0
 $ storage_used        : num  2.79e+08 1.08e+09 2.79e+08
 $ shared_gallery_cnt  : int  0 1 0
 $ password_set        : int  1 1 1
 $ site_created        : int  0 0 NA
 $ site_published      : int  0 0 NA
 $ pricelist_created   : int  0 0 NA
 $ used_desktop        : int  1 0 NA
 $ custom_domain       : int  0 0 NA
 $ added_watermark     : int  0 0 NA
 $ added_galley        : int  1 1 NA
 $ added_logo          : int  0 0 NA
 $ added_social_link   : int  0 0 NA

备注

Lines <- "                 nzid                 | converted  | logins_cnt | shootypes_cnt | galleries_cnt | photos_cnt | favorite_images_cnt | image_downloaded_cnt | gallery_visitors_cnt |    storage_used     | shared_gallery_cnt | password_set | site_created | site_published | pricelist_created | used_desktop | custom_domain | added_watermark | added_galley | added_logo | added_social_link 
--------------------------------------+------------+------------+---------------+---------------+------------+---------------------+----------------------+----------------------+---------------------+--------------------+--------------+--------------+----------------+-------------------+--------------+---------------+-----------------+--------------+------------+-------------------
 abc123 |            |          0 |             4 |             0 |         31 |            0.000000 |             0.000000 |             4.000000 |    278895839.000000 |                  0 |            1 |            0 |              0 |                 0 |            1 |             0 |               0 |            1 |          0 |                 0
 jhgfdfghj543454 |            |          1 |             9 |             0 |        140 |            2.000000 |          1127.000000 |           137.000000 |   1077768195.000000 |                  1 |            1 |            0 |              0 |                 0 |            0 |             0 |               0 |            1 |          0 |                 0
 ijhgfdrfgh765456 |            |          0 |             4 |             0 |         30 |                   0 |                    0 |                    0 |    278796703.000000 |                  0 |            1 |       "
writeLines(Lines, "training-data.tsv")