在 R 中导入 csv 的问题

Issues importing a csv in R

我正在尝试自学 R(刚开始)。 我决定导入 2 个 csv 文件来练习连接它们。

一个文件导入正常,另一个文件出现以下错误:

这是 csv 文件 link:

https://data.world/jonathankkizer/occupation-computerization

我用了下面的语句

occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",")

Warning messages: 1: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 1 appears to contain embedded nulls 2: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 2 appears to contain embedded nulls 3: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 3 appears to contain embedded nulls 4: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 4 appears to contain embedded nulls 5: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 5 appears to contain embedded nulls 6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string 7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input

我在Whosebug上发现可能是编码的问题,所以我使用了建议的解决方案并执行了语句

occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",", fileEncoding="UTF-16LE")

它给了我不同的错误信息:

Error in read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : more columns than column names

我也试过使用read.csv函数也没有用。

如何解决这个问题并成功导入数据集?我在网上找到的 None 个解决方案(例如,使用 "skipNul = TRUE"、"comment.char="" " 参数)有帮助。

更新: 如果你不想从数据世界下载csv文件,这里是数据集的粘贴: https://pastebin.com/SPEtWT6f

尝试使用 readr 包中 read_csv() 的函数。

使用数据框 = read.csv("name_of_file.csv")

dataframe = read.csv(file.choose()).

希望这会奏效。

我终于找到解决办法了! 我快疯了;连我的老师都不知道怎么修!

这条语句有效:

o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")

就像我在原来的问题中所说的那样:我尝试使用 fileEncoding="UTF-16LE" 但它没有帮助。问完问题后,我尝试使用 sep="\t",但没有帮助。但是使用两个就成功了!