在 R 中导入 csv 的问题
Issues importing a csv in R
我正在尝试自学 R(刚开始)。
我决定导入 2 个 csv 文件来练习连接它们。
一个文件导入正常,另一个文件出现以下错误:
这是 csv 文件 link:
https://data.world/jonathankkizer/occupation-computerization
我用了下面的语句
occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",")
Warning messages:
1: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
: line 1 appears to contain embedded nulls
2: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
: line 2 appears to contain embedded nulls
3: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
: line 3 appears to contain embedded nulls
4: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
: line 4 appears to contain embedded nulls
5: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
: line 5 appears to contain embedded nulls
6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string
7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input
我在Whosebug上发现可能是编码的问题,所以我使用了建议的解决方案并执行了语句
occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",", fileEncoding="UTF-16LE")
它给了我不同的错误信息:
Error in read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
:
more columns than column names
我也试过使用read.csv函数也没有用。
如何解决这个问题并成功导入数据集?我在网上找到的 None 个解决方案(例如,使用 "skipNul = TRUE"、"comment.char="" " 参数)有帮助。
更新:
如果你不想从数据世界下载csv文件,这里是数据集的粘贴:
https://pastebin.com/SPEtWT6f
尝试使用 readr 包中 read_csv() 的函数。
使用数据框 = read.csv("name_of_file.csv")
或
dataframe = read.csv(file.choose()).
希望这会奏效。
我终于找到解决办法了!
我快疯了;连我的老师都不知道怎么修!
这条语句有效:
o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")
就像我在原来的问题中所说的那样:我尝试使用 fileEncoding="UTF-16LE" 但它没有帮助。问完问题后,我尝试使用 sep="\t",但没有帮助。但是使用两个就成功了!
我正在尝试自学 R(刚开始)。 我决定导入 2 个 csv 文件来练习连接它们。
一个文件导入正常,另一个文件出现以下错误:
这是 csv 文件 link:
https://data.world/jonathankkizer/occupation-computerization
我用了下面的语句
occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",")
Warning messages: 1: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 1 appears to contain embedded nulls 2: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 2 appears to contain embedded nulls 3: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 3 appears to contain embedded nulls 4: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 4 appears to contain embedded nulls 5: In read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : line 5 appears to contain embedded nulls 6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : EOF within quoted string 7: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input
我在Whosebug上发现可能是编码的问题,所以我使用了建议的解决方案并执行了语句
occupationforjoin<-read.table("C:/Users/Admin/Desktop/-=Data
Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv",
header=TRUE, sep=",", fileEncoding="UTF-16LE")
它给了我不同的错误信息:
Error in read.table("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/jonathankkizer-occupation-computerization/OccComp.csv", : more columns than column names
我也试过使用read.csv函数也没有用。
如何解决这个问题并成功导入数据集?我在网上找到的 None 个解决方案(例如,使用 "skipNul = TRUE"、"comment.char="" " 参数)有帮助。
更新: 如果你不想从数据世界下载csv文件,这里是数据集的粘贴: https://pastebin.com/SPEtWT6f
尝试使用 readr 包中 read_csv() 的函数。
使用数据框 = read.csv("name_of_file.csv")
或
dataframe = read.csv(file.choose()).
希望这会奏效。
我终于找到解决办法了! 我快疯了;连我的老师都不知道怎么修!
这条语句有效:
o<-read.csv("C:/Users/Admin/Desktop/-=Data Science=-/11-27-2018/Occ.txt", header=T, sep="\t", fileEncoding="UTF-16LE")
就像我在原来的问题中所说的那样:我尝试使用 fileEncoding="UTF-16LE" 但它没有帮助。问完问题后,我尝试使用 sep="\t",但没有帮助。但是使用两个就成功了!