如何读取带有大量逗号的变量 "comments" 的 CSV?

How can I read a CSV with a variable "comments" that's plenty of commas?

我正在尝试从 Airbnb 页面 http://insideairbnb.com/get-the-data.html 读取巴塞罗那的 .csv 文件详细评论数据。

但问题是有一个变量专门用于人们的评论,而且它有很多逗号,所以当我尝试阅读 .csv 时,它完全扭曲了。我会很感激一些帮助!

非常感谢!

您可以使用 data.table 包中的 fread(),使用 sep2 参数.

来自文档:

sep2: The separator within columns.

我在阿姆斯特丹的数据上试过了,效果很好。它会发出警告,但这只是由于 data.table 开发人员编写 fread 的方式所致。

DT = fread(".../location/reviews.csv", sep2=",")
nrow(DT) #returns 163351 (which seems to be the correct number)
head(DT$comments,1)

returns:

[1] "The room was small but comfortable. The place was quite clean but the bed sheets could have been cleaner. The apartment was nicely decorated and located just about 20 minutes (walking) from the city center so it was very convenient for us. However, we had a quite unpleasant experience in one of the nights because they decided to throw a party on a Thursday night which lasted until 5:30am. The walls were very thin and we could hear their music and their conversations all night long. People were also smoking all night so the smell of cigarettes was unbearable to us since we are not smokers and the smell in our room got so strong. Cedaria was very helpful in giving us tips before hand about things to see in the city, but if we had known that we would have had that kind of experience we would have stayed somewhere else."

仅供参考,fread 和 data.table 非常快。我喜欢使用那个包。