如何将文本文件转换为 R 中的数据框?
how to convert text files into dataframe in R?
我正在尝试从 mongodb 导出数据点。不幸的是,我无法将它直接连接到 rstudio。因此,根据查询结果,我创建了一个文本文件并尝试将其作为 R 中的文本文件读取。
"cityid", "count"
"102","2"
"55","31"
"119","7"
"206","1"
"18","2"
"15","1"
"32","3"
"14","1"
"54","2"
"23","85"
"158","3"
"266","1"
"9","1"
"34","1"
"159","1"
"31","1"
"22","2"
"209","2"
"121","4"
"73","12"
"350","2"
"311","2"
"377","2"
"230","7"
"290","1"
"49","2"
"379","2"
"75","1"
"59","6"
"165","3"
"19","8"
"13","40"
"126","13"
"243","12"
"325","1"
"17","1"
"null","235"
"144","2"
"334","1"
"40","12"
"7","34"
"181","40"
"349","4"
所以基本上格式就像上面一样,我想把它转换成一个数据框,我可以用它作为其他数据集计算的参考。
这就是我尝试制作的数据框...
L <- readLines(file.choose())
L.df <- as.data.frame(L)
list <- strsplit(L.df, ",")
library("plyr")
df <- ldply(list)
colnames(df) <- c("city_id", "count")
str(df)
df$city_id <- suppressWarnings(as.numeric(as.character(df$city_id)))
在最后一行,我试图将字符值转换为数值,但失败并强制将它们转换为 NA。
有没有人有更好的建议将它们设为数值table?
或者实际上是否有更好的方法将 mongodb 带入 R 而无需将它们复制并粘贴为文本文件?我使用 Rmongo 成功连接到 mongodb,但是语法太复杂了,我无法理解。我使用的查询是:
db.getCollection('logging_app_location_view_logs').aggregate([
{"$group": {"_id": "$city_id", "total": {"$sum":1}}}
]).forEach(function(l){
print('"' + l._id + '","' + l.total + '"');
});
在此先感谢您的帮助!
在read.table
函数中已经通过了header = TRUE
,就不需要再指定列名了。 colClasses
参数将处理列数据的 class。
df <- read.table(file.choose(), header = TRUE, sep = ",", colClasses = c('character', 'character'), na.strings = 'null')
# convert character to numeric format
char_cols <- which(sapply(df, class) == 'character') # identify character columns
df[char_cols] <- lapply(df[char_cols], as.numeric) # convert character to numeric column
我正在尝试从 mongodb 导出数据点。不幸的是,我无法将它直接连接到 rstudio。因此,根据查询结果,我创建了一个文本文件并尝试将其作为 R 中的文本文件读取。
"cityid", "count"
"102","2"
"55","31"
"119","7"
"206","1"
"18","2"
"15","1"
"32","3"
"14","1"
"54","2"
"23","85"
"158","3"
"266","1"
"9","1"
"34","1"
"159","1"
"31","1"
"22","2"
"209","2"
"121","4"
"73","12"
"350","2"
"311","2"
"377","2"
"230","7"
"290","1"
"49","2"
"379","2"
"75","1"
"59","6"
"165","3"
"19","8"
"13","40"
"126","13"
"243","12"
"325","1"
"17","1"
"null","235"
"144","2"
"334","1"
"40","12"
"7","34"
"181","40"
"349","4"
所以基本上格式就像上面一样,我想把它转换成一个数据框,我可以用它作为其他数据集计算的参考。
这就是我尝试制作的数据框...
L <- readLines(file.choose())
L.df <- as.data.frame(L)
list <- strsplit(L.df, ",")
library("plyr")
df <- ldply(list)
colnames(df) <- c("city_id", "count")
str(df)
df$city_id <- suppressWarnings(as.numeric(as.character(df$city_id)))
在最后一行,我试图将字符值转换为数值,但失败并强制将它们转换为 NA。
有没有人有更好的建议将它们设为数值table? 或者实际上是否有更好的方法将 mongodb 带入 R 而无需将它们复制并粘贴为文本文件?我使用 Rmongo 成功连接到 mongodb,但是语法太复杂了,我无法理解。我使用的查询是:
db.getCollection('logging_app_location_view_logs').aggregate([
{"$group": {"_id": "$city_id", "total": {"$sum":1}}}
]).forEach(function(l){
print('"' + l._id + '","' + l.total + '"');
});
在此先感谢您的帮助!
在read.table
函数中已经通过了header = TRUE
,就不需要再指定列名了。 colClasses
参数将处理列数据的 class。
df <- read.table(file.choose(), header = TRUE, sep = ",", colClasses = c('character', 'character'), na.strings = 'null')
# convert character to numeric format
char_cols <- which(sapply(df, class) == 'character') # identify character columns
df[char_cols] <- lapply(df[char_cols], as.numeric) # convert character to numeric column