在 R 中过滤来自 table 的记录
Filter records from table in R
我有一个数据集电影。
head(Movies)
如何获取 MovieID 为“0000008”的行?
我试过:
t1 = subset(Movies, "MovieID" == "0000008")
t2 <- Movies[ which(Movies["MovieID"]=="0000008"), ]
head(t1)
head(t2)
两个 return 空数据集,这是错误的,因为我可以看到 ID 为“0000008”的行。
编辑:
我尝试从 MovieID 中删除“”,但会引发错误:
Error in subset.matrix(Movies, MovieID == "0000008"): object 'MovieID' not found
编辑:
电影数据获得为:
URL = "https://raw.githubusercontent.com/sidooms/MovieTweetings/master/latest/movies.dat"
MovieText = readLines( remote.file(URL) ) # HACK!!!
Movies = matrix( sapply( MovieText,
function(x) unlist(strsplit(sub(" [(]([0-9]+)[)]", "::\1",x),"::"))[1:4] ),
nrow=length(MovieText), ncol=4, byrow=TRUE )
colnames(Movies) = c("MovieID", "MovieTitle", "Year", "Genres")
你的nrow
应该是length(MovieText)/4
URL = "https://raw.githubusercontent.com/sidooms/MovieTweetings/master/latest/movies.dat"
MovieText = readLines( URL ) # HACK!!!
Movies = matrix( sapply( MovieText,
function(x) unlist(strsplit(sub(" [(]([0-9]+)[)]", "::\1",x),"::"))[1:4] ),
nrow=length(MovieText)/4, ncol=4, byrow=TRUE )
colnames(Movies) = c("MovieID", "MovieTitle", "Year", "Genres")
#if you want to work with matrix, then use this
subset(Movies, Movies[,"MovieID"]=="0000008")
编辑:data.frame
和 data.table
子集
library(data.table)
MoviesDF <- data.frame(Movies)
MoviesDT <- data.table(Movies)
MoviesDF[MoviesDF["MovieID"] == "0000008", ]
MoviesDT[MovieID == "0000008", ]
顺便说一句:喜欢 HACK!!! 评论。
我有一个数据集电影。
head(Movies)
如何获取 MovieID 为“0000008”的行? 我试过:
t1 = subset(Movies, "MovieID" == "0000008")
t2 <- Movies[ which(Movies["MovieID"]=="0000008"), ]
head(t1)
head(t2)
两个 return 空数据集,这是错误的,因为我可以看到 ID 为“0000008”的行。
编辑: 我尝试从 MovieID 中删除“”,但会引发错误:
Error in subset.matrix(Movies, MovieID == "0000008"): object 'MovieID' not found
编辑: 电影数据获得为:
URL = "https://raw.githubusercontent.com/sidooms/MovieTweetings/master/latest/movies.dat"
MovieText = readLines( remote.file(URL) ) # HACK!!!
Movies = matrix( sapply( MovieText,
function(x) unlist(strsplit(sub(" [(]([0-9]+)[)]", "::\1",x),"::"))[1:4] ),
nrow=length(MovieText), ncol=4, byrow=TRUE )
colnames(Movies) = c("MovieID", "MovieTitle", "Year", "Genres")
你的nrow
应该是length(MovieText)/4
URL = "https://raw.githubusercontent.com/sidooms/MovieTweetings/master/latest/movies.dat"
MovieText = readLines( URL ) # HACK!!!
Movies = matrix( sapply( MovieText,
function(x) unlist(strsplit(sub(" [(]([0-9]+)[)]", "::\1",x),"::"))[1:4] ),
nrow=length(MovieText)/4, ncol=4, byrow=TRUE )
colnames(Movies) = c("MovieID", "MovieTitle", "Year", "Genres")
#if you want to work with matrix, then use this
subset(Movies, Movies[,"MovieID"]=="0000008")
编辑:data.frame
和 data.table
子集
library(data.table)
MoviesDF <- data.frame(Movies)
MoviesDT <- data.table(Movies)
MoviesDF[MoviesDF["MovieID"] == "0000008", ]
MoviesDT[MovieID == "0000008", ]
顺便说一句:喜欢 HACK!!! 评论。