R 和 postgreSQL table 操作

R and postgreSQL table operation

我一直在研究这个功能,试图获得所有至少有一个来自洛杉矶的家庭。当我使用包 data.table 时,可以上传和 运行 函数并获得良好的结果,但由于内存问题,我使用的是 postgreSQL,这就成了问题。

    year    sample  serial  pernum  wtper   relate  birthyr bplctry
    2005    8406    1244876000  3   75  4   NA  24040
    2005    8406    1244877000  1   62  1   NA  22010
    2005    8406    1244877000  2   67  2   NA  24040
    2005    8406    1244878000  1   137 1   NA  24040
    2005    8406    1244878000  2   130 2   NA  24040
    2005    8406    1244878000  3   149 3   NA  24040

> paises
 [1] 21080 21100 21130 22020 22030 22040 22050 22060 22070 22080 
23010 23020 23030 23040 23050 23060 23100 23110 23130 23140

然后,阅读(作品)...

create a PostgreSQL instance and create one connection.

m <- dbDriver("PostgreSQL")

con <- dbConnect(m, user="postgres", password="xxxx", dbname="IPUMS", host='localhost', port=5432)

mig_db <- src_postgres(dbname = 'IPUMS', user = 'postgres', password = 'xxxx')

Then, here I tried to obtain all the households with one LA. (that part works if I read USA with fread function from data.table pckg, but here is the code for the SQL statement)

USA <- tbl(mig_db, sql('SELECT * FROM namerica'))

paises.n <- fread('paises.csv',header=T, sep=',', data.table=F)

paises <- paises.n$code

Here is the problem, the function returns a logical vector (logical(0)) for USA$latino:

USA$latino <- ifelse(USA$bplctry %in% paises, 'LA', 'otro')

la <- USA[USA$latino == 'LA', ]

id <- unique(la$serial)

usa.new <- USA[USA$serial %in% id,]

您应该尝试使用 RPostgreSQL 库中的 dbGetQuery 函数

dbGetQuery(con, "Select * from namerica")