尝试 select R 中的列,但它只假定行
Trying to select columns in R but it only assumes rows
我正在尝试从 827 个现有变量中 select 41 个变量。我正在使用代码:
`myvars <- c('newid', # CU id
'region', # region
'state', # state
'cutenure', # housing tenure
'fam_size', # family size
'no_earnr', # number of earners
'num_auto', # number of vehicles owned
'popsize', # population size (see codes)
'inclass', # income class
'age_ref', # age reference person
'educ_ref', # education reference person (see codes)
'ref_race', # race reference person (1=white, 2=black, 3= nat-am, 4=asian, 5=pac-isl, 6=multi-race)
'inc_hrs1', # hours x week by ref person
'inc_hrs2', # hours x week by spouse
'incweek1', # number weeks worked ref person
'incweek2', # number weeks worked spouse
'fincbtax', # income before tax past 12 month
'fincatax', # income after tax past 12 month
'fsalaryx', # wage and salary income before ded.
'totexppq', # tot exp prev quarter
'totexpcq', # tot exp curr quarter
'majapppq', # major appliances prev quarter
'majappcq', # major appliances curr quarter
'FOODHOME', # Expenditures food at home
'FOODAWAY', # Food away from home
'ALCBEV', # Alcholic Beverages
'OWNDWECQ', # Owned Dwellings
'ZRENTDWL', # Rented Dwellings
'OTHLODCQ', # Other Lodging
'UTILCQ', # Utilities
'MISCEQPQ', # Household Equipment
'HOUSOPCQ', # Household Operations
'APPARCQ', # Apparel and Services
'VEHICLCQ', # Vehicle Expenditures
'OTHVEHCQ', # Other Vehicle Expenditures
'GASMOCQ', # Gasoline
'TRNOTHCQ', # Public Transportation
'HEALTHCQ', # Health Care
'ENTERTCQ', # Entertainment
'PERSCACQ', # Personal Care
'READCQ', # Reading
'EDUCACQ', # Education
'TOBACCCQ' # Tobacco
)
newdataQ1 = dataQ1[,myvars]`
之后我收到错误:
[.data.frame
(dataQ1, , myvars) 中的错误:未定义的列 selected
回溯:
- dataQ1[ myvars]
[.data.frame
(dataQ1, , myvars)
- stop("未定义的列 selected")
如果我更改逗号并输入
newdataQ1 = dataQ1[myvars,]
它允许我继续,但保留了 41 行而不是 41 列,保留了原始列数。
我该如何解决这个问题?
谢谢。
也许 dplyr 是您分区数据的好方法。
这是一个很棒的包,使您的请求非常简单且易于阅读
(我正在使用管道运算符,它们也让生活更轻松,代码更易读)
newdataQ1 <- dataQ1 %>% select(myvars)
值得查看 dplyr 的教程(例如在 datacamp 上)以熟悉语法。
如果你只想使用基本的 R,你需要指定你想要 select colnames:
newdataQ1 <- dataQ1[, colnames(dataQ1) %in% myvars]
我正在尝试从 827 个现有变量中 select 41 个变量。我正在使用代码:
`myvars <- c('newid', # CU id
'region', # region
'state', # state
'cutenure', # housing tenure
'fam_size', # family size
'no_earnr', # number of earners
'num_auto', # number of vehicles owned
'popsize', # population size (see codes)
'inclass', # income class
'age_ref', # age reference person
'educ_ref', # education reference person (see codes)
'ref_race', # race reference person (1=white, 2=black, 3= nat-am, 4=asian, 5=pac-isl, 6=multi-race)
'inc_hrs1', # hours x week by ref person
'inc_hrs2', # hours x week by spouse
'incweek1', # number weeks worked ref person
'incweek2', # number weeks worked spouse
'fincbtax', # income before tax past 12 month
'fincatax', # income after tax past 12 month
'fsalaryx', # wage and salary income before ded.
'totexppq', # tot exp prev quarter
'totexpcq', # tot exp curr quarter
'majapppq', # major appliances prev quarter
'majappcq', # major appliances curr quarter
'FOODHOME', # Expenditures food at home
'FOODAWAY', # Food away from home
'ALCBEV', # Alcholic Beverages
'OWNDWECQ', # Owned Dwellings
'ZRENTDWL', # Rented Dwellings
'OTHLODCQ', # Other Lodging
'UTILCQ', # Utilities
'MISCEQPQ', # Household Equipment
'HOUSOPCQ', # Household Operations
'APPARCQ', # Apparel and Services
'VEHICLCQ', # Vehicle Expenditures
'OTHVEHCQ', # Other Vehicle Expenditures
'GASMOCQ', # Gasoline
'TRNOTHCQ', # Public Transportation
'HEALTHCQ', # Health Care
'ENTERTCQ', # Entertainment
'PERSCACQ', # Personal Care
'READCQ', # Reading
'EDUCACQ', # Education
'TOBACCCQ' # Tobacco
)
newdataQ1 = dataQ1[,myvars]`
之后我收到错误:
[.data.frame
(dataQ1, , myvars) 中的错误:未定义的列 selected
回溯:
- dataQ1[ myvars]
[.data.frame
(dataQ1, , myvars)- stop("未定义的列 selected")
如果我更改逗号并输入
newdataQ1 = dataQ1[myvars,]
它允许我继续,但保留了 41 行而不是 41 列,保留了原始列数。
我该如何解决这个问题?
谢谢。
也许 dplyr 是您分区数据的好方法。 这是一个很棒的包,使您的请求非常简单且易于阅读 (我正在使用管道运算符,它们也让生活更轻松,代码更易读)
newdataQ1 <- dataQ1 %>% select(myvars)
值得查看 dplyr 的教程(例如在 datacamp 上)以熟悉语法。
如果你只想使用基本的 R,你需要指定你想要 select colnames:
newdataQ1 <- dataQ1[, colnames(dataQ1) %in% myvars]