尝试 select R 中的列,但它只假定行

Trying to select columns in R but it only assumes rows

我正在尝试从 827 个现有变量中 select 41 个变量。我正在使用代码:

`myvars <- c('newid',    # CU id
            'region',   # region
            'state',    # state
            'cutenure', # housing tenure 
            'fam_size',  # family size
            'no_earnr', # number of earners
            'num_auto', # number of vehicles owned
            'popsize',  # population size (see codes)
            'inclass',  # income class
            'age_ref',  # age reference person 
            'educ_ref', # education reference person (see codes)
            'ref_race', # race reference person (1=white, 2=black, 3= nat-am, 4=asian, 5=pac-isl, 6=multi-race)
            'inc_hrs1', # hours x week by ref person
            'inc_hrs2', # hours x week by spouse
            'incweek1', # number weeks worked ref person
            'incweek2', # number weeks worked spouse
            'fincbtax', # income before tax past 12 month
            'fincatax', # income after tax past 12 month
            'fsalaryx', # wage and salary income before ded.
            'totexppq', # tot exp prev quarter
            'totexpcq', # tot exp curr quarter
            'majapppq', # major appliances prev quarter
            'majappcq', # major appliances curr quarter
            'FOODHOME', # Expenditures food at home
            'FOODAWAY', # Food away from home
            'ALCBEV',   # Alcholic Beverages
            'OWNDWECQ', # Owned Dwellings
            'ZRENTDWL', # Rented Dwellings
            'OTHLODCQ', # Other Lodging
            'UTILCQ',   # Utilities
            'MISCEQPQ', # Household Equipment
            'HOUSOPCQ', # Household Operations
            'APPARCQ',  # Apparel and Services
            'VEHICLCQ', # Vehicle Expenditures
            'OTHVEHCQ', # Other Vehicle Expenditures
            'GASMOCQ',  # Gasoline
            'TRNOTHCQ', # Public Transportation
            'HEALTHCQ', # Health Care
            'ENTERTCQ', # Entertainment
            'PERSCACQ', # Personal Care
            'READCQ',   # Reading
            'EDUCACQ',  # Education
            'TOBACCCQ' # Tobacco
           )

newdataQ1 = dataQ1[,myvars]`

之后我收到错误:

[.data.frame(dataQ1, , myvars) 中的错误:未定义的列 selected 回溯:

  1. dataQ1[ myvars]
  2. [.data.frame(dataQ1, , myvars)
  3. stop("未定义的列 selected")

如果我更改逗号并输入

newdataQ1 = dataQ1[myvars,]

它允许我继续,但保留了 41 行而不是 41 列,保留了原始列数。

我该如何解决这个问题?

谢谢。

也许 dplyr 是您分区数据的好方法。 这是一个很棒的包,使您的请求非常简单且易于阅读 (我正在使用管道运算符,它们也让生活更轻松,代码更易读)

newdataQ1 <- dataQ1 %>% select(myvars)

值得查看 dplyr 的教程(例如在 datacamp 上)以熟悉语法。

如果你只想使用基本的 R,你需要指定你想要 select colnames:

newdataQ1 <- dataQ1[, colnames(dataQ1) %in% myvars]