从 data.frame 中对特定列和行进行子集化 - 错误消息 "unexpected symbol in..."
Subsetting specific columns and rows from a data.frame - error message "unexpected symbol in..."
我是初学者,正在学习如何从 R 中的数据集中对特定行和列进行子集化。我正在使用 R Studio 中的 state.x77 数据集作为练习。当我尝试 select 指定的列时,我收到以下错误消息:
library(dplyr)
library(tibble)
select(state.x77, Income, HS Grad)
Error: unexpected symbol in "select(state.x77, Income, HS Grad"
我不明白那行代码中的哪个符号不正确。
此外,如果除了 selecting 某些列(变量)之外,我还想过滤某个状态,那么当状态列表是行名称时,我该如何使用过滤功能?当我尝试时:
rownames_to_column(state.x77, var = "State")
它为州名称创建了一个名为 State 的列,但当我查看 state.x77 时它似乎不是永久性的(因此我无法使用过滤功能)。
对不起,我是初学者。任何帮助将不胜感激。
谢谢。
有两个问题。首先,state.x77
是一个矩阵,因此您需要将其转换为数据框,因为 dplyr
包中的 select
函数仅将数据框作为第一个参数。第二,如果列名中有空格,需要用``或""把列名括起来。
# Load package
library(dplyr)
# Show the class of state.x77
class(state.x77)
# [1] "matrix"
# Convert state.x77 to a data frame
state.x77_df <- as.data.frame(state.x77)
# Show the class of state.x77_df
class(state.x77_df)
[1] "data.frame"
# Select Income and `HS Grad` columns
# All the following will work
select(state.x77_df, Income, `HS Grad`)
select(state.x77_df, "Income", "HS Grad")
select(state.x77_df, c("Income", "HS Grad"))
对于你的第二个问题,你必须将输出保存回对象,如下所示。
library(tibble)
state.x77_df <- rownames_to_column(state.x77_df, var = "State")
head(state.x77_df)
State Population Income Illiteracy Life Exp Murder HS Grad Frost Area
1 Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
2 Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
3 Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
4 Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
5 California 21198 5114 1.1 71.71 10.3 62.6 20 156361
6 Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
# Convert state.x77 into a dataframe and renaming rowname into State column
df <- tibble::rownames_to_column(data.frame(state.x77), var = "State")
## You can select any columns by their column names or by index
# by column names
col_names <- c("Income", "HS.Grad")
df[,col_names]
# by column index
col_index <- c(3,7)
df[, col_index]
# Filtering(subsetting) data by state
subset(df, df$State == "Arizona")
State Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
我是初学者,正在学习如何从 R 中的数据集中对特定行和列进行子集化。我正在使用 R Studio 中的 state.x77 数据集作为练习。当我尝试 select 指定的列时,我收到以下错误消息:
library(dplyr)
library(tibble)
select(state.x77, Income, HS Grad)
Error: unexpected symbol in "select(state.x77, Income, HS Grad"
我不明白那行代码中的哪个符号不正确。
此外,如果除了 selecting 某些列(变量)之外,我还想过滤某个状态,那么当状态列表是行名称时,我该如何使用过滤功能?当我尝试时:
rownames_to_column(state.x77, var = "State")
它为州名称创建了一个名为 State 的列,但当我查看 state.x77 时它似乎不是永久性的(因此我无法使用过滤功能)。
对不起,我是初学者。任何帮助将不胜感激。
谢谢。
有两个问题。首先,state.x77
是一个矩阵,因此您需要将其转换为数据框,因为 dplyr
包中的 select
函数仅将数据框作为第一个参数。第二,如果列名中有空格,需要用``或""把列名括起来。
# Load package
library(dplyr)
# Show the class of state.x77
class(state.x77)
# [1] "matrix"
# Convert state.x77 to a data frame
state.x77_df <- as.data.frame(state.x77)
# Show the class of state.x77_df
class(state.x77_df)
[1] "data.frame"
# Select Income and `HS Grad` columns
# All the following will work
select(state.x77_df, Income, `HS Grad`)
select(state.x77_df, "Income", "HS Grad")
select(state.x77_df, c("Income", "HS Grad"))
对于你的第二个问题,你必须将输出保存回对象,如下所示。
library(tibble)
state.x77_df <- rownames_to_column(state.x77_df, var = "State")
head(state.x77_df)
State Population Income Illiteracy Life Exp Murder HS Grad Frost Area
1 Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
2 Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
3 Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
4 Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
5 California 21198 5114 1.1 71.71 10.3 62.6 20 156361
6 Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
# Convert state.x77 into a dataframe and renaming rowname into State column
df <- tibble::rownames_to_column(data.frame(state.x77), var = "State")
## You can select any columns by their column names or by index
# by column names
col_names <- c("Income", "HS.Grad")
df[,col_names]
# by column index
col_index <- c(3,7)
df[, col_index]
# Filtering(subsetting) data by state
subset(df, df$State == "Arizona")
State Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417