将数据框从文本文件重塑为矩阵 value.var 错误
Reshape a data frame to matrix from text file value.var errors
我遇到的问题与这里的问题非常相似:Reshape three column data frame to matrix ("long" to "wide" format)
除非我从文本文件中获取数据,而且我正在尝试使用 reshape2
库和 dcast
方法
这是我的文本文件:
'Group','LiteracyLevel','Frequency'
'Shifting','Illerate',114
'Shifting','Primary',10
'Shifting','AtLeastMiddle',45
'Settled','Illerate',76
'Settled','Primary',2
'Settled','AtLeastMiddle',53
'Town','Illerate',93
'Town','Primary',13
'Town','AtLeastMiddle',208
应该改成这种格式,因为我想在上面使用barplot(as.matrix(data))
。
'Group','Illerate','Primary','AtLeastMiddle'
'Shifting',114,10,45
'Settled',76,2,53
'Town',93,13,208
我不知道为 dcast 的 value.var 部分输入什么。我假设它的频率。我目前重塑数据的尝试如下所示:
> data <- read.csv("ex3-39.txt", header=TRUE)
> dcast(data, data$Group~data$LiteracyLevel, value.var="X.Frequency")
Error: value.var (X.Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var="Frequency")
Error: value.var (Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var="data$X.Frequency")
Error: value.var (data$X.Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var=data$X.Frequency)
Error: value.var (1141045762539313208) not found in input
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
the condition has length > 1 and only the first element will be used
> dcast(data, data$Group~data$LiteracyLevel, value.var=Frequency)
Error in match(x, table, nomatch = 0L) : object 'Frequency' not found
# Just to make sure we're dealing with the same data...
df <- read.csv(quote="'",text="'Group','LiteracyLevel','Frequency'
'Shifting','Illerate',114
'Shifting','Primary',10
'Shifting','AtLeastMiddle',45
'Settled','Illerate',76
'Settled','Primary',2
'Settled','AtLeastMiddle',53
'Town','Illerate',93
'Town','Primary',13
'Town','AtLeastMiddle',208")
df
# Group LiteracyLevel Frequency
# 1 Shifting Illerate 114
# 2 Shifting Primary 10
# 3 Shifting AtLeastMiddle 45
# 4 Settled Illerate 76
# 5 Settled Primary 2
# 6 Settled AtLeastMiddle 53
# 7 Town Illerate 93
# 8 Town Primary 13
# 9 Town AtLeastMiddle 208
library(reshape2)
dcast(df, Group~LiteracyLevel)
# Group AtLeastMiddle Illerate Primary
# 1 Settled 53 76 2
# 2 Shifting 45 NA NA
# 3 Town 208 93 13
# 4 Shifting NA 114 10
问题是您需要在公式中指定列名(参考data
),而不是列。当您像您一样指定列时,例如df$Group
生成的矢量是 未命名。
names(df)
# [1] "Group" "LiteracyLevel" "Frequency"
names(df$Group)
# NULL
这有帮助吗
library(reshape2)
data<-read.csv("filename.csv",quote = "'")
dcast(data, data$Group~data$LiteracyLevel, value.var="Frequency")
这给出了输出
data$Group AtLeastMiddle Illerate Primary
1 Settled 53 76 2
2 Shifting 45 114 10
3 Town 208 93 13
我认为您错过了 quote="'"
参数并且您的列名称的格式为
"X.Group." "X.LiteracyLevel." "X.Frequency."
如果您不想使用 quote="'"
参数,请使用:
dcast(data, data$X.Group.~data$X.LiteracyLevel., value.var="X.Frequency.")
这将给出输出
data$X.Group. 'AtLeastMiddle' 'Illerate' 'Primary'
1 'Settled' 53 76 2
2 'Shifting' 45 114 10
3 'Town' 208 93 13
这是为了好玩。要在此代码后创建一个漂亮的条形图,请不要投射整个矩阵。您应该将第一列保留为图例
设 final_data
包含整形后的数据。对于矩阵,跳过第一列并将其用作图例。
barplot(as.matrix(final_data[,2:4]),legend=final_data$"data$Group")
这将给出一个漂亮的图表
我遇到的问题与这里的问题非常相似:Reshape three column data frame to matrix ("long" to "wide" format)
除非我从文本文件中获取数据,而且我正在尝试使用 reshape2
库和 dcast
方法
这是我的文本文件:
'Group','LiteracyLevel','Frequency'
'Shifting','Illerate',114
'Shifting','Primary',10
'Shifting','AtLeastMiddle',45
'Settled','Illerate',76
'Settled','Primary',2
'Settled','AtLeastMiddle',53
'Town','Illerate',93
'Town','Primary',13
'Town','AtLeastMiddle',208
应该改成这种格式,因为我想在上面使用barplot(as.matrix(data))
。
'Group','Illerate','Primary','AtLeastMiddle'
'Shifting',114,10,45
'Settled',76,2,53
'Town',93,13,208
我不知道为 dcast 的 value.var 部分输入什么。我假设它的频率。我目前重塑数据的尝试如下所示:
> data <- read.csv("ex3-39.txt", header=TRUE)
> dcast(data, data$Group~data$LiteracyLevel, value.var="X.Frequency")
Error: value.var (X.Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var="Frequency")
Error: value.var (Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var="data$X.Frequency")
Error: value.var (data$X.Frequency) not found in input
> dcast(data, data$Group~data$LiteracyLevel, value.var=data$X.Frequency)
Error: value.var (1141045762539313208) not found in input
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
the condition has length > 1 and only the first element will be used
> dcast(data, data$Group~data$LiteracyLevel, value.var=Frequency)
Error in match(x, table, nomatch = 0L) : object 'Frequency' not found
# Just to make sure we're dealing with the same data...
df <- read.csv(quote="'",text="'Group','LiteracyLevel','Frequency'
'Shifting','Illerate',114
'Shifting','Primary',10
'Shifting','AtLeastMiddle',45
'Settled','Illerate',76
'Settled','Primary',2
'Settled','AtLeastMiddle',53
'Town','Illerate',93
'Town','Primary',13
'Town','AtLeastMiddle',208")
df
# Group LiteracyLevel Frequency
# 1 Shifting Illerate 114
# 2 Shifting Primary 10
# 3 Shifting AtLeastMiddle 45
# 4 Settled Illerate 76
# 5 Settled Primary 2
# 6 Settled AtLeastMiddle 53
# 7 Town Illerate 93
# 8 Town Primary 13
# 9 Town AtLeastMiddle 208
library(reshape2)
dcast(df, Group~LiteracyLevel)
# Group AtLeastMiddle Illerate Primary
# 1 Settled 53 76 2
# 2 Shifting 45 NA NA
# 3 Town 208 93 13
# 4 Shifting NA 114 10
问题是您需要在公式中指定列名(参考data
),而不是列。当您像您一样指定列时,例如df$Group
生成的矢量是 未命名。
names(df)
# [1] "Group" "LiteracyLevel" "Frequency"
names(df$Group)
# NULL
这有帮助吗
library(reshape2)
data<-read.csv("filename.csv",quote = "'")
dcast(data, data$Group~data$LiteracyLevel, value.var="Frequency")
这给出了输出
data$Group AtLeastMiddle Illerate Primary
1 Settled 53 76 2
2 Shifting 45 114 10
3 Town 208 93 13
我认为您错过了 quote="'"
参数并且您的列名称的格式为
"X.Group." "X.LiteracyLevel." "X.Frequency."
如果您不想使用 quote="'"
参数,请使用:
dcast(data, data$X.Group.~data$X.LiteracyLevel., value.var="X.Frequency.")
这将给出输出
data$X.Group. 'AtLeastMiddle' 'Illerate' 'Primary'
1 'Settled' 53 76 2
2 'Shifting' 45 114 10
3 'Town' 208 93 13
这是为了好玩。要在此代码后创建一个漂亮的条形图,请不要投射整个矩阵。您应该将第一列保留为图例
设 final_data
包含整形后的数据。对于矩阵,跳过第一列并将其用作图例。
barplot(as.matrix(final_data[,2:4]),legend=final_data$"data$Group")
这将给出一个漂亮的图表