使用 ggplot2 对 R 中的数据集进行多元线性回归
Multiple linear regression for a dataset in R with ggplot2
我正在测试对数据集的情绪进行分析。在这里,我想看看消息量和嗡嗡声,消息量和分数之间是否有任何有趣的观察结果......
我的数据集是这样的:
> str(data)
'data.frame': 40 obs. of 11 variables:
$ Date Time : POSIXct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...
$ Subject : chr "MMM" "ACE" "AES" "AFL" ...
$ Sscore : chr "-0.2280" "-0.4415" "1.9821" "-2.9335" ...
$ Smean : chr "0.2593" "0.3521" "0.0233" "0.0035" ...
$ Svscore : chr "-0.2795" "-0.0374" "1.1743" "-0.2975" ...
$ Sdispersion : chr "0.375" "0.500" "1.000" "1.000" ...
$ Svolume : num 8 4 1 1 5 3 2 1 1 2 ...
$ Sbuzz : chr "0.6026" "0.7200" "1.9445" "0.8321" ...
$ Last close : chr "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...
$ Company name: chr "3M Company" "ACE Limited" "The AES Corporation" "AFLAC Inc." ...
$ Date : Date, format: "2015-07-08" "2015-07-08" ...
我想到了线性回归,所以我想使用 ggplot,但是我使用了这段代码,我认为我在某处出错了,因为我没有出现回归线...是因为回归是要弱?我帮助编写了以下代码:code of topchef
我的是:
library(ggplot2)
require(ggplot2)
library("reshape2")
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) + geom_smooth(aes(value,Svolume, colour=variable), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
但我可能误解了一些东西,因为我没有得到我想要的东西。
我是 R 的新手,所以我希望有人能帮助我。
我有这个错误:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
最后,您认为可以为不同的主题使用不同的颜色,而不是为每个变量使用一种颜色吗?
我可以在每个图表上添加回归线吗?
感谢您的帮助。
示例数据:
Date Time Subject Sscore Smean Svscore Sdispersion Svolume Sbuzz Last close Company name Date
1 2015-07-08 09:10:00 MMM -0.2280 0.2593 -0.2795 0.375 8 0.6026 155.430000000 3M Company 2015-07-08
2 2015-07-08 09:10:00 ACE -0.4415 0.3521 -0.0374 0.500 4 0.7200 104.460000000 ACE Limited 2015-07-08
3 2015-07-07 09:10:00 AES 1.9821 0.0233 1.1743 1.000 1 1.9445 13.200000000 The AES Corporation 2015-07-07
4 2015-07-04 09:10:00 AFL -2.9335 0.0035 -0.2975 1.000 1 0.8321 61.960000000 AFLAC Inc. 2015-07-04
5 2015-07-07 09:10:00 MMM 0.2977 0.2713 -0.7436 0.400 5 0.4895 155.080000000 3M Company 2015-07-07
6 2015-07-07 09:10:00 ACE -0.2331 0.3519 -0.1118 1.000 3 0.7196 103.330000000 ACE Limited 2015-07-07
7 2015-06-28 09:10:00 AES 1.8721 0.0609 1.9100 0.500 2 2.4319 13.460000000 The AES Corporation 2015-06-28
8 2015-07-03 09:10:00 AFL 0.6024 0.0330 -0.2663 1.000 1 0.6822 61.960000000 AFLAC Inc. 2015-07-03
9 2015-07-06 09:10:00 MMM -1.0057 0.2579 -1.3796 1.000 1 0.4531 155.380000000 3M Company 2015-07-06
10 2015-07-06 09:10:00 ACE -0.0263 0.3435 -0.1904 1.000 2 1.3536 103.740000000 ACE Limited 2015-07-06
11 2015-06-19 09:10:00 AES -1.1981 0.1517 1.2063 1.000 2 1.9427 13.850000000 The AES Corporation 2015-06-19
12 2015-07-02 09:10:00 AFL -0.8247 0.0269 1.8635 1.000 5 2.2454 62.430000000 AFLAC Inc. 2015-07-02
13 2015-07-05 09:10:00 MMM -0.4272 0.3107 -0.7970 0.167 6 0.6003 155.380000000 3M Company 2015-07-05
14 2015-07-04 09:10:00 ACE 0.0642 0.3274 -0.0975 0.667 3 1.2932 103.740000000 ACE Limited 2015-07-04
15 2015-06-17 09:10:00 AES 0.1627 0.1839 1.3141 0.500 2 1.9578 13.580000000 The AES Corporation 2015-06-17
16 2015-07-01 09:10:00 AFL -0.7419 0.0316 1.5699 0.250 4 2.0988 62.200000000 AFLAC Inc. 2015-07-01
17 2015-07-04 09:10:00 MMM -0.5962 0.3484 -1.2481 0.667 3 0.4496 155.380000000 3M Company 2015-07-04
18 2015-07-03 09:10:00 ACE 0.8527 0.3085 0.1944 0.833 6 1.3656 103.740000000 ACE Limited 2015-07-03
19 2015-06-15 09:10:00 AES 0.8145 0.1725 0.2939 1.000 1 1.6121 13.350000000 The AES Corporation 2015-06-15
20 2015-06-30 09:10:00 AFL 0.3076 0.0538 -0.0938 1.000 1 0.7071 61.440000000 AFLAC Inc. 2015-06-30
输入
data <- structure(list(`Date Time` = structure(c(1436361000, 1436361000,
1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000,
1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400,
1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800
), class = c("POSIXct", "POSIXt"), tzone = ""), Subject = c("MMM",
"ACE", "AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE",
"AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", "AES",
"AFL"), Sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977,
-0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247,
-0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076
), Smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519,
0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274,
0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), Svscore = c(-0.2795,
-0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796,
-0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481,
0.1944, 0.2939, -0.0938), Sdispersion = c(0.375, 0.5, 1, 1, 0.4,
1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833,
1, 1), Svolume = c(8L, 4L, 1L, 1L, 5L, 3L, 2L, 1L, 1L, 2L, 2L,
5L, 6L, 3L, 2L, 4L, 3L, 6L, 1L, 1L), Sbuzz = c(0.6026, 0.72,
1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536,
1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656,
1.6121, 0.7071), `Last close` = c(155.43, 104.46, 13.2, 61.96,
155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38,
103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `Company name` = c("3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc."), Date = structure(c(16624,
16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622,
16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601,
16616), class = "Date")), .Names = c("Date Time", "Subject",
"Sscore", "Smean", "Svscore", "Sdispersion", "Svolume", "Sbuzz",
"Last close", "Company name", "Date"), row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20"), class = "data.frame")
注意警告 Maybe you want aes(group = 1)
。我所做的只是将 group = 1
添加到 aes
以获得 geom_smooth
.
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) +
geom_smooth(aes(value,Svolume, colour=variable, group = 1), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
一些不请自来的建议
您不需要使用 require
和 library
,一个或另一个。
你只需要aes
一次
您的示例数据无效 - 我必须 fiddle 才能阅读它。请参阅 How to make a great R reproducible example? 以获取建议。
这是我编写 ggplot 代码的方式:
library(ggplot2)
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
aes(x = value, y = Svolume, colour = variable) +
geom_jitter() +
geom_smooth(method=lm, se=FALSE, aes(group = 1)) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
我正在测试对数据集的情绪进行分析。在这里,我想看看消息量和嗡嗡声,消息量和分数之间是否有任何有趣的观察结果......
我的数据集是这样的:
> str(data)
'data.frame': 40 obs. of 11 variables:
$ Date Time : POSIXct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...
$ Subject : chr "MMM" "ACE" "AES" "AFL" ...
$ Sscore : chr "-0.2280" "-0.4415" "1.9821" "-2.9335" ...
$ Smean : chr "0.2593" "0.3521" "0.0233" "0.0035" ...
$ Svscore : chr "-0.2795" "-0.0374" "1.1743" "-0.2975" ...
$ Sdispersion : chr "0.375" "0.500" "1.000" "1.000" ...
$ Svolume : num 8 4 1 1 5 3 2 1 1 2 ...
$ Sbuzz : chr "0.6026" "0.7200" "1.9445" "0.8321" ...
$ Last close : chr "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...
$ Company name: chr "3M Company" "ACE Limited" "The AES Corporation" "AFLAC Inc." ...
$ Date : Date, format: "2015-07-08" "2015-07-08" ...
我想到了线性回归,所以我想使用 ggplot,但是我使用了这段代码,我认为我在某处出错了,因为我没有出现回归线...是因为回归是要弱?我帮助编写了以下代码:code of topchef
我的是:
library(ggplot2)
require(ggplot2)
library("reshape2")
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) + geom_smooth(aes(value,Svolume, colour=variable), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
但我可能误解了一些东西,因为我没有得到我想要的东西。 我是 R 的新手,所以我希望有人能帮助我。
我有这个错误:
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)?
最后,您认为可以为不同的主题使用不同的颜色,而不是为每个变量使用一种颜色吗? 我可以在每个图表上添加回归线吗?
感谢您的帮助。
示例数据:
Date Time Subject Sscore Smean Svscore Sdispersion Svolume Sbuzz Last close Company name Date
1 2015-07-08 09:10:00 MMM -0.2280 0.2593 -0.2795 0.375 8 0.6026 155.430000000 3M Company 2015-07-08
2 2015-07-08 09:10:00 ACE -0.4415 0.3521 -0.0374 0.500 4 0.7200 104.460000000 ACE Limited 2015-07-08
3 2015-07-07 09:10:00 AES 1.9821 0.0233 1.1743 1.000 1 1.9445 13.200000000 The AES Corporation 2015-07-07
4 2015-07-04 09:10:00 AFL -2.9335 0.0035 -0.2975 1.000 1 0.8321 61.960000000 AFLAC Inc. 2015-07-04
5 2015-07-07 09:10:00 MMM 0.2977 0.2713 -0.7436 0.400 5 0.4895 155.080000000 3M Company 2015-07-07
6 2015-07-07 09:10:00 ACE -0.2331 0.3519 -0.1118 1.000 3 0.7196 103.330000000 ACE Limited 2015-07-07
7 2015-06-28 09:10:00 AES 1.8721 0.0609 1.9100 0.500 2 2.4319 13.460000000 The AES Corporation 2015-06-28
8 2015-07-03 09:10:00 AFL 0.6024 0.0330 -0.2663 1.000 1 0.6822 61.960000000 AFLAC Inc. 2015-07-03
9 2015-07-06 09:10:00 MMM -1.0057 0.2579 -1.3796 1.000 1 0.4531 155.380000000 3M Company 2015-07-06
10 2015-07-06 09:10:00 ACE -0.0263 0.3435 -0.1904 1.000 2 1.3536 103.740000000 ACE Limited 2015-07-06
11 2015-06-19 09:10:00 AES -1.1981 0.1517 1.2063 1.000 2 1.9427 13.850000000 The AES Corporation 2015-06-19
12 2015-07-02 09:10:00 AFL -0.8247 0.0269 1.8635 1.000 5 2.2454 62.430000000 AFLAC Inc. 2015-07-02
13 2015-07-05 09:10:00 MMM -0.4272 0.3107 -0.7970 0.167 6 0.6003 155.380000000 3M Company 2015-07-05
14 2015-07-04 09:10:00 ACE 0.0642 0.3274 -0.0975 0.667 3 1.2932 103.740000000 ACE Limited 2015-07-04
15 2015-06-17 09:10:00 AES 0.1627 0.1839 1.3141 0.500 2 1.9578 13.580000000 The AES Corporation 2015-06-17
16 2015-07-01 09:10:00 AFL -0.7419 0.0316 1.5699 0.250 4 2.0988 62.200000000 AFLAC Inc. 2015-07-01
17 2015-07-04 09:10:00 MMM -0.5962 0.3484 -1.2481 0.667 3 0.4496 155.380000000 3M Company 2015-07-04
18 2015-07-03 09:10:00 ACE 0.8527 0.3085 0.1944 0.833 6 1.3656 103.740000000 ACE Limited 2015-07-03
19 2015-06-15 09:10:00 AES 0.8145 0.1725 0.2939 1.000 1 1.6121 13.350000000 The AES Corporation 2015-06-15
20 2015-06-30 09:10:00 AFL 0.3076 0.0538 -0.0938 1.000 1 0.7071 61.440000000 AFLAC Inc. 2015-06-30
输入
data <- structure(list(`Date Time` = structure(c(1436361000, 1436361000,
1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000,
1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400,
1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800
), class = c("POSIXct", "POSIXt"), tzone = ""), Subject = c("MMM",
"ACE", "AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE",
"AES", "AFL", "MMM", "ACE", "AES", "AFL", "MMM", "ACE", "AES",
"AFL"), Sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977,
-0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247,
-0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076
), Smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519,
0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274,
0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), Svscore = c(-0.2795,
-0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796,
-0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481,
0.1944, 0.2939, -0.0938), Sdispersion = c(0.375, 0.5, 1, 1, 0.4,
1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833,
1, 1), Svolume = c(8L, 4L, 1L, 1L, 5L, 3L, 2L, 1L, 1L, 2L, 2L,
5L, 6L, 3L, 2L, 4L, 3L, 6L, 1L, 1L), Sbuzz = c(0.6026, 0.72,
1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536,
1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656,
1.6121, 0.7071), `Last close` = c(155.43, 104.46, 13.2, 61.96,
155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38,
103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `Company name` = c("3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc.", "3M Company",
"ACE Limited", "The AES Corporation", "AFLAC Inc."), Date = structure(c(16624,
16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622,
16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601,
16616), class = "Date")), .Names = c("Date Time", "Subject",
"Sscore", "Smean", "Svscore", "Sdispersion", "Svolume", "Sbuzz",
"Last close", "Company name", "Date"), row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
"15", "16", "17", "18", "19", "20"), class = "data.frame")
注意警告 Maybe you want aes(group = 1)
。我所做的只是将 group = 1
添加到 aes
以获得 geom_smooth
.
ggplot(data.2) +
geom_jitter(aes(value,Svolume, colour=variable),) +
geom_smooth(aes(value,Svolume, colour=variable, group = 1), method=lm, se=FALSE) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")
一些不请自来的建议
您不需要使用
require
和library
,一个或另一个。你只需要
aes
一次您的示例数据无效 - 我必须 fiddle 才能阅读它。请参阅 How to make a great R reproducible example? 以获取建议。
这是我编写 ggplot 代码的方式:
library(ggplot2)
require(reshape2)
data.2 = melt(data[3:9], id.vars='Svolume')
ggplot(data.2) +
aes(x = value, y = Svolume, colour = variable) +
geom_jitter() +
geom_smooth(method=lm, se=FALSE, aes(group = 1)) +
facet_wrap(~variable, scales="free_x") +
labs(x = "Variables", y = "Svolumes")