R中的图形段
Graphing Segments in R
我想在 R 中绘制分段数据。也就是说,假设我有以下形式的数据
| Product | Date | Origination | Rate | Num | Balance |
|-----------------------|--------|-------------|------|-----|-----------|
| DEMAND DEPOSITS | 200505 | 198209 | 0 | 1 | 2586.25 |
| DEMAND DEPOSITS | 200505 | 198304 | 0 | 1 | 3557.73 |
| DEMAND DEPOSITS | 200505 | 198308 | 0 | 1 | 14923.72 |
| DEMAND DEPOSITS | 200505 | 198401 | 0 | 1 | 4431.67 |
| DEMAND DEPOSITS | 200505 | 198410 | 0 | 1 | 44555.23 |
| MONEY MARKET ACCOUNTS | 200505 | 198209 | 0.25 | 2 | 65710.01 |
| MONEY MARKET ACCOUNTS | 200505 | 198211 | 0.25 | 2 | 41218.41 |
| MONEY MARKET ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 61421.2 |
| MONEY MARKET ACCOUNTS | 200505 | 198402 | 0.25 | 1 | 13620.17 |
| MONEY MARKET ACCOUNTS | 200505 | 198408 | 0.75 | 1 | 281897.74 |
| MONEY MARKET ACCOUNTS | 200505 | 198410 | 0.25 | 1 | 5131.33 |
| NOW ACCOUNTS | 200505 | 198209 | 0 | 1 | 142744.35 |
| NOW ACCOUNTS | 200505 | 198303 | 0 | 1 | 12191.6 |
| SAVING ACCOUNTS | 200505 | 198301 | 0.25 | 1 | 96936.24 |
| SAVING ACCOUNTS | 200505 | 198302 | 0.25 | 2 | 21764 |
| SAVING ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 14646.55 |
| SAVING ACCOUNTS | 200505 | 198305 | 0.25 | 1 | 20909.7 |
| SAVING ACCOUNTS | 200505 | 198306 | 0.25 | 1 | 66434.56 |
| SAVING ACCOUNTS | 200505 | 198309 | 0.25 | 1 | 20005.56 |
| SAVING ACCOUNTS | 200505 | 198404 | 0.25 | 2 | 16766.56 |
| SAVING ACCOUNTS | 200505 | 198407 | 0.25 | 1 | 47721.97 |
我想在 Y 轴上为每个 'Product' 类型绘制一条线 'Balance'。在 X 轴上,我想放置 'Origination'。理想情况下,我还想设置颜色来区分线条。数据目前不是 data.frame 形式,所以如果我需要改回那种形式,请告诉我。
虽然我确定有,但我无法在网上找到信息丰富的解决方案。
谢谢,
如@zx8754 所述,您应该提供可重现的数据。
没有测试代码(因为没有可重现的数据),我建议如下,假设数据在 data.frame 'data':
all_products <- unique(data$Product)
colors_use <- rainbow(length(all_products))
plot(y = data[data$Product == all_products[1],"Balance"],
x = data[data$Product == all_products[1],"Origination"],
type = "l",
col = colors_use[1],
ylim = c(min(data$Balance, na.rm = T),max(data$Balance, na.rm = T)),
xlim = c(min(data$Origination, na.rm = T),max(data$Origination, na.rm = T)))
for(i_product in 2:length(all_products)){
lines(y = data[data$Product == all_products[i_product],"Balance"],
x = data[data$Product == all_products[i_product],"Origination"],
col = colors_use[i_product])
}
我没有足够的声誉来发表评论,所以我把它写成一个答案。为了使@tobiasegli_te 的答案更短,第一个 plot
可以是 plot(Balance~Origination,data=data,type='n')
,然后使后续的 lines
完成 i_product in 1:length(all_products)
。这样你就不用担心 ylim
。这是一个使用 Grunfeld 数据的示例。
z <- read.csv('http://statmath.wu-wien.ac.at/~zeileis/grunfeld/Grunfeld.csv')
plot(invest~year,data=z,type='n')
for (i in unique(as.numeric(z$firm))) lines(invest~year,data=z,
subset=as.numeric(z$firm)==i, col=i)
另请注意,您的 Origination
不是等距的。您需要将其更改为 Date
或类似的。
我猜你想要类似下面的东西:
df <- as.data.frame(df[c('Product', 'Balance', 'Origination')])
head(df)
Product Balance Origination
1 DEMAND DEPOSITS 2586.25 198209
2 DEMAND DEPOSITS 3557.73 198304
3 DEMAND DEPOSITS 14923.72 198308
4 DEMAND DEPOSITS 4431.67 198401
5 DEMAND DEPOSITS 44555.23 198410
6 MONEY MARKET ACCOUNTS 65710.01 198209
library(ggplot2)
library(scales)
ggplot(df, aes(Origination, Balance, group=Product, col=Product)) +
geom_line(lwd=1.2) + scale_y_continuous(labels = comma)
我不确定你想要什么,这是你要找的吗?
假设您将数据放在 data.txt 中,删除管道并将名称中的空格替换为“_”
d = read.table("data.txt", header=T)
prod.col = c("red", "blue", "green", "black" )
prod = unique(d$Product)
par(mai = c(0.8, 1.8, 0.8, 0.8))
plot(1, yaxt = 'n', type = "n", axes = TRUE, xlab = "Origination", ylab = "", xlim = c(min(d$Origination), max(d$Origination)), ylim=c(0, nrow(d)+5) )
axis(2, at=seq(1:nrow(d)), labels=d$Product, las = 2, cex.axis=0.5)
mtext(side=2, line=7, "Products")
for( i in 1:nrow(d) ){
myProd = d$Product[i]
myCol = prod.col[which(prod == myProd)]
myOrig = d$Origination[i]
segments( x0 = 0, x1 = myOrig, y0 = i, y1 = i, col = myCol, lwd = 5 )
}
legend( "topright", col=prod.col, legend=prod, cex=0.3, lty=c(1,1), bg="white" )
我想在 R 中绘制分段数据。也就是说,假设我有以下形式的数据
| Product | Date | Origination | Rate | Num | Balance |
|-----------------------|--------|-------------|------|-----|-----------|
| DEMAND DEPOSITS | 200505 | 198209 | 0 | 1 | 2586.25 |
| DEMAND DEPOSITS | 200505 | 198304 | 0 | 1 | 3557.73 |
| DEMAND DEPOSITS | 200505 | 198308 | 0 | 1 | 14923.72 |
| DEMAND DEPOSITS | 200505 | 198401 | 0 | 1 | 4431.67 |
| DEMAND DEPOSITS | 200505 | 198410 | 0 | 1 | 44555.23 |
| MONEY MARKET ACCOUNTS | 200505 | 198209 | 0.25 | 2 | 65710.01 |
| MONEY MARKET ACCOUNTS | 200505 | 198211 | 0.25 | 2 | 41218.41 |
| MONEY MARKET ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 61421.2 |
| MONEY MARKET ACCOUNTS | 200505 | 198402 | 0.25 | 1 | 13620.17 |
| MONEY MARKET ACCOUNTS | 200505 | 198408 | 0.75 | 1 | 281897.74 |
| MONEY MARKET ACCOUNTS | 200505 | 198410 | 0.25 | 1 | 5131.33 |
| NOW ACCOUNTS | 200505 | 198209 | 0 | 1 | 142744.35 |
| NOW ACCOUNTS | 200505 | 198303 | 0 | 1 | 12191.6 |
| SAVING ACCOUNTS | 200505 | 198301 | 0.25 | 1 | 96936.24 |
| SAVING ACCOUNTS | 200505 | 198302 | 0.25 | 2 | 21764 |
| SAVING ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 14646.55 |
| SAVING ACCOUNTS | 200505 | 198305 | 0.25 | 1 | 20909.7 |
| SAVING ACCOUNTS | 200505 | 198306 | 0.25 | 1 | 66434.56 |
| SAVING ACCOUNTS | 200505 | 198309 | 0.25 | 1 | 20005.56 |
| SAVING ACCOUNTS | 200505 | 198404 | 0.25 | 2 | 16766.56 |
| SAVING ACCOUNTS | 200505 | 198407 | 0.25 | 1 | 47721.97 |
我想在 Y 轴上为每个 'Product' 类型绘制一条线 'Balance'。在 X 轴上,我想放置 'Origination'。理想情况下,我还想设置颜色来区分线条。数据目前不是 data.frame 形式,所以如果我需要改回那种形式,请告诉我。
虽然我确定有,但我无法在网上找到信息丰富的解决方案。
谢谢,
如@zx8754 所述,您应该提供可重现的数据。 没有测试代码(因为没有可重现的数据),我建议如下,假设数据在 data.frame 'data':
all_products <- unique(data$Product)
colors_use <- rainbow(length(all_products))
plot(y = data[data$Product == all_products[1],"Balance"],
x = data[data$Product == all_products[1],"Origination"],
type = "l",
col = colors_use[1],
ylim = c(min(data$Balance, na.rm = T),max(data$Balance, na.rm = T)),
xlim = c(min(data$Origination, na.rm = T),max(data$Origination, na.rm = T)))
for(i_product in 2:length(all_products)){
lines(y = data[data$Product == all_products[i_product],"Balance"],
x = data[data$Product == all_products[i_product],"Origination"],
col = colors_use[i_product])
}
我没有足够的声誉来发表评论,所以我把它写成一个答案。为了使@tobiasegli_te 的答案更短,第一个 plot
可以是 plot(Balance~Origination,data=data,type='n')
,然后使后续的 lines
完成 i_product in 1:length(all_products)
。这样你就不用担心 ylim
。这是一个使用 Grunfeld 数据的示例。
z <- read.csv('http://statmath.wu-wien.ac.at/~zeileis/grunfeld/Grunfeld.csv')
plot(invest~year,data=z,type='n')
for (i in unique(as.numeric(z$firm))) lines(invest~year,data=z,
subset=as.numeric(z$firm)==i, col=i)
另请注意,您的 Origination
不是等距的。您需要将其更改为 Date
或类似的。
我猜你想要类似下面的东西:
df <- as.data.frame(df[c('Product', 'Balance', 'Origination')])
head(df)
Product Balance Origination
1 DEMAND DEPOSITS 2586.25 198209
2 DEMAND DEPOSITS 3557.73 198304
3 DEMAND DEPOSITS 14923.72 198308
4 DEMAND DEPOSITS 4431.67 198401
5 DEMAND DEPOSITS 44555.23 198410
6 MONEY MARKET ACCOUNTS 65710.01 198209
library(ggplot2)
library(scales)
ggplot(df, aes(Origination, Balance, group=Product, col=Product)) +
geom_line(lwd=1.2) + scale_y_continuous(labels = comma)
我不确定你想要什么,这是你要找的吗?
假设您将数据放在 data.txt 中,删除管道并将名称中的空格替换为“_”
d = read.table("data.txt", header=T)
prod.col = c("red", "blue", "green", "black" )
prod = unique(d$Product)
par(mai = c(0.8, 1.8, 0.8, 0.8))
plot(1, yaxt = 'n', type = "n", axes = TRUE, xlab = "Origination", ylab = "", xlim = c(min(d$Origination), max(d$Origination)), ylim=c(0, nrow(d)+5) )
axis(2, at=seq(1:nrow(d)), labels=d$Product, las = 2, cex.axis=0.5)
mtext(side=2, line=7, "Products")
for( i in 1:nrow(d) ){
myProd = d$Product[i]
myCol = prod.col[which(prod == myProd)]
myOrig = d$Origination[i]
segments( x0 = 0, x1 = myOrig, y0 = i, y1 = i, col = myCol, lwd = 5 )
}
legend( "topright", col=prod.col, legend=prod, cex=0.3, lty=c(1,1), bg="white" )