如何在 R 中绘制拷贝数变异曲线?
How to plot copy number variation profile in R?
我正在尝试在 R 中绘制拷贝数变异曲线图。这就是我正在寻找的,但所有单元格都在我的数据中。
倍性在 Y 轴上,染色体数在 X 轴上
这是我的数据,这是我迄今为止尝试过的数据,但没有提供我正在寻找的数据
input <- data.frame(chrom = sample("chr1"),start = sample(c(780000, 2920000, 4920000)), stop=sample(c(2920000, 4920000, 692000)), cell0=sample(1), cell1=sample(1,3,1),cell2=sample(2,1,2)
ggplot(input, aes(x=chrom, y=cell_0, group=1)) +
geom_point() +
geom_line(color = "#00AFBB", size = 1)
这是整个文件的link
当我 运行 答案中的代码时,这就是我得到的。我希望所有的染色体都能像上图一样水平。
我们可以使用 facet_wrap
将每个 chrom
并排放置。我使用了一堆格式变量来使情节看起来像上面显示的那样。为了更好地说明,我还用两个 chrom
制作了自己的数据。往下看;
read.table(text="chrom start stop cell_0 cell_1 cell_2
chr1 780000 2920000 2 2 2
chr1 2920000 4920000 1 2 3
chr1 4920000 6920000 2 3 2
chr2 480000 1920000 1 2 3
chr2 1920000 2920000 2 2 2
chr2 2920000 3920000 1 3 3", header=T) -> input
library(ggplot2)
library(tidyr)
input %>%
pivot_longer(c(start,stop)) %>%
ggplot(., aes(x=value, y=as.factor(cell_0), group=1L)) +
geom_point(colour="grey") +
facet_wrap(~chrom, strip.position = "bottom", scales = "free_x") +
geom_line(color = "#00AFBB", size = 1) +
theme_bw() +
theme(panel.spacing.x=unit(0, "lines"),
panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white")) +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.3,.3)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
整个数据的更新解
我添加了另一列来展示这如何适用于两个 cell
列。不过,这块地块会很拥挤。
# input <- read.table(file = "clipboard", header=T)
## read data from pastebin
library(ggplot2)
library(tidyr)
library(dplyr)
set.seed(123)
input %>%
mutate(cell_1 = cell_0 +
sample.int(1, 1417, replace = T) * sample(c(-1,1),1417, replace = T)) %>%
pivot_longer(c(start,stop), names_to = "step", values_to = "time") %>%
pivot_longer(c(cell_0,cell_1), names_to = "cell", values_to = "ploidy") %>%
ggplot(data=., aes(x=time, y=as.factor(ploidy), group=cell)) +
geom_point(aes(colour=cell)) +
facet_wrap(~chrom, strip.position = "bottom", scales = "free_x", nrow=1) +
geom_line(aes(color = cell), size = 1, alpha=0.5) +
theme_bw() +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.1,.1)) +
theme(panel.spacing.x=unit(0, "lines"),
panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_line(colour = "black"),
plot.title = element_text(hjust = 0.5)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
最终更新:
library(ggplot2)
library(tidyr)
library(dplyr)
library(stringr)
input %>%
pivot_longer(c(start,stop), names_to = "step", values_to = "time") %>%
mutate(chrom = factor(chrom, levels = str_sort(unique(chrom), numeric = T))) %>%
ggplot(data=., aes(x=time, y=as.factor(cell_0), group=1L)) +
geom_point(colour="grey", size=0.5) +
geom_line(color = "#00AFBB", size = 1, alpha=0.5) +
facet_wrap(~as.factor(chrom),
strip.position = "bottom", scales = "free_x", nrow=1) +
theme_bw() +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.1,.1)) +
theme(panel.spacing.x=unit(0, "lines"),panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_line(colour = "black"),
plot.title = element_text(hjust = 0.5)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
由 reprex package (v0.3.0)
于 2019-12-10 创建
我正在尝试在 R 中绘制拷贝数变异曲线图。这就是我正在寻找的,但所有单元格都在我的数据中。
倍性在 Y 轴上,染色体数在 X 轴上
这是我的数据,这是我迄今为止尝试过的数据,但没有提供我正在寻找的数据
input <- data.frame(chrom = sample("chr1"),start = sample(c(780000, 2920000, 4920000)), stop=sample(c(2920000, 4920000, 692000)), cell0=sample(1), cell1=sample(1,3,1),cell2=sample(2,1,2)
ggplot(input, aes(x=chrom, y=cell_0, group=1)) +
geom_point() +
geom_line(color = "#00AFBB", size = 1)
这是整个文件的link
当我 运行 答案中的代码时,这就是我得到的。我希望所有的染色体都能像上图一样水平。
我们可以使用 facet_wrap
将每个 chrom
并排放置。我使用了一堆格式变量来使情节看起来像上面显示的那样。为了更好地说明,我还用两个 chrom
制作了自己的数据。往下看;
read.table(text="chrom start stop cell_0 cell_1 cell_2
chr1 780000 2920000 2 2 2
chr1 2920000 4920000 1 2 3
chr1 4920000 6920000 2 3 2
chr2 480000 1920000 1 2 3
chr2 1920000 2920000 2 2 2
chr2 2920000 3920000 1 3 3", header=T) -> input
library(ggplot2)
library(tidyr)
input %>%
pivot_longer(c(start,stop)) %>%
ggplot(., aes(x=value, y=as.factor(cell_0), group=1L)) +
geom_point(colour="grey") +
facet_wrap(~chrom, strip.position = "bottom", scales = "free_x") +
geom_line(color = "#00AFBB", size = 1) +
theme_bw() +
theme(panel.spacing.x=unit(0, "lines"),
panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white")) +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.3,.3)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
整个数据的更新解
我添加了另一列来展示这如何适用于两个 cell
列。不过,这块地块会很拥挤。
# input <- read.table(file = "clipboard", header=T)
## read data from pastebin
library(ggplot2)
library(tidyr)
library(dplyr)
set.seed(123)
input %>%
mutate(cell_1 = cell_0 +
sample.int(1, 1417, replace = T) * sample(c(-1,1),1417, replace = T)) %>%
pivot_longer(c(start,stop), names_to = "step", values_to = "time") %>%
pivot_longer(c(cell_0,cell_1), names_to = "cell", values_to = "ploidy") %>%
ggplot(data=., aes(x=time, y=as.factor(ploidy), group=cell)) +
geom_point(aes(colour=cell)) +
facet_wrap(~chrom, strip.position = "bottom", scales = "free_x", nrow=1) +
geom_line(aes(color = cell), size = 1, alpha=0.5) +
theme_bw() +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.1,.1)) +
theme(panel.spacing.x=unit(0, "lines"),
panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_line(colour = "black"),
plot.title = element_text(hjust = 0.5)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
最终更新:
library(ggplot2)
library(tidyr)
library(dplyr)
library(stringr)
input %>%
pivot_longer(c(start,stop), names_to = "step", values_to = "time") %>%
mutate(chrom = factor(chrom, levels = str_sort(unique(chrom), numeric = T))) %>%
ggplot(data=., aes(x=time, y=as.factor(cell_0), group=1L)) +
geom_point(colour="grey", size=0.5) +
geom_line(color = "#00AFBB", size = 1, alpha=0.5) +
facet_wrap(~as.factor(chrom),
strip.position = "bottom", scales = "free_x", nrow=1) +
theme_bw() +
scale_x_continuous(expand = c(.01, 0)) +
scale_y_discrete("ploidy", expand = c(.1,.1)) +
theme(panel.spacing.x=unit(0, "lines"),panel.spacing.y=unit(0, "lines"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
strip.background = element_rect(color="black", fill="white"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
axis.line = element_line(colour = "black"),
plot.title = element_text(hjust = 0.5)) +
ggtitle("cell_596, 2Mb resoloution, mean ploidy 3.04")
由 reprex package (v0.3.0)
于 2019-12-10 创建