R获得范围重叠的矩阵
R obtain matrix with overlap in ranges
我有一个范围如下所示的数据框:
df <- data.frame(label = c("A", "B", "C"),
start = c(2, 11, 22),
stop = c(37, 45, 29))
现在我想获得一个矩阵,我可以在其中看到 A:B、B:C、A:C 等之间有多少重叠(百分比),即有多少范围 A 出现在范围 B 等中。输出应如下所示:
A B C
A 100 76.5 100
B 74.3 100 100
C 20 20.6 100
我试过用 IRanges 或 GRanges 获得这样的矩阵,但这似乎不可能。希望有人能帮我解决这个问题!
基础 R
out <- 100 * with(df, t((outer(stop, stop, pmin) - outer(start, start, pmax)) / (stop - start)))
dimnames(out) <- list(df$label, df$label)
out
# A B C
# A 100.00000 76.47059 100
# B 74.28571 100.00000 100
# C 20.00000 20.58824 100
整洁宇宙
library(dplyr)
library(tidyr)
expand_grid(Var1 = df$label, Var2 = df$label) %>%
left_join(df, by = c("Var1" = "label")) %>%
left_join(df, by = c("Var2" = "label")) %>%
mutate(
start = pmax(start.y, start.x),
stop = pmin(stop.x, stop.y),
overlap = 100 * (stop - start) / (stop.y - start.y)
) %>%
pivot_wider(Var1, names_from = Var2, values_from = overlap)
# # A tibble: 3 x 4
# Var1 A B C
# <chr> <dbl> <dbl> <dbl>
# 1 A 100 76.5 100
# 2 B 74.3 100 100
# 3 C 20 20.6 100
我有一个范围如下所示的数据框:
df <- data.frame(label = c("A", "B", "C"),
start = c(2, 11, 22),
stop = c(37, 45, 29))
现在我想获得一个矩阵,我可以在其中看到 A:B、B:C、A:C 等之间有多少重叠(百分比),即有多少范围 A 出现在范围 B 等中。输出应如下所示:
A B C
A 100 76.5 100
B 74.3 100 100
C 20 20.6 100
我试过用 IRanges 或 GRanges 获得这样的矩阵,但这似乎不可能。希望有人能帮我解决这个问题!
基础 R
out <- 100 * with(df, t((outer(stop, stop, pmin) - outer(start, start, pmax)) / (stop - start)))
dimnames(out) <- list(df$label, df$label)
out
# A B C
# A 100.00000 76.47059 100
# B 74.28571 100.00000 100
# C 20.00000 20.58824 100
整洁宇宙
library(dplyr)
library(tidyr)
expand_grid(Var1 = df$label, Var2 = df$label) %>%
left_join(df, by = c("Var1" = "label")) %>%
left_join(df, by = c("Var2" = "label")) %>%
mutate(
start = pmax(start.y, start.x),
stop = pmin(stop.x, stop.y),
overlap = 100 * (stop - start) / (stop.y - start.y)
) %>%
pivot_wider(Var1, names_from = Var2, values_from = overlap)
# # A tibble: 3 x 4
# Var1 A B C
# <chr> <dbl> <dbl> <dbl>
# 1 A 100 76.5 100
# 2 B 74.3 100 100
# 3 C 20 20.6 100