根据行和列元数据将 data.frame 中的一个值除以备用 data.frame 中的另一个值
Divide one value in a data.frame by another in an alternate data.frame base on row and column meta data
我有一个结构为=
的数据框
Gene Transcript_ID V1 V2 V3 V4
1 ENSG00000000003.14 ENST00000612152.4 0 6 0 3
2 ENSG00000000003.14 ENST00000373020.8 4 0 5 0
3 ENSG00000000003.14 ENST00000614008.4 0 0 0 0
4 ENSG00000000003.14 ENST00000496771.5 0 3 0 0
我在另一个结构为 =
的数据框中得到了 Gene 的总计
Category V1 V2 V3 V4
1 ENSG00000000003.14 4.00 9.00 5.00 3.00
2 ENSG00000000005.6 0.00 0.00 0.00 0.00
3 ENSG00000000419.12 61.00 94.00 103.00 71.00
4 ENSG00000000457.14 577.01 698.20 815.49 697.72
我想将 data.frame 1 中的值除以 dataframe2 中相应的聚合值,得到所有值的相对比例。
有人可以在这里应用一些简单的语法吗?非常感谢!
我们可以在这里使用连接
library(data.table)
nm1 <- paste0("V", 1:4)
setDT(df1)[, (nm1) := lapply(.SD, as.numeric), .SDcols = nm1]
df1[df2, (nm1) := Map(`/`, mget(nm1),
mget(paste0("i.", nm1))), on = .(Gene = Category)]
-输出
df1
Gene Transcript_ID V1 V2 V3 V4
1: ENSG00000000003.14 ENST00000612152.4 0 0.6666667 0 1
2: ENSG00000000003.14 ENST00000373020.8 1 0.0000000 1 0
3: ENSG00000000003.14 ENST00000614008.4 0 0.0000000 0 0
4: ENSG00000000003.14 ENST00000496771.5 0 0.3333333 0 0
数据
df1 <- structure(list(Gene = c("ENSG00000000003.14", "ENSG00000000003.14",
"ENSG00000000003.14", "ENSG00000000003.14"), Transcript_ID = c("ENST00000612152.4",
"ENST00000373020.8", "ENST00000614008.4", "ENST00000496771.5"
), V1 = c(0L, 4L, 0L, 0L), V2 = c(6L, 0L, 0L, 3L), V3 = c(0L,
5L, 0L, 0L), V4 = c(3L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
df2 <- structure(list(Category = c("ENSG00000000003.14", "ENSG00000000005.6",
"ENSG00000000419.12", "ENSG00000000457.14"), V1 = c(4, 0, 61,
577.01), V2 = c(9, 0, 94, 698.2), V3 = c(5, 0, 103, 815.49),
V4 = c(3, 0, 71, 697.72)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
您还可以这样做:
df1 %>%
left_join(df2, c('Gene' = 'Category')) %>%
pivot_longer(starts_with('V'),
names_to = c('name','.value'), names_sep = '[.]') %>%
mutate(value = x/y, x = NULL, y = NULL) %>%
pivot_wider()
# A tibble: 4 x 6
Gene Transcript_ID V1 V2 V3 V4
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ENSG000000~ ENST000006121~ 0 0.667 0 1
2 ENSG000000~ ENST000003730~ 1 0 1 0
3 ENSG000000~ ENST000006140~ 0 0 0 0
4 ENSG000000~ ENST000004967~ 0 0.333 0 0
我有一个结构为=
的数据框Gene Transcript_ID V1 V2 V3 V4
1 ENSG00000000003.14 ENST00000612152.4 0 6 0 3
2 ENSG00000000003.14 ENST00000373020.8 4 0 5 0
3 ENSG00000000003.14 ENST00000614008.4 0 0 0 0
4 ENSG00000000003.14 ENST00000496771.5 0 3 0 0
我在另一个结构为 =
的数据框中得到了 Gene 的总计Category V1 V2 V3 V4
1 ENSG00000000003.14 4.00 9.00 5.00 3.00
2 ENSG00000000005.6 0.00 0.00 0.00 0.00
3 ENSG00000000419.12 61.00 94.00 103.00 71.00
4 ENSG00000000457.14 577.01 698.20 815.49 697.72
我想将 data.frame 1 中的值除以 dataframe2 中相应的聚合值,得到所有值的相对比例。
有人可以在这里应用一些简单的语法吗?非常感谢!
我们可以在这里使用连接
library(data.table)
nm1 <- paste0("V", 1:4)
setDT(df1)[, (nm1) := lapply(.SD, as.numeric), .SDcols = nm1]
df1[df2, (nm1) := Map(`/`, mget(nm1),
mget(paste0("i.", nm1))), on = .(Gene = Category)]
-输出
df1
Gene Transcript_ID V1 V2 V3 V4
1: ENSG00000000003.14 ENST00000612152.4 0 0.6666667 0 1
2: ENSG00000000003.14 ENST00000373020.8 1 0.0000000 1 0
3: ENSG00000000003.14 ENST00000614008.4 0 0.0000000 0 0
4: ENSG00000000003.14 ENST00000496771.5 0 0.3333333 0 0
数据
df1 <- structure(list(Gene = c("ENSG00000000003.14", "ENSG00000000003.14",
"ENSG00000000003.14", "ENSG00000000003.14"), Transcript_ID = c("ENST00000612152.4",
"ENST00000373020.8", "ENST00000614008.4", "ENST00000496771.5"
), V1 = c(0L, 4L, 0L, 0L), V2 = c(6L, 0L, 0L, 3L), V3 = c(0L,
5L, 0L, 0L), V4 = c(3L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
df2 <- structure(list(Category = c("ENSG00000000003.14", "ENSG00000000005.6",
"ENSG00000000419.12", "ENSG00000000457.14"), V1 = c(4, 0, 61,
577.01), V2 = c(9, 0, 94, 698.2), V3 = c(5, 0, 103, 815.49),
V4 = c(3, 0, 71, 697.72)), class = "data.frame", row.names = c("1",
"2", "3", "4"))
您还可以这样做:
df1 %>%
left_join(df2, c('Gene' = 'Category')) %>%
pivot_longer(starts_with('V'),
names_to = c('name','.value'), names_sep = '[.]') %>%
mutate(value = x/y, x = NULL, y = NULL) %>%
pivot_wider()
# A tibble: 4 x 6
Gene Transcript_ID V1 V2 V3 V4
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 ENSG000000~ ENST000006121~ 0 0.667 0 1
2 ENSG000000~ ENST000003730~ 1 0 1 0
3 ENSG000000~ ENST000006140~ 0 0 0 0
4 ENSG000000~ ENST000004967~ 0 0.333 0 0