如何在 R 中的数据框中对每个可能的行组合应用多个函数?
How to apply several functions on every possible row combinations within a dataframe in R?
我有一个带坐标 (lon, lat) 的数据框
lon <- list(505997.627175236, 505997.627175236, 505997.627175236, 505997.627175236)
lon <- do.call(rbind.data.frame, lon)
lat <- list(7941821.025438220, 7941821.025438220, 7941821.025438220, 7941821.025438220)
lat <- do.call(rbind.data.frame, lat)
coord <- cbind(lon, lat)
colnames(coord) <- c("lon", "lat")
我正在尝试计算数据框中所有可能的行组合之间的欧氏距离和角度。
lon lat apply function on every possible combinations such as v1-v2, v1-v3, v1-v4,
v1 x1 y1 v2-v3 and so on...
v2 x2 y2
v3 x3 y3 here are the two functions applied beetween v1 and v2 :
v4 x4 y4 **euclidian distance** sqrt((x1-x2)^2 + (y1-y2)^2)
**angle** atan2((y1-y2),(x1-x2))*(180/pi)
如何对每个可能的行组合应用多个函数并在各自的列表中获得结果?我的目标是在每次迭代时使用这些计算,无论输入中的行数如何。
提前感谢您的回答,如果问题看起来很愚蠢,我们深表歉意。我看了很多帖子,但找不到我可以理解和复制的解决方案。
# two vectors (I changed them a little bit)
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)
# a function for the euclidean distance
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
# now we create a dataframe...
df <- data.frame(lon, lat) %>%
mutate(joinIndex = 1:nrow(.)) # and we add an index column
# ...that looks like this
# lon lat joinIndex
# 1 505997.6 7941821 1
# 2 505597.6 7945821 2
# 3 515997.6 7141821 3
# 4 505297.6 7921821 4
# create all combinations of the join indeces
df_combinations <- expand.grid(1:nrow(df), 1:nrow(df))
# Var1 Var2
# 1 1 1
# 2 2 1
# 3 3 1
# 4 4 1
# 5 1 2
# 6 2 2
# 7 3 2
# 8 4 2
# 9 1 3
# 10 2 3
# 11 3 3
# 12 4 3
# 13 1 4
# 14 2 4
# 15 3 4
# 16 4 4
# and join our dataframe first on one index then on the other
df_final <- df_combinations %>%
left_join(df, by = c("Var1" = "joinIndex")) %>%
left_join(df, by = c("Var2" = "joinIndex"))
# and then finally calculate the euclidean distance
df_final %>%
mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))
Var1 Var2 lon.x lat.x lon.y lat.y distance
1 1 1 505997.6 7941821 505997.6 7941821 0.00
2 2 1 505597.6 7945821 505997.6 7941821 4019.95
3 3 1 515997.6 7141821 505997.6 7941821 800062.50
4 4 1 505297.6 7921821 505997.6 7941821 20012.25
5 1 2 505997.6 7941821 505597.6 7945821 4019.95
6 2 2 505597.6 7945821 505597.6 7945821 0.00
7 3 2 515997.6 7141821 505597.6 7945821 804067.26
8 4 2 505297.6 7921821 505597.6 7945821 24001.87
9 1 3 505997.6 7941821 515997.6 7141821 800062.50
10 2 3 505597.6 7945821 515997.6 7141821 804067.26
11 3 3 515997.6 7141821 515997.6 7141821 0.00
12 4 3 505297.6 7921821 515997.6 7141821 780073.39
13 1 4 505997.6 7941821 505297.6 7921821 20012.25
14 2 4 505597.6 7945821 505297.6 7921821 24001.87
15 3 4 515997.6 7141821 505297.6 7921821 780073.39
16 4 4 505297.6 7921821 505297.6 7921821 0.00
Base R 函数 combn
一次生成向量元素的组合 m
并且它可以选择将函数 FUN
应用于这些组合。由于输入数据是"data.frame"
,我将rownames
2乘2合并
euclidean <- function(k){
f <- function(x, y) sqrt((x[1] - y[1])^2 + (x[2] - y[2])^2)
x <- unlist(coord[k[1], 1:2])
y <- unlist(coord[k[2], 1:2])
f(x, y)
}
angle <- function(k){
f <- function(x, y) atan2(x[2] - y[2], x[1] - y[1])*(180/pi)
x <- unlist(coord[k[1], 1:2])
y <- unlist(coord[k[2], 1:2])
f(x, y)
}
combn(rownames(coord), 2, euclidean)
#[1] 4019.95 800062.50 20012.25 804067.26 24001.87 780073.39
combn(rownames(coord), 2, angle)
#[1] -84.28941 90.71616 87.99547 90.74110 89.28384 -89.21407
数据.
这是 OP 答案中的数据,但没有 id
列。
lon <- c(505997.627175236, 505597.627175236,
515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220,
7141821.025438220, 7921821.025438220)
coord <- data.frame(lon, lat)
快速欧几里德计算,你可以看看
对于其他功能,您可以这样做
atan2(outer(coord$lat, coord$lat, `-`), outer(coord$lon, coord$lon, `-`))*180/pi
最后,我修改了 Georgery 提供的代码,但我使用 "combn" 而不是 "expand.grid" 以避免在将函数应用于最终数据帧时行组合之间的重复.我还必须使用包 "hablar" 中的函数 "convert" 以便将我的数据帧 "coord_combn" 的因子正确转换为数值。
代码如下:
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)
# dataframe creation + adding of an id column
coord <- data.frame(lon, lat) %>%
mutate(id = 1:nrow(.))
coord_combn <- combn(rownames(coord), 2) # all the possible row combinations
coord_combn <- as.data.frame(t(coord_combn)) # transpose columns into rows
coord_combn <- coord_combn %>%
convert(num(V1, V2)) # factor to numeric
#join our dataframe first on one index then on the other
coord_final <- coord_combn %>%
left_join(coord, by = c("V1" = "id")) %>%
left_join(coord, by = c("V2" = "id"))
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
eAngle <- function(x1, x2, y1, y2) atan2((y1-y2),(x1-x2))*(180/3.14159265359)
# euclidean distance calculation
coord_final <- coord_final %>%
mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))
# angle calculation
coord_final <- coord_final %>%
mutate(angle = eAngle(lon.x, lon.y, lat.x, lat.y))
谢谢大家,你们帮了大忙。
我有一个带坐标 (lon, lat) 的数据框
lon <- list(505997.627175236, 505997.627175236, 505997.627175236, 505997.627175236)
lon <- do.call(rbind.data.frame, lon)
lat <- list(7941821.025438220, 7941821.025438220, 7941821.025438220, 7941821.025438220)
lat <- do.call(rbind.data.frame, lat)
coord <- cbind(lon, lat)
colnames(coord) <- c("lon", "lat")
我正在尝试计算数据框中所有可能的行组合之间的欧氏距离和角度。
lon lat apply function on every possible combinations such as v1-v2, v1-v3, v1-v4,
v1 x1 y1 v2-v3 and so on...
v2 x2 y2
v3 x3 y3 here are the two functions applied beetween v1 and v2 :
v4 x4 y4 **euclidian distance** sqrt((x1-x2)^2 + (y1-y2)^2)
**angle** atan2((y1-y2),(x1-x2))*(180/pi)
如何对每个可能的行组合应用多个函数并在各自的列表中获得结果?我的目标是在每次迭代时使用这些计算,无论输入中的行数如何。
提前感谢您的回答,如果问题看起来很愚蠢,我们深表歉意。我看了很多帖子,但找不到我可以理解和复制的解决方案。
# two vectors (I changed them a little bit)
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)
# a function for the euclidean distance
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
# now we create a dataframe...
df <- data.frame(lon, lat) %>%
mutate(joinIndex = 1:nrow(.)) # and we add an index column
# ...that looks like this
# lon lat joinIndex
# 1 505997.6 7941821 1
# 2 505597.6 7945821 2
# 3 515997.6 7141821 3
# 4 505297.6 7921821 4
# create all combinations of the join indeces
df_combinations <- expand.grid(1:nrow(df), 1:nrow(df))
# Var1 Var2
# 1 1 1
# 2 2 1
# 3 3 1
# 4 4 1
# 5 1 2
# 6 2 2
# 7 3 2
# 8 4 2
# 9 1 3
# 10 2 3
# 11 3 3
# 12 4 3
# 13 1 4
# 14 2 4
# 15 3 4
# 16 4 4
# and join our dataframe first on one index then on the other
df_final <- df_combinations %>%
left_join(df, by = c("Var1" = "joinIndex")) %>%
left_join(df, by = c("Var2" = "joinIndex"))
# and then finally calculate the euclidean distance
df_final %>%
mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))
Var1 Var2 lon.x lat.x lon.y lat.y distance
1 1 1 505997.6 7941821 505997.6 7941821 0.00
2 2 1 505597.6 7945821 505997.6 7941821 4019.95
3 3 1 515997.6 7141821 505997.6 7941821 800062.50
4 4 1 505297.6 7921821 505997.6 7941821 20012.25
5 1 2 505997.6 7941821 505597.6 7945821 4019.95
6 2 2 505597.6 7945821 505597.6 7945821 0.00
7 3 2 515997.6 7141821 505597.6 7945821 804067.26
8 4 2 505297.6 7921821 505597.6 7945821 24001.87
9 1 3 505997.6 7941821 515997.6 7141821 800062.50
10 2 3 505597.6 7945821 515997.6 7141821 804067.26
11 3 3 515997.6 7141821 515997.6 7141821 0.00
12 4 3 505297.6 7921821 515997.6 7141821 780073.39
13 1 4 505997.6 7941821 505297.6 7921821 20012.25
14 2 4 505597.6 7945821 505297.6 7921821 24001.87
15 3 4 515997.6 7141821 505297.6 7921821 780073.39
16 4 4 505297.6 7921821 505297.6 7921821 0.00
Base R 函数 combn
一次生成向量元素的组合 m
并且它可以选择将函数 FUN
应用于这些组合。由于输入数据是"data.frame"
,我将rownames
2乘2合并
euclidean <- function(k){
f <- function(x, y) sqrt((x[1] - y[1])^2 + (x[2] - y[2])^2)
x <- unlist(coord[k[1], 1:2])
y <- unlist(coord[k[2], 1:2])
f(x, y)
}
angle <- function(k){
f <- function(x, y) atan2(x[2] - y[2], x[1] - y[1])*(180/pi)
x <- unlist(coord[k[1], 1:2])
y <- unlist(coord[k[2], 1:2])
f(x, y)
}
combn(rownames(coord), 2, euclidean)
#[1] 4019.95 800062.50 20012.25 804067.26 24001.87 780073.39
combn(rownames(coord), 2, angle)
#[1] -84.28941 90.71616 87.99547 90.74110 89.28384 -89.21407
数据.
这是 OP 答案中的数据,但没有 id
列。
lon <- c(505997.627175236, 505597.627175236,
515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220,
7141821.025438220, 7921821.025438220)
coord <- data.frame(lon, lat)
快速欧几里德计算,你可以看看
对于其他功能,您可以这样做
atan2(outer(coord$lat, coord$lat, `-`), outer(coord$lon, coord$lon, `-`))*180/pi
最后,我修改了 Georgery 提供的代码,但我使用 "combn" 而不是 "expand.grid" 以避免在将函数应用于最终数据帧时行组合之间的重复.我还必须使用包 "hablar" 中的函数 "convert" 以便将我的数据帧 "coord_combn" 的因子正确转换为数值。
代码如下:
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)
# dataframe creation + adding of an id column
coord <- data.frame(lon, lat) %>%
mutate(id = 1:nrow(.))
coord_combn <- combn(rownames(coord), 2) # all the possible row combinations
coord_combn <- as.data.frame(t(coord_combn)) # transpose columns into rows
coord_combn <- coord_combn %>%
convert(num(V1, V2)) # factor to numeric
#join our dataframe first on one index then on the other
coord_final <- coord_combn %>%
left_join(coord, by = c("V1" = "id")) %>%
left_join(coord, by = c("V2" = "id"))
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
eAngle <- function(x1, x2, y1, y2) atan2((y1-y2),(x1-x2))*(180/3.14159265359)
# euclidean distance calculation
coord_final <- coord_final %>%
mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))
# angle calculation
coord_final <- coord_final %>%
mutate(angle = eAngle(lon.x, lon.y, lat.x, lat.y))
谢谢大家,你们帮了大忙。