Select 个最接近定义值的元素,同时避免重复
Select elements closest to a defined value while avoiding duplicates
我有以下问题。我有许多覆盖生物梯度的地块。从这些图中,我想 select 25,即覆盖梯度最好。为此,我提取了最小值和最大值,并计算了最能覆盖梯度的值。然后我选择了最接近理想值的地块。这很好用。然而,有时一个图最接近两个理论值,因此,我最终会在列表中出现重复项,我想避免这种情况。显然,我可以增加 length.out 的数量,但从我的角度来看,这不是最佳解决方案。我想以 25 selected 和独特的情节结束。
下面的代码举例说明了这个问题:length.out设置为25,但只有19个地块被select编辑。
data <- structure(list(Plot = c("3", "4", "5", "6", "8", "12", "14",
"15", "17", "18", "19", "20", "21", "22", "23", "25", "26", "28",
"29", "30", "32", "33", "34", "35", "36", "37", "38", "39", "40",
"41", "42", "43", "44", "45", "46", "47", "48", "49"), Value = c(2.19490722347427,
0.817884294633935, 0.834577676660982, 1.19923035999043, 0.293146158435238,
1.93237941781986, 1.74536845664897, 2.22904916731729, 0.789604037117133,
0.439716474953651, 0.834321473446987, 1.07386786707173, 0.977203815084214,
0.539717907433468, 0.950019385036826, 1.10794069639141, 1.41499437622422,
1.12933520841724, 1.99342508363262, 1.05715847816517, 2.27711128641038,
1.9766526350752, 2.16657914911448, 2.01955890337827, 1.1080527140292,
1.16614766657035, 1.04478527637105, 0.980792736677819, 0.818000882117776,
0.656157422806534, 1.07223822052094, 0.799912719334531, 0.4365715090508,
0.824331627537106, 1.19478221856558, 1.06047128780385, 1.54822823084764,
0.582397279167692)), class = "data.frame", row.names = c("3",
"4", "5", "6", "8", "12", "14", "15", "17", "18", "19", "20",
"21", "22", "23", "25", "26", "28", "29", "30", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49"))
opt_seq<-seq(min(data$Value), max(data$Value), length.out = 25)
sel_plots <- sapply(opt_seq, function(i) which.min(abs(data$Value - i)))#25 plots
length(unique(sel_plots))
非常感谢您的帮助!!
你可以试试:
sel_plots <- logical(nrow(data))
for(i in opt_seq) {
sel_plots[which(!sel_plots)[which.min(abs(data$Value[!sel_plots] - i))]] <- TRUE
}
sel_plots <- which(sel_plots)
length(unique(sel_plots))
#[1] 25
这样做的一种方法是在 for
循环中找到具有 abs
olute rank
和 which.min
的元素,并在每次迭代后删除该元素。
y <- data$Value ## copy values column
r <- c() ## initialize result vector
for (x in opt_seq) {
i <- which.min(rank(abs(x - y)))
r <- c(r, y[i])
y <- y[-i]
}
r
# [1] 0.2931462 0.4365715 0.4397165 0.5397179 ...
stopifnot(!any(duplicated(r)) & length(r) == 25)
我有以下问题。我有许多覆盖生物梯度的地块。从这些图中,我想 select 25,即覆盖梯度最好。为此,我提取了最小值和最大值,并计算了最能覆盖梯度的值。然后我选择了最接近理想值的地块。这很好用。然而,有时一个图最接近两个理论值,因此,我最终会在列表中出现重复项,我想避免这种情况。显然,我可以增加 length.out 的数量,但从我的角度来看,这不是最佳解决方案。我想以 25 selected 和独特的情节结束。
下面的代码举例说明了这个问题:length.out设置为25,但只有19个地块被select编辑。
data <- structure(list(Plot = c("3", "4", "5", "6", "8", "12", "14",
"15", "17", "18", "19", "20", "21", "22", "23", "25", "26", "28",
"29", "30", "32", "33", "34", "35", "36", "37", "38", "39", "40",
"41", "42", "43", "44", "45", "46", "47", "48", "49"), Value = c(2.19490722347427,
0.817884294633935, 0.834577676660982, 1.19923035999043, 0.293146158435238,
1.93237941781986, 1.74536845664897, 2.22904916731729, 0.789604037117133,
0.439716474953651, 0.834321473446987, 1.07386786707173, 0.977203815084214,
0.539717907433468, 0.950019385036826, 1.10794069639141, 1.41499437622422,
1.12933520841724, 1.99342508363262, 1.05715847816517, 2.27711128641038,
1.9766526350752, 2.16657914911448, 2.01955890337827, 1.1080527140292,
1.16614766657035, 1.04478527637105, 0.980792736677819, 0.818000882117776,
0.656157422806534, 1.07223822052094, 0.799912719334531, 0.4365715090508,
0.824331627537106, 1.19478221856558, 1.06047128780385, 1.54822823084764,
0.582397279167692)), class = "data.frame", row.names = c("3",
"4", "5", "6", "8", "12", "14", "15", "17", "18", "19", "20",
"21", "22", "23", "25", "26", "28", "29", "30", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49"))
opt_seq<-seq(min(data$Value), max(data$Value), length.out = 25)
sel_plots <- sapply(opt_seq, function(i) which.min(abs(data$Value - i)))#25 plots
length(unique(sel_plots))
非常感谢您的帮助!!
你可以试试:
sel_plots <- logical(nrow(data))
for(i in opt_seq) {
sel_plots[which(!sel_plots)[which.min(abs(data$Value[!sel_plots] - i))]] <- TRUE
}
sel_plots <- which(sel_plots)
length(unique(sel_plots))
#[1] 25
这样做的一种方法是在 for
循环中找到具有 abs
olute rank
和 which.min
的元素,并在每次迭代后删除该元素。
y <- data$Value ## copy values column
r <- c() ## initialize result vector
for (x in opt_seq) {
i <- which.min(rank(abs(x - y)))
r <- c(r, y[i])
y <- y[-i]
}
r
# [1] 0.2931462 0.4365715 0.4397165 0.5397179 ...
stopifnot(!any(duplicated(r)) & length(r) == 25)