源代码中的 uniroot() 函数不适用于修改;无法找出错误
uniroot() function in source code does not work with modification; Could not figure out the error
我试图找出R中两条曲线的交点坐标。输入数据是两条曲线的经验点坐标。我的解决方案是使用函数 curve_intersect()。我需要为 2000 次复制(即 2000 对曲线)执行此操作。所以我把数据放在两个列表中。每个列表包含 1000 个数据帧,每个数据帧中一条曲线的 x 和 y 坐标。
这是我的数据:data
下面是我使用的代码。
threshold_or1 <- map2_df(recall_or1_4, precision_or1_4,
~curve_intersect(.x, .y, empirical = TRUE, domain = NULL))
# recall_or_4 is a list of 2000 data frames. Each data frame
# |contains coordinates from curve #1.
# precision_or_4 is a list of 2000 data frames. Each data frame
# |contains coordinates from curve #2.
我在下面收到这条错误消息。
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), : f() values at end points not of opposite sign
因为函数 curve_intersect() 可以成功地应用于两个列表中的一些单独的数据帧。我 运行 下面的代码是为了准确地查看是哪对数据帧导致了进程失败。
test <- for (i in 1:2000){
curve_intersect(recall_or1_4[[i]], precision_or1_4[[i]], empirical = TRUE, domain = NULL)
print(paste("i=",i))}
然后,我收到以下消息,这意味着处理 运行 成功,直到到达数据对 #460。所以我检查了那个单独的数据对。
[1] "i= 457"
[1] "i= 458"
[1] "i= 459"
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), : f() values at end points not of opposite sign
我绘制了数据对 #460。
test1 <- precision_or1_4[[460]] %>% mutate(statistics = 'precision')
test2 <- recall_or1_4[[460]] %>% mutate(statistics = 'recall')
test3 <- rbind(test1, test2)
test3 <- test3 %>% mutate(statistics = as.factor(statistics))
curve_test3 <- ggplot(test3, aes(x = x, y = y))+
geom_line(aes(colour = statistics))
curve_test3
Find coordinates of the intersection point
然后我去修改了curve_intersect()的源码。原源码为
curve_intersect <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(function(x) curve1_f(x) - curve2_f(x),
c(min(curve1$x), max(curve1$x)))$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
我修改了第三个 if 语句的 uniroot()
部分。我没有使用 c(min(curve1$x), max(curve1$x))
作为 uniroot()
的参数,而是使用了 lower = -100000000, upper = 100000000
。修改后的函数为
curve_intersect_tq <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(function(x) curve1_f(x) - curve2_f(x),
lower = -100000000, upper = 100000000)$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
我试图更改 lower =, upper =
个参数的值。它不起作用。我收到了如下所示的相同错误消息。
curve_intersect_tq(recall_or1_4[[460]], precision_or1_4[[460]], empirical = TRUE, domain = NULL)
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), :
f() values at end points not of opposite sign
我还尝试使用 tidyverse 包中的 possibly(fun, NA)
,希望该过程可以 运行 即使出现错误消息。我用
的时候没用
(1) possibly(curve_intersect(), NA)
或
(2) possibly(uniroot(), NA)
同样的错误信息出现了。
为什么我会收到错误消息?可能的解决方案是什么?提前致谢。
聚会可能有点晚了,但这就是您的代码仍然失败的原因以及您可以做什么,具体取决于您希望从分析中得到什么:
首先,你的代码失败的原因,即使在改编之后,你只是告诉 uniroot
搜索更广泛的 window在 x
。但是,基础曲线永远不会相交 - 根本找不到任何 curve1_f(x) - curve2_f(x) == 0
。
来自 uniroot
的文档:
"The function values at the endpoints must be of opposite signs (or zero), for extendInt="no", the default."
在最初的 curve_intersect
实现中,uniroot
正在搜索数据中定义的 x-interval(即 c(min(curve1$x), max(curve1$x))
)。在您的更改中,您告诉它在 x 区间 [-100000000, 100000000]
中搜索。您也可以设置 extendInt = "yes"
,但它不会改变任何内容。
问题不在于搜索间隔,而在于 approxfun
!
approxfun
仅通过 插值 点之间的经验数据来帮助您。在您传入的数据之外,返回的函数不知道要做什么。
approxfun
允许您为 y
指定明确的值,这些值应该在经验定义的 window(及其参数 yleft
/yright
)之外返回,或者让您设置每边一个rule
。
在您上面发布的代码中,rule = 2
决定“使用最接近数据极值的值”。因此,approxfun
不会 外推 您传入的数据。它只会扩展已知数据。
我们可以绘制 curve1_f
和 curve2_f
将如何扩展到经验定义的 x-interval 之外进入无穷大:
tibble(
x = seq(0, 1, by = 0.001),
curve1_approxed = curve1_f(x),
curve2_approxed = curve2_f(x)
) %>%
pivot_longer(starts_with("curve"), names_to = "curve", values_to = "y") %>%
ggplot(aes(x = x, y = y, color = curve)) +
geom_line() +
geom_vline(xintercept = c(min(curve1$x), max(curve1$x)), color = "grey75")
那么,现在你可以做些什么来让你的代码不崩溃:
(剧透:这在很大程度上取决于您要通过项目完成的目标)
- 接受在您的数据的观察范围内没有交集。
如果您不想做任何假设,我建议您将映射函数包装在 tryCatch
语句中,并让它在 out-of-the-box 解决方案没有给您任何结果的地方失败。让我们 运行 这是您列表中之前使整个事情崩溃的部分:
threshold_or1.fix1 <- map2_df(
recall_or1_4, precision_or1_4,
~tryCatch({
curve_intersect(.x, .y, empirical = TRUE, domain = NULL)
}, error = function(e){
return(tibble(.rows = 1))
}),
.id = "i"
)
现在,当 curve_intersect
无法为您提供结果时,只有一行 NA。
threshold_or1.fix1[459:461,]
# A tibble: 3 x 3
i x y
<chr> <dbl> <dbl>
1 459 0.116 0.809
2 460 NA NA
3 461 0.264 0.773
- 尝试使用线性模型外推您的数据
在这种情况下,我们将使用自定义 curve_intersect
函数。让我们将有问题的 uniroot
调用包装在 tryCatch
中,如果找不到根,我们将为每条曲线拟合一个 lm
并让 uniroot
在拟合线性。
根据您的实验,这可能有意义也可能没有意义,所以我会让您在这里做判断。显然,如果您的数据比这更复杂,您可以使用除简单 lm
之外的其他模型...
只是为了形象化这种方法与默认方法:
tibble(
x = seq(-1, 2, by = 0.001),
curve1_approxed = curve1_f(x),
curve2_approxed = curve2_f(x),
curve1_lm = predict(lm(y ~ x, data = curve1), newdata = tibble(x = x)),
curve2_lm = predict(lm(y ~ x, data = curve2), newdata = tibble(x = x))
) %>%
pivot_longer(starts_with("curve"), names_to = "curve", values_to = "y") %>%
ggplot(aes(x = x, y = y, color = curve)) +
geom_line() +
geom_vline(xintercept = c(min(curve1$x), max(curve1$x)), color = "grey75")
你看,在approxfun
“失败”的地方,lm
我们假设我们可以线性推断并在观察到的框架之外找到x = 1.27
周围的交点。
要采用第二种方法并在我们的搜索中包含 lm
的外推法,您可以像这样拼凑一些东西:
(这里也只编辑了第三个 if
。)
curve_intersect_custom <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
return(
tryCatch({
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(
f = function(x) curve1_f(x) - curve2_f(x),
interval = c(min(curve1$x), max(curve1$x))
)$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
return(list(x = point_x, y = point_y, method = "approxfun"))
}, error = function(e) {
tryCatch({
curve1_lm_f <- function(x) predict(lm(y ~ x, data = curve1), newdata = tibble(x = x))
curve2_lm_f <- function(x) predict(lm(y ~ x, data = curve2), newdata = tibble(x = x))
point_x <- uniroot(
f = function(x) curve1_lm_f(x) - curve2_lm_f(x),
interval = c(min(curve1$x), max(curve1$x)),
extendInt = "yes"
)$root
point_y <- curve2_lm_f(point_x)
return(list(x = point_x, y = point_y, method = "lm"))
}, error = function(e) {
return(list(x = NA_real_, y = NA_real_, method = NA_character_))
})
})
)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
对于您的有问题的列表元素,现在尝试使用天真的拟合 lm
模型进行推断:
threshold_or1.fix2 <- map2_df(
recall_or1_4, precision_or1_4,
~curve_intersect_custom(.x, .y, empirical = TRUE, domain = NULL),
.id = "i"
)
threshold_or1.fix2[459:461,]
# A tibble: 3 x 4
i x y method
<chr> <dbl> <dbl> <chr>
1 459 0.116 0.809 approxfun
2 460 1.27 0.813 lm
3 461 0.264 0.773 approxfun
希望这对理解和解决您的问题有所帮助:)
我试图找出R中两条曲线的交点坐标。输入数据是两条曲线的经验点坐标。我的解决方案是使用函数 curve_intersect()。我需要为 2000 次复制(即 2000 对曲线)执行此操作。所以我把数据放在两个列表中。每个列表包含 1000 个数据帧,每个数据帧中一条曲线的 x 和 y 坐标。
这是我的数据:data
下面是我使用的代码。
threshold_or1 <- map2_df(recall_or1_4, precision_or1_4,
~curve_intersect(.x, .y, empirical = TRUE, domain = NULL))
# recall_or_4 is a list of 2000 data frames. Each data frame
# |contains coordinates from curve #1.
# precision_or_4 is a list of 2000 data frames. Each data frame
# |contains coordinates from curve #2.
我在下面收到这条错误消息。
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), : f() values at end points not of opposite sign
因为函数 curve_intersect() 可以成功地应用于两个列表中的一些单独的数据帧。我 运行 下面的代码是为了准确地查看是哪对数据帧导致了进程失败。
test <- for (i in 1:2000){
curve_intersect(recall_or1_4[[i]], precision_or1_4[[i]], empirical = TRUE, domain = NULL)
print(paste("i=",i))}
然后,我收到以下消息,这意味着处理 运行 成功,直到到达数据对 #460。所以我检查了那个单独的数据对。
[1] "i= 457"
[1] "i= 458"
[1] "i= 459"
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), : f() values at end points not of opposite sign
我绘制了数据对 #460。
test1 <- precision_or1_4[[460]] %>% mutate(statistics = 'precision')
test2 <- recall_or1_4[[460]] %>% mutate(statistics = 'recall')
test3 <- rbind(test1, test2)
test3 <- test3 %>% mutate(statistics = as.factor(statistics))
curve_test3 <- ggplot(test3, aes(x = x, y = y))+
geom_line(aes(colour = statistics))
curve_test3
Find coordinates of the intersection point
然后我去修改了curve_intersect()的源码。原源码为
curve_intersect <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(function(x) curve1_f(x) - curve2_f(x),
c(min(curve1$x), max(curve1$x)))$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
我修改了第三个 if 语句的 uniroot()
部分。我没有使用 c(min(curve1$x), max(curve1$x))
作为 uniroot()
的参数,而是使用了 lower = -100000000, upper = 100000000
。修改后的函数为
curve_intersect_tq <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(function(x) curve1_f(x) - curve2_f(x),
lower = -100000000, upper = 100000000)$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
我试图更改 lower =, upper =
个参数的值。它不起作用。我收到了如下所示的相同错误消息。
curve_intersect_tq(recall_or1_4[[460]], precision_or1_4[[460]], empirical = TRUE, domain = NULL)
Error in uniroot(function(x) curve1_f(x) - curve2_f(x), c(min(curve1$x), :
f() values at end points not of opposite sign
我还尝试使用 tidyverse 包中的 possibly(fun, NA)
,希望该过程可以 运行 即使出现错误消息。我用
(1) possibly(curve_intersect(), NA)
或
(2) possibly(uniroot(), NA)
同样的错误信息出现了。
为什么我会收到错误消息?可能的解决方案是什么?提前致谢。
聚会可能有点晚了,但这就是您的代码仍然失败的原因以及您可以做什么,具体取决于您希望从分析中得到什么:
首先,你的代码失败的原因,即使在改编之后,你只是告诉 uniroot
搜索更广泛的 window在 x
。但是,基础曲线永远不会相交 - 根本找不到任何 curve1_f(x) - curve2_f(x) == 0
。
来自 uniroot
的文档:
"The function values at the endpoints must be of opposite signs (or zero), for extendInt="no", the default."
在最初的 curve_intersect
实现中,uniroot
正在搜索数据中定义的 x-interval(即 c(min(curve1$x), max(curve1$x))
)。在您的更改中,您告诉它在 x 区间 [-100000000, 100000000]
中搜索。您也可以设置 extendInt = "yes"
,但它不会改变任何内容。
问题不在于搜索间隔,而在于 approxfun
!
approxfun
仅通过 插值 点之间的经验数据来帮助您。在您传入的数据之外,返回的函数不知道要做什么。
approxfun
允许您为 y
指定明确的值,这些值应该在经验定义的 window(及其参数 yleft
/yright
)之外返回,或者让您设置每边一个rule
。
在您上面发布的代码中,rule = 2
决定“使用最接近数据极值的值”。因此,approxfun
不会 外推 您传入的数据。它只会扩展已知数据。
我们可以绘制 curve1_f
和 curve2_f
将如何扩展到经验定义的 x-interval 之外进入无穷大:
tibble(
x = seq(0, 1, by = 0.001),
curve1_approxed = curve1_f(x),
curve2_approxed = curve2_f(x)
) %>%
pivot_longer(starts_with("curve"), names_to = "curve", values_to = "y") %>%
ggplot(aes(x = x, y = y, color = curve)) +
geom_line() +
geom_vline(xintercept = c(min(curve1$x), max(curve1$x)), color = "grey75")
那么,现在你可以做些什么来让你的代码不崩溃:
(剧透:这在很大程度上取决于您要通过项目完成的目标)
- 接受在您的数据的观察范围内没有交集。
如果您不想做任何假设,我建议您将映射函数包装在tryCatch
语句中,并让它在 out-of-the-box 解决方案没有给您任何结果的地方失败。让我们 运行 这是您列表中之前使整个事情崩溃的部分:
threshold_or1.fix1 <- map2_df(
recall_or1_4, precision_or1_4,
~tryCatch({
curve_intersect(.x, .y, empirical = TRUE, domain = NULL)
}, error = function(e){
return(tibble(.rows = 1))
}),
.id = "i"
)
现在,当 curve_intersect
无法为您提供结果时,只有一行 NA。
threshold_or1.fix1[459:461,]
# A tibble: 3 x 3
i x y
<chr> <dbl> <dbl>
1 459 0.116 0.809
2 460 NA NA
3 461 0.264 0.773
- 尝试使用线性模型外推您的数据
在这种情况下,我们将使用自定义curve_intersect
函数。让我们将有问题的uniroot
调用包装在tryCatch
中,如果找不到根,我们将为每条曲线拟合一个lm
并让uniroot
在拟合线性。
根据您的实验,这可能有意义也可能没有意义,所以我会让您在这里做判断。显然,如果您的数据比这更复杂,您可以使用除简单lm
之外的其他模型...
只是为了形象化这种方法与默认方法:
tibble(
x = seq(-1, 2, by = 0.001),
curve1_approxed = curve1_f(x),
curve2_approxed = curve2_f(x),
curve1_lm = predict(lm(y ~ x, data = curve1), newdata = tibble(x = x)),
curve2_lm = predict(lm(y ~ x, data = curve2), newdata = tibble(x = x))
) %>%
pivot_longer(starts_with("curve"), names_to = "curve", values_to = "y") %>%
ggplot(aes(x = x, y = y, color = curve)) +
geom_line() +
geom_vline(xintercept = c(min(curve1$x), max(curve1$x)), color = "grey75")
你看,在approxfun
“失败”的地方,lm
我们假设我们可以线性推断并在观察到的框架之外找到x = 1.27
周围的交点。
要采用第二种方法并在我们的搜索中包含 lm
的外推法,您可以像这样拼凑一些东西:
(这里也只编辑了第三个 if
。)
curve_intersect_custom <- function(curve1, curve2, empirical=TRUE, domain=NULL) {
if (!empirical & missing(domain)) {
stop("'domain' must be provided with non-empirical curves")
}
if (!empirical & (length(domain) != 2 | !is.numeric(domain))) {
stop("'domain' must be a two-value numeric vector, like c(0, 10)")
}
if (empirical) {
return(
tryCatch({
# Approximate the functional form of both curves
curve1_f <- approxfun(curve1$x, curve1$y, rule = 2)
curve2_f <- approxfun(curve2$x, curve2$y, rule = 2)
# Calculate the intersection of curve 1 and curve 2 along the x-axis
point_x <- uniroot(
f = function(x) curve1_f(x) - curve2_f(x),
interval = c(min(curve1$x), max(curve1$x))
)$root
# Find where point_x is in curve 2
point_y <- curve2_f(point_x)
return(list(x = point_x, y = point_y, method = "approxfun"))
}, error = function(e) {
tryCatch({
curve1_lm_f <- function(x) predict(lm(y ~ x, data = curve1), newdata = tibble(x = x))
curve2_lm_f <- function(x) predict(lm(y ~ x, data = curve2), newdata = tibble(x = x))
point_x <- uniroot(
f = function(x) curve1_lm_f(x) - curve2_lm_f(x),
interval = c(min(curve1$x), max(curve1$x)),
extendInt = "yes"
)$root
point_y <- curve2_lm_f(point_x)
return(list(x = point_x, y = point_y, method = "lm"))
}, error = function(e) {
return(list(x = NA_real_, y = NA_real_, method = NA_character_))
})
})
)
} else {
# Calculate the intersection of curve 1 and curve 2 along the x-axis
# within the given domain
point_x <- uniroot(function(x) curve1(x) - curve2(x), domain)$root
# Find where point_x is in curve 2
point_y <- curve2(point_x)
}
return(list(x = point_x, y = point_y))
}
对于您的有问题的列表元素,现在尝试使用天真的拟合 lm
模型进行推断:
threshold_or1.fix2 <- map2_df(
recall_or1_4, precision_or1_4,
~curve_intersect_custom(.x, .y, empirical = TRUE, domain = NULL),
.id = "i"
)
threshold_or1.fix2[459:461,]
# A tibble: 3 x 4
i x y method
<chr> <dbl> <dbl> <chr>
1 459 0.116 0.809 approxfun
2 460 1.27 0.813 lm
3 461 0.264 0.773 approxfun
希望这对理解和解决您的问题有所帮助:)