rpy2 - 'R' 对象没有属性 'nls'
rpy2 - 'R' object has no attribute 'nls'
我正在使用 rpy2 在 python 的 r 中做一些非线性回归。
import rpy2.robjects as robjects
from rpy2.robjects import DataFrame, Formula
from rpy2.robjects import r
import rpy2.robjects.numpy2ri as npr
import numpy as np
from rpy2.robjects.packages import importr
r.nls(rates * 1-(1/(10^(a * count ^ (b-1)))), weights=count, start=list(a=a, b=b))
我有以下错误:
LookupError: 'nls' not found
AttributeError: 'R' object has no attribute 'nls'
它也将 '~' 称为无效语法(我将其更改为 * 以通过它,但我确实需要它是 '~')
有什么问题吗?
代码在 R 中运行良好。
这是在 R 中运行良好的完整代码:
#This recipe assumes that the data is in a csv file called 'ratedata.csv' and that the values are in columns titled:
#Entity, Trials and Successes
#Data must be sorted in order of number of applications (i.e. the 'Trials' column) highest to lowest.
data <- read.csv("ratedata.csv") #get the data
count <- data$Trials #define count as the number of trials
rates <- data$Successes / data$Trials #define rate as the success rate for each entity
a <- .05 #set initial values for a and b to generate predicted rates
b <- 1.1 #these values need to be reasonably sensible otherwise the later estimate will not converge sensibly
fit <- nls(rates ~ 1-(1/(10^(a * count ^ (b-1)))), weights=count, start=list(a=a, b=b)) #non-linear least squares fit of data, weighted by count (weighting is optional but helps if it won't converge sensibly)
summary(fit) #to show estimates of a and b
coef <- as.vector(coef(fit)) #extract the coefficients into a vector for re-use
a <- coef[[1]] # extract the calculated coefficient for a
b <- coef[[2]] # extract the calculated coefficient for b
confidence <- confint(fit)
intervals <- as.vector(confidence[c(2,4)])
predopt <- 1-(1/(10^(a * count ^ (b-1)))) #predict rate by count with optimised coefficients
se <- sqrt(( predopt* (1-predopt))/count) #calculate standard error for predicted rate
upper95 <- predopt + 2*se #upper 95% limit - roughly speaking. Wald interval is appropriate in this case.
lower95 <- predopt - 2*se #lower 95% limit
upper99 <- predopt + 3*se #upper 99% limit
lower99 <- predopt - 3*se #lower 99% limit
xlim <- range(count + 10) #setup plot
ylim <- range(c(upper99, 0)) #lower limit truncated at zero
main <- plot(count, rates, pch = 21, col = "navajowhite4", bg = "mistyrose4") #plot rates by organisation
lines(count, predopt, type="l", xlim=xlim, ylim=ylim, xlab="Trials", ylab="Predicted rate", col = "red") #plot predicted rate
lines (count, upper95, lty="dashed") #plot upper limit
lines (count, lower95, lty="dashed") #plot lower limit
lines (count, upper99, lty="dotted") #plot upper limit
lines (count, lower99, lty="dotted") #plot lower limit
cat("The least-squares values of a and b are", coef[[1]], "and", coef[[2]], "respectively", "\n")
print(confint(fit))
if (intervals[[1]] < 1 & intervals [[2]] > 1)
{
message ("There is probably no relationship between success rate and number of trials")
} else
{
message ("There is probably a relationship between success rate and number of trials")
}
列 Trials
和 Successes
只是两列 48 个整数(它们可以是任何整数。试验范围从 129 到 2359,成功范围从 8 到 365
2018 年 1 月 25 日下午 19 点 40 分更新问题
当前代码是:
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
count = ro.IntVector([1,2,3,4,5])
rates = ro.IntVector([1,2,3,4,5])
a = ro.FloatVector([0.5])
b = ro.FloatVector([1.1])
base = importr('base', robject_translations={'with': '_with'})
stats = importr('stats', robject_translations={'format_perc': '_format_perc'})
my_formula = stats.as_formula('rates ~ 1-(1/(10^(a * count ^ (b-1))))')
d = ro.ListVector({'a': a, 'b': b})
fit = stats.nls(my_formula, weights=count, start=d)
我收到错误:
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-2-3f7fcd7d7851> in <module>()
6 d = ro.ListVector({'a': a, 'b': b})
7
----> 8 fit = stats.nls(my_formula, weights=count, start=d)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\rpy2\robjects\functions.py in __call__(self, *args, **kwargs)
176 v = kwargs.pop(k)
177 kwargs[r_k] = v
--> 178 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
179
180 pattern_link = re.compile(r'\link\{(.+?)\}')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\rpy2\robjects\functions.py in __call__(self, *args, **kwargs)
104 for k, v in kwargs.items():
105 new_kwargs[k] = conversion.py2ri(v)
--> 106 res = super(Function, self).__call__(*new_args, **new_kwargs)
107 res = conversion.ri2ro(res)
108 return res
RRuntimeError: Error in (function (formula, data = parent.frame(), start, control = nls.control(), :
parameters without starting value in 'data': rates, count
我猜我的 count 和 rates 变量不是列表?或者是其他东西?我试过搞乱并转换它们但无济于事。非常感谢任何帮助!
这是我为数据框编写的代码:
dataf = ro.DataFrame({})
d = {'count': ro.IntVector((1,2,3,4,5)),'rates': ro.IntVector((1,2,3,4,5))}
dataf = ro.DataFrame(d)
count = dataf.rx(True, 'count')
rates = dataf.rx(True, 'rates')
考虑导入 R 的 stats 和 base 库,然后复制所需的调用。并使用 as_formula
将公式的字符串表示形式转换为实际的公式对象。由于这些是默认的 R 库,因此可以找出哪个方法属于哪个包,例如 stats::nls()
和 base::list()
.
还请注意,为了与 Python 的语法规则保持一致,R 名称中的任何句点都将转换为下划线。其他一些方法已重命名以避免与 Python 自己的方法发生冲突。
...
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
base = importr('base', robject_translations={'with': '_with'})
stats = importr('stats', robject_translations={'format_perc': '_format_perc'})
my_formula = stats.as_formula('rates ~ 1-(1/(10^(a * count ^ (b-1))))')
d = ro.ListVector({'a': a, 'b': b})
fit = stats.nls(my_formula, weights=count, start=d)
我正在使用 rpy2 在 python 的 r 中做一些非线性回归。
import rpy2.robjects as robjects
from rpy2.robjects import DataFrame, Formula
from rpy2.robjects import r
import rpy2.robjects.numpy2ri as npr
import numpy as np
from rpy2.robjects.packages import importr
r.nls(rates * 1-(1/(10^(a * count ^ (b-1)))), weights=count, start=list(a=a, b=b))
我有以下错误:
LookupError: 'nls' not found
AttributeError: 'R' object has no attribute 'nls'
它也将 '~' 称为无效语法(我将其更改为 * 以通过它,但我确实需要它是 '~')
有什么问题吗?
代码在 R 中运行良好。
这是在 R 中运行良好的完整代码:
#This recipe assumes that the data is in a csv file called 'ratedata.csv' and that the values are in columns titled:
#Entity, Trials and Successes
#Data must be sorted in order of number of applications (i.e. the 'Trials' column) highest to lowest.
data <- read.csv("ratedata.csv") #get the data
count <- data$Trials #define count as the number of trials
rates <- data$Successes / data$Trials #define rate as the success rate for each entity
a <- .05 #set initial values for a and b to generate predicted rates
b <- 1.1 #these values need to be reasonably sensible otherwise the later estimate will not converge sensibly
fit <- nls(rates ~ 1-(1/(10^(a * count ^ (b-1)))), weights=count, start=list(a=a, b=b)) #non-linear least squares fit of data, weighted by count (weighting is optional but helps if it won't converge sensibly)
summary(fit) #to show estimates of a and b
coef <- as.vector(coef(fit)) #extract the coefficients into a vector for re-use
a <- coef[[1]] # extract the calculated coefficient for a
b <- coef[[2]] # extract the calculated coefficient for b
confidence <- confint(fit)
intervals <- as.vector(confidence[c(2,4)])
predopt <- 1-(1/(10^(a * count ^ (b-1)))) #predict rate by count with optimised coefficients
se <- sqrt(( predopt* (1-predopt))/count) #calculate standard error for predicted rate
upper95 <- predopt + 2*se #upper 95% limit - roughly speaking. Wald interval is appropriate in this case.
lower95 <- predopt - 2*se #lower 95% limit
upper99 <- predopt + 3*se #upper 99% limit
lower99 <- predopt - 3*se #lower 99% limit
xlim <- range(count + 10) #setup plot
ylim <- range(c(upper99, 0)) #lower limit truncated at zero
main <- plot(count, rates, pch = 21, col = "navajowhite4", bg = "mistyrose4") #plot rates by organisation
lines(count, predopt, type="l", xlim=xlim, ylim=ylim, xlab="Trials", ylab="Predicted rate", col = "red") #plot predicted rate
lines (count, upper95, lty="dashed") #plot upper limit
lines (count, lower95, lty="dashed") #plot lower limit
lines (count, upper99, lty="dotted") #plot upper limit
lines (count, lower99, lty="dotted") #plot lower limit
cat("The least-squares values of a and b are", coef[[1]], "and", coef[[2]], "respectively", "\n")
print(confint(fit))
if (intervals[[1]] < 1 & intervals [[2]] > 1)
{
message ("There is probably no relationship between success rate and number of trials")
} else
{
message ("There is probably a relationship between success rate and number of trials")
}
列 Trials
和 Successes
只是两列 48 个整数(它们可以是任何整数。试验范围从 129 到 2359,成功范围从 8 到 365
2018 年 1 月 25 日下午 19 点 40 分更新问题
当前代码是:
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
count = ro.IntVector([1,2,3,4,5])
rates = ro.IntVector([1,2,3,4,5])
a = ro.FloatVector([0.5])
b = ro.FloatVector([1.1])
base = importr('base', robject_translations={'with': '_with'})
stats = importr('stats', robject_translations={'format_perc': '_format_perc'})
my_formula = stats.as_formula('rates ~ 1-(1/(10^(a * count ^ (b-1))))')
d = ro.ListVector({'a': a, 'b': b})
fit = stats.nls(my_formula, weights=count, start=d)
我收到错误:
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-2-3f7fcd7d7851> in <module>()
6 d = ro.ListVector({'a': a, 'b': b})
7
----> 8 fit = stats.nls(my_formula, weights=count, start=d)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\rpy2\robjects\functions.py in __call__(self, *args, **kwargs)
176 v = kwargs.pop(k)
177 kwargs[r_k] = v
--> 178 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
179
180 pattern_link = re.compile(r'\link\{(.+?)\}')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\rpy2\robjects\functions.py in __call__(self, *args, **kwargs)
104 for k, v in kwargs.items():
105 new_kwargs[k] = conversion.py2ri(v)
--> 106 res = super(Function, self).__call__(*new_args, **new_kwargs)
107 res = conversion.ri2ro(res)
108 return res
RRuntimeError: Error in (function (formula, data = parent.frame(), start, control = nls.control(), :
parameters without starting value in 'data': rates, count
我猜我的 count 和 rates 变量不是列表?或者是其他东西?我试过搞乱并转换它们但无济于事。非常感谢任何帮助!
这是我为数据框编写的代码:
dataf = ro.DataFrame({})
d = {'count': ro.IntVector((1,2,3,4,5)),'rates': ro.IntVector((1,2,3,4,5))}
dataf = ro.DataFrame(d)
count = dataf.rx(True, 'count')
rates = dataf.rx(True, 'rates')
考虑导入 R 的 stats 和 base 库,然后复制所需的调用。并使用 as_formula
将公式的字符串表示形式转换为实际的公式对象。由于这些是默认的 R 库,因此可以找出哪个方法属于哪个包,例如 stats::nls()
和 base::list()
.
还请注意,为了与 Python 的语法规则保持一致,R 名称中的任何句点都将转换为下划线。其他一些方法已重命名以避免与 Python 自己的方法发生冲突。
...
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
base = importr('base', robject_translations={'with': '_with'})
stats = importr('stats', robject_translations={'format_perc': '_format_perc'})
my_formula = stats.as_formula('rates ~ 1-(1/(10^(a * count ^ (b-1))))')
d = ro.ListVector({'a': a, 'b': b})
fit = stats.nls(my_formula, weights=count, start=d)