如何从 Python 加载保存为 RDS 文件的模型并进行预测？

Question

我有一个以 *.rds 格式保存的机器学习模型。我想在 Python 中打开此模型以进行预测。为此，我安装了 rpy2。这是我的 Jupyter Notebook 代码：

!pip install rpy2

import json
import pandas as pd
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
from rpy2.robjects.packages import importr

r = robjects.r
numpy2ri.activate()

model_rds_path = "model.rds"
model = r.readRDS(model_rds_path)

raw_data = '{"data":[[79],[63]]}'
data = json.loads(raw_data)["data"]

if type(data) is not np.ndarray:
    data = np.array(data)

result = r.predict(model, data, probability=False)
result

我在 r.predict(…) 行收到以下错误：

RRuntimeError: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : 
  'data' must be a data.frame, not a matrix or an array
Calls: <Anonymous> -> predict.lm -> model.frame -> model.frame.default

R 中的训练脚本如下所示：

library(caret)

# Reading `data` from CSV file
x <- data$height
y <- data$weight

model <- lm(y~x)

# Test predictions
df_test_heights <- data.frame(x = as.numeric(c(115,20)))
result <- predict(model,df_test_heights)
print(result)

我很困惑…花了一整天试图解决这个问题！！有人知道怎么修吗？？？如果有人知道从 Python.

打开 RDS 文件的替代方法（替代 rpy2），我也将不胜感激

谢谢！！！

Answer 1

R 函数 predict() 需要 data 的 R 数据帧。但是，此时您拥有的是一个 numpy 数组。

data = json.loads(raw_data)["data"]

if type(data) is not np.ndarray:
    data = np.array(data)

在Python中，pandas的DataFrame对象在概念上更接近于R数据帧。 rpy2 文档的这一部分可能会对您有所帮助：

https://rpy2.github.io/doc/v3.2.x/html/pandas.html

Answer 2

这里有一个选项pyper

import numpy as np
import pandas as pd
from pyper import *
import json
r=R(use_pandas=True)
model_rds_path = "model.rds"
r.assign("rmodel", model_rds_path)


raw_data = '{"data":[[79],[63]]}'
data = json.loads(raw_data)["data"]

if type(data) is not np.ndarray:
    data = dat = pd.DataFrame( np.array(data), columns = ['x'])


r.assign("rdata", data)
# rdata
expr  = 'model <- readRDS(rmodel); result <- predict(model, rdata, probability=False)'
r(expr)
res= r.get('result')

如何从 Python 加载保存为 RDS 文件的模型并进行预测？

How to load a model saved as RDS file from Python and make predictions?

python

r

rpy2