图和图学习器之间的区别
Difference between graph and graph learner
我试图理解图和图学习器之间的区别。
我可以用图表 $train 和 $predict 。但是我需要“包装器”才能使用行选择和分数(请参见下面的代码)。
有没有可以用同时不是学习者的图来做的事情? (在代码中 gr
而不是 glrn
?
gr = po(lrn("classif.kknn", predict_type = "prob"),
param_vals = list(k = 10, distance=2, kernel='rectangular' )) %>>%
po("threshold", param_vals = list(thresholds = 0.6))
glrn = GraphLearner$new(gr) # build Graph Learner from graph
glrn$train(task, row_ids=1:300) # n.b.: We need to construct a graph learner in order to use row_ids etc.
predictions=glrn$predict(task,row_ids = 327:346) # would not work with gr
predictions$score(msr("classif.acc"))
predictions$print()
A GraphLearner
总是包装一个 Graph
,它将单个 Task
作为输入并产生单个 Prediction
作为输出。但是,Graph
可以表示任何类型的计算,甚至可以接受多个输入/产生多个输出。在构建 对单个任务进行 训练的 Graph
时,您通常会使用这些作为中间构建块,给出单个预测,然后将其包装为 GraphLearner
.
在某些情况下,如果您进行某种预处理(例如插补或 PCA)也应该应用于某种看不见的数据(即应用与 PCA 相同的旋转),即使您的过程总的来说不是经典机器学习产生预测模型:
data <- tsk("pima")
trainingset <- sample(seq(0, 1, length.out = data$nrow) < 2/3)
data.t <- data$clone(deep = TRUE)$filter(which(trainingset))
data.p <- data$clone(deep = TRUE)$filter(which(!trainingset))
# Operation:
# 1. impute missing values with mean of non-missings in same column
# 2. rotate to principal component axes
imputepca <- po("imputemean") %>>% po("pca")
# Need to take element 1 of result here: 'Graph' could have multiple
# outputs and therefore returns a list. In our case we only have one
# result that we care about.
rotated.t <- imputepca$train(data.t)[[1]]
rotated.t$head(2)
#> diabetes PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> 1: pos -4.744963 27.76824 -5.2432401 9.817512 -9.042784 0.4979002 0.4574355 -0.1058608
#> 2: neg 6.341357 -37.18033 -0.1210501 3.731123 -1.451952 3.6890699 2.3901156 0.0755521
# this data is imputed using the column means of the training data, and then
# rotated by the same rotation as the training data.
rotated.p <- imputepca$predict(data.p)[[1]]
rotated.p$head(2)
#> diabetes PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> 1: pos -11.535952 9.358736 25.1073705 4.761627 -23.313410 -9.743428 3.412071 -1.6403521
#> 2: neg 1.189971 -7.098455 -0.2785817 -3.280845 -0.281516 -2.277787 -6.746323 0.3434535
但是,由于 mlr3pipelines
主要是为 mlr3
构建的,也就是说 Learner
可以进行训练和重新采样等,因此您通常最终会包装您的 Graph
s 在 GraphLearner
s.
我试图理解图和图学习器之间的区别。 我可以用图表 $train 和 $predict 。但是我需要“包装器”才能使用行选择和分数(请参见下面的代码)。
有没有可以用同时不是学习者的图来做的事情? (在代码中 gr
而不是 glrn
?
gr = po(lrn("classif.kknn", predict_type = "prob"),
param_vals = list(k = 10, distance=2, kernel='rectangular' )) %>>%
po("threshold", param_vals = list(thresholds = 0.6))
glrn = GraphLearner$new(gr) # build Graph Learner from graph
glrn$train(task, row_ids=1:300) # n.b.: We need to construct a graph learner in order to use row_ids etc.
predictions=glrn$predict(task,row_ids = 327:346) # would not work with gr
predictions$score(msr("classif.acc"))
predictions$print()
A GraphLearner
总是包装一个 Graph
,它将单个 Task
作为输入并产生单个 Prediction
作为输出。但是,Graph
可以表示任何类型的计算,甚至可以接受多个输入/产生多个输出。在构建 对单个任务进行 训练的 Graph
时,您通常会使用这些作为中间构建块,给出单个预测,然后将其包装为 GraphLearner
.
在某些情况下,如果您进行某种预处理(例如插补或 PCA)也应该应用于某种看不见的数据(即应用与 PCA 相同的旋转),即使您的过程总的来说不是经典机器学习产生预测模型:
data <- tsk("pima")
trainingset <- sample(seq(0, 1, length.out = data$nrow) < 2/3)
data.t <- data$clone(deep = TRUE)$filter(which(trainingset))
data.p <- data$clone(deep = TRUE)$filter(which(!trainingset))
# Operation:
# 1. impute missing values with mean of non-missings in same column
# 2. rotate to principal component axes
imputepca <- po("imputemean") %>>% po("pca")
# Need to take element 1 of result here: 'Graph' could have multiple
# outputs and therefore returns a list. In our case we only have one
# result that we care about.
rotated.t <- imputepca$train(data.t)[[1]]
rotated.t$head(2)
#> diabetes PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> 1: pos -4.744963 27.76824 -5.2432401 9.817512 -9.042784 0.4979002 0.4574355 -0.1058608
#> 2: neg 6.341357 -37.18033 -0.1210501 3.731123 -1.451952 3.6890699 2.3901156 0.0755521
# this data is imputed using the column means of the training data, and then
# rotated by the same rotation as the training data.
rotated.p <- imputepca$predict(data.p)[[1]]
rotated.p$head(2)
#> diabetes PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> 1: pos -11.535952 9.358736 25.1073705 4.761627 -23.313410 -9.743428 3.412071 -1.6403521
#> 2: neg 1.189971 -7.098455 -0.2785817 -3.280845 -0.281516 -2.277787 -6.746323 0.3434535
但是,由于 mlr3pipelines
主要是为 mlr3
构建的,也就是说 Learner
可以进行训练和重新采样等,因此您通常最终会包装您的 Graph
s 在 GraphLearner
s.