使用 Python 从经过训练的 Vertex AI 表格回归模型中获取特征重要性

Get feature importance from trained Vertex AI Tabular regression model using Python

我正在处理使用 GCP 中 Vertex 的 Tabular automl 训练的模型。 培训和批量预测工作正常。我正在尝试在可视化中使用特征重要性并尝试从 python 中获取它们。 我可以进入模型:

client = aiplatform.gapic.ModelServiceClient(client_options=client_options)
name = client.model_path(project=project, location='us-central1', model=modelnum)
response = client.get_model(name=name)

但我不知道如何获得训练管道中生成的训练模型的特征重要性。我可以在模型页面上看到它们,但无法从 python.

访问它们

要在“评估”页面上获取详细信息,您需要使用 list_model_evaluations()。这将 return google.cloud.aiplatform_v1.services.model_service.pagers.ListModelEvaluationsPager 包含您在“评估”页面上看到的值。由于您提到要获得特征重要性,因此您需要遍历所述对象并获得 model_explanation。请参阅下面的代码:

from google.cloud import aiplatform_v1 as aiplatform

api_endpoint = 'us-central1-aiplatform.googleapis.com'
client_options = {"api_endpoint": api_endpoint} # api_endpoint is required for client_options
client_model = aiplatform.services.model_service.ModelServiceClient(client_options=client_options)
project_id = 'your-project-id'
location = 'us-central1'
model_id = '9999999999999'

model_name = f'projects/{project_id}/locations/{location}/models/{model_id}'
list_eval_request = aiplatform.types.ListModelEvaluationsRequest(parent=model_name)
list_eval = client_model.list_model_evaluations(request=list_eval_request)
for val in list_eval:
    print(val.model_explanation)

为了测试,我使用了 Google 的示例数据 (gs://cloud-ml-tables-data/bank-marketing.csv)。

代码的响应:

mean_attributions {
  feature_attributions {
    struct_value {
      fields {
        key: "Age"
        value {
          number_value: 0.027145349596062344
        }
      }
      fields {
        key: "Balance"
        value {
          number_value: 0.009469658279914696
        }
      }
      fields {
        key: "Campaign"
        value {
          number_value: 0.009621628534664564
        }
      }
      fields {
        key: "Contact"
        value {
          number_value: 0.006477007587775141
        }
      }
      fields {
        key: "Day"
        value {
          number_value: 0.013976069802316006
        }
      }
      fields {
        key: "Default"
        value {
          number_value: 1.528606850783311e-08
        }
      }
      fields {
        key: "Duration"
        value {
          number_value: 0.1395725763431482
        }
      }
      fields {
        key: "Education"
        value {
          number_value: 0.007015091678270283
        }
      }
      fields {
        key: "Housing"
        value {
          number_value: 0.055101036115872845
        }
      }
      fields {
        key: "Job"
        value {
          number_value: 0.021222775094579954
        }
      }
      fields {
        key: "Loan"
        value {
          number_value: 0.002048753814978598
        }
      }
      fields {
        key: "MaritalStatus"
        value {
          number_value: 0.005709941134721149
        }
      }
      fields {
        key: "Month"
        value {
          number_value: 0.12325089337437695
        }
      }
      fields {
        key: "PDays"
        value {
          number_value: 0.023952343173674555
        }
      }
      fields {
        key: "POutcome"
        value {
          number_value: 0.06695149606670256
        }
      }
      fields {
        key: "Previous"
        value {
          number_value: 0.03921166116430856
        }
      }
    }
  }
}

来自“评估”页面:

编辑:20210920

我使用我的回归模型并使用 aiplatform 库获取数据。我仍然得到了属性 model_explanation。我正在使用 google-cloud-aiplatform==1.4.3 作为库版本。

使用的代码:

from google.cloud import aiplatform

api_endpoint = 'us-central1-aiplatform.googleapis.com'
client_options = {"api_endpoint": api_endpoint}
client_model = aiplatform.gapic.ModelServiceClient(client_options=client_options)
#client_model = aiplatform.services.model_service.ModelServiceClient(client_options=client_options)

project_id = 'your-project-id'
location = 'us-central1'
model_id = '999999999'

model_name = f'projects/{project_id}/locations/{location}/models/{model_id}'
list_eval = client_model.list_model_evaluations(parent=model_name)
print(list_eval)

完整 JSON 回复:

ListModelEvaluationsPager<model_evaluations {
  name: "projects/xxxxxxx/locations/us-central1/models/99999999/evaluations/8888888"
  metrics_schema_uri: "gs://google-cloud-aiplatform/schema/modelevaluation/regression_metrics_1.0.0.yaml"
  metrics {
    struct_value {
      fields {
        key: "meanAbsoluteError"
        value {
          number_value: 0.1303236
        }
      }
      fields {
        key: "meanAbsolutePercentageError"
        value {
          number_value: 9.991856
        }
      }
      fields {
        key: "rSquared"
        value {
          number_value: 0.39691383
        }
      }
      fields {
        key: "rootMeanSquaredError"
        value {
          number_value: 0.24697715
        }
      }
      fields {
        key: "rootMeanSquaredLogError"
        value {
          number_value: 0.10037828
        }
      }
    }
  }
  create_time {
    seconds: 1632106497
    nanos: 416614000
  }
  model_explanation {
    mean_attributions {
      feature_attributions {
        struct_value {
          fields {
            key: "Age"
            value {
              number_value: 0.033690840005874634
            }
          }
          fields {
            key: "Balance"
            value {
              number_value: 0.021756498143076897
            }
          }
          fields {
            key: "Campaign"
            value {
              number_value: 0.03156016394495964
            }
          }
          fields {
            key: "Contact"
            value {
              number_value: 0.09849491715431213
            }
          }
          fields {
            key: "Day"
            value {
              number_value: 0.08989512920379639
            }
          }
          fields {
            key: "Default"
            value {
              number_value: 0.00012870959471911192
            }
          }
          fields {
            key: "Duration"
            value {
              number_value: 0.3097792863845825
            }
          }
          fields {
            key: "Education"
            value {
              number_value: 0.01789841242134571
            }
          }
          fields {
            key: "Housing"
            value {
              number_value: 0.05525226518511772
            }
          }
          fields {
            key: "Job"
            value {
              number_value: 0.010000345297157764
            }
          }
          fields {
            key: "Loan"
            value {
              number_value: 0.00856288243085146
            }
          }
          fields {
            key: "MaritalStatus"
            value {
              number_value: 0.01715957187116146
            }
          }
          fields {
            key: "Month"
            value {
              number_value: 0.22002224624156952
            }
          }
          fields {
            key: "PDays"
            value {
              number_value: 0.026749607175588608
            }
          }
          fields {
            key: "POutcome"
            value {
              number_value: 0.05268073454499245
            }
          }
          fields {
            key: "Previous"
            value {
              number_value: 0.00636840146034956
            }
          }
        }
      }
    }
  }
}
>

来自“评估”页面: