R 中的风险评估模型,以获得某个因素的特定水平的概率
Risk assessment models in R, in order to get the probability of specif levels of a factor
我是一名风险分析师,我的老板给我分配了一项任务,我不知道该怎么做。
现在我想得到在某些特定条件下的概率。例如,数据看起来像这样
sex hair_color Credit_Score Loan_Status
"Male" "Red" "256" "bad"
"Female" "black" "133" "bad"
"Female" "brown" "33" "bad"
"Male" "yellow" "123" "good"
所以我们想为每个客户预测 Loan_Status。
我能做的就是将 "sex"、"hair_color"、"credit_score" 视为因素。
并将它们放入 R 中的 glm() 中。
但是我老板想知道"if a new customer who is male, red hair, what's the probability his loan status will be 'good'?"
或"What's the probability of male customers' loan status become 'good'?"
我应该使用什么样的方法?如何获得概率?
我正在考虑边际分布,但我不知道这是否有效或我该如何计算它。
我希望我让这个问题容易理解,谁能帮助我,非常感谢你抽出时间
我认为本教程非常适合您的问题:http://www.theanalysisfactor.com/r-tutorial-glm1/
如果你在你的数据上使用它,它看起来像这样:
sex <- factor(c("m", "f", "f", "m"))
hair_color <- factor(c("red", "black", "brown", "yellow"))
credit_score <- c(256, 133, 33, 123)
loan_status <- factor(c("b", "b", "b", "g"))
data <- data.frame(sex, hair_color, credit_score, loan_status)
model <- glm(formula = loan_status ~ sex + hair_color + credit_score,
data = data,
family = "binomial")
predict(object = model,
newdata = data.frame(sex = "f", hair_color = "yellow", credit_score = 100),
type = "response")
我是一名风险分析师,我的老板给我分配了一项任务,我不知道该怎么做。
现在我想得到在某些特定条件下的概率。例如,数据看起来像这样
sex hair_color Credit_Score Loan_Status
"Male" "Red" "256" "bad"
"Female" "black" "133" "bad"
"Female" "brown" "33" "bad"
"Male" "yellow" "123" "good"
所以我们想为每个客户预测 Loan_Status。 我能做的就是将 "sex"、"hair_color"、"credit_score" 视为因素。 并将它们放入 R 中的 glm() 中。
但是我老板想知道"if a new customer who is male, red hair, what's the probability his loan status will be 'good'?"
或"What's the probability of male customers' loan status become 'good'?"
我应该使用什么样的方法?如何获得概率? 我正在考虑边际分布,但我不知道这是否有效或我该如何计算它。
我希望我让这个问题容易理解,谁能帮助我,非常感谢你抽出时间
我认为本教程非常适合您的问题:http://www.theanalysisfactor.com/r-tutorial-glm1/
如果你在你的数据上使用它,它看起来像这样:
sex <- factor(c("m", "f", "f", "m"))
hair_color <- factor(c("red", "black", "brown", "yellow"))
credit_score <- c(256, 133, 33, 123)
loan_status <- factor(c("b", "b", "b", "g"))
data <- data.frame(sex, hair_color, credit_score, loan_status)
model <- glm(formula = loan_status ~ sex + hair_color + credit_score,
data = data,
family = "binomial")
predict(object = model,
newdata = data.frame(sex = "f", hair_color = "yellow", credit_score = 100),
type = "response")