利用 Pandas (StatsModels) 在多元回归上创建 for 循环

Create a for Loop on a multiple regression utilizing Pandas (StatsModels)

我正在对 50 个州执行多元回归,以根据多个变量确定每个州的预期寿命。目前我的数据集只过滤到缅因州,我想知道是否有办法创建一个 For 循环来遍历整个州列并对每个州执行回归。这比创建 50 个过滤器更有效。任何帮助都会很棒!

import pandas
import pandas as pd
import openpyxl
import statsmodels.formula.api as smf
import statsmodels.formula.api as ols

df = pd.read_excel(C:/Users/File1.xlsx, sheet_name = 'States')

dfME = df[(df[State] == "Maine")]

pd.set_option('display.max_columns', None)

dfME.head()

model = smf.ols(Life Expectancy ~ Race + Age + Weight + C(Pets), data = dfME) 
modelfit = model.fit()
modelfit.summary
###### Assuming rest of your code is ok I am sharing a strategy for the loop and storing model outputs:
pd.set_option('display.max_columns', None)
state_modelfit_summary = {}
states = df['State'].unique() # As you only need to loop once for each state
for st in states:
    dfME = df[(df['State'] == st)]     
    model = smf.ols(Life Expectancy ~ Race + Age + Weight + C(Pets), data = dfME) 
    modelfit = model.fit()
    # Store output in a dictionary with state name as key
    state_modelfit_summary[st] = modelfit.summary