具有空 pandas dataFrame 列的稳健回归
Robust regression with empty pandas dataFrame columns
我有一个 pandas DataFrame。例如,我有以下内容:
column1 column2 column3
34 nan 3
45 nan 1
45 nan 3
45 nan 3
46 nan 3
45 nan nan
45 nan 3
47 nan 5
45 nan 3
50 nan 3
我想使用 Theil Sen 做一些回归。我写了以下脚本:
def LR(df)
line = {}
slope = {}
for k, v in df.iteritems():
if v.empty:
pass # This is to check if a column is empty
else:
xm = np.ma.masked_array(df.index.values, mask=np.isnan(df[k]).compressed()
ym = np.ma.masked_array(df[k], mask=np.isnan(df[k]).compressed()
res = stats.theislopes(ym, xm, 0.90)
line[k] = res[1] + res[0] * xm
slope[k] = res[0]
return line, slope
问题是我有这个错误:
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,).
当我使用调试模式时,似乎特定列为空时出现错误。
实际问题是什么?
我设法使用 sklearn 修复了它,如下所示:
for k,v in df.iteritems():
xm=np.ma.masked_array(df.index.values,mask=np.isnan(df[k])).compressed()
ym=np.ma.masked_array(df[k], mask=np.isnan(df[k])).compressed()
if len(xm)>0 and len(ym)>0:
model=TheilSenRegressor()
xm=np.reshape(len(xm),1)
ym=np.reshape(len(ym),1)
model.fit(xm,ym)
return model.intercept_, model.coef_
else: pass
我有一个 pandas DataFrame。例如,我有以下内容:
column1 column2 column3
34 nan 3
45 nan 1
45 nan 3
45 nan 3
46 nan 3
45 nan nan
45 nan 3
47 nan 5
45 nan 3
50 nan 3
我想使用 Theil Sen 做一些回归。我写了以下脚本:
def LR(df)
line = {}
slope = {}
for k, v in df.iteritems():
if v.empty:
pass # This is to check if a column is empty
else:
xm = np.ma.masked_array(df.index.values, mask=np.isnan(df[k]).compressed()
ym = np.ma.masked_array(df[k], mask=np.isnan(df[k]).compressed()
res = stats.theislopes(ym, xm, 0.90)
line[k] = res[1] + res[0] * xm
slope[k] = res[0]
return line, slope
问题是我有这个错误:
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,).
当我使用调试模式时,似乎特定列为空时出现错误。
实际问题是什么?
我设法使用 sklearn 修复了它,如下所示:
for k,v in df.iteritems():
xm=np.ma.masked_array(df.index.values,mask=np.isnan(df[k])).compressed()
ym=np.ma.masked_array(df[k], mask=np.isnan(df[k])).compressed()
if len(xm)>0 and len(ym)>0:
model=TheilSenRegressor()
xm=np.reshape(len(xm),1)
ym=np.reshape(len(ym),1)
model.fit(xm,ym)
return model.intercept_, model.coef_
else: pass