Statsmodels mosaic plot ValueError: cannot convert float NaN to integer
Statsmodels mosaic plot ValueError: cannot convert float NaN to integer
我有一个简单的 pandas DataFrame,我想为其创建一个马赛克图。这是我的代码:
import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
mydata = pd.DataFrame({'id2': {64: 'Angelica',
65: 'DXW_UID', 66: 'casuid01',
67: 'casuid01', 68: 'EC93_uid',
69: 'EC93_uid', 70: 'EC93_uid',
60: 'DXW_UID', 61: 'AtmosFox',
62: 'DXW_UID', 63: 'DXW_UID'},
'id1': {64: 'TGP',
65: 'Retention01', 66: 'default',
67: 'default', 68: 'Musa_EC_9_3',
69: 'Musa_EC_9_3', 70: 'Musa_EC_9_3',
60: 'default', 61: 'default',
62: 'default', 63: 'default'}})
mydata
id1 id2
60 default DXW_UID
61 default AtmosFox
62 default DXW_UID
63 default DXW_UID
64 TGP Angelica
65 Retention01 DXW_UID
66 default casuid01
67 default casuid01
68 Musa_EC_9_3 EC93_uid
69 Musa_EC_9_3 EC93_uid
70 Musa_EC_9_3 EC93_uid
[11 rows x 2 columns]
当我排除第 64 行时,我可以很好地创建马赛克图。
mosaic(mydata[mydata.id1!='TGP'], ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11E0D3B0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.594059405940594, 0.49504950495049505)), (('default', 'AtmosFox'), (0.0, 0.49834983498349833, 0.594059405940594, 0.16501650165016499)), (('default', 'casuid01'), (0.0, 0.66666666666666663, 0.594059405940594, 0.33003300330033009)), (('default', 'EC93_uid'), (0.0, 1.0, 0.594059405940594, 0.0)), (('Retention01', 'DXW_UID'), (0.599009900990099, 0.0, 0.09900990099009899, 0.99009900990099009)), (('Retention01', 'AtmosFox'), (0.599009900990099, 0.99339933993399343, 0.09900990099009899, 0.0)), (('Retention01', 'casuid01'), (0.599009900990099, 0.99669966996699666, 0.09900990099009899, 0.0)), (('Retention01', 'EC93_uid'), (0.599009900990099, 1.0, 0.09900990099009899, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7029702970297029, 0.0, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7029702970297029, 0.0033003300330033004, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7029702970297029, 0.0066006600660066007, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7029702970297029, 0.0099009900990099011, 0.29702970297029707, 0.99009900990099009))]))
情节很好(除了一些标签看起来有点滑稽——但这不是问题所在)。
当我包含第 64 行时出现错误。我的问题是,为什么这一行会导致此错误,我该如何解决?我可以看到在尝试绘制图像时发生错误,但 NaN 的来源一点也不明显,特别是因为之前的情节工作得很好。
mosaic(mydata, ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11D13ED0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.5373936408419167, 0.49342105263157893)), (('default', 'AtmosFox'), (0.0, 0.49671052631578938, 0.5373936408419167, 0.16447368421052627)), (('default', 'casuid01'), (0.0, 0.66447368421052622, 0.5373936408419167, 0.32894736842105265)), (('default', 'Angelica'), (0.0, 0.99671052631578938, 0.5373936408419167, 0.0)), (('default', 'EC93_uid'), (0.0, 1.0, 0.5373936408419167, 0.0)), (('TGP', 'DXW_UID'), (0.5423197492163009, 0.0, 0.08956560680698614, 0.0)), (('TGP', 'AtmosFox'), (0.5423197492163009, 0.0032894736842105261, 0.08956560680698614, 0.0)), (('TGP', 'casuid01'), (0.5423197492163009, 0.0065789473684210523, 0.08956560680698614, 0.0)), (('TGP', 'Angelica'), (0.5423197492163009, 0.0098684210526315784, 0.08956560680698614, 0.98684210526315785)), (('TGP', 'EC93_uid'), (0.5423197492163009, 1.0, 0.08956560680698614, 0.0)), (('Retention01', 'DXW_UID'), (0.6368114643976712, 0.0, 0.08956560680698614, 0.98684210526315785)), (('Retention01', 'AtmosFox'), (0.6368114643976712, 0.99013157894736836, 0.08956560680698614, 0.0)), (('Retention01', 'casuid01'), (0.6368114643976712, 0.99342105263157876, 0.08956560680698614, 0.0)), (('Retention01', 'Angelica'), (0.6368114643976712, 0.99671052631578938, 0.08956560680698614, 0.0)), (('Retention01', 'EC93_uid'), (0.6368114643976712, 1.0, 0.08956560680698614, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7313031795790416, 0.0, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7313031795790416, 0.0032894736842105261, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7313031795790416, 0.0065789473684210523, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'Angelica'), (0.7313031795790416, 0.0098684210526315784, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7313031795790416, 0.013157894736842105, 0.2686968204209583, 0.98684210526315785))]))
当我 运行 以上时,我得到这个回溯:
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 374, in idle_draw
self.draw()
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
FigureCanvasAgg.draw(self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
self.figure.draw(self.renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
func(*args)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
a.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
tick.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
self.label1.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
ismath=ismath, mtext=self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 299, in resizeEvent
self.draw()
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
FigureCanvasAgg.draw(self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
self.figure.draw(self.renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
func(*args)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
a.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
tick.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
self.label1.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
ismath=ismath, mtext=self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
我运行上面的代码在spyderIDE中,使用默认设置。
解决了一个类似的问题 here,数字下溢是罪魁祸首。但是,如果这里是这种情况,则根本不明原因。
根据the docs,第一个参数应该是一个偶然事件table。事实上,您的做事方式似乎是一个未记录的功能。
您看到的行为(包括您的 "funny" 外观标签)是因为您的意外事件 table 中的许多条目为零,而 [=13= 的标签代码中的某些内容] 很难做到这一点。
要查看此内容,请将您的 DataFrame
转换为应急 table:
In [161]: pd.crosstab(mydata.id1, mydata.id2)
Out[161]:
id2 Angelica AtmosFox DXW-UID EC93-uid casuid01
id1
Musa-EC-9-3 0 0 0 3 0
Retention01 0 0 1 0 0
TGP 1 0 0 0 0
default 0 1 3 0 2
并为所有这些零添加 "little bit"。然后马赛克就可以正常工作了。
In [165]: ct = pd.crosstab(mydata.id1, mydata.id2)
In [166]: ctplus = ct + 1
In [167]: mosaic(ctplus.unstack())
结果相当漂亮:
微小的缺点是它是错误的!但是你可以通过
来补救
ctplus = ct + 1e-8
只为所有这些零添加一点点。该图仍然有效(但看起来很难看,因为马赛克的所有零瓷砖上的标签都在彼此之上):
我有一个简单的 pandas DataFrame,我想为其创建一个马赛克图。这是我的代码:
import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
mydata = pd.DataFrame({'id2': {64: 'Angelica',
65: 'DXW_UID', 66: 'casuid01',
67: 'casuid01', 68: 'EC93_uid',
69: 'EC93_uid', 70: 'EC93_uid',
60: 'DXW_UID', 61: 'AtmosFox',
62: 'DXW_UID', 63: 'DXW_UID'},
'id1': {64: 'TGP',
65: 'Retention01', 66: 'default',
67: 'default', 68: 'Musa_EC_9_3',
69: 'Musa_EC_9_3', 70: 'Musa_EC_9_3',
60: 'default', 61: 'default',
62: 'default', 63: 'default'}})
mydata
id1 id2
60 default DXW_UID
61 default AtmosFox
62 default DXW_UID
63 default DXW_UID
64 TGP Angelica
65 Retention01 DXW_UID
66 default casuid01
67 default casuid01
68 Musa_EC_9_3 EC93_uid
69 Musa_EC_9_3 EC93_uid
70 Musa_EC_9_3 EC93_uid
[11 rows x 2 columns]
当我排除第 64 行时,我可以很好地创建马赛克图。
mosaic(mydata[mydata.id1!='TGP'], ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11E0D3B0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.594059405940594, 0.49504950495049505)), (('default', 'AtmosFox'), (0.0, 0.49834983498349833, 0.594059405940594, 0.16501650165016499)), (('default', 'casuid01'), (0.0, 0.66666666666666663, 0.594059405940594, 0.33003300330033009)), (('default', 'EC93_uid'), (0.0, 1.0, 0.594059405940594, 0.0)), (('Retention01', 'DXW_UID'), (0.599009900990099, 0.0, 0.09900990099009899, 0.99009900990099009)), (('Retention01', 'AtmosFox'), (0.599009900990099, 0.99339933993399343, 0.09900990099009899, 0.0)), (('Retention01', 'casuid01'), (0.599009900990099, 0.99669966996699666, 0.09900990099009899, 0.0)), (('Retention01', 'EC93_uid'), (0.599009900990099, 1.0, 0.09900990099009899, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7029702970297029, 0.0, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7029702970297029, 0.0033003300330033004, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7029702970297029, 0.0066006600660066007, 0.29702970297029707, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7029702970297029, 0.0099009900990099011, 0.29702970297029707, 0.99009900990099009))]))
情节很好(除了一些标签看起来有点滑稽——但这不是问题所在)。
当我包含第 64 行时出现错误。我的问题是,为什么这一行会导致此错误,我该如何解决?我可以看到在尝试绘制图像时发生错误,但 NaN 的来源一点也不明显,特别是因为之前的情节工作得很好。
mosaic(mydata, ['id1','id2'])
(<matplotlib.figure.Figure object at 0x11D13ED0>, OrderedDict([(('default', 'DXW_UID'), (0.0, 0.0, 0.5373936408419167, 0.49342105263157893)), (('default', 'AtmosFox'), (0.0, 0.49671052631578938, 0.5373936408419167, 0.16447368421052627)), (('default', 'casuid01'), (0.0, 0.66447368421052622, 0.5373936408419167, 0.32894736842105265)), (('default', 'Angelica'), (0.0, 0.99671052631578938, 0.5373936408419167, 0.0)), (('default', 'EC93_uid'), (0.0, 1.0, 0.5373936408419167, 0.0)), (('TGP', 'DXW_UID'), (0.5423197492163009, 0.0, 0.08956560680698614, 0.0)), (('TGP', 'AtmosFox'), (0.5423197492163009, 0.0032894736842105261, 0.08956560680698614, 0.0)), (('TGP', 'casuid01'), (0.5423197492163009, 0.0065789473684210523, 0.08956560680698614, 0.0)), (('TGP', 'Angelica'), (0.5423197492163009, 0.0098684210526315784, 0.08956560680698614, 0.98684210526315785)), (('TGP', 'EC93_uid'), (0.5423197492163009, 1.0, 0.08956560680698614, 0.0)), (('Retention01', 'DXW_UID'), (0.6368114643976712, 0.0, 0.08956560680698614, 0.98684210526315785)), (('Retention01', 'AtmosFox'), (0.6368114643976712, 0.99013157894736836, 0.08956560680698614, 0.0)), (('Retention01', 'casuid01'), (0.6368114643976712, 0.99342105263157876, 0.08956560680698614, 0.0)), (('Retention01', 'Angelica'), (0.6368114643976712, 0.99671052631578938, 0.08956560680698614, 0.0)), (('Retention01', 'EC93_uid'), (0.6368114643976712, 1.0, 0.08956560680698614, 0.0)), (('Musa_EC_9_3', 'DXW_UID'), (0.7313031795790416, 0.0, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'AtmosFox'), (0.7313031795790416, 0.0032894736842105261, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'casuid01'), (0.7313031795790416, 0.0065789473684210523, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'Angelica'), (0.7313031795790416, 0.0098684210526315784, 0.2686968204209583, 0.0)), (('Musa_EC_9_3', 'EC93_uid'), (0.7313031795790416, 0.013157894736842105, 0.2686968204209583, 0.98684210526315785))]))
当我 运行 以上时,我得到这个回溯:
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 374, in idle_draw
self.draw()
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
FigureCanvasAgg.draw(self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
self.figure.draw(self.renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
func(*args)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
a.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
tick.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
self.label1.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
ismath=ismath, mtext=self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4.py", line 299, in resizeEvent
self.draw()
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_qt4agg.py", line 154, in draw
FigureCanvasAgg.draw(self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 451, in draw
self.figure.draw(self.renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\figure.py", line 1034, in draw
func(*args)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 2086, in draw
a.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 1096, in draw
tick.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\axis.py", line 241, in draw
self.label1.draw(renderer)
File "C:\Python27\lib\site-packages\matplotlib\artist.py", line 55, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "C:\Python27\lib\site-packages\matplotlib\text.py", line 598, in draw
ismath=ismath, mtext=self)
File "C:\Python27\lib\site-packages\matplotlib\backends\backend_agg.py", line 188, in draw_text
font.get_image(), np.round(x - xd), np.round(y + yd) + 1, angle, gc)
ValueError: cannot convert float NaN to integer
我运行上面的代码在spyderIDE中,使用默认设置。
解决了一个类似的问题 here,数字下溢是罪魁祸首。但是,如果这里是这种情况,则根本不明原因。
根据the docs,第一个参数应该是一个偶然事件table。事实上,您的做事方式似乎是一个未记录的功能。
您看到的行为(包括您的 "funny" 外观标签)是因为您的意外事件 table 中的许多条目为零,而 [=13= 的标签代码中的某些内容] 很难做到这一点。
要查看此内容,请将您的 DataFrame
转换为应急 table:
In [161]: pd.crosstab(mydata.id1, mydata.id2)
Out[161]:
id2 Angelica AtmosFox DXW-UID EC93-uid casuid01
id1
Musa-EC-9-3 0 0 0 3 0
Retention01 0 0 1 0 0
TGP 1 0 0 0 0
default 0 1 3 0 2
并为所有这些零添加 "little bit"。然后马赛克就可以正常工作了。
In [165]: ct = pd.crosstab(mydata.id1, mydata.id2)
In [166]: ctplus = ct + 1
In [167]: mosaic(ctplus.unstack())
结果相当漂亮:
微小的缺点是它是错误的!但是你可以通过
来补救ctplus = ct + 1e-8
只为所有这些零添加一点点。该图仍然有效(但看起来很难看,因为马赛克的所有零瓷砖上的标签都在彼此之上):