Matplotlib 散点图,每个点都有不同的文本
Matplotlib scatter plot with different text at each point
假设我有 3 个系列
>>> df[df['Type']=="Machine Learning"]['Cost']
0 2300.00
1 3200.00
4 1350.00
7 1352.00
8 4056.00
9 79.00
10 1595.00
Name: Cost, dtype: float64
>>>df[df['Type']=="Machine Learning"]['Rank']
0 1
1 1
4 1
7 2
8 2
9 2
10 2
Name: Rank, dtype: int64
>>>df[df['Type']=="Machine Learning"]['Univ/Org']
0 Massachusetts Institute of Technology
1 Massachusetts Institute of Technology
4 EDX/MIT
7 Stanford University
8 Stanford University
9 Coursera/Stanford University
10 Stanford University
Name: Univ/Org, dtype: object
现在我想绘制一个散点图,y 轴为 Cost,X 轴为 Rank,每个数据点的名称为 Univ/Org。
现在在参考了this问题后我能做的是
plt.scatter(df[df['Type']=="Machine Learning"]['Rank'], df[df['Type']=="Machine Learning"]['Cost'],marker='2', edgecolors='black')
for i, txt in enumerate(df[df['Type']=="Machine Learning"]['Univ/Org']):
plt.annotate(txt, (df[df['Type']=="Machine Learning"]['Rank'][i], df[df['Type']=="Machine Learning"]['Cost'][i]))
它正在命名 2 个数据点,然后给出错误。
情节是:
错误是:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-111-0d31107a166a> in <module>
1 plt.scatter(df[df['Type']=="Machine Learning"]['Rank'], df[df['Type']=="Machine Learning"]['Cost'],marker='2', edgecolors='black')
2 for i, txt in enumerate(df[df['Type']=="Machine Learning"]['Univ/Org']):
----> 3 plt.annotate(txt, (df[df['Type']=="Machine Learning"]['Rank'][i], df[df['Type']=="Machine Learning"]['Cost'][i]))
~/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
869 key = com.apply_if_callable(key, self)
870 try:
--> 871 result = self.index.get_value(self, key)
872
873 if not is_scalar(result):
~/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4403 k = self._convert_scalar_indexer(k, kind="getitem")
4404 try:
-> 4405 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4406 except KeyError as e1:
4407 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 2
两件事。
首先,我建议您 select 您将 ML 数据放入新的数据框中。您还应该使用 .loc
和 .at
访问器来更加精确。所以像这样:
mldf = df.loc[df['Type'] == "Machine Learning", :]
fig, ax = plt.sunplots()
ax.scatter('Rank', 'Cost', data=mldf, marker='2', edgecolors='black')
for i in mldf.index:
ax.annotate(mldf.at[i, 'Univ/Org'], (mldf.at[i, 'Rank'], mldf.at[i, 'Cost'])
假设我有 3 个系列
>>> df[df['Type']=="Machine Learning"]['Cost']
0 2300.00
1 3200.00
4 1350.00
7 1352.00
8 4056.00
9 79.00
10 1595.00
Name: Cost, dtype: float64
>>>df[df['Type']=="Machine Learning"]['Rank']
0 1
1 1
4 1
7 2
8 2
9 2
10 2
Name: Rank, dtype: int64
>>>df[df['Type']=="Machine Learning"]['Univ/Org']
0 Massachusetts Institute of Technology
1 Massachusetts Institute of Technology
4 EDX/MIT
7 Stanford University
8 Stanford University
9 Coursera/Stanford University
10 Stanford University
Name: Univ/Org, dtype: object
现在我想绘制一个散点图,y 轴为 Cost,X 轴为 Rank,每个数据点的名称为 Univ/Org。
现在在参考了this问题后我能做的是
plt.scatter(df[df['Type']=="Machine Learning"]['Rank'], df[df['Type']=="Machine Learning"]['Cost'],marker='2', edgecolors='black')
for i, txt in enumerate(df[df['Type']=="Machine Learning"]['Univ/Org']):
plt.annotate(txt, (df[df['Type']=="Machine Learning"]['Rank'][i], df[df['Type']=="Machine Learning"]['Cost'][i]))
它正在命名 2 个数据点,然后给出错误。
情节是:
错误是:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-111-0d31107a166a> in <module>
1 plt.scatter(df[df['Type']=="Machine Learning"]['Rank'], df[df['Type']=="Machine Learning"]['Cost'],marker='2', edgecolors='black')
2 for i, txt in enumerate(df[df['Type']=="Machine Learning"]['Univ/Org']):
----> 3 plt.annotate(txt, (df[df['Type']=="Machine Learning"]['Rank'][i], df[df['Type']=="Machine Learning"]['Cost'][i]))
~/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in __getitem__(self, key)
869 key = com.apply_if_callable(key, self)
870 try:
--> 871 result = self.index.get_value(self, key)
872
873 if not is_scalar(result):
~/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4403 k = self._convert_scalar_indexer(k, kind="getitem")
4404 try:
-> 4405 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4406 except KeyError as e1:
4407 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 2
两件事。
首先,我建议您 select 您将 ML 数据放入新的数据框中。您还应该使用 .loc
和 .at
访问器来更加精确。所以像这样:
mldf = df.loc[df['Type'] == "Machine Learning", :]
fig, ax = plt.sunplots()
ax.scatter('Rank', 'Cost', data=mldf, marker='2', edgecolors='black')
for i in mldf.index:
ax.annotate(mldf.at[i, 'Univ/Org'], (mldf.at[i, 'Rank'], mldf.at[i, 'Cost'])