更改形状标记取决于 matplotlib 中的第三个字符串变量

Question

我有一个名为 df_comunidades 的 pandas 数据框。在这里我们可以看到类似于 head():

的东西

df_comunidades.head()

Tormenta    Comunidad   TIEPI   Gustmax
0   ANA ANDALUCIA   0.050   130.2
1   ANA ARAGON  0.250   90.5
2   BRUNO   ANDALUCIA   0.012   114.0
3   BRUNO   CATALUNYA   0.023   78.2
4   KARINE  ARAGON  3.500   80.2
5   ANA BALEARES    2.000   97.2

每个“Comunidad”在我的散点图中都有不同的颜色，但此外，我希望每个“Tormenta”都有不同的形状标记。我尝试了很多方法......其中一种类似于我用于颜色的方法。我也尝试了一个循环 for i in range(len(markers)): 所有标记都保存在列表中 markers=['o','v','<','>','1','8','s','*','x','d'] 我的稳定码是：

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# initialize list of lists
data = [['ANA', 'ANDALUCIA', 0.05, 130.2], ['ANA', 'ARAGON', 0.25, 90.5], ['BRUNO', 'ANDALUCIA', 0.012, 114], ['BRUNO', 'CATALUNYA', 0.023, 78.2],['KARINE', 'ARAGON', 3.5, 80.2], ['ANA', 'BALEARES', 2, 97.2]]
 
# Create the pandas DataFrame
df_comunidades = pd.DataFrame(data, columns = ['Tormenta', 'Comunidad', 'TIEPI', 'Gustmax'])

#I define every color for every "Comunidad"
colors = {'ANDALUCIA' : 'g',
          'CATALUNYA' : 'y',
          'BALEARES' : 'r', 
          'ARAGON' : 'c'}
c = [colors[comunid] for comunid in df_comunidades['Comunidad']]

plt.scatter(df_comunidades['TIEPI'], df_comunidades['Gustmax'], alpha=0.5, c=c)

ax = plt.subplot(1, 1, 1)
#code to title the axes and the plot: 
ax.set_xlim([0,3])
ax.set_xlabel("TIEPI")
ax.set_ylabel("Max Gusts in community")
plt.title("Relation between max gusts and TIEPI in autonomous communities")
plt.savefig('max_tiepi-gusts_comunid.png',dpi=300)

我明白了...看起来正方形高于其他部分...但散点图中的每个点都应该是“Tormenta”并且颜色表示“Comunidad”。

整个数据看起来是这样的：

在评论后编辑以便更清楚

Answer 1

我想你只是在感叹 scatter() 需要 颜色序列，但只是一个单个标记。所以我们需要遍历 N 个点。（或者我们可以 .groupby() 如果我们只想对 T 次调用进行 T 次折磨。）

两者之间似乎存在差异 “最小阵风”标签和 “GustMAX”列。

有很多markers可供选择。您可以尝试 'v'、's'、'p' 经历 3、4、5 面标记的进展。

我做了这些更改以生成随附的图表。

--- a/tmp/so_69076918_orig.py
+++ b/tmp/so_69076918.py
@@ -14,14 +14,22 @@ colors = {'ANDALUCIA' : 'g',
           'CATALUNYA' : 'y',
           'BALEARES' : 'r', 
           'ARAGON' : 'c'}
+markers = {'ANA': 'v',
+           'BRUNO': 'x',
+           'KARINE': 'd'}
 c = [colors[comunid] for comunid in df_comunidades['Comunidad']]
 
-plt.scatter(df_comunidades['TIEPI'], df_comunidades['Gustmax'], alpha=0.5, c=c)
-
 ax = plt.subplot(1, 1, 1)
 #code to title the axes and the plot: 
-ax.set_xlim([0,3])
+ax.set_xlim([-.1, 4])
 ax.set_xlabel("TIEPI")
+ax.set_ylim([0, 140])
 ax.set_ylabel("Min Gusts in community")
 plt.title("Relation between min gusts and TIEPI in autonomous communities")
+
+for row in df_comunidades.itertuples():
+    plt.scatter([row.TIEPI], [row.Gustmax], alpha=0.5,
+                c=colors[row.Comunidad],
+                marker=markers[row.Tormenta])
+
 plt.savefig('min_tiepi-gusts_comunid.png',dpi=300)

更改形状标记取决于 matplotlib 中的第三个字符串变量

Changing shape markers depend on a third string variable in matplotlib

python

matplotlib

markers

scatter-plot