使用 matplotlib 绘制散点图和曲线图
Scatter and curve plot using matplotlib
我正在尝试绘制 NN 模型随着时间推移的准确度演变。所以,我有一个 excel 文件,其中包含如下数据:
我编写了以下代码来提取数据并绘制散点图:
import pandas as pd
data = pd.read_excel (r'SOTA DNN.xlsx')
acc1 = pd.DataFrame(data, columns= ['Top-1-Acc'])
para = pd.DataFrame(data, columns= ['Parameters'])
dates = pd.to_datetime(data['Date'], format='%Y-%m-%d')
import matplotlib.pyplot as plt
plt.grid(True)
plt.ylim(40, 100)
plt.scatter(dates, acc1)
plt.show()
有没有办法在同一张图中画一条线,只显示达到最大值的那些,并同时打印他们的名字,就像这个例子一样:
是否也可以拉伸 x 轴(对于日期)?
仍然不清楚“拉伸 x 轴”是什么意思,并且您没有提供数据集,但这里有一个可能的通用方法:
import matplotlib.pyplot as plt
import pandas as pd
#fake data generation, this has to be substituted by your .xls import routine
from pandas._testing import rands_array
import numpy as np
np.random.seed(1234)
n = 30
acc = np.concatenate([np.random.randint(0, 10, 10), np.random.randint(0, 30, 10), np.random.randint(0, 100, n-20)])
date_range = pd.date_range("20190101", periods=n)
model = rands_array(5, n)
df = pd.DataFrame({"Model": model, "Date": date_range, "TopAcc": acc})
df = df.sample(frac=1).reset_index(drop=True)
#now to the actual data modification
#first, we extract the dataframe with monotonically increasing values after sorting the date column
df = df.sort_values("Date").reset_index()
df["Max"] = df.TopAcc.cummax().diff()
df.loc[0, "Max"] = 1
dfmax = df[df.Max > 0]
#then, we plot all data, followed by the best performers
fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(df.Date, df.TopAcc, c="grey")
ax.plot(dfmax.Date, dfmax.TopAcc, marker="x", c="blue")
#finally, we annotate the best performers
for _, xylabel in dfmax.iterrows():
ax.text(xylabel.Date, xylabel.TopAcc, xylabel.Model, c="blue", horizontalalignment="right", verticalalignment="bottom")
plt.show()
示例输出:
我正在尝试绘制 NN 模型随着时间推移的准确度演变。所以,我有一个 excel 文件,其中包含如下数据:
import pandas as pd
data = pd.read_excel (r'SOTA DNN.xlsx')
acc1 = pd.DataFrame(data, columns= ['Top-1-Acc'])
para = pd.DataFrame(data, columns= ['Parameters'])
dates = pd.to_datetime(data['Date'], format='%Y-%m-%d')
import matplotlib.pyplot as plt
plt.grid(True)
plt.ylim(40, 100)
plt.scatter(dates, acc1)
plt.show()
有没有办法在同一张图中画一条线,只显示达到最大值的那些,并同时打印他们的名字,就像这个例子一样:
仍然不清楚“拉伸 x 轴”是什么意思,并且您没有提供数据集,但这里有一个可能的通用方法:
import matplotlib.pyplot as plt
import pandas as pd
#fake data generation, this has to be substituted by your .xls import routine
from pandas._testing import rands_array
import numpy as np
np.random.seed(1234)
n = 30
acc = np.concatenate([np.random.randint(0, 10, 10), np.random.randint(0, 30, 10), np.random.randint(0, 100, n-20)])
date_range = pd.date_range("20190101", periods=n)
model = rands_array(5, n)
df = pd.DataFrame({"Model": model, "Date": date_range, "TopAcc": acc})
df = df.sample(frac=1).reset_index(drop=True)
#now to the actual data modification
#first, we extract the dataframe with monotonically increasing values after sorting the date column
df = df.sort_values("Date").reset_index()
df["Max"] = df.TopAcc.cummax().diff()
df.loc[0, "Max"] = 1
dfmax = df[df.Max > 0]
#then, we plot all data, followed by the best performers
fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(df.Date, df.TopAcc, c="grey")
ax.plot(dfmax.Date, dfmax.TopAcc, marker="x", c="blue")
#finally, we annotate the best performers
for _, xylabel in dfmax.iterrows():
ax.text(xylabel.Date, xylabel.TopAcc, xylabel.Model, c="blue", horizontalalignment="right", verticalalignment="bottom")
plt.show()
示例输出: