从矩阵生成散点图

Generating Scatter Plot from a Matrix

我有一个代码可以生成 0 和 1 的随机矩阵,我想将这些矩阵转换成散点图,其中坐标对应于矩阵 row/column,以及散点的颜色点对应于值(例如,如果为 0,则为红色,如果为 1,则为蓝色)。

我已经能够使用 matplotlib 做到这一点,但我的用例涉及生成数千个这样的图像,而 matplotlib 对此目的非常慢。出于这个原因,我一直在尝试使用 pyctgraph,但是 运行 遇到了一些麻烦。

Matplotlib 代码:

import itertools
import random
import numpy as np
import matplotlib.pyplot as plt

d = 25
w = 10
l = 5

for n in range(num):
    lst = list(itertools.repeat(1, d + 1)) + list(itertools.repeat(0, d - 1))
    random.shuffle(lst)
    a = np.array(lst).reshape((w, l))
    for i in range(w):
         for j in range(l):
              if a[i, j] == 1:
                   plt.scatter(i + 1, j + 1, c="red")
              else:
                   plt.scatter(i + 1, j + 1, c="blue")
plt.savefig(path)
plt.clf()

Pyctgraph 代码尝试:

import pyqtgraph as pg
import pyqtgraph.exporters
import numpy as np
import itertools
import random

w = 10
l = 5
d = 25

for n in range(num):
    plt=pg.plot()
    lst = list(itertools.repeat(1, d + 1)) + list(itertools.repeat(0, d - 1))
    random.shuffle(lst)
    a = np.array(lst).reshape((w, l))
    for i in range(w):
         for j in range(l):
              if a[i, j] == 1:
                   p=pg.ScatterPlotItem([i + 1], [j + 1],brush=None)
                   plt.addItem(p)
              else:
                   p = pg.ScatterPlotItem([i + 1], [j + 1], brush=None)
                   plt.addItem(p)

exporter = pg.exporters.ImageExporter(plt.plotItem)

exporter.parameters()['width'] = 100

exporter.export('fileName.png')

pyctgraph 代码运行速度非常慢,所以我一定是做错了什么,因为我对这个包不熟悉。感谢您的帮助!

编辑:澄清一下,所需的最终产品是实心点网格,用空格分隔它们。红点数需要26个,蓝点数24个,顺序随机。

我认为在循环中使用嵌套循环和 运行ning plt.scatter 是您的程序浪费大量时间的地方。最好只 运行 plt.scatter 一次,而是传递一个 meshgrid 的 (x,y) 坐标,颜色随机打乱。

例如,我可以在没有任何循环或条件的情况下生成相同的图,我只需要为每个点调用 plt.scatter 一次而不是 5x10 = 50 次(!)

x = np.arange(1,w+1)
y = np.arange(1,l+1)
xx,yy = np.meshgrid(x,y)

colors = ['r']*26 + ['b']*24
random.shuffle(colors)
plt.scatter(xx,yy,color=colors)

我添加了一些基准测试来证明我们正在关注的性能改进:

import itertools
import random
import numpy as np
import matplotlib.pyplot as plt

d = 25
w = 10
l = 5

## original program using matplotlib and nested loops
def make_matplotlib_grid():
    lst = list(itertools.repeat(1, d + 1)) + list(itertools.repeat(0, d - 1))
    random.shuffle(lst)
    a = np.array(lst).reshape((w, l))
    for i in range(w):
        for j in range(l):
                if a[i, j] == 1:
                    plt.scatter(i + 1, j + 1, c="red")
                else:
                    plt.scatter(i + 1, j + 1, c="blue")

## using numpy mesh grid
def make_matplotlib_meshgrid():
    x = np.arange(1,w+1)
    y = np.arange(1,l+1)
    xx,yy = np.meshgrid(x,y)

    colors = ['r']*26 + ['b']*24
    random.shuffle(colors)
    plt.scatter(xx,yy,color=colors)


## benchmarking to compare speed between the two methods
if __name__ == "__main__":
    import timeit
    n_plots = 10
    setup = "from __main__ import make_matplotlib_grid"
    make_matplotlib_grid_time = timeit.timeit("make_matplotlib_grid()", setup=setup, number=n_plots)
    print(f"original program creates {n_plots} plots with an average time of {make_matplotlib_grid_time / n_plots} seconds")
    setup = "from __main__ import make_matplotlib_meshgrid"
    make_matplotlib_meshgrid_time = timeit.timeit("make_matplotlib_meshgrid()", setup=setup, number=n_plots)
    print(f"numpy meshgrid method creates {n_plots} plots with average time of {make_matplotlib_meshgrid_time / n_plots} seconds")
    print(f"on average, the numpy meshgrid method is roughly {make_matplotlib_grid_time / make_matplotlib_meshgrid_time}x faster")

输出:

original program creates 10 plots with an average time of 0.1041847709 seconds
numpy meshgrid method creates 10 plots with average time of 0.003275972299999985 seconds
on average, the numpy meshgrid method is roughly 31.80270202528894x faster