如何使用接受一个输入并输出标量值的函数来初始化等高线图的数据？

Question

注意： 由于文档字符串和由 40 个日期时间组成的数组，post 看起来比应有的要长。

我有一些时间序列数据。例如，假设我有三个参数，每个参数由 40 个数据点组成：日期时间（由 dts 给出）、速度（由 vobs 给出）和经过的小时（由 [=16= 给出） ]), 按键组合成字典 data_dict.

dts = np.array(['2006/01/01 02:30:04', '2006/01/01 03:30:04', '2006/01/01 03:54:04'
 ,'2006/01/01 05:30:04', '2006/01/01 06:30:04', '2006/01/01 07:30:04'
 ,'2006/01/01 08:30:04', '2006/01/01 09:30:04', '2006/01/01 10:30:04'
 ,'2006/01/01 11:30:04', '2006/01/01 12:30:04', '2006/01/01 13:30:04'
 ,'2006/01/01 14:30:04', '2006/01/01 15:30:04', '2006/01/01 16:30:04'
 ,'2006/01/01 17:30:04', '2006/01/01 18:30:04', '2006/01/01 19:30:04'
 ,'2006/01/01 20:30:04', '2006/01/01 21:30:04', '2006/01/01 21:54:05'
 ,'2006/01/01 23:30:04', '2006/01/02 00:30:04', '2006/01/02 01:30:04'
 ,'2006/01/02 02:30:04', '2006/01/02 03:30:04', '2006/01/02 04:30:04'
 ,'2006/01/02 05:30:04', '2006/01/02 06:30:04', '2006/01/02 07:30:04'
 ,'2006/01/02 08:30:04', '2006/01/02 09:30:04', '2006/01/02 10:30:04'
 ,'2006/01/02 11:30:04', '2006/01/02 12:30:04', '2006/01/02 13:30:04'
 ,'2006/01/02 14:30:04', '2006/01/02 15:30:04', '2006/01/02 16:30:04'
 ,'2006/01/02 17:30:04'])

vobs = np.array([158, 1, 496, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
    , 1, 1, 823, 1, 1, 1, 1, 303, 1, 1, 1, 1, 253, 1, 1, 1, 408, 1
    , 1, 1, 1, 321])

els = np.array([i for i in range(len(vobs))])

data_dictionary = {'datetime' : dts, 'values' : vobs, 'elapsed' : els}

我有一个将字典作为输入并输出单个标量值 type <float> 或 type <int> 的函数。下面给出的功能比我的实际用例更简单，只是为了举例。

def get_z(dictionary):
    """ This function returns a scalar value. """
    return np.sum(dictionary['elapsed'] / dictionary['values'])

我想看看这个函数输出如何随着时间间隔的变化而变化。因此，我创建了一个函数，它将字典作为输入并输出一个新字典，其数组值在输入字典中的每个键的输入索引处切片。请注意，连续经过的小时数可以作为索引。

def subsect(dictionary, indices):
    """ This function returns a dictionary, the array values
        of which are sliced at the input indices. """
    return {key : dictionary[key][indices] for key in list(dictionary.keys())}

要验证上述函数是否有效，可以运行下面包含函数 read_dictionary(...) 的 for 循环。

def read_dictionary(dictionary):
    """ This function prints the input dictionary as a check. """
    for key in list(dictionary.keys()):
        print(" .. KEY = {}\n{}\n".format(key, dictionary[key]))

print("\nORIGINAL DATA DICTIONARY\n")
read_dictionary(data_dictionary)

# for i in range(1, 38):
    # mod_dictionary = subsect(data_dictionary, indices=slice(i, 39, 1))
    # print("\n{}th MODIFIED DATA DICTIONARY\n".format(i))
    # read_dictionary(mod_dictionary)

我的问题是我想要等高线图。 x 轴将包含日期时间间隔的下限（mod_dictionary[i] 的第一个条目），而 y 轴将包含日期时间间隔的上限（mod_dictioary[i] 的最后一个条目）。通常在制作等高线图时，会有一个 (x,y) 值数组，这些值通过 numpy.meshgrid 制成网格 (X,Y)。由于我的实际函数（不是示例中的函数）未矢量化，我可以使用 X.copy().reshape(-1) 并使用 (...).reshape(X.shape) 重塑我的结果。

我的确切问题是我不知道如何使用单个字典作为输出单个标量值的函数的输入来制作不同参数的网格。有办法吗？

Answer 1

如果我正确理解了您的想法，那么这应该就是您所需要的。但是我需要以下软件包：

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata
import pandas as pd

首先，所需的值存储在三个列表中。我不得不稍微更改 for 循环，因为在您的示例中，所有上限都相同，因此无法绘制等高线图：

lower_bounds = [];
upper_bounds = [];
z_values = [];
for j in range(1, 30):
  for i in range(0,j):
    mod_dictionary = subsect(data_dictionary, indices=slice(i, j, 1))
    lower_bounds.append(mod_dictionary['datetime'][0])
    upper_bounds.append(mod_dictionary['datetime'][-1])
    z_values.append(get_z(mod_dictionary))

然后日期时间字符串被转换为Timestamps:

lower_bounds_dt = [pd.Timestamp(date).value for date in lower_bounds]
upper_bounds_dt = [pd.Timestamp(date).value for date in upper_bounds]

并生成等高线图的网格：

xi = np.linspace(min(lower_bounds_dt), max(lower_bounds_dt), 100)
print(xi)
yi = np.linspace(min(upper_bounds_dt), max(upper_bounds_dt), 100)
print(yi)

使用 griddata 生成 z 值缺失的网格点。

zi = griddata(lower_bounds_dt, upper_bounds_dt, z_values, xi, yi)
print(zi)

最后你可以使用contour或contourf生成等高线图：

fig1 = plt.figure(figsize=(10, 8))
ax1 = fig1.add_subplot(111)
ax1.contourf(xi, yi, zi)
fig1.savefig('graph.png')

由于目前生成的数据只有一小段（因为for循环中上下界是一起增加的）所以结果是这样的：

您可以通过更改在 for 循环中跨越数据数组的方式轻松更改此设置。使用 pd.to_datetime 您还可以以您喜欢的日期时间格式显示 x 和 y 轴。

编辑： 我把完整的例子上传到 repl.it

Answer 2

使用 @Axel 发布的解决方案，我能够在不使用 griddata 和 pandas 的情况下制作等高线图。（我需要编辑滴答标签，但这不是我关心的问题。原始字典中经过的小时数可以用作索引，用于为此目的对日期时间数组进行切片）。这种方法的优点是不需要插值，并且使用 numpy 向量化比使用双 for 循环获得的速度要快。

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker

def initialize_xy_grid(data_dictionary):
    """ """
    params = {'x' : {}, 'y' : {}}
    params['x']['datetime'] = data_dictionary['datetime'][:-1]
    params['x']['elapsed'] = data_dictionary['elapsed'][:-1]
    params['y']['datetime'] = data_dictionary['datetime'][1:]
    params['y']['elapsed'] = data_dictionary['elapsed'][1:]
    X_dt, Y_dt = np.meshgrid(params['x']['datetime'], params['y']['datetime'])
    X_hr, Y_hr = np.meshgrid(params['x']['elapsed'], params['y']['elapsed'])
    return X_hr, Y_hr, X_dt, Y_dt

def initialize_z(data_dictionary, X, Y):
    """ """
    xx = X.copy().reshape(-1)
    yy = Y.copy().reshape(-1)
    return np.array([get_z(subsect(data_dictionary, indices=slice(xi, yi, 1))) for xi, yi in zip(xx, yy)])

def initialize_Z(z, shape):
    """ """
    return z.reshape(shape)

X_hr, Y_hr, X_dt, Y_dt = initialize_xy_grid(data_dictionary)
z = initialize_z(data_dictionary, X_hr, Y_hr)
Z = initialize_Z(z, X_hr.shape)

ncontours = 11
plt.contourf(X_hr, Y_hr, Z, ncontours, cmap='plasma', )
contours = plt.contour(X_hr, Y_hr, Z, ncontours, colors='k')
fmt_func = lambda x, pos : "{:1.3f}".format(x)
fmt = matplotlib.ticker.FuncFormatter(fmt_func)
plt.clabel(contours, inline=True, fontsize=8, fmt=fmt)
plt.show()

如何使用接受一个输入并输出标量值的函数来初始化等高线图的数据？

How can one initialize data for a contour plot using a function that takes one input and outputs a scalar value?

scalar

numpy

matplotlib

contour

python-3.x