使用 Python / hvplot 叠加来自两个不同数据源的两个图

Overlay of two plots from two different data sources using Python / hvplot

我想在 hvplot(来源:xarray/NetCDF)上绘制线图(来源:pandas 数据帧)。

xarray 看起来像这样:

dataDIR = 'ceilodata.nc'
DS = xr.open_dataset(dataDIR)
DS = DS.transpose()
print(DS)

<xarray.Dataset>
Dimensions:         (range_hr: 32, range: 1024, layer: 3, time: 5760)
Coordinates:
  * range_hr        (range_hr) float32 0.001 4.995 9.99 ... 144.9 149.9 154.8
  * range           (range) float32 14.98 29.97 44.96 ... 1.533e+04 1.534e+04
  * layer           (layer) int32 1 2 3
  * time            (time) datetime64[ns] 2022-03-18 ... 2022-03-18T23:59:46
Data variables: (12/41)
    zenith          float32 ...
    wavelength      float32 ...
    scaling         float32 ...
    range_gate_hr   float32 ...
    range_gate      float32 ...
    longitude       float32 ...
    ...              ...
    cbe             (layer, time) int16 ...
    beta_raw_hr     (range_hr, time) float32 ...
    beta_raw        (range, time) float32 ...
    bcc             (time) int8 ...
    base            (time) float32 ...
    average_time    (time) int32 ...
Attributes: (12/13)
    comment:           
    software_version:  15.06.1 2.13 1.040 1
    title:             CHM15k Nimbus
    wmo_id:            10865
    month:             3
    source:            CHM160138
    ...                ...
    serlom:            TUB160038
    location:          muenchen
    year:              2022
    device_name:       CHM160138
    institution:       DWD
    day:               18

pandas 数据帧源如下所示:

df = pd.read_csv('PTU.csv')
print(df)

               Unnamed: 0                PTU
0     2022-03-18 07:38:56            451.839
1     2022-03-18 07:38:57            468.826
2     2022-03-18 07:38:58            469.093
3     2022-03-18 07:38:59            469.356
4     2022-03-18 07:39:00            469.623
...                   ...                ...
6140  2022-03-18 09:21:16          31690.600
6141  2022-03-18 09:21:17          31694.700
6142  2022-03-18 09:21:18          31692.900
6143  2022-03-18 09:21:19          31712.000
6144  2022-03-18 09:21:20          31711.500

[6145 rows x 2 columns]

两者都是时间相关数据集,但具有不同的时间戳和频率。时间是每个数据集中的索引。

我试图将它们与额外导入的全息图一起绘制。虽然每个单独的图都没有问题,但将它们组合在一起似乎并不像我尝试的那样有效:

import hvplot.pandas
import holoviews as hv

# cmap of the xarray:
ceilo = (DS.b_r.hvplot(cmap="viridis_r", width = 850, height = 600, title = 'title', clim = (5, 80))

# line plot of the data frame
p = df.hvplot.line()

# add pressure line plot to pcolormeshplot using * which overlays the line on the plot
ceilo * p

但这以带有以下完整回溯的错误消息结束:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-10-2b1c6baca339> in <module>
     24 p = df.hvplot.line()
     25 # add pressure line plot to pcolormeshplot using * which overlays the line on the plot
---> 26 ceilo * df

c:\python38\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     68         other = item_from_zerodim(other)
     69 
---> 70         return method(self, other)
     71 
     72     return new_method

c:\python38\lib\site-packages\pandas\core\arraylike.py in __rmul__(self, other)
    118     @unpack_zerodim_and_defer("__rmul__")
    119     def __rmul__(self, other):
--> 120         return self._arith_method(other, roperator.rmul)
    121 
    122     @unpack_zerodim_and_defer("__truediv__")

c:\python38\lib\site-packages\pandas\core\frame.py in _arith_method(self, other, op)
   6936         other = ops.maybe_prepare_scalar_for_op(other, (self.shape[axis],))
   6937 
-> 6938         self, other = ops.align_method_FRAME(self, other, axis, flex=True, level=None)
   6939 
   6940         new_data = self._dispatch_frame_op(other, op, axis=axis)

c:\python38\lib\site-packages\pandas\core\ops\__init__.py in align_method_FRAME(left, right, axis, flex, level)
    275     elif is_list_like(right) and not isinstance(right, (ABCSeries, ABCDataFrame)):
    276         # GH 36702. Raise when attempting arithmetic with list of array-like.
--> 277         if any(is_array_like(el) for el in right):
    278             raise ValueError(
    279                 f"Unable to coerce list of {type(right[0])} to Series/DataFrame"

c:\python38\lib\site-packages\holoviews\core\element.py in __iter__(self)
     94     def __iter__(self):
     95         "Disable iterator interface."
---> 96         raise NotImplementedError('Iteration on Elements is not supported.')
     97 
     98 

NotImplementedError: Iteration on Elements is not supported.

不同的时间频率是这里的问题吗?考虑到底层 cmap-(matplotlib)-plot 的正确时间戳和高度,线图应沿 x 轴和 y 轴定向。

为了说明我的目标,这是我的目标图片:

感谢阅读/帮助。

我找到了这个案例的解决方案:

两个数据集时间列的格式必须相同。在我的例子中是:datetime64[ns](采用 NetCDF xarray)。这就是为什么我将数据帧时间列转换为 datetime64[ns]:

df.Datetime = df.Datetime.astype('datetime64')

我还发现数据是“对象”类型。所以我把它改成了“浮动”:

df.PTU = df.PTU.astype(float) # convert to correct data type

最后一步是选择 hvplot,因为这有助于绘制 xarray 数据

import hvplot.xarray
hvplot.quadmesh

这是我的最终解决方案:

title = ('Ceilo data + '\ndate: '+ str(DS.year) + '-' + str(DS.month) + '-' + str(DS.day))

ceilo = (DS.br.hvplot.quadmesh(cmap="viridis_r", width = 850, height = 600, title = title, 
                               clim = (1000, 10000),  # set colorbar limits
                               cnorm = ('log'), # choose log scale
                               clabel = ('colorbar title'),
                               rot = 0  # degree rotation of ticks
                               )
         )

# from: https://justinbois.github.io/bootcamp/2020/lessons/l27_holoviews.html
# take care! may take 2...3 minutes to be ploted:
p = hv.Points(data=df,
              kdims=['Datetime', 'PTU'],
              ).opts(#alpha=0.7, 
                    color='red',
                    size=1,
                    ylim=(0, 5000))

# add PTU line plot to quadmesh plot using * which overlays the line on the plot
ceilo * p