How to transform dataframe to tensorflow data set for GRU model? ValueError: (Unsupported numpy type: NPY_DATETIME)

How to transform dataframe to tensorflow data set for GRU model? ValueError: (Unsupported numpy type: NPY_DATETIME)

我尝试创建一个 GRU 模型,但我遇到了有关设置时间戳的问题

这是我的输入示例:

Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Date = pd.to_datetime(Date)
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]

X = pd.DataFrame({'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq}, index = Date)

X.head()

    Close_SP    Close_DJ    Close_Nasdaq
2021-08-06  4436.52 333.96  14835.8
2021-08-07  4436.52 333.96  14835.8
2021-08-08  4436.52 333.96  14835.8
2021-08-09  4432.35 332.12  14860.2
2021-08-10  4436.75 328.85  14788.1

GRU模型的输入大小为(batch size, timestamp, features),所以打算先获取日期数据和feature,然后再zip。

x1 = tf.convert_to_tensor(X.index)
x2 = tf.convert_to_tensor(X)

input = tf.data.Dataset.zip((x1, x2))

但是,我遇到了 ValueError: Failed to convert a NumPy array to a Tensor (Unsupported numpy type: NPY_DATETIME)

那么,我该如何解决这个问题?是否有另一种有效的方法来实现我的目标?

我认为您只需要将 datetime 对象转换为时间戳。

x1 = tf.convert_to_tensor(X.index.values.astype(np.int64))

我也 运行 在这一行出现另一个错误:

input = tf.data.Dataset.zip((x1, x1))

TypeError: The argument to Dataset.zip() must be a (nested) structure of Dataset objects.

为了解决这个问题,我将两个张量都转换为数据集。

d1 = tf.data.Dataset.from_tensors(x1)
d2 = tf.data.Dataset.from_tensors(x2)

input = tf.data.Dataset.zip((d1, d2))

这将导致对象 <ZipDataset shapes: ((5,), (5, 3)), types: (tf.int64, tf.float64)>

字符串到 unix

Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]

X = pd.DataFrame({'Date':Date, 'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq})
X.head()

inp= X.iloc[:,:1].values.astype('datetime64[s]').astype(np.int64)   # to_unix
input_dates = tf.convert_to_tensor(inp)            # dates
print(input_dates)

check here