How to transform dataframe to tensorflow data set for GRU model? ValueError: (Unsupported numpy type: NPY_DATETIME)
How to transform dataframe to tensorflow data set for GRU model? ValueError: (Unsupported numpy type: NPY_DATETIME)
我尝试创建一个 GRU 模型,但我遇到了有关设置时间戳的问题
这是我的输入示例:
Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Date = pd.to_datetime(Date)
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]
X = pd.DataFrame({'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq}, index = Date)
X.head()
Close_SP Close_DJ Close_Nasdaq
2021-08-06 4436.52 333.96 14835.8
2021-08-07 4436.52 333.96 14835.8
2021-08-08 4436.52 333.96 14835.8
2021-08-09 4432.35 332.12 14860.2
2021-08-10 4436.75 328.85 14788.1
GRU模型的输入大小为(batch size, timestamp, features),所以打算先获取日期数据和feature,然后再zip。
x1 = tf.convert_to_tensor(X.index)
x2 = tf.convert_to_tensor(X)
input = tf.data.Dataset.zip((x1, x2))
但是,我遇到了 ValueError: Failed to convert a NumPy array to a Tensor (Unsupported numpy type: NPY_DATETIME)
那么,我该如何解决这个问题?是否有另一种有效的方法来实现我的目标?
我认为您只需要将 datetime
对象转换为时间戳。
x1 = tf.convert_to_tensor(X.index.values.astype(np.int64))
我也 运行 在这一行出现另一个错误:
input = tf.data.Dataset.zip((x1, x1))
TypeError: The argument to Dataset.zip()
must be a (nested) structure of Dataset
objects.
为了解决这个问题,我将两个张量都转换为数据集。
d1 = tf.data.Dataset.from_tensors(x1)
d2 = tf.data.Dataset.from_tensors(x2)
input = tf.data.Dataset.zip((d1, d2))
这将导致对象 <ZipDataset shapes: ((5,), (5, 3)), types: (tf.int64, tf.float64)>
。
字符串到 unix
Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]
X = pd.DataFrame({'Date':Date, 'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq})
X.head()
inp= X.iloc[:,:1].values.astype('datetime64[s]').astype(np.int64) # to_unix
input_dates = tf.convert_to_tensor(inp) # dates
print(input_dates)
我尝试创建一个 GRU 模型,但我遇到了有关设置时间戳的问题
这是我的输入示例:
Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Date = pd.to_datetime(Date)
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]
X = pd.DataFrame({'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq}, index = Date)
X.head()
Close_SP Close_DJ Close_Nasdaq
2021-08-06 4436.52 333.96 14835.8
2021-08-07 4436.52 333.96 14835.8
2021-08-08 4436.52 333.96 14835.8
2021-08-09 4432.35 332.12 14860.2
2021-08-10 4436.75 328.85 14788.1
GRU模型的输入大小为(batch size, timestamp, features),所以打算先获取日期数据和feature,然后再zip。
x1 = tf.convert_to_tensor(X.index)
x2 = tf.convert_to_tensor(X)
input = tf.data.Dataset.zip((x1, x2))
但是,我遇到了 ValueError: Failed to convert a NumPy array to a Tensor (Unsupported numpy type: NPY_DATETIME)
那么,我该如何解决这个问题?是否有另一种有效的方法来实现我的目标?
我认为您只需要将 datetime
对象转换为时间戳。
x1 = tf.convert_to_tensor(X.index.values.astype(np.int64))
我也 运行 在这一行出现另一个错误:
input = tf.data.Dataset.zip((x1, x1))
TypeError: The argument to
Dataset.zip()
must be a (nested) structure ofDataset
objects.
为了解决这个问题,我将两个张量都转换为数据集。
d1 = tf.data.Dataset.from_tensors(x1)
d2 = tf.data.Dataset.from_tensors(x2)
input = tf.data.Dataset.zip((d1, d2))
这将导致对象 <ZipDataset shapes: ((5,), (5, 3)), types: (tf.int64, tf.float64)>
。
字符串到 unix
Date = ['2021-08-06', '2021-08-07', '2021-08-08', '2021-08-09', '2021-08-10']
Close_SP = [4436.52, 4436.52, 4436.52, 4432.35, 4436.75]
Close_DJ = [333.96, 333.96, 333.96, 332.12, 328.85]
Close_Nasdaq = [14835.8, 14835.8, 14835.8, 14860.2, 14788.1]
X = pd.DataFrame({'Date':Date, 'Close_SP': Close_SP, 'Close_DJ': Close_DJ, 'Close_Nasdaq': Close_Nasdaq})
X.head()
inp= X.iloc[:,:1].values.astype('datetime64[s]').astype(np.int64) # to_unix
input_dates = tf.convert_to_tensor(inp) # dates
print(input_dates)