预测温度时如何使用 RNN?

How would one use a RNN when predicting temperature?



我将如何调整数据框以便它可以在带有 Keras 的 RNN 中使用?

假设您具有以下数据结构,并且我们想要预测过去 1 天的温度:

import tensorflow as tf
import pandas as pd
import numpy as np

df = pd.DataFrame(data={
    'temperature': np.random.random((1, 20)).ravel(),
    'pressure': np.random.random((1, 20)).ravel(),
    'humidity': np.random.random((1, 20)).ravel(),
    'wind': np.random.random((1, 20)).ravel()

temperature pressure humidity wind
0 0.0589101 0.278302 0.875369 0.622687
1 0.594924 0.797274 0.510012 0.374484
2 0.511291 0.334929 0.401483 0.77062
3 0.711329 0.72051 0.595685 0.872691
4 0.495425 0.520179 0.516858 0.628928
5 0.676054 0.67902 0.0213801 0.0267594
6 0.058189 0.69932 0.885174 0.00602091
7 0.708245 0.871698 0.345451 0.448352
8 0.958427 0.471423 0.412678 0.618024
9 0.941202 0.825181 0.211916 0.0808273
10 0.49252 0.541955 0.00522009 0.396557
11 0.323757 0.113585 0.797503 0.323961
12 0.819055 0.637116 0.285361 0.569794
13 0.95123 0.00604303 0.208746 0.150214
14 0.89466 0.948916 0.556422 0.555165
15 0.705789 0.269704 0.289568 0.391438
16 0.154502 0.703137 0.184157 0.765623
17 0.25974 0.934706 0.172775 0.412022
18 0.403475 0.144796 0.0224043 0.891236
19 0.922302 0.805214 0.0232178 0.951568


features = df.iloc[::2, :] # Get every first row 
labels = df.iloc[1::2, :] # Get every second row since we want to predict the temperature given 1 day in the past


temperature pressure humidity wind
0 0.0589101 0.278302 0.875369 0.622687
2 0.511291 0.334929 0.401483 0.77062
4 0.495425 0.520179 0.516858 0.628928
6 0.058189 0.69932 0.885174 0.00602091
8 0.958427 0.471423 0.412678 0.618024
10 0.49252 0.541955 0.00522009 0.396557
12 0.819055 0.637116 0.285361 0.569794
14 0.89466 0.948916 0.556422 0.555165
16 0.154502 0.703137 0.184157 0.765623
18 0.403475 0.144796 0.0224043 0.891236


temperature pressure humidity wind
1 0.594924 0.797274 0.510012 0.374484
3 0.711329 0.72051 0.595685 0.872691
5 0.676054 0.67902 0.0213801 0.0267594
7 0.708245 0.871698 0.345451 0.448352
9 0.941202 0.825181 0.211916 0.0808273
11 0.323757 0.113585 0.797503 0.323961
13 0.95123 0.00604303 0.208746 0.150214
15 0.705789 0.269704 0.289568 0.391438
17 0.25974 0.934706 0.172775 0.412022
19 0.922302 0.805214 0.0232178 0.951568


features = features.to_numpy() # shape (10, 4)
labels = labels['temperature'].to_numpy() # shape (10,)
features = np.expand_dims(features, axis=1) # shape (10, 1, 4)

请注意,features 中添加了一个时间维度,这实际上意味着数据集中的每个样本代表一个时间步长(一天),每个时间步长有 4 个特征(温度、压力、湿度、风)。

构建并 运行 一个 RNN 模型:

inputs = tf.keras.layers.Input(shape=(features.shape[1], features.shape[2]))
rnn_out = tf.keras.layers.SimpleRNN(32)(inputs)
outputs = tf.keras.layers.Dense(1)(rnn_out) # one output = temperature

model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer='adam', loss="mse")
history = model.fit(features, labels, batch_size=2, epochs=3)
Model: "model_1"
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 1, 4)]            0         
 simple_rnn (SimpleRNN)      (None, 32)                1184      
 dense_1 (Dense)             (None, 1)                 33        
Total params: 1,217
Trainable params: 1,217
Non-trainable params: 0
Epoch 1/3
5/5 [==============================] - 1s 9ms/step - loss: 0.7859
Epoch 2/3
5/5 [==============================] - 0s 7ms/step - loss: 0.5862
Epoch 3/3
5/5 [==============================] - 0s 6ms/step - loss: 0.4354


samples = 1
model.predict(tf.random.normal((samples, 1, 4)))
# array([[-1.610171]], dtype=float32)


# You usually also normalize your data before training
mean = df.mean(axis=0)
std = df.std(axis=0)
df = df - mean / std
