移动数据框并填充 NaN

Question

我想创建一个 DataFrame，其中包含 n 个消费者（此处 n=5）10 小时的每小时热量需求。 --> 名为“Village”的 DataFrame，n 列（每列代表一个消费者）和 10 行（10 小时）所有消费者都遵循相同的需求概况，唯一的区别是它会在随机的小时数内发生变化。随机数服从正态分布。

我设法创建了一个遵循正态分布的离散数字列表，并且我设法创建了一个包含 n 行的 DataFrame，其中相同的需求概况通过该随机数进行偏移。

我无法解决的问题是，NaN 出现了，而不是用因班次而被切断的值填充班次时间。

示例：如果需求概况偏移 1 小时（例如消费者 5）。现在第一个小时出现“NaN”作为需求。我希望显示原始需求概况的第 10 小时的值（4755.005240），而不是“NaN”。因此，我不想改变需求概况的值，而是希望它更多地“旋转”。

   heat_demand
0  1896.107462
1  1964.878199
2  2072.946499
3  2397.151402
4  3340.292937
5  4912.195496
6  6159.893152
7  5649.024821
8  5157.805271
9  4755.005240

    Consumer 1   Consumer 2   Consumer 3   Consumer 4   Consumer 5
0  1896.107462          NaN  1964.878199          NaN          NaN
1  1964.878199          NaN  2072.946499          NaN  1896.107462
2  2072.946499          NaN  2397.151402          NaN  1964.878199
3  2397.151402  1896.107462  3340.292937  1896.107462  2072.946499
4  3340.292937  1964.878199  4912.195496  1964.878199  2397.151402
5  4912.195496  2072.946499  6159.893152  2072.946499  3340.292937
6  6159.893152  2397.151402  5649.024821  2397.151402  4912.195496
7  5649.024821  3340.292937  5157.805271  3340.292937  6159.893152
8  5157.805271  4912.195496  4755.005240  4912.195496  5649.024821
9  4755.005240  6159.893152          NaN  6159.893152  5157.805271

有人可以提示我如何解决该问题吗？非常感谢提前和亲切的问候

路易丝

import numpy as np
import pandas as pd
import os

path= os.path.dirname(os.path.abspath(os.path.join(file)))

#Create a list with discrete numbers following normal distribution
n = 5
timeshift_1h = np.random.normal(loc=0.1085, scale=1.43825, size=n)
timeshift_1h = np.round(timeshift_1h).astype(int)
print ("Time Shift in h:", timeshift_1h)

#Read the Standard Load Profile
cols = ["heat_demand"]
df_StandardLoadProfile = pd.read_excel(os.path.join(path, '10_h_example.xlsx'),usecols=cols)
print(df_StandardLoadProfile)

#Create a df for n consumers, whose demand equals a shifted StandardLoadProfile.
#It is shifted by a random amount of hours, that is taken from the list timeshift_1h
list_consumers = list(range(1,n+1))
Village=pd.DataFrame()
for i in list_consumers:
a=timeshift_1h[i-1]
name = "Consumer {}".format(i)
Village[name] = df_StandardLoadProfile.shift(a)
print(Village)

Answer 1

use-case 有一个非常好的 numpy 函数，即 np.roll（请参阅 here 了解文档）。它接受一个数组并按 shift.

指定的步长移动它

对于您的示例，这可能如下所示：

import pandas as pd
import numpy as np

df = pd.read_csv("demand.csv")
df['Consumer 1'] = np.roll(df["heat_demand"], shift=1)

Answer 2

您可以填写反向列中的 nan 值 -

df = pd.DataFrame(np.arange(10))
df
#   0
#0  0
#1  1
#2  2
#3  3
#4  4
#5  5
#6  6
#7  7
#8  8
#9  9

df[0].shift(3).fillna(pd.Series(reversed(df[0])))
#0    9.0
#1    8.0
#2    7.0
#3    0.0
#4    1.0
#5    2.0
#6    3.0
#7    4.0
#8    5.0
#9    6.0

移动数据框并填充 NaN

Shift Dataframe and filling up NaN

python

time-series

shift

dataframe

pandas