Pandas:根据另一个数据框的行创建一个数据框
Pandas: Create a data frame based on rows of another data frame
我有一个问题,有点复杂。
我有这样一个数据框:
commodity_name first_delivery_date last_delivery_date last_trading_date tenor delivery_window new_tenor Vol
<chr> <dttm> <dttm> <dttm> <chr> <chr> <chr> <int>
1 oil 2021-06-01 00:00:00 2021-06-30 00:00:00 2021-04-30 00:00:00 month Jun 21 Jun 21 29000
2 gold 2022-03-01 00:00:00 2022-03-31 00:00:00 2022-02-28 00:00:00 month Mar 22 Mar 22 -800
3 oil 2021-07-01 00:00:00 2021-07-31 00:00:00 2021-05-31 00:00:00 month Jul 21 Jul 21 -21000
4 gold 2021-09-01 00:00:00 2021-09-30 00:00:00 2021-08-31 00:00:00 month Sep 21 Sep 21 1100
5 gold 2021-02-01 00:00:00 2021-02-28 00:00:00 2021-01-29 00:00:00 month Feb 21 Feb 21 -3000
6 depower 2021-01-01 00:00:00 2021-01-31 00:00:00 2020-12-30 00:00:00 quarter Jan 21 Q1 21 -3
7 oil 2022-04-01 00:00:00 2022-04-30 00:00:00 2022-02-28 00:00:00 month Apr 22 Apr 22 23000
8 czpower 2023-02-01 00:00:00 2023-02-28 00:00:00 2023-01-30 00:00:00 quarter Feb 23 Q1 23 26
9 oil 2021-02-01 00:00:00 2021-02-28 00:00:00 2020-12-31 00:00:00 quarter Feb 21 Q1 21 -17000
10 gold 2021-05-01 00:00:00 2021-05-31 00:00:00 2021-04-30 00:00:00 month May 21 May 21 2400
我想根据以下条件从中创建另一个数据框:
- 对于
YY
年,如果旧数据框中的 new_tenor
为 Q1 YY
:在新数据框中创建三行,其中 new_tenor
为 [=分别为 17=、Feb YY
和 Mar YY
。所有其他变量保持不变;
- 如果旧数据框中的
new_tenor
为 Q2 YY
:在新数据框中创建三行,其中 new_tenor
为 Apr YY
,May YY
和 Jun YY
。所有其他变量保持不变;
- 如果旧数据框中的
new_tenor
为 Q3 YY
:在新数据框中创建三行,其中 new_tenor
为 Jul YY
,Aug YY
和 Sep YY
。所有其他变量保持不变;
- 如果旧数据框中的
new_tenor
为 Q4 YY
:在新数据框中创建三行,其中 new_tenor
为 Oct YY
,Nov YY
和 Dec YY
。所有其他变量保持不变;
- 如果旧数据框中的
new_tenor
为 Cal YY
:在新数据框中创建六行,其中 new_tenor
为 Jan YY+1
,Feb YY+1
、Mar YY+1
、Q2 YY+1
、Q3 YY+1
和 Q4 YY+1
。所有其他变量保持不变;
问题很简单,主要取决于 YY
的值,新数据框中的其他所有内容都与旧数据框相同。
我尝试使用以下代码解决问题:
my_df = []
for index, row in ss.iterrows():
# d = row["NewTenor"].split()
# year = d[1]
print(year)
if "Cal" in row["NewTenor"]:
# Go to next year
# Add Jan, Feb, and Mar
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Jan','21':'22','22':'23','23':'24'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Cal':'Feb','21':'22','22':'23','23':'24'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Cal':'Mar','21':'22','22':'23','23':'24'})
# Add Q2, Q3, and Q4
temp_4 = row
temp_4['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q2','21':'22','22':'23','23':'24'})
temp_5 = row
temp_5['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q3','21':'22','22':'23','23':'24'})
temp_6 = row
temp_6['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q4','21':'22','22':'23','23':'24'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
my_df.append(temp_4)
my_df.append(temp_5)
my_df.append(temp_6)
elif "Q1" in row["NewTenor"]:
# Add Jan, Feb, and Mar
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q1':'Jan'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q1':'Feb'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q1':'Mar'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
elif "Q2" in row["NewTenor"]:
# Add Apr, May, and Jun
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q2':'Apr'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q2':'May'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q2':'Jun'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
elif "Q3" in row["NewTenor"]:
# Add Jul, Aug, and Sep
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q3':'Jul'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q3':'Aug'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q3':'Sep'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
else :
# Add Oct, Nov, and Dec
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q4':'Oct'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q4':'Nov'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q4':'Dec'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
my_df = pd.DataFrame(my_df)
这并不复杂,而且总是给我错误。
有人可以帮我创建新的数据框吗?提前谢谢你。
如果我理解正确的话:
def split_tenor(tenor):
start, year = tenor.split(" ")
if start == "Cal":
months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
year = int(year) + 1
elif start == "Q1":
months = ["Jan", "Feb", "Mar"]
elif start == "Q2":
months = ["Apr", "May", "Jun"]
elif start == "Q3":
months = ["Jul", "Aug", "Sep"]
elif start == "Q4":
months = ["Oct", "Nov", "Dec"]
else:
return tenor
return [f"{m} {year}" for m in months]
df["new_tenor"] = df["new_tenor"].apply(split_tenor)
df.explode("new_tenor")
我有一个问题,有点复杂。
我有这样一个数据框:
commodity_name first_delivery_date last_delivery_date last_trading_date tenor delivery_window new_tenor Vol
<chr> <dttm> <dttm> <dttm> <chr> <chr> <chr> <int>
1 oil 2021-06-01 00:00:00 2021-06-30 00:00:00 2021-04-30 00:00:00 month Jun 21 Jun 21 29000
2 gold 2022-03-01 00:00:00 2022-03-31 00:00:00 2022-02-28 00:00:00 month Mar 22 Mar 22 -800
3 oil 2021-07-01 00:00:00 2021-07-31 00:00:00 2021-05-31 00:00:00 month Jul 21 Jul 21 -21000
4 gold 2021-09-01 00:00:00 2021-09-30 00:00:00 2021-08-31 00:00:00 month Sep 21 Sep 21 1100
5 gold 2021-02-01 00:00:00 2021-02-28 00:00:00 2021-01-29 00:00:00 month Feb 21 Feb 21 -3000
6 depower 2021-01-01 00:00:00 2021-01-31 00:00:00 2020-12-30 00:00:00 quarter Jan 21 Q1 21 -3
7 oil 2022-04-01 00:00:00 2022-04-30 00:00:00 2022-02-28 00:00:00 month Apr 22 Apr 22 23000
8 czpower 2023-02-01 00:00:00 2023-02-28 00:00:00 2023-01-30 00:00:00 quarter Feb 23 Q1 23 26
9 oil 2021-02-01 00:00:00 2021-02-28 00:00:00 2020-12-31 00:00:00 quarter Feb 21 Q1 21 -17000
10 gold 2021-05-01 00:00:00 2021-05-31 00:00:00 2021-04-30 00:00:00 month May 21 May 21 2400
我想根据以下条件从中创建另一个数据框:
- 对于
YY
年,如果旧数据框中的new_tenor
为Q1 YY
:在新数据框中创建三行,其中new_tenor
为 [=分别为 17=、Feb YY
和Mar YY
。所有其他变量保持不变; - 如果旧数据框中的
new_tenor
为Q2 YY
:在新数据框中创建三行,其中new_tenor
为Apr YY
,May YY
和Jun YY
。所有其他变量保持不变; - 如果旧数据框中的
new_tenor
为Q3 YY
:在新数据框中创建三行,其中new_tenor
为Jul YY
,Aug YY
和Sep YY
。所有其他变量保持不变; - 如果旧数据框中的
new_tenor
为Q4 YY
:在新数据框中创建三行,其中new_tenor
为Oct YY
,Nov YY
和Dec YY
。所有其他变量保持不变; - 如果旧数据框中的
new_tenor
为Cal YY
:在新数据框中创建六行,其中new_tenor
为Jan YY+1
,Feb YY+1
、Mar YY+1
、Q2 YY+1
、Q3 YY+1
和Q4 YY+1
。所有其他变量保持不变;
问题很简单,主要取决于 YY
的值,新数据框中的其他所有内容都与旧数据框相同。
我尝试使用以下代码解决问题:
my_df = []
for index, row in ss.iterrows():
# d = row["NewTenor"].split()
# year = d[1]
print(year)
if "Cal" in row["NewTenor"]:
# Go to next year
# Add Jan, Feb, and Mar
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Jan','21':'22','22':'23','23':'24'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Cal':'Feb','21':'22','22':'23','23':'24'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Cal':'Mar','21':'22','22':'23','23':'24'})
# Add Q2, Q3, and Q4
temp_4 = row
temp_4['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q2','21':'22','22':'23','23':'24'})
temp_5 = row
temp_5['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q3','21':'22','22':'23','23':'24'})
temp_6 = row
temp_6['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q4','21':'22','22':'23','23':'24'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
my_df.append(temp_4)
my_df.append(temp_5)
my_df.append(temp_6)
elif "Q1" in row["NewTenor"]:
# Add Jan, Feb, and Mar
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q1':'Jan'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q1':'Feb'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q1':'Mar'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
elif "Q2" in row["NewTenor"]:
# Add Apr, May, and Jun
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q2':'Apr'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q2':'May'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q2':'Jun'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
elif "Q3" in row["NewTenor"]:
# Add Jul, Aug, and Sep
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q3':'Jul'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q3':'Aug'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q3':'Sep'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
else :
# Add Oct, Nov, and Dec
temp_1 = row
temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q4':'Oct'})
temp_2 = row
temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q4':'Nov'})
temp_3 = row
temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q4':'Dec'})
# Append to data frame
my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
my_df = pd.DataFrame(my_df)
这并不复杂,而且总是给我错误。
有人可以帮我创建新的数据框吗?提前谢谢你。
如果我理解正确的话:
def split_tenor(tenor):
start, year = tenor.split(" ")
if start == "Cal":
months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
year = int(year) + 1
elif start == "Q1":
months = ["Jan", "Feb", "Mar"]
elif start == "Q2":
months = ["Apr", "May", "Jun"]
elif start == "Q3":
months = ["Jul", "Aug", "Sep"]
elif start == "Q4":
months = ["Oct", "Nov", "Dec"]
else:
return tenor
return [f"{m} {year}" for m in months]
df["new_tenor"] = df["new_tenor"].apply(split_tenor)
df.explode("new_tenor")