对 Pandas 中的数据进行排序
Sorting Data in Pandas
我有如下所示的数据集。我试图对其进行排序,以便列按以下顺序排列:Week End、Australia、Germany、France 等...
我尝试使用 loc
并将每个数据集分配为变量,但是当我创建一个新的 DataFrame 时它会导致错误。任何帮助将不胜感激。
这是更改前的数据:
Region
Week End
Value
Australia
2014-01-11
1.480510
Germany
2014-01-11
1.481258
France
2014-01-11
0.986507
United Kingdom
2014-01-11
1.973014
Italy
2014-01-11
0.740629
这是我想要的输出:
Week End
Australia
Germany
France
United Kingdom
Italy
2014-01-11
1.480510
1.481258
0.986507
1.973014
0.740629
我尝试过的:
cols = (['Region','Week End','Value'])
df = GS.loc[GS['Brand'].isin(rows)]
df = df[cols]
AUS = df.loc[df['Region'] == 'Australia']
JPN = df.loc[df['Region'] == 'Japan']
US = df.loc[df['Region'] == 'United States of America']
我认为你实际上可以这样做:
df.pivot(index="Week End", columns="Region", values="Value")
用户 965311532 的回答更为简洁,但使用字典的替代方法是:
new_df = {'Week End': df['Week End'][0]}
new_df.update({region: value for region, value in zip(df['Region'], df['Value'])})
new_df = pd.DataFrame(new_df, index = [0])
正如用户965311532所指出的,如果有更多的日期,上面的代码将无法工作。在这种情况下,我们可以使用 pandas groupby
:
dates = []
for date, group in df.groupby('Week End'):
date_df = {'Week End': date}
date_df.update({region: value for region, value in zip(df['Region'], df['Value'])})
date_df = pd.DataFrame(date_df, index = [0])
dates.append(date_df)
new_df = pd.concat(dates)
我有如下所示的数据集。我试图对其进行排序,以便列按以下顺序排列:Week End、Australia、Germany、France 等...
我尝试使用 loc
并将每个数据集分配为变量,但是当我创建一个新的 DataFrame 时它会导致错误。任何帮助将不胜感激。
这是更改前的数据:
Region | Week End | Value |
---|---|---|
Australia | 2014-01-11 | 1.480510 |
Germany | 2014-01-11 | 1.481258 |
France | 2014-01-11 | 0.986507 |
United Kingdom | 2014-01-11 | 1.973014 |
Italy | 2014-01-11 | 0.740629 |
这是我想要的输出:
Week End | Australia | Germany | France | United Kingdom | Italy |
---|---|---|---|---|---|
2014-01-11 | 1.480510 | 1.481258 | 0.986507 | 1.973014 | 0.740629 |
我尝试过的:
cols = (['Region','Week End','Value'])
df = GS.loc[GS['Brand'].isin(rows)]
df = df[cols]
AUS = df.loc[df['Region'] == 'Australia']
JPN = df.loc[df['Region'] == 'Japan']
US = df.loc[df['Region'] == 'United States of America']
我认为你实际上可以这样做:
df.pivot(index="Week End", columns="Region", values="Value")
用户 965311532 的回答更为简洁,但使用字典的替代方法是:
new_df = {'Week End': df['Week End'][0]}
new_df.update({region: value for region, value in zip(df['Region'], df['Value'])})
new_df = pd.DataFrame(new_df, index = [0])
正如用户965311532所指出的,如果有更多的日期,上面的代码将无法工作。在这种情况下,我们可以使用 pandas groupby
:
dates = []
for date, group in df.groupby('Week End'):
date_df = {'Week End': date}
date_df.update({region: value for region, value in zip(df['Region'], df['Value'])})
date_df = pd.DataFrame(date_df, index = [0])
dates.append(date_df)
new_df = pd.concat(dates)