将数据框转换为不同的形式
Transform data frame to a different form
这是我的数据框。
Date
Country
Value
1/4/1971
Sweden
5.1643
1/5/1971
Sweden
5.1628
1/6/1971
Sweden
5.1614
1/7/1971
Sweden
5.1649
1/8/1971
Sweden
5.1631
1/4/1971
Canada
1.0109
1/5/1971
Canada
1.0102
1/6/1971
Canada
1.0106
1/7/1971
Canada
1.0148
1/8/1971
Canada
1.0154
1/4/1971
India
8.02
1/5/1971
India
8.00
1/6/1971
India
8.01
1/7/1971
India
8.00
1/8/1971
India
8.03
我想使用 python 和 panda 像下面这样的数据框。
Date
Sweden
Canada
India
1/4/1971
5.1643
1.0109
8.02
1/5/1971
5.1628
1.0102
8
1/6/1971
5.1614
1.0106
8.01
1/7/1971
5.1649
1.0148
8
1/8/1971
5.1631
1.0154
8.03
请帮帮我。
谢谢。
我们在这里创建您的数据框以供测试..
import pandas as pd
arr = [['1/4/1971', 'Sweden', '5.1643'],
['1/5/1971', 'Sweden', '5.1628'],
['1/6/1971', 'Sweden', '5.1614'],
['1/7/1971', 'Sweden', '5.1649'],
['1/8/1971', 'Sweden', '5.1631'],
['1/4/1971', 'Canada', '1.0109'],
['1/5/1971', 'Canada', '1.0102'],
['1/6/1971', 'Canada', '1.0106'],
['1/7/1971', 'Canada', '1.0148'],
['1/8/1971', 'Canada', '1.0154'],
['1/4/1971', 'India', '8.02'],
['1/5/1971', 'India', '8.00'],
['1/6/1971', 'India', '8.01'],
['1/7/1971', 'India', '8.00'],
['1/8/1971', 'India', '8.03']]
df = pd.DataFrame(arr,columns=['Date','Country','Value'])
print('old form')
print(df)
输出应该是这样的:
old form
Date Country Value
0 1/4/1971 Sweden 5.1643
1 1/5/1971 Sweden 5.1628
2 1/6/1971 Sweden 5.1614
3 1/7/1971 Sweden 5.1649
4 1/8/1971 Sweden 5.1631
5 1/4/1971 Canada 1.0109
6 1/5/1971 Canada 1.0102
7 1/6/1971 Canada 1.0106
8 1/7/1971 Canada 1.0148
9 1/8/1971 Canada 1.0154
10 1/4/1971 India 8.02
11 1/5/1971 India 8.00
12 1/6/1971 India 8.01
13 1/7/1971 India 8.00
14 1/8/1971 India 8.03
让我们施展魔法:
note: this code not optimized but works well
table = {}
for row in df.values:
date = row[0]
country = row[1]
value = row[2]
if date not in table:table[date] = {country:value}
else:table[date][country] = value
arr = []
for date in table.keys():
row = table[date]
row = [date,row['Sweden'],row['Canada'],row['India']]
arr.append(row)
df2 = pd.DataFrame(arr,columns=['Date','Sweden','Canada','India'])
print('new form')
print(df2)
最终输出应该是
new form
Date Sweden Canada India
0 1/4/1971 5.1643 1.0109 8.02
1 1/5/1971 5.1628 1.0102 8.00
2 1/6/1971 5.1614 1.0106 8.01
3 1/7/1971 5.1649 1.0148 8.00
4 1/8/1971 5.1631 1.0154 8.03
您可以使用数据框的 pivot
方法执行此操作。
代码
以下代码假定原始数据位于名为 test.csv
.
的文件中
import pandas as pd
df = pd.read_csv('test.csv')
print(df)
df = df.pivot(index='Date', columns='Country', values = 'Value').reset_index()
print(df)
之前
Date
Country
Value
1/4/1971
Sweden
5.1643
1/5/1971
Sweden
5.1628
1/6/1971
Sweden
5.1614
1/7/1971
Sweden
5.1649
1/8/1971
Sweden
5.1631
1/4/1971
Canada
1.0109
1/5/1971
Canada
1.0102
1/6/1971
Canada
1.0106
1/7/1971
Canada
1.0148
1/8/1971
Canada
1.0154
1/4/1971
India
8.02
1/5/1971
India
8
1/6/1971
India
8.01
1/7/1971
India
8
1/8/1971
India
8.03
之后
Country
Date
Canada
India
Sweden
0
1/4/1971
1.0109
8.02
5.1643
1
1/5/1971
1.0102
8
5.1628
2
1/6/1971
1.0106
8.01
5.1614
3
1/7/1971
1.0148
8
5.1649
4
1/8/1971
1.0154
8.03
5.1631
这是我的数据框。
Date | Country | Value |
---|---|---|
1/4/1971 | Sweden | 5.1643 |
1/5/1971 | Sweden | 5.1628 |
1/6/1971 | Sweden | 5.1614 |
1/7/1971 | Sweden | 5.1649 |
1/8/1971 | Sweden | 5.1631 |
1/4/1971 | Canada | 1.0109 |
1/5/1971 | Canada | 1.0102 |
1/6/1971 | Canada | 1.0106 |
1/7/1971 | Canada | 1.0148 |
1/8/1971 | Canada | 1.0154 |
1/4/1971 | India | 8.02 |
1/5/1971 | India | 8.00 |
1/6/1971 | India | 8.01 |
1/7/1971 | India | 8.00 |
1/8/1971 | India | 8.03 |
我想使用 python 和 panda 像下面这样的数据框。
Date | Sweden | Canada | India |
---|---|---|---|
1/4/1971 | 5.1643 | 1.0109 | 8.02 |
1/5/1971 | 5.1628 | 1.0102 | 8 |
1/6/1971 | 5.1614 | 1.0106 | 8.01 |
1/7/1971 | 5.1649 | 1.0148 | 8 |
1/8/1971 | 5.1631 | 1.0154 | 8.03 |
请帮帮我。 谢谢。
我们在这里创建您的数据框以供测试..
import pandas as pd
arr = [['1/4/1971', 'Sweden', '5.1643'],
['1/5/1971', 'Sweden', '5.1628'],
['1/6/1971', 'Sweden', '5.1614'],
['1/7/1971', 'Sweden', '5.1649'],
['1/8/1971', 'Sweden', '5.1631'],
['1/4/1971', 'Canada', '1.0109'],
['1/5/1971', 'Canada', '1.0102'],
['1/6/1971', 'Canada', '1.0106'],
['1/7/1971', 'Canada', '1.0148'],
['1/8/1971', 'Canada', '1.0154'],
['1/4/1971', 'India', '8.02'],
['1/5/1971', 'India', '8.00'],
['1/6/1971', 'India', '8.01'],
['1/7/1971', 'India', '8.00'],
['1/8/1971', 'India', '8.03']]
df = pd.DataFrame(arr,columns=['Date','Country','Value'])
print('old form')
print(df)
输出应该是这样的:
old form
Date Country Value
0 1/4/1971 Sweden 5.1643
1 1/5/1971 Sweden 5.1628
2 1/6/1971 Sweden 5.1614
3 1/7/1971 Sweden 5.1649
4 1/8/1971 Sweden 5.1631
5 1/4/1971 Canada 1.0109
6 1/5/1971 Canada 1.0102
7 1/6/1971 Canada 1.0106
8 1/7/1971 Canada 1.0148
9 1/8/1971 Canada 1.0154
10 1/4/1971 India 8.02
11 1/5/1971 India 8.00
12 1/6/1971 India 8.01
13 1/7/1971 India 8.00
14 1/8/1971 India 8.03
让我们施展魔法:
note: this code not optimized but works well
table = {}
for row in df.values:
date = row[0]
country = row[1]
value = row[2]
if date not in table:table[date] = {country:value}
else:table[date][country] = value
arr = []
for date in table.keys():
row = table[date]
row = [date,row['Sweden'],row['Canada'],row['India']]
arr.append(row)
df2 = pd.DataFrame(arr,columns=['Date','Sweden','Canada','India'])
print('new form')
print(df2)
最终输出应该是
new form
Date Sweden Canada India
0 1/4/1971 5.1643 1.0109 8.02
1 1/5/1971 5.1628 1.0102 8.00
2 1/6/1971 5.1614 1.0106 8.01
3 1/7/1971 5.1649 1.0148 8.00
4 1/8/1971 5.1631 1.0154 8.03
您可以使用数据框的 pivot
方法执行此操作。
代码
以下代码假定原始数据位于名为 test.csv
.
import pandas as pd
df = pd.read_csv('test.csv')
print(df)
df = df.pivot(index='Date', columns='Country', values = 'Value').reset_index()
print(df)
之前
Date | Country | Value |
---|---|---|
1/4/1971 | Sweden | 5.1643 |
1/5/1971 | Sweden | 5.1628 |
1/6/1971 | Sweden | 5.1614 |
1/7/1971 | Sweden | 5.1649 |
1/8/1971 | Sweden | 5.1631 |
1/4/1971 | Canada | 1.0109 |
1/5/1971 | Canada | 1.0102 |
1/6/1971 | Canada | 1.0106 |
1/7/1971 | Canada | 1.0148 |
1/8/1971 | Canada | 1.0154 |
1/4/1971 | India | 8.02 |
1/5/1971 | India | 8 |
1/6/1971 | India | 8.01 |
1/7/1971 | India | 8 |
1/8/1971 | India | 8.03 |
之后
Country | Date | Canada | India | Sweden |
---|---|---|---|---|
0 | 1/4/1971 | 1.0109 | 8.02 | 5.1643 |
1 | 1/5/1971 | 1.0102 | 8 | 5.1628 |
2 | 1/6/1971 | 1.0106 | 8.01 | 5.1614 |
3 | 1/7/1971 | 1.0148 | 8 | 5.1649 |
4 | 1/8/1971 | 1.0154 | 8.03 | 5.1631 |