更新加入 pandas - python
update join pandas - python
如何在pandas原生语句(.apply())上实现update join?
我想在 Dataframe1
的 PhySci
上从 Dataframe2 更新 PSMark
Dataframe1 (tbl_ex):
| | Sname | Tamil | English | Maths | Science | Sscience | PhySci |
|---:|:--------|--------:|----------:|--------:|----------:|-----------:|---------:|
| 0 | Abu | 35 | 65 | 64 | 98 | 36 | 0 |
| 1 | Eric | 70 | 54 | 65 | 32 | 58 | 25 |
| 2 | Mani | 56 | 25 | 32 | 32 | 78 | 10 |
| 3 | Ram | 80 | 24 | 68 | 54 | 76 | 0 |
| 4 | Tom | 40 | 26 | 56 | 69 | 42 | 65 |
| 5 | Eva | 50 | 18 | 56 | 87 | 56 | 0 |
Dataframe2 (tbl_fy):
| | Sname | PSMark |
|---:|:--------|---------:|
| 0 | Tom | 69 |
| 1 | Ram | 54 |
| 2 | Mani | 32 |
| 3 | Eva | 87 |
| 4 | Sam | 89 |
我用 sqldf 模块实现了这个
q="""
UPDATE tbl_ex
SET
PhySci = (SELECT tbl_fy.PSMark
FROM tbl_fy
WHERE tbl_fy.Sname = tbl_ex.Sname )
WHERE
EXISTS (
SELECT *
FROM tbl_fy
WHERE tbl_fy.Sname = tbl_ex.Sname );
"""
sqldf.run(q)
print(tbl_ex.to_markdown())
tbl_ex的最终结果:
| | Sname | Tamil | English | Maths | Science | Sscience | PhySci |
|---:|:--------|--------:|----------:|--------:|----------:|-----------:|---------:|
| 0 | Abu | 35 | 65 | 64 | 98 | 36 | 0 |
| 1 | Eric | 70 | 54 | 65 | 32 | 58 | 25 |
| 2 | Mani | 56 | 25 | 32 | 32 | 78 | 32 |
| 3 | Ram | 80 | 24 | 68 | 54 | 76 | 54 |
| 4 | Tom | 40 | 26 | 56 | 69 | 42 | 69 |
| 5 | Eva | 50 | 18 | 56 | 87 | 56 | 87 |
使用,pd.merge
to left merge the dataframes df1
& df2
on column Sname
, Then using Series.fillna
从PhySci
列填充PSMark
列中的NaN
值,并将PSMark
列分配给PhySci
:
df = pd.merge(df1, df2, on='Sname', how='left')
df = df.assign(PhySci=df.pop('PSMark').fillna(df['PhySci']).astype(int))
结果:
# print(df)
Sname Tamil English Maths Science Sscience PhySci
0 Abu 35 65 64 98 36 0
1 Eric 70 54 65 32 58 25
2 Mani 56 25 32 32 78 32
3 Ram 80 24 68 54 76 54
4 Tom 40 26 56 69 42 69
5 Eva 50 18 56 87 56 87
如何在pandas原生语句(.apply())上实现update join? 我想在 Dataframe1
的PhySci
上从 Dataframe2 更新 PSMark
Dataframe1 (tbl_ex):
| | Sname | Tamil | English | Maths | Science | Sscience | PhySci |
|---:|:--------|--------:|----------:|--------:|----------:|-----------:|---------:|
| 0 | Abu | 35 | 65 | 64 | 98 | 36 | 0 |
| 1 | Eric | 70 | 54 | 65 | 32 | 58 | 25 |
| 2 | Mani | 56 | 25 | 32 | 32 | 78 | 10 |
| 3 | Ram | 80 | 24 | 68 | 54 | 76 | 0 |
| 4 | Tom | 40 | 26 | 56 | 69 | 42 | 65 |
| 5 | Eva | 50 | 18 | 56 | 87 | 56 | 0 |
Dataframe2 (tbl_fy):
| | Sname | PSMark |
|---:|:--------|---------:|
| 0 | Tom | 69 |
| 1 | Ram | 54 |
| 2 | Mani | 32 |
| 3 | Eva | 87 |
| 4 | Sam | 89 |
我用 sqldf 模块实现了这个
q="""
UPDATE tbl_ex
SET
PhySci = (SELECT tbl_fy.PSMark
FROM tbl_fy
WHERE tbl_fy.Sname = tbl_ex.Sname )
WHERE
EXISTS (
SELECT *
FROM tbl_fy
WHERE tbl_fy.Sname = tbl_ex.Sname );
"""
sqldf.run(q)
print(tbl_ex.to_markdown())
tbl_ex的最终结果:
| | Sname | Tamil | English | Maths | Science | Sscience | PhySci |
|---:|:--------|--------:|----------:|--------:|----------:|-----------:|---------:|
| 0 | Abu | 35 | 65 | 64 | 98 | 36 | 0 |
| 1 | Eric | 70 | 54 | 65 | 32 | 58 | 25 |
| 2 | Mani | 56 | 25 | 32 | 32 | 78 | 32 |
| 3 | Ram | 80 | 24 | 68 | 54 | 76 | 54 |
| 4 | Tom | 40 | 26 | 56 | 69 | 42 | 69 |
| 5 | Eva | 50 | 18 | 56 | 87 | 56 | 87 |
使用,pd.merge
to left merge the dataframes df1
& df2
on column Sname
, Then using Series.fillna
从PhySci
列填充PSMark
列中的NaN
值,并将PSMark
列分配给PhySci
:
df = pd.merge(df1, df2, on='Sname', how='left')
df = df.assign(PhySci=df.pop('PSMark').fillna(df['PhySci']).astype(int))
结果:
# print(df)
Sname Tamil English Maths Science Sscience PhySci
0 Abu 35 65 64 98 36 0
1 Eric 70 54 65 32 58 25
2 Mani 56 25 32 32 78 32
3 Ram 80 24 68 54 76 54
4 Tom 40 26 56 69 42 69
5 Eva 50 18 56 87 56 87