如何使用 Python 在 Excel 中突出显示不匹配的行并更新标记?
How to highlight unmatch row and update marking' in Excel using Python?
大家自学程序有点问题
- 我有两个不同的excel要比较...
Data1.xlsx
| Name | Reg Date |
|Annie | 2021-07-01 |
|Billy | 2021-07-02 |
|Cathrine | 2021-07-03 |
|David | 2021-07-04 |
|Eric | 2021-07-04 |
Data2.xlsx
| Name | City | Reg Date | Gender | Data1.xlsx |
|Alex | Hong Kong | 2021-07-04 | Male | |
|Annie | Hong Kong | 2021-07-01 | Female | |
|Bob | Taipei | 2021-07-02 | Male | |
|Lucy | Tokyo | 2021-07-01 | Female | |
|David | London | 2021-07-04 | Male | |
|Kate | New York | 2021-07-03 | Female | |
|Cathrine | London | 2021-07-03 | Female | |
|Rose | Hong Kong | 2021-07-04 | Female | |
我得到 'Name' & 'Reg Date' 用于合并密钥
import pandas as pd
dt1 = pd.read_excel('Data1.xlsx')
dt2 = pd.read_excel('Data2.xlsx')
df_merge = pd.merge(dt1.iloc[:, [0, 1]], dt2.iloc[:, [0, 2]], on=['Name', 'Reg Date'], how='outer', indicator=True)
i = 0
rows_to_color = []
for a in df_merge.iloc[:, [2]].values:
if a == 'both':
rows_to_color.append(i)
i += 1
| Name | Reg Date | _merge |
|Alex | 2021-07-04 | right_only |
|Annie | 2021-07-01 | both |
|Billy | 2021-07-02 | left_only |
|Bob | 2021-07-02 | right_only |
|Lucy | 2021-07-01 | right_only |
|David | 2021-07-04 | both |
|Eric | 2021-07-04 | left_only |
|Kate | 2021-07-03 | right_only |
|Cathrine | 2021-07-03 | both |
|Rose | 2021-07-04 | right_only |
我尝试编码以针对 'Data2.xlsx' 突出显示 'left_only' 和 'right_only',但不起作用。
def bg_color(col):
color = '#ffffff'
return 'background-color: %s' % color
if i in rows_to_color:
for i, x in col.iteritems():
styled = df_merge.style.apply(bg_color)
我不知道如何在'Data2.xlsx'中突出显示不匹配的行并标记'Y/N',下图是我的预期结果。你介意教我如何编码吗?
enter image description here
在 merge
中使用左连接并先将 numpy.where
设置为 Y/N
:
#change order dt2, dt1
df_merge = pd.merge(dt2,
dt1[['Name', 'Reg Date']],
on=['Name', 'Reg Date'],
how='left', indicator=True)
df_merge['Data1.xlsx'] = np.where(df_merge.pop('_merge').eq('both'), 'Y', 'N')
print (df_merge)
Name City Reg Date Gender Data1.xlsx
0 Alex Hong Kong 2021-07-04 Male N
1 Annie Hong Kong 2021-07-01 Female Y
2 Bob Taipei 2021-07-02 Male N
3 Lucy Tokyo 2021-07-01 Female N
4 David London 2021-07-04 Male Y
5 Kate New York 2021-07-03 Female N
6 Cathrine London 2021-07-03 Female Y
7 Rose Hong Kong 2021-07-04 Female N
然后按 N
行设置颜色:
def bg_color(x):
c = 'background-color: yellow'
# condition
m = x["Data1.xlsx"].eq('N')
# DataFrame of styles
df1 = pd.DataFrame('', index=x.index, columns=x.columns)
# set columns by condition
return df1.mask(m, c)
styled = df_merge.style.apply(bg_color, axis=None)
styled.to_excel('styled.xlsx', engine='openpyxl', index=False)
大家自学程序有点问题
- 我有两个不同的excel要比较...
Data1.xlsx
| Name | Reg Date |
|Annie | 2021-07-01 |
|Billy | 2021-07-02 |
|Cathrine | 2021-07-03 |
|David | 2021-07-04 |
|Eric | 2021-07-04 |
Data2.xlsx
| Name | City | Reg Date | Gender | Data1.xlsx |
|Alex | Hong Kong | 2021-07-04 | Male | |
|Annie | Hong Kong | 2021-07-01 | Female | |
|Bob | Taipei | 2021-07-02 | Male | |
|Lucy | Tokyo | 2021-07-01 | Female | |
|David | London | 2021-07-04 | Male | |
|Kate | New York | 2021-07-03 | Female | |
|Cathrine | London | 2021-07-03 | Female | |
|Rose | Hong Kong | 2021-07-04 | Female | |
我得到 'Name' & 'Reg Date' 用于合并密钥
import pandas as pd dt1 = pd.read_excel('Data1.xlsx') dt2 = pd.read_excel('Data2.xlsx') df_merge = pd.merge(dt1.iloc[:, [0, 1]], dt2.iloc[:, [0, 2]], on=['Name', 'Reg Date'], how='outer', indicator=True) i = 0 rows_to_color = [] for a in df_merge.iloc[:, [2]].values: if a == 'both': rows_to_color.append(i) i += 1 | Name | Reg Date | _merge | |Alex | 2021-07-04 | right_only | |Annie | 2021-07-01 | both | |Billy | 2021-07-02 | left_only | |Bob | 2021-07-02 | right_only | |Lucy | 2021-07-01 | right_only | |David | 2021-07-04 | both | |Eric | 2021-07-04 | left_only | |Kate | 2021-07-03 | right_only | |Cathrine | 2021-07-03 | both | |Rose | 2021-07-04 | right_only |
我尝试编码以针对 'Data2.xlsx' 突出显示 'left_only' 和 'right_only',但不起作用。
def bg_color(col): color = '#ffffff' return 'background-color: %s' % color if i in rows_to_color: for i, x in col.iteritems(): styled = df_merge.style.apply(bg_color)
我不知道如何在'Data2.xlsx'中突出显示不匹配的行并标记'Y/N',下图是我的预期结果。你介意教我如何编码吗?
enter image description here
在 merge
中使用左连接并先将 numpy.where
设置为 Y/N
:
#change order dt2, dt1
df_merge = pd.merge(dt2,
dt1[['Name', 'Reg Date']],
on=['Name', 'Reg Date'],
how='left', indicator=True)
df_merge['Data1.xlsx'] = np.where(df_merge.pop('_merge').eq('both'), 'Y', 'N')
print (df_merge)
Name City Reg Date Gender Data1.xlsx
0 Alex Hong Kong 2021-07-04 Male N
1 Annie Hong Kong 2021-07-01 Female Y
2 Bob Taipei 2021-07-02 Male N
3 Lucy Tokyo 2021-07-01 Female N
4 David London 2021-07-04 Male Y
5 Kate New York 2021-07-03 Female N
6 Cathrine London 2021-07-03 Female Y
7 Rose Hong Kong 2021-07-04 Female N
然后按 N
行设置颜色:
def bg_color(x):
c = 'background-color: yellow'
# condition
m = x["Data1.xlsx"].eq('N')
# DataFrame of styles
df1 = pd.DataFrame('', index=x.index, columns=x.columns)
# set columns by condition
return df1.mask(m, c)
styled = df_merge.style.apply(bg_color, axis=None)
styled.to_excel('styled.xlsx', engine='openpyxl', index=False)