如何将具有相关权重的行的列范围相加
How to add up total for range of columns across rows with associated weights
我想获取 dementia_yn
和 tumour_yn
列并将它们跨行对患者进行求和。但我想将它们与不同列关联的权重相加
权重={
“痴呆症”:2,
“tumour_yn”:4
}
patients = [('pat1', 'C77', 'F01', 'M32', 'M315', 1, 1),
('pat2', 'I099', 'I278', 'M05', 'F01', 1, 0),
('pat3', 'N057', 'N057', 'N058', 'N057', 0, 0)]
labels = ['patient_num', 'DIAGX1', 'DIAGX2', 'DIAGX3', 'DIAGX4', 'dementia_yn', 'tumour_yn']
df_patients = pd.DataFrame.from_records(patients, columns=labels)
df_patients
Input
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn
pat1 C77 F01 M32 M315 1 1
pat2 I099 I278 M05 F01 1 0
pat3 N057 N057 N058 N057 0 0
Output
Input
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn total
pat1 C77 F01 M32 M315 1 1 6
pat2 I099 I278 M05 F01 1 0 2
pat3 N057 N057 N058 N057 0 0 0
让我们试试dot
#weights = { "dementia_yn": 2, "tumour_yn": 4 }
df_patients['total'] = df_patients[weights.keys()].dot(list(weights.values()))
假设:weights = { "dementia_yn": 2, "tumour_yn": 4 }
使用经典的乘法和求和。
手动对齐:
df_patients['total'] = df_patients[list (weights)].mul(weights).sum(1)
print(df_patients)
自动对齐(即使字典中有额外的键也能工作):
df_patient['total'] = df_patients.mul(pd.Series(weights)).sum(1)
输出:
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn total
0 pat1 C77 F01 M32 M315 1 1 6
1 pat2 I099 I278 M05 F01 1 0 2
2 pat3 N057 N057 N058 N057 0 0 0
如果你确实有没有“_yn”的字典键,使用系列允许你动态添加后缀:
weights = { "dementia": 2, "tumour": 4 }
df_patients['total'] = df_patients.mul(pd.Series(weights).add_suffix('_yn')).sum(1)
我想获取 dementia_yn
和 tumour_yn
列并将它们跨行对患者进行求和。但我想将它们与不同列关联的权重相加
权重={ “痴呆症”:2, “tumour_yn”:4 }
patients = [('pat1', 'C77', 'F01', 'M32', 'M315', 1, 1),
('pat2', 'I099', 'I278', 'M05', 'F01', 1, 0),
('pat3', 'N057', 'N057', 'N058', 'N057', 0, 0)]
labels = ['patient_num', 'DIAGX1', 'DIAGX2', 'DIAGX3', 'DIAGX4', 'dementia_yn', 'tumour_yn']
df_patients = pd.DataFrame.from_records(patients, columns=labels)
df_patients
Input
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn
pat1 C77 F01 M32 M315 1 1
pat2 I099 I278 M05 F01 1 0
pat3 N057 N057 N058 N057 0 0
Output
Input
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn total
pat1 C77 F01 M32 M315 1 1 6
pat2 I099 I278 M05 F01 1 0 2
pat3 N057 N057 N058 N057 0 0 0
让我们试试dot
#weights = { "dementia_yn": 2, "tumour_yn": 4 }
df_patients['total'] = df_patients[weights.keys()].dot(list(weights.values()))
假设:weights = { "dementia_yn": 2, "tumour_yn": 4 }
使用经典的乘法和求和。
手动对齐:
df_patients['total'] = df_patients[list (weights)].mul(weights).sum(1)
print(df_patients)
自动对齐(即使字典中有额外的键也能工作):
df_patient['total'] = df_patients.mul(pd.Series(weights)).sum(1)
输出:
patient_num DIAGX1 DIAGX2 DIAGX3 DIAGX4 dementia_yn tumour_yn total
0 pat1 C77 F01 M32 M315 1 1 6
1 pat2 I099 I278 M05 F01 1 0 2
2 pat3 N057 N057 N058 N057 0 0 0
如果你确实有没有“_yn”的字典键,使用系列允许你动态添加后缀:
weights = { "dementia": 2, "tumour": 4 }
df_patients['total'] = df_patients.mul(pd.Series(weights).add_suffix('_yn')).sum(1)