如何用 SQL 中类别的平均值替换 NULL 值？

Question

我有一个在 'revenues_from_appointment'

列中包含空值的数据集

数据集

appointment_date	patient_id	practitioner_id	appointment_duration_min	revenues_from_appointment
2021-06-28	42734	748	30	90.0
2021-06-29	42737	747	60	150.0
2021-07-01	42737	747	60	NaN
2021-07-03	42736	748	30	60.0
2021-07-03	42735	747	15	42.62
2021-07-04	42734	748	30	NaN
2021-07-05	42734	748	30	100.0
2021-07-10	42738	747	15	50.72
2021-08-12	42739	748	30	73.43

我希望用行的平均值替换 NULL 值，其中“patient_id、practitioner_id、appointment_duration_min”相同.

我使用 pandas 数据框，

df['revenues_from_appointment'].fillna(df.groupby(['patient_id','practitioner_id','appointment_duration_min'])['revenues_from_appointment'].transform('mean'), inplace = True)

如何使用SQL得到相同的结果？

最终输出

appointment_date	patient_id	practitioner_id	appointment_duration_min	revenues_from_appointment
2021-06-28	42734	748	30	90.0
2021-06-29	42737	747	60	150.0
2021-07-01	42737	747	60	150.0
2021-07-03	42736	748	30	60.0
2021-07-03	42735	747	15	42.62
2021-07-04	42734	748	30	95.0
2021-07-05	42734	748	30	100.0
2021-07-10	42738	747	15	50.72
2021-08-12	42739	748	30	73.43

Answer 1

您可以使用 AVG window 函数，它将对感兴趣的三列进行分区并使用 COALESCE 函数替换空值：

SELECT appointment_date,
       patient_id,
       practitioner_id,
       appointment_duration_min,
       COALESCE(revenues_from_appointment, 
                AVG(revenues_from_appointment) OVER(PARTITION BY patient_id, 
                                                                 practitioner_id, 
                                                                 appointment_duration_min))
FROM tab

试试看 here.

如何用 SQL 中类别的平均值替换 NULL 值？

How to replace NULL values with Mean value of a category in SQL?

sql

database

sql-server

missing-data

最终输出