根据列值从 df 访问一行
Access a row from a df based on a column value
我试图找出可靠性 <0.70 的行,但输出似乎也包括可靠性为 0.70 的行。有什么问题吗?
原DF:
po_id po_name 产品年份测量率分母分子is_reported 可靠性
0 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 CHLAMSCR 67.740000 62.0 42.0 True NaN
1 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 AMROV64 80.000000 20.0 16.0 True NaN
2 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CISCOMBO10 17.650000 34.0 6.0 True NaN
3 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 OFCSTAFF 69.440000 NaN NaN 真 0.76
4 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 BCS5274 86.420000 302.0 261.0 True NaN
5 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 SPD1 57.810000 64.0 37.0 True NaN
6 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 PDCS 79.530000 127.0 101.0 True NaN
7 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 TCOC_250K_GEO_RISKADJ 289.281096 NaN NaN False NaN
8 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CBPD4 67.440000 129.0 87.0 True NaN
9 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 COORDINATE3 55.370000 NaN NaN True 0.74
添加代码以定位可靠性小于 0.70 的位置
awards_test_df.loc[awards_test_df['reliability'] <0.70]
输出:
po_id po_name product year measure rate denominator numerator is_reported reliability
191 1008200 Advancements Physicians Medical Center Commercial HMO/POS 18 ACCESS3 58.13 NaN NaN True 0.60
515 1021102 Baird Medical Group Commercial HMO/POS 18 COORDINATE3 60.02 NaN NaN True 0.70
... ... ... ... ... ... ... ... ... ... ...
8606 1038400 Vf Healthcare Commercial HMO/POS 18 OFCSTAFF 68.78 NaN NaN True 0.70
8620 1038400 Vf Healthcare Commercial HMO/POS 18 MDINTERACT3 79.57 NaN NaN True 0.70
8800 1006001 Viva Physicians Commercial HMO/POS 18 ACCESS3 66.25 NaN NaN True 0.70
8869 1017708 Waltz Hospital Commercial HMO/POS 19 MDINTERACT3 81.01 NaN NaN True 0.70
9142 1028100 Zeke Medical Group Commercial HMO/POS 18 ACCESS3 56.37 NaN NaN True 0.70
您的代码在重现时看起来非常完美:
import pandas as pd
data = [ { "po_id": 191, "po_name": 1008200, "product": "Advancements Physicians Medical Center Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 58.13, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.6 }, { "po_id": 515, "po_name": 1021102, "product": "Baird Medical Group Commercial HMO/POS", "year": 18, "measure": "COORDINATE3", "rate": 60.02, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8606, "po_name": 1038400, "product": "Vf Healthcare Commercial HMO/POS", "year": 18, "measure": "OFCSTAFF", "rate": 68.78, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8620, "po_name": 1038400, "product": "Vf Healthcare Commercial HMO/POS", "year": 18, "measure": "MDINTERACT3", "rate": 79.57, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8800, "po_name": 1006001, "product": "Viva Physicians Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 66.25, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8869, "po_name": 1017708, "product": "Waltz Hospital Commercial HMO/POS", "year": 19, "measure": "MDINTERACT3", "rate": 81.01, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 9142, "po_name": 1028100, "product": "Zeke Medical Group Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 56.37, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 } ]
awards_test_df = pd.DataFrame(data)
awards_test_df.loc[awards_test_df['reliability'] <0.70]
输出:
| | po_id | po_name | product | year | measure | rate | denominator | numerator | is_reported | reliability |
|---:|--------:|----------:|:-----------------------------------------------------------|-------:|:----------|-------:|--------------:|------------:|:--------------|--------------:|
| 0 | 191 | 1008200 | Advancements Physicians Medical Center Commercial HMO/POS | 18 | ACCESS3 | 58.13 | nan | nan | True | 0.6 |
这只是显示格式。试试下面检查
df = pd.DataFrame({"reliability":np.random.uniform(.65,.75, 100)})
df = df.loc[df.reliability.lt(.7)].assign(twodp=df.reliability.round(2)).query("twodp.eq(.7)")
reliability
twodp
0.695661
0.7
0.699588
0.7
0.698993
0.7
0.697933
0.7
0.698356
0.7
0.699906
0.7
0.695279
0.7
我试图找出可靠性 <0.70 的行,但输出似乎也包括可靠性为 0.70 的行。有什么问题吗?
原DF:
po_id po_name 产品年份测量率分母分子is_reported 可靠性
0 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 CHLAMSCR 67.740000 62.0 42.0 True NaN 1 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 AMROV64 80.000000 20.0 16.0 True NaN 2 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CISCOMBO10 17.650000 34.0 6.0 True NaN 3 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 OFCSTAFF 69.440000 NaN NaN 真 0.76 4 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 BCS5274 86.420000 302.0 261.0 True NaN 5 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 SPD1 57.810000 64.0 37.0 True NaN 6 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 PDCS 79.530000 127.0 101.0 True NaN 7 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 TCOC_250K_GEO_RISKADJ 289.281096 NaN NaN False NaN 8 1051408 Aberdeen Care Alliance Commercial HMO/POS 19 CBPD4 67.440000 129.0 87.0 True NaN 9 1051408 Aberdeen Care Alliance Commercial HMO/POS 18 COORDINATE3 55.370000 NaN NaN True 0.74
添加代码以定位可靠性小于 0.70 的位置 awards_test_df.loc[awards_test_df['reliability'] <0.70]
输出:
po_id po_name product year measure rate denominator numerator is_reported reliability
191 1008200 Advancements Physicians Medical Center Commercial HMO/POS 18 ACCESS3 58.13 NaN NaN True 0.60
515 1021102 Baird Medical Group Commercial HMO/POS 18 COORDINATE3 60.02 NaN NaN True 0.70
... ... ... ... ... ... ... ... ... ... ...
8606 1038400 Vf Healthcare Commercial HMO/POS 18 OFCSTAFF 68.78 NaN NaN True 0.70
8620 1038400 Vf Healthcare Commercial HMO/POS 18 MDINTERACT3 79.57 NaN NaN True 0.70
8800 1006001 Viva Physicians Commercial HMO/POS 18 ACCESS3 66.25 NaN NaN True 0.70
8869 1017708 Waltz Hospital Commercial HMO/POS 19 MDINTERACT3 81.01 NaN NaN True 0.70
9142 1028100 Zeke Medical Group Commercial HMO/POS 18 ACCESS3 56.37 NaN NaN True 0.70
您的代码在重现时看起来非常完美:
import pandas as pd
data = [ { "po_id": 191, "po_name": 1008200, "product": "Advancements Physicians Medical Center Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 58.13, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.6 }, { "po_id": 515, "po_name": 1021102, "product": "Baird Medical Group Commercial HMO/POS", "year": 18, "measure": "COORDINATE3", "rate": 60.02, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8606, "po_name": 1038400, "product": "Vf Healthcare Commercial HMO/POS", "year": 18, "measure": "OFCSTAFF", "rate": 68.78, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8620, "po_name": 1038400, "product": "Vf Healthcare Commercial HMO/POS", "year": 18, "measure": "MDINTERACT3", "rate": 79.57, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8800, "po_name": 1006001, "product": "Viva Physicians Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 66.25, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 8869, "po_name": 1017708, "product": "Waltz Hospital Commercial HMO/POS", "year": 19, "measure": "MDINTERACT3", "rate": 81.01, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 }, { "po_id": 9142, "po_name": 1028100, "product": "Zeke Medical Group Commercial HMO/POS", "year": 18, "measure": "ACCESS3", "rate": 56.37, "denominator": "NaN", "numerator": "NaN", "is_reported": True, "reliability": 0.7 } ]
awards_test_df = pd.DataFrame(data)
awards_test_df.loc[awards_test_df['reliability'] <0.70]
输出:
| | po_id | po_name | product | year | measure | rate | denominator | numerator | is_reported | reliability |
|---:|--------:|----------:|:-----------------------------------------------------------|-------:|:----------|-------:|--------------:|------------:|:--------------|--------------:|
| 0 | 191 | 1008200 | Advancements Physicians Medical Center Commercial HMO/POS | 18 | ACCESS3 | 58.13 | nan | nan | True | 0.6 |
这只是显示格式。试试下面检查
df = pd.DataFrame({"reliability":np.random.uniform(.65,.75, 100)})
df = df.loc[df.reliability.lt(.7)].assign(twodp=df.reliability.round(2)).query("twodp.eq(.7)")
reliability | twodp |
---|---|
0.695661 | 0.7 |
0.699588 | 0.7 |
0.698993 | 0.7 |
0.697933 | 0.7 |
0.698356 | 0.7 |
0.699906 | 0.7 |
0.695279 | 0.7 |