使用 get() 访问 pd.Series 多索引未找到元素
Accessing pd.Series Multi-index with get() does not find element
下面是我的代码的一个最小示例。
我有一个带有多索引的分组 pd.Series 并且可以使用 grouped["stallion", "london"]
访问单个元素,但是当使用 grouped.get(["stallion", "london"])
时结果是 None
(或者 -1 当默认值已给出)。
import pandas as pd
a = pd.DataFrame({"breed": ["stallion", "stallion", "stallion", "stallion", "pony", "pony", "pony"],
"stable": ["hogwarts", "hogwarts", "london", None, "hogwarts", "london", "london"],
"weight": [800, 900, 982, 400, 230, 300, 500]})
grouped = a.groupby(["breed", "stable"], dropna=False)["weight"].mean()
grouped = grouped.append(a.groupby("breed", dropna=False)["weight"].mean())
grouped = grouped.append(pd.Series(a["weight"].mean(), index=["all_breeds"]))
print(grouped)
print()
print(grouped["stallion", "london"])
print(grouped.get(["stallion", "london"]))
print(grouped.get(["stallion", "london"], -1))
print(f'The same? {grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)}')
预期行为
我期待所有三行给我相同的结果:
grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)
使用get()
的原因是我想获得我能找到的条目的最佳结果:
grouped.get(["stallion", "london"], grouped.get("stallion", grouped["all_breeds"]))
您必须使用元组来获取值,因为您的索引包含元组(这不是 MultiIndex)
>>> grouped.index
Index([ ('pony', 'hogwarts'), ('pony', 'london'),
('stallion', 'hogwarts'), ('stallion', 'london'),
('stallion', nan), 'pony',
'stallion', 'all_breeds'],
dtype='object')
>>> grouped.get(["stallion", "london"])
None
>>> grouped.get(("stallion", "london"))
982.0
###
>>> grouped.get(["stallion", "london"], -1)
-1
>>> grouped.get(("stallion", "london"), -1)
982.0
注意 grouped["stallion", "london"]
等同于 grouped[("stallion", "london")]
但元组是隐式的。
最终输出:
>>> grouped.get(("stallion", "london"), grouped.get("stallion", grouped["all_breeds"]))
982.0
下面是我的代码的一个最小示例。
我有一个带有多索引的分组 pd.Series 并且可以使用 grouped["stallion", "london"]
访问单个元素,但是当使用 grouped.get(["stallion", "london"])
时结果是 None
(或者 -1 当默认值已给出)。
import pandas as pd
a = pd.DataFrame({"breed": ["stallion", "stallion", "stallion", "stallion", "pony", "pony", "pony"],
"stable": ["hogwarts", "hogwarts", "london", None, "hogwarts", "london", "london"],
"weight": [800, 900, 982, 400, 230, 300, 500]})
grouped = a.groupby(["breed", "stable"], dropna=False)["weight"].mean()
grouped = grouped.append(a.groupby("breed", dropna=False)["weight"].mean())
grouped = grouped.append(pd.Series(a["weight"].mean(), index=["all_breeds"]))
print(grouped)
print()
print(grouped["stallion", "london"])
print(grouped.get(["stallion", "london"]))
print(grouped.get(["stallion", "london"], -1))
print(f'The same? {grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)}')
预期行为
我期待所有三行给我相同的结果:
grouped["stallion", "london"] == grouped.get(["stallion", "london"]) == grouped.get(["stallion", "london"], -1)
使用get()
的原因是我想获得我能找到的条目的最佳结果:
grouped.get(["stallion", "london"], grouped.get("stallion", grouped["all_breeds"]))
您必须使用元组来获取值,因为您的索引包含元组(这不是 MultiIndex)
>>> grouped.index
Index([ ('pony', 'hogwarts'), ('pony', 'london'),
('stallion', 'hogwarts'), ('stallion', 'london'),
('stallion', nan), 'pony',
'stallion', 'all_breeds'],
dtype='object')
>>> grouped.get(["stallion", "london"])
None
>>> grouped.get(("stallion", "london"))
982.0
###
>>> grouped.get(["stallion", "london"], -1)
-1
>>> grouped.get(("stallion", "london"), -1)
982.0
注意 grouped["stallion", "london"]
等同于 grouped[("stallion", "london")]
但元组是隐式的。
最终输出:
>>> grouped.get(("stallion", "london"), grouped.get("stallion", grouped["all_breeds"]))
982.0