比较 pandas DataFrame 中的列给出无法解决的 ValueError
Comparing columns in pandas DataFrame gives unsolvable ValueError
我有以下 pandas DataFrame:
df = pd.DataFrame({"id": [0, 1, 2, 3, 4, 5, 6],
"from": ["A", "B", "B", "D", "B", "C", "B"],
"to": ["B", "C", "D", "F", "G", "F", "E"],
"cases": [[1, 2, 44], [2, 4, 3], [5, 2], [5], [1, 7], [4], [44, 7]]
"start1": [1, 5, 4, 4, 23, 12, 8],
"start2": [4, 7, 9, 30, 26, 15, 18],
"end1": [5, 7, 11, 32, 15, 17, 21],
"end2": [9, 12, 15, 35, 17, 20, 25],})
看起来像:
id from to cases start1 start2 end1 end2
0 0 A B [1, 2, 44] 1 4 5 9
1 1 B C [2, 4, 3] 5 7 7 12
2 2 B D [5, 2] 4 9 11 15
3 3 D F [5] 4 30 32 35
4 4 B G [1, 7] 23 26 15 17
5 5 C F [4] 12 15 17 20
6 6 B E [44, 7] 8 18 21 25
我正在尝试创建一个列 adjacency_list,其中包含第 i
行的 id
行 j
的值,其中:
i["to"] == j["from"]
i["cases"]
与 j["cases"]
重叠
- 间隔 (
i["end1"]
, i["end2"]
) 和 (j["start1"]
, j["start2"]
) 重叠
我正在尝试执行以下代码来实现此目的:
data["adjacency_list"] = data.apply(
lambda x: [
row["id"]
for i, row in data[(x["to"] == data["from"])].iterrows()
if ((not set(row["cases"]).isdisjoint(x["cases"])) and ((x["end1"] <= test["start1"] <= x["end2"]) or (test["start1"] <= x["end1"] <= test["start2"])))
],
axis=1,
)
输出应如下所示:
id from to cases start1 start2 end1 end2 adjacency_list
0 0 A B [1, 2, 44] 1 4 5 9 [1, 2, 6]
1 1 B C [2, 4, 3] 5 7 7 12 [5]
2 2 B D [5, 2] 4 9 11 15 [3]
3 3 D F [5] 4 30 32 35 []
4 4 B G [1, 7] 23 26 15 17 []
5 5 C F [4] 12 15 17 20 []
6 6 B E [44, 7] 8 18 21 25 []
但它给我以下错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我从在不同上下文中遇到此错误的用户那里阅读了很多其他答案,并尝试将 and
和 or
替换为 &
和 |
,但这没有用。此外,将双 <=
比较替换为两个单 <=
比较也无济于事。
如何解决?
(test["start1"] <= x["end1"] <= test["start2"])
正在创建一系列布尔值,因为 test['start1']
是一个系列,所以每个元素都会进行比较。
尝试将每个 row
与 x
进行比较:
df["adjacency_list"] = df.apply(
lambda x: [
row["id"]
for _, row in df[(x["to"] == df["from"])].iterrows()
if (
(
not set(row["cases"]).isdisjoint(x["cases"])
) and (
(x["end1"] <= row["start1"] <= x["end2"])
or
(row["start1"] <= x["end1"] <= row["start2"])
)
)
],
axis=1,
)
输出:
id from to cases start1 start2 end1 end2 adjacency_list
0 0 A B [1, 2, 44] 1 4 5 9 [1, 2, 6]
1 1 B C [2, 4, 3] 5 7 7 12 [5]
2 2 B D [5, 2] 4 9 11 15 [3]
3 3 D F [5] 4 30 32 35 []
4 4 B G [1, 7] 23 26 15 17 []
5 5 C F [4] 12 15 17 20 []
6 6 B E [44, 7] 8 18 21 25 []
我有以下 pandas DataFrame:
df = pd.DataFrame({"id": [0, 1, 2, 3, 4, 5, 6],
"from": ["A", "B", "B", "D", "B", "C", "B"],
"to": ["B", "C", "D", "F", "G", "F", "E"],
"cases": [[1, 2, 44], [2, 4, 3], [5, 2], [5], [1, 7], [4], [44, 7]]
"start1": [1, 5, 4, 4, 23, 12, 8],
"start2": [4, 7, 9, 30, 26, 15, 18],
"end1": [5, 7, 11, 32, 15, 17, 21],
"end2": [9, 12, 15, 35, 17, 20, 25],})
看起来像:
id from to cases start1 start2 end1 end2
0 0 A B [1, 2, 44] 1 4 5 9
1 1 B C [2, 4, 3] 5 7 7 12
2 2 B D [5, 2] 4 9 11 15
3 3 D F [5] 4 30 32 35
4 4 B G [1, 7] 23 26 15 17
5 5 C F [4] 12 15 17 20
6 6 B E [44, 7] 8 18 21 25
我正在尝试创建一个列 adjacency_list,其中包含第 i
行的 id
行 j
的值,其中:
i["to"] == j["from"]
i["cases"]
与j["cases"]
重叠
- 间隔 (
i["end1"]
,i["end2"]
) 和 (j["start1"]
,j["start2"]
) 重叠
我正在尝试执行以下代码来实现此目的:
data["adjacency_list"] = data.apply(
lambda x: [
row["id"]
for i, row in data[(x["to"] == data["from"])].iterrows()
if ((not set(row["cases"]).isdisjoint(x["cases"])) and ((x["end1"] <= test["start1"] <= x["end2"]) or (test["start1"] <= x["end1"] <= test["start2"])))
],
axis=1,
)
输出应如下所示:
id from to cases start1 start2 end1 end2 adjacency_list
0 0 A B [1, 2, 44] 1 4 5 9 [1, 2, 6]
1 1 B C [2, 4, 3] 5 7 7 12 [5]
2 2 B D [5, 2] 4 9 11 15 [3]
3 3 D F [5] 4 30 32 35 []
4 4 B G [1, 7] 23 26 15 17 []
5 5 C F [4] 12 15 17 20 []
6 6 B E [44, 7] 8 18 21 25 []
但它给我以下错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
我从在不同上下文中遇到此错误的用户那里阅读了很多其他答案,并尝试将 and
和 or
替换为 &
和 |
,但这没有用。此外,将双 <=
比较替换为两个单 <=
比较也无济于事。
如何解决?
(test["start1"] <= x["end1"] <= test["start2"])
正在创建一系列布尔值,因为 test['start1']
是一个系列,所以每个元素都会进行比较。
尝试将每个 row
与 x
进行比较:
df["adjacency_list"] = df.apply(
lambda x: [
row["id"]
for _, row in df[(x["to"] == df["from"])].iterrows()
if (
(
not set(row["cases"]).isdisjoint(x["cases"])
) and (
(x["end1"] <= row["start1"] <= x["end2"])
or
(row["start1"] <= x["end1"] <= row["start2"])
)
)
],
axis=1,
)
输出:
id from to cases start1 start2 end1 end2 adjacency_list
0 0 A B [1, 2, 44] 1 4 5 9 [1, 2, 6]
1 1 B C [2, 4, 3] 5 7 7 12 [5]
2 2 B D [5, 2] 4 9 11 15 [3]
3 3 D F [5] 4 30 32 35 []
4 4 B G [1, 7] 23 26 15 17 []
5 5 C F [4] 12 15 17 20 []
6 6 B E [44, 7] 8 18 21 25 []