删除具有给定条件的重复项和 python 中的 return 唯一元组列表
Remove duplicates with a given condition and return unique list of tuples in python
我有一个元组列表如下:
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
如果你注意到,我有红色和绿色的“苹果”以及黑色和棕色的“葡萄”。
所以我想删除任何一个元组并保留另一个元组,输出应如下所示:
output = [("red", "apple"), ("black", "grapes"),
("yellow", "banana"), ("white", "litchi")]
所以在输出中 (green apple) 和 (brown grapes) 被删除了。
有什么办法可以实现吗?我尝试了很多次,但无法弄清楚。请帮助.. :)
如果需要按元组的第二个值删除重复项,请使用 DataFrame.drop_duplicates
:
a = pd.DataFrame(ls).drop_duplicates([1]).apply(tuple, 1).tolist()
print (a)
[('red', 'apple'), ('black', 'grapes'), ('yellow', 'banana'), ('white', 'litchi')]
我设法将列表转换为 Pandas DataFrame,根据“水果”属性删除重复项,然后将其转换回元组列表。
import pandas as pd
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
df = pd.DataFrame (ls, columns=["color", "fruit"])
df.drop_duplicates (subset=["fruit"], keep="first", inplace=True)
print (list(df.to_records(index=False)))
pandas 太过分了。它可以在不导入任何额外模块的情况下完成。
创建一个中间字典,然后从中重建元组列表:
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
d = [(v, k) for k, v in {v:k for k, v in ls}.items()]
print(d)
输出:
[('green', 'apple'), ('brown', 'grapes'), ('yellow', 'banana'), ('white', 'litchi')]
我有一个元组列表如下:
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
如果你注意到,我有红色和绿色的“苹果”以及黑色和棕色的“葡萄”。 所以我想删除任何一个元组并保留另一个元组,输出应如下所示:
output = [("red", "apple"), ("black", "grapes"),
("yellow", "banana"), ("white", "litchi")]
所以在输出中 (green apple) 和 (brown grapes) 被删除了。
有什么办法可以实现吗?我尝试了很多次,但无法弄清楚。请帮助.. :)
如果需要按元组的第二个值删除重复项,请使用 DataFrame.drop_duplicates
:
a = pd.DataFrame(ls).drop_duplicates([1]).apply(tuple, 1).tolist()
print (a)
[('red', 'apple'), ('black', 'grapes'), ('yellow', 'banana'), ('white', 'litchi')]
我设法将列表转换为 Pandas DataFrame,根据“水果”属性删除重复项,然后将其转换回元组列表。
import pandas as pd
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
df = pd.DataFrame (ls, columns=["color", "fruit"])
df.drop_duplicates (subset=["fruit"], keep="first", inplace=True)
print (list(df.to_records(index=False)))
pandas 太过分了。它可以在不导入任何额外模块的情况下完成。
创建一个中间字典,然后从中重建元组列表:
ls = [("red", "apple"), ("black", "grapes"),
("green", "apple"), ("yellow", "banana"),
("white", "litchi"), ("brown", "grapes")]
d = [(v, k) for k, v in {v:k for k, v in ls}.items()]
print(d)
输出:
[('green', 'apple'), ('brown', 'grapes'), ('yellow', 'banana'), ('white', 'litchi')]