在 python 的 2d 列表中找到重复项

Find an duplicate items at position in 2d list in python

我有一个从 KML 文件中生成的路线编号和坐标的二维列表。顺序为'routenumber','coordinates' 例如:

[['90', '54.93920,25.52,0.0'], ['93', '37.326,19.39,0.0'], ['94', '-110.67,24.395,0.0'], ['95', '-102.154599,17.915081,0.0'], ['96', '-109.177574,25.537,0.0'], ['97', '54.93920,25.52,0.0'], ['98', '55.319,25.506,0.0'], ['911', '54.939206,25.5249,0.0'], ['914', '54.93920,25.52,0.0'], ['915', '54.9169,25.5031,0.0'], ['916', '55.3709,25.35949,0.0'], ['917', '54.939206,25.5249,0.0'], ['920', '56.4641,25.21,0.0'], ['921', '56.4916,25.376,0.0']]

对于列表中的每条路线,我都试图找到具有相同坐标的路线,例如

Output:
54.939206,25.5249,0.0 is seen in 911, 917
54.93920,25.52,0.0 is seen in 90, 97, 914

所以在简单的层面上,可以在第二个内部识别重复项? 'column' 个

[['1','10'],['2','50'],['3','10'],['4','0'],['5','50']]

并为我提供:

Output:
10 is seen in 1, 3
50 is seen in 2, 5

我已经尝试了以下方法,但现在我陷入了困境,想知道我是否应该改用字典。我对 python 还很陌生,您可能已经看出来了。我在这里的论坛上找不到任何关于比较二维列表中某个位置的特定项目的内容(这让我想知道是否使用字典)

for idx, (route,mcoords) in enumerate(endcoordlist):
    #print(idx, route, coords)
    if any(mcoords in coords for idx, (route,coords) in enumerate(endcoordlist)):    
        print(idx, route, coords)
    

如有任何帮助,我们将不胜感激。

更新:感谢 sushanth 提供的代码。我对它进行了一些调整,以便像我这样的 readability/noobs 了解发生了什么。

for v in endcoordlist:
    if groups.get(v[1]):
        print(v[0]," is a duplicate with coordinates:",v[1])
        groups[v[1]].append(v[0])
    
    else:
        groups[v[1]] = [v[0]]
        print(v[1]," is unique")

for k, v in groups.items():
    print(f"{k} seen in {','.join(v)}")

这是一个您可以尝试的解决方案,

声明一个 dict 将坐标作为 key 并将路线作为字典的 values 列表。

groups = {}

for v in input_:
    if groups.get(v[1]):
        groups[v[1]].append(v[0])
    else:
        groups[v[1]] = [v[0]]

for k, v in groups.items():
    print(f"{k} seen in {','.join(v)}")

54.93920,25.52,0.0 seen in 90,97,914
37.326,19.39,0.0 seen in 93
...

如果您根据坐标对输入列表进行排序,那么您只需要遍历排序列表以查找彼此相等的相邻值。这可能不会为您提供所需的输出格式,但它确实展示了您如何可以 做到这一点。可能还有更优雅的方式:-

RN = [['90', '54.93920,25.52,0.0'], ['93', '37.326,19.39,0.0'], ['94', '-110.67,24.395,0.0'], ['95', '-102.154599,17.915081,0.0'], ['96', '-109.177574,25.537,0.0'], ['97', '54.93920,25.52,0.0'], ['98', '55.319,25.506,0.0'],
      ['911', '54.939206,25.5249,0.0'], ['914', '54.93920,25.52,0.0'], ['915', '54.9169,25.5031,0.0'], ['916', '55.3709,25.35949,0.0'], ['917', '54.939206,25.5249,0.0'], ['920', '56.4641,25.21,0.0'], ['921', '56.4916,25.376,0.0']]

RNS = sorted(RN, key=lambda x: x[1])
m = set()
for i in range(len(RNS) - 1):
    if RNS[i][1] == RNS[i + 1][1]:
        coord = RNS[i][1]
        m.add(RNS[i][0])
        m.add(RNS[i + 1][0])
    else:
        if len(m) > 0:
            print(coord, m)
            m = set()
if len(m) > 0:
    print(coord, m)