Python:按索引在字符串中查找子字符串

Python: find substring in string by index

我正在制作一些列表,其中我需要在压缩之前排除一些情况。列表中的每个项目都有类似的编码“A001_AA”、“A002_AA”等。我想做的是压缩列表,同时删除重复的列表。我需要能够根据字符串中的前 4 项删除它们。

下面我包含了我希望我的输出看起来有帮助的内容。

listA = "A001_AA","A002_AA" "A003_AA","A004_AA"
listB = "A001_BB","A002_BB" "A003_BB","A004_BB"
listZipped = ("A001_AA","A002_BB"),("A001_AA","A003_BB"),("A001_AA","A004_BB"), ("A002_AA","A001_BB") etc

所以我基本上需要能够做类似的事情:

for i in listA:
    for x in listB:
       if i[first 4 letters] == x[first 4 letters]:
            do not add to zipped list

我希望这是有道理的

这行得通!如果有任何问题,请告诉我。

listA = ["A001_AA", "A002_AA", "A003_AA", "A004_AA"]
listB = ["A001_BB", "A002_BB", "A003_BB", "A004_BB"]
listZipped = []

for i in listA:
    for j in listB:
        if i[0:4] != j[0:4]:
            listZipped.append((i, j))

print(listZipped)

或者,如果您觉得它更具可读性,您可以删除嵌套的 for 循环并将其替换为

listZipped = [(i, j) for i in listA for j in listB if i[0:4] != j[0:4]]

输出

[('A001_AA', 'A002_BB'), ('A001_AA', 'A003_BB'), ('A001_AA', 'A004_BB'), ('A002_AA', 'A001_BB'), ('A002_AA', 'A003_BB'), ('A002_AA', 'A004_BB'), ('A003_AA', 'A001_BB'), ('A003_AA', 'A002_BB'), ('A003_AA', 'A004_BB'), ('A004_AA', 'A001_BB'), ('A004_AA', 'A002_BB'), ('A004_AA', 'A003_BB')]