如何使用 lambda 在 for 循环中创建 if？

Question

我有 list_a 和 string_tmp 这样的

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

我想知道 list_a 中是否有任何 string_tmp 项，如果有，type = L1 否则 type = L2?

# for example
type = ''
for k in string_tmp.split():
    if k in list_a:
        type = 'L1'
if len(type) == 0:
    type = 'L2'

这是真正的问题，但在我的项目中，len(list_a) = 200,000 和 len(strgin_tmp) = 10,000，所以我需要超快

# this is the output of the example 
type = 'L1'

Answer 1

我们可以尝试使用正则表达式和列表理解：

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'
output = ['L1' if re.search(r'\b' + x + r'\b', string_tmp) else 'L2' for x in list_a]
print(output)  # ['L1', 'L2', 'L2']

Answer 2

将引用列表和字符串标记转换为集合应该可以提高性能。像这样：

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

def get_type(s, r): # s is the string, r is the reference list
    s = set(s.split())
    r = set(r)
    return 'L1' if any(map(lambda x: x in r, s)) else 'L2'

print(get_type(string_tmp, list_a))

输出：

L1

Answer 3

效率取决于两个输入中哪一个最不变。例如，如果 list_a 保持不变，但要测试的字符串不同，则可能值得将该列表转换为正则表达式，然后将其用于不同的字符串。

这是一个为给定列表创建 class 实例的解决方案。然后对不同的字符串重复使用这个实例：

import re

class Matcher:
    def __init__(self, lst):
        self.regex = re.compile(r"\b(" + "|".join(re.escape(key) for key in lst) + r")\b")

    def typeof(self, s):
        return "L1" if self.regex.search(s) else "L2"

# demo

list_a = ['AA', 'BB', 'CC']

matcher = Matcher(list_a)

string_tmp = 'Hi AA How Are You'
print(matcher.typeof(string_tmp))  # L1

string_tmp = 'Hi DD How Are You'
print(matcher.typeof(string_tmp))  # L2

这个正则表达式的一个副作用是它也匹配附近有标点符号的单词。例如，当字符串为 'Hi AA, How Are You'（带有额外的逗号）时，上面的内容仍然 return "L1"。

如何使用 lambda 在 for 循环中创建 if？

How to make if inside for loop using lambda?

python

list-comprehension