在长字符串列表中搜索子字符串，然后打印出接下来的 4 个字符

Question

感谢阅读我的问题。我将在我们说话时解决这个问题，如果我找到解决方案，我会更新问题。虽然我担心这对我的技能组合来说可能有点太高级了，但我会很感激任何帮助！

我有一个字符串列表，每个字符串都显示一条错误消息。

'Error: Customer (ABC 111) has an activation error'

'Error: Customer (ABC 112) has an activation error'

对于这个 strings 列表中的每个 string 我想找到 substring 'ABC'然后打印出下面四个个字符对应的身份证号码

OUT: ' 111', ' 112'

现在我知道如何在字符串列表中查找子字符串，但是打印以下字符让我感到困惑。

我会在处理代码时进行更新，或者直到某些编码图例帮助我解决问题！[=15=]

谢谢！！

编辑：在下面添加 MRE 和最终代码：

基本上，数据最初是在一个 excel 文件中提供的，其中有两个 headers 被转换为 Pandas.

中的数据帧

CONT_ID	ERROR_DESC
123	Error: Customer (ABC 111) has an activation error
124	Error: Customer (ABC 112) has an activation error

等等

我需要将 ERROR_DESC 列迭代到 select CUSTOMER_ID每一行。在现实世界中，数据有点复杂，在 ID 之前有不同的代码，我还需要字符串中的另一个子字符串。但是对于 MRE，我将使用 ABC 作为常量。

我的最终 MRE 代码如下。


cust_id = []
for index, row in df.iterrows():
   desc = row['ERROR_DESC']
   
   i = desc.index('ABC')
   id_num = desc[i+4:1+7]
   cust_id.append(id_num)

Answer 1

可以得到ABC的索引，在字符串上找到：

a = 'Error: Customer (ABC 111) has an activation error'
i = a.index("ABC")
num = a[i+4:i+7] -> '111'

Answer 2

没有 MRE 可以工作，我会要求您适当修改它以适合您的用例：

import re

#setup
list_of_strings = ['Error: Customer (ABC 111) has an activation error',
                   'Error: Customer (ABC 112) has an activation error',
                  ]
pattern = r'(?<=ABC )(\d{3})'

#the thing you want
customer_ids = [int(cust_id.group(0)) for long_string in list_of_strings\
     if (cust_id:=re.search(pattern,long_string))]

#produces
print(customer_ids)

[Out]: [111, 112]

在长字符串列表中搜索子字符串，然后打印出接下来的 4 个字符

Searching a list of long strings for a substring and then printing out the next 4 characters

python

string

substring

list