regular_expression 的问题

Issues with regular_expression

我写了一个脚本,如果行与特定模式匹配,它会打开多个文本文件。之后我想将这一行与我的国家模式(包含国家名称)进行比较并(现在)打印这个国家。

(稍后我将尝试创建一种方法,将每个文本文件移动到基​​于其所在国家/地区的文件夹)

基本上每个文本文件都包含行:

25/02/2015|11:06:21|MYS|MYS14_FRC6-7_MY1_AA1_WP|MMS1|WXP2632|ashraf|true|120|0|false|

如您所见,此示例包含国家/地区名称 "MYS"

import os
import string
import re
import sys
import glob
import fileinput

country_pattern = 'MYS','IDN','ZAF', 'THA','TWN','SGP', 'NWZ', 'AUS','ALB','AUT','BEL', 'BGR', 'BIH', 'CHE','CZE', 'DEU', 'DNK', 'ESP','EST','SRB','MDK','MNE','BIH', 'BIH','MNE','FIN', 'FRA', 'GBR','GRC', 'HRV', 'HUN', 'IRL', 'ITA', 'LIE', 'LTU', 'LUX', 'LVA', 'MDA', 'SMR','CYP','NLD','NOR','POL','PRT','ROU','SCG', 'SVK','SVN','SWE','TUR','BRA','CAN','USA','MEX','CHL','ARG','RUS'
pattern = r'(\d+)/(\d+)/(\d+)|(\d+):(\d+):(\d+)|(\S+)|(\S+)|(\S+|(\S+)|(\S+)|(\S+)|(\d+)|(\d+)|(\S+)|'
src = raw_input("Enter source disk location: ")
src = os.path.dirname(src) # zwraca sciezke do pliku
for dir,_,_ in os.walk(src): # odwoluje sie do wielu folderow
file_path = glob.glob(os.path.join(dir,"*.txt")) # szukam plikow mdi
print(file_path)
for file in file_path: 
    f = open(file, 'r')
    object_name = f.readlines()
    f.close()


    for line_name_tmp in object_name: 
        line_name = line_name_tmp.replace('\n','')
        if line_name == '':
            line_name.split()
            continue
        else:
            try:
                re.search(pattern, line_name) 
            except:
                print line_name
                pass

        searchObj = re.search(pattern, line_name) 
        m = searchObj.group(1)
        if m in coutry_pattern:
            print "searchObj.group(1) : ", searchObj.group(1)
        else:
            print 'did not find any'

不幸的是我得到这个错误:

  File "<string>", line 254, in run_nodebug
  File "C:\Users\kostrzew\Desktop\REPORTS\MdiAdmin.py", line 43, in <module>
  searchObj = re.search(pattern, line_name) #
  File "C:\Python27\Lib\re.py", line 142, in search
    return _compile(pattern, flags).search(string)
  File "C:\Python27\Lib\re.py", line 245, in _compile
    raise error, v # invalid expression
  sre_constants.error: unbalanced parenthesis

我不知道如何解决这个错误。我是不是漏掉了什么图案?

您刚刚在正则表达式中省略了一个右括号 :

pattern = r'(\d+)/(\d+)/(\d+)|(\d+):(\d+):(\d+)|(\S+)|(\S+)|(\S+|(\S+)|(\S+)|(\S+)|(\d+)|(\d+)|(\S+)|'
                                                               ^