匹配字符串中的项目
Matching items in a string
我正在尝试处理从工作申请数据库收到的输出,我需要将每个帐户信息组织到一个列表中,但我不确定如何匹配申请人的开始日期,因为它从 #1 开始,但有时它是根据此人是否被雇用与不同的号码配对。
例如开始日期#1 需要与姓名#3 配对,开始日期#2 需要与姓名#4 配对。
我的开头是什么:
FirstName#1 = Joe | FirstName#2 = Michael | FirstName#3 = Harold | FirstName#4 = John | LastName#1 = Miles | LastName#2 = Gomez | LastName#3 = Hall | LastName#4 = Hancock | Hired#1 = False | Hired#2 = False | Hired#3 = True | Hired#4 = True | StartDate#1 = 10/31/2018 | StartDate#2 = 10/25/2018 |
需要输出:
[['Joe','Miles','False'], ['Michael','Gomez','False'], ['Harold','Hall','True','10/31/2018'], ['John','Hancock','True','10/25/2018']]
您可以按以下方式执行此操作:(您可能想阅读有关 python 'exec')
import re
db_output = """FirstName#1 = Joe | FirstName#2 = Michael | FirstName#3 = Harold | FirstName#4 = John | LastName#1 = Miles | LastName#2 = Gomez | LastName#3 = Hall | LastName#4 = Hancock | Hired#1 = False | Hired#2 = False | Hired#3 = True | Hired#4 = True | StartDate#1 = 10/31/2018 | StartDate#2 = 10/25/2018 |"""
# define the keys
keys = ['FirstName', 'LastName', 'Hired', 'StartDate']
# create empty dict with name in keys
for key in keys:
exec('{} = dict()'.format(key))
# parse and build dicts
for data in db_output.split(' |'):
data = data.strip()
if data == '':
continue
reobj = re.search('(\S+)#(\d+)\s+=(.*)', data)
if reobj:
key = reobj.group(1).strip()
num = int(reobj.group(2))
value = reobj.group(3).strip()
print key, num, value
exec('{}[num] = value'.format(key))
# get the inexes for FirstName and sort it
exec('firstname_indexes = {}.keys()'.format(keys[0]))
firstname_indexes = sorted(firstname_indexes)
# get all local variables (required due to usage 'exec')
local = locals()
# variable to collect all data
output = list()
# Start date starts with 1
startdate_track = 1
# use firstname indexes and track start date, create a tmp list and
# then append to output
for fn_index in firstname_indexes:
tmp = list()
for key in keys:
if key == 'StartDate':
if tmp[-1] == 'True':
tmp.append(local[key][startdate_track])
startdate_track += 1
output.append(tmp)
else:
tmp.append(local[key][fn_index])
print output
输出:
[['Joe', 'Miles', 'False'], ['Michael', 'Gomez', 'False'], ['Harold', 'Hall', 'True', '10/31/2018'], ['John', 'Hancock', 'True', '10/25/2018']]
我正在尝试处理从工作申请数据库收到的输出,我需要将每个帐户信息组织到一个列表中,但我不确定如何匹配申请人的开始日期,因为它从 #1 开始,但有时它是根据此人是否被雇用与不同的号码配对。
例如开始日期#1 需要与姓名#3 配对,开始日期#2 需要与姓名#4 配对。
我的开头是什么:
FirstName#1 = Joe | FirstName#2 = Michael | FirstName#3 = Harold | FirstName#4 = John | LastName#1 = Miles | LastName#2 = Gomez | LastName#3 = Hall | LastName#4 = Hancock | Hired#1 = False | Hired#2 = False | Hired#3 = True | Hired#4 = True | StartDate#1 = 10/31/2018 | StartDate#2 = 10/25/2018 |
需要输出:
[['Joe','Miles','False'], ['Michael','Gomez','False'], ['Harold','Hall','True','10/31/2018'], ['John','Hancock','True','10/25/2018']]
您可以按以下方式执行此操作:(您可能想阅读有关 python 'exec')
import re
db_output = """FirstName#1 = Joe | FirstName#2 = Michael | FirstName#3 = Harold | FirstName#4 = John | LastName#1 = Miles | LastName#2 = Gomez | LastName#3 = Hall | LastName#4 = Hancock | Hired#1 = False | Hired#2 = False | Hired#3 = True | Hired#4 = True | StartDate#1 = 10/31/2018 | StartDate#2 = 10/25/2018 |"""
# define the keys
keys = ['FirstName', 'LastName', 'Hired', 'StartDate']
# create empty dict with name in keys
for key in keys:
exec('{} = dict()'.format(key))
# parse and build dicts
for data in db_output.split(' |'):
data = data.strip()
if data == '':
continue
reobj = re.search('(\S+)#(\d+)\s+=(.*)', data)
if reobj:
key = reobj.group(1).strip()
num = int(reobj.group(2))
value = reobj.group(3).strip()
print key, num, value
exec('{}[num] = value'.format(key))
# get the inexes for FirstName and sort it
exec('firstname_indexes = {}.keys()'.format(keys[0]))
firstname_indexes = sorted(firstname_indexes)
# get all local variables (required due to usage 'exec')
local = locals()
# variable to collect all data
output = list()
# Start date starts with 1
startdate_track = 1
# use firstname indexes and track start date, create a tmp list and
# then append to output
for fn_index in firstname_indexes:
tmp = list()
for key in keys:
if key == 'StartDate':
if tmp[-1] == 'True':
tmp.append(local[key][startdate_track])
startdate_track += 1
output.append(tmp)
else:
tmp.append(local[key][fn_index])
print output
输出:
[['Joe', 'Miles', 'False'], ['Michael', 'Gomez', 'False'], ['Harold', 'Hall', 'True', '10/31/2018'], ['John', 'Hancock', 'True', '10/25/2018']]