python 字典和双端队列根据某些条件打印所需的输出
python dictionary and deque to print required output based on some condition
我有 CSV File
,其中包含一些从采矿中产生的数据
我想按照要求的格式打印它
Required Format
A -> B -> C -> D -> E -> F
A -> B -> C -> I
X -> Y -> Z
X -> Y -> P -> Q
A -> B -> K -> L
a.csv 文件
## code
from collections import deque
import pandas as pd
data = pd.read_csv("a.csv")
data['Start'] = data['Start'].str.replace(' ','_')
data['End'] = data['End'].str.replace(' ','_')
fronts = dict()
backs = dict()
sequences = []
position_counter = 0
selector = data.apply(lambda row: row.str.extractall("([\w+\d]+)"), axis=1)
for relation in selector:
front, back = relation[0]
llist = deque((front, back))
finb = front in backs.keys()
if finb:
position = backs[front]
llist2 = sequences[position]
back_llist2 = llist2.pop()
llist = llist2 + llist
sequences[position] = llist
backs[llist[-1]] = position
if front in fronts.keys():
del fronts[front]
if back_llist2 in backs.keys():
del backs[back_llist2]
if not finb:
sequences.append(llist)
fronts[front] = position_counter
backs[back] = position_counter
position_counter += 1
data = []
for s in sequences:
data.append(' -> '.join(str(el) for el in s))
data
我得到的是:
'A -> B -> C -> D -> E -> F'
'C -> I'
'A -> N -> A'
'X -> Y -> Z'
'Y -> P -> Q'
'B -> K -> L'
'X1 -> Y1'
您需要在现有路径中搜索新行的起始元素。如果找到,附加到现有路径或复制路径并附加新的结束元素。
试试这个代码:
ss = '''
A B
B C
C D
D E
E F
C I
A N
N A
X Y
Y Z
Y P
P Q
B K
K L
X1 Y1
'''.strip()
lst = []
for r in ss.split('\n'):
lst.append(r.split())
################
paths = []
for e in lst: # each row in source data
pnew = [] # new path
for p in paths:
if e[0] in p: # if start in existing path
if p.index(e[0]) == len(p)-1: # if end of path
p.append(e[1]) # add to path
else:
pnew.append(p[:p.index(e[0])+1]+[e[1]]) # copy path then add
break
else: # loop completed, not found
paths.append(list(e)) # create new path
if len(pnew): # copied path
paths.extend(pnew) # add copied path
print('\n'.join([' => '.join(e) for e in paths]))
输出
A => B => C => D => E => F
A => B => C => I
A => N => A
X => Y => Z
X => Y => P => Q
A => B => K => L
X1 => Y1
根据源数据,A->N->A
和 X1->Y1
是正确的。我不知道为什么它们会被排除在所需的输出中。
我有 CSV File
,其中包含一些从采矿中产生的数据
我想按照要求的格式打印它
Required Format
A -> B -> C -> D -> E -> F
A -> B -> C -> I
X -> Y -> Z
X -> Y -> P -> Q
A -> B -> K -> L
a.csv 文件
## code
from collections import deque
import pandas as pd
data = pd.read_csv("a.csv")
data['Start'] = data['Start'].str.replace(' ','_')
data['End'] = data['End'].str.replace(' ','_')
fronts = dict()
backs = dict()
sequences = []
position_counter = 0
selector = data.apply(lambda row: row.str.extractall("([\w+\d]+)"), axis=1)
for relation in selector:
front, back = relation[0]
llist = deque((front, back))
finb = front in backs.keys()
if finb:
position = backs[front]
llist2 = sequences[position]
back_llist2 = llist2.pop()
llist = llist2 + llist
sequences[position] = llist
backs[llist[-1]] = position
if front in fronts.keys():
del fronts[front]
if back_llist2 in backs.keys():
del backs[back_llist2]
if not finb:
sequences.append(llist)
fronts[front] = position_counter
backs[back] = position_counter
position_counter += 1
data = []
for s in sequences:
data.append(' -> '.join(str(el) for el in s))
data
我得到的是:
'A -> B -> C -> D -> E -> F'
'C -> I'
'A -> N -> A'
'X -> Y -> Z'
'Y -> P -> Q'
'B -> K -> L'
'X1 -> Y1'
您需要在现有路径中搜索新行的起始元素。如果找到,附加到现有路径或复制路径并附加新的结束元素。
试试这个代码:
ss = '''
A B
B C
C D
D E
E F
C I
A N
N A
X Y
Y Z
Y P
P Q
B K
K L
X1 Y1
'''.strip()
lst = []
for r in ss.split('\n'):
lst.append(r.split())
################
paths = []
for e in lst: # each row in source data
pnew = [] # new path
for p in paths:
if e[0] in p: # if start in existing path
if p.index(e[0]) == len(p)-1: # if end of path
p.append(e[1]) # add to path
else:
pnew.append(p[:p.index(e[0])+1]+[e[1]]) # copy path then add
break
else: # loop completed, not found
paths.append(list(e)) # create new path
if len(pnew): # copied path
paths.extend(pnew) # add copied path
print('\n'.join([' => '.join(e) for e in paths]))
输出
A => B => C => D => E => F
A => B => C => I
A => N => A
X => Y => Z
X => Y => P => Q
A => B => K => L
X1 => Y1
根据源数据,A->N->A
和 X1->Y1
是正确的。我不知道为什么它们会被排除在所需的输出中。