仅将员工 ID 复制到 excel sheet

Copy only Employee id to excel sheet

我有一个未格式化的数据包含在记事本文件中,如下所示。

#Civil
GROUP CIVIL RPatel66 LKohli12 m12 PSen72 m72
GROUP CIVIL SKumar22 ASekar32 m32 BSiva90 
#Mechanical
GROUP MECHANICAL OKhan78 m78 MShah81 JKumar11 
GROUP MECHANICAL VHiremath12 TVasu43 m43 NReddy21
#Electrical
GROUP ELECTRICAL LPathan88 SPatil56 m56 AParth33
GROUP ELECTRICAL HAnil45 m45 Khari67 m67 Skumar49

当我运行下面的代码

from openpyxl import Workbook
wb = Workbook()
ws = wb.active
f = open('C:\Users\Kiran\Desktop\Input.txt', 'r+') 
data = f.readlines()
spaces = ""
for i in range(len(data)):
    row = data[i].split(" ")  
    ws.append(row)
wb.save("Output1.xlsx")

import openpyxl
book= openpyxl.load_workbook('Output1.xlsx')
sheet = book['Sheet']
sheet.delete_cols(1,2) #deletes Column 1 and 2
book.save("Output1.xlsx") 

对于上面的内容,我遇到了错误,没有得到我需要的输出。

我需要在 excel sheet 中输出,如图所示 below.I 需要在 excel sheet 中输出 Eg:Rpatel66,LKohli12 等它不应该包含 m12,m72

RPatel66
LKohli12
PSen72
SKumar22 
ASekar32
BSiva90
OKhan78
MShah81
JKumar11
VHiremath12
TVasu43
NReddy21
LPathan88
SPatil56
AParth33
HAnil45
Khari67
Skumar49

请参考以下代码以获得您的查询所需的输出。假设数据存在于

import re
import pandas as pd
with open("<your-file-name.txt>",'r') as f:
    content=f.readlines()
    
content = [x for x in content if not x.startswith('#')]
temp_content_1=list(map(lambda x: x.replace('GROUP','').replace('MECHANICAL','').replace('CIVIL','').replace('ELECTRICAL','').strip(), content))

temp_content_2=list(map(lambda x: re.sub(' m\d+','',x), temp_content_1))

final=' '.join(temp_content_2).split()
df=pd.DataFrame({"Employee":final})
df.to_excel("<your-output-file-name.xlsx>", index=None)

备用解决方案

import re
import pandas as pd
with open("<your-file-name.txt>",'r') as f:
    content=f.readlines()

temp_content_1=list(map(lambda x: re.sub(' m\d+','',x), content))
temp_content_2=' '.join(temp_content_1)

final=re.findall(r'\w+\d+', temp_content_2)    
df=pd.DataFrame({"Employee":final})
df.to_excel("<your-output-file-name.xlsx>", index=None)