excel 中的重复数据使用 Openpyxl

Duplicated data in excel using Openpyxl

我创建了一个 python 脚本,它将在 excel 中附加数据。但是,在 excel 中传输的数据存在多重重复。有人可以帮我修复我的脚本吗?

tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
    login = user.find('login').text
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
            for row in dataframe_to_rows(df, index = False):
                ws.append(row)          
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)
    wb.save(filename = 'userData.xlsx')

excel 中的输出是这样的: 登录 = A1,用户名 = B1,角色 = C1,状态 = D1

  1. 登录用户名角色状态
  2. 管理员管理员管理员活跃
  3. 登录用户名角色状态
  4. 管理员管理员管理员活跃
  5. user1 佩德罗领导活跃
  6. 登录用户名角色状态
  7. 管理员管理员管理员活跃
  8. user1 佩德罗领导活跃
  9. user2 娟组长活跃

此外,对于非活动用户的 else 循环,是否可以将它们附加到同一个 excel 文件到另一个 sheet?谢谢大家

ws.append()ws.save 应该在所有 for 循环之外,包括第一个循环。在此处更新代码。


tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
wb = Workbook()
ws = wb.active
ws.title = ("Active Users")
df=pd.DataFrame(columns=["Login", "User Name", "Role", "Status"])
for user in root.findall('user'):
    login = user.find('login').text
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            df.loc[len(df.index)] =[userLogin, lastname, str(role['name']), str(activeStatus)]
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)

### MOVED code here - note it should be outside ALL for loops ####
for row in dataframe_to_rows(df, index = False):
    ws.append(row)          

wb.save(filename = 'userData.xlsx')

您确定 users.xml 只包含唯一用户吗?

如果您不确定,我认为最好检查现有的用户逻辑。

要实现这一点,您可以使用字典或数组在循环中临时存储您的用户并检查当前用户是否存在

. . .
user_tmp = []
for user in root.findall('user'):
    login = user.find('login').text
    # Check if login is in the list
    if login not in user_tmp:
        user_tmp.append(login)
    else:
        # if login is in the list, continue the loop
        continue
 . . .

由于您使用的是 Pandas 数据框,因此在使用 toExcel

保存数据框时可以生成多个工作表
# Example, you generate an active user in df_active and inactive user in # create a excel writer object
with pd.ExcelWriter("path to file\filename.xlsx") as writer:
    # use to_excel function and specify the sheet_name and index
    # to store the dataframe in specified sheet
    df_active.to_excel(writer, sheet_name="Active", index=False)
    df_inactive.to_excel(writer, sheet_name="Inactive", index=False)

希望您能从我的建议中得到解决问题的提示。

@Redox 和@taipei 您好,感谢您的快速回复和回答, 我已经以不同的格式解决了我的重复问题:)

def getUserDetail():    
tree = ET.parse('users.xml')
root = tree.getroot()
#create excel
workbook = Workbook()
ws = workbook.active
ws.title = ("Active Users")
ws.append(['Login', 'User Name', 'Role', 'Status'])
#logins = []
for user in root.findall('user'):
    login = user.find('login').text
#    logins.append(login)
# for index in range(10):
#     login = logins[index]
    for m in tls.getUserByLogin(login):
        user_status = int(m.get("isActive"))
        if user_status == 1:
            lastname = m.get("lastName")
            firstname = m.get("firstName")
            userLogin = m.get("login")
            activeStatus = ("Active User")
            role = m.get("globalRole")
            tproject = m.get("tprojectRoles")    
            print("Login: " + userLogin + " " + lastname + " " + firstname + " Role: " + str(role['name']) + " " + str(activeStatus))
            data = [[userLogin, lastname + firstname, str(role['name']), str(activeStatus)]]
            for row in data:
                ws.append(row)
        else:
            inactive = (str(m.get("firstName")) + " " + str(m.get("lastName")) +": User is not Active")
            print(inactive)
### MOVED code here - note it should be outside ALL for loops ####             
workbook.save(filename = 'userData.xlsx')

getUserDetail()