Pandas 计算每台计算机连接的用户数
Pandas count how many users connected to each computer
我有一个登录数据集。我想计算有多少用户连接到每台计算机仅使用 pandas 内置函数。我需要结果数据集与原始数据集大小相同,因此每次在原始 table 中出现 1 台计算机时,它将以相同的登录次数出现在结果 table 中:
所以如果这是原来的 table:
Computer
User
computer1
user1
computer1
user2
computer1
user3
computer2
user1
computer2
user1
computer3
user1
computer3
user2
computer3
user2
我希望结果table是这样的:
Computer
User_Count
computer1
3
computer1
3
computer1
3
computer2
1
computer2
1
computer3
2
computer3
2
computer3
2
简单的列表对我有用:
result = []
num_of_computers = {}
for user in set(user_and_computer):
computers = []
for logon in user_and_computers:
if user == logon[0]:
computer.append(logon[1])
num_of_computers[user] = len(computers)
for user in user_and_computer:
result.append(num_of_computers[user[0]]
此外,我尝试在第三列(失败或成功)上计算一个条件,以仅计算成功登录:
result = []
num_of_computers = {}
for user in set(user_and_computer):
computers = []
for logon in user_and_computers:
if user == logon[0] and logon[2] == 'Success':
computer.append(logon[1])
num_of_computers[user] = len(computers)
for user in user_and_computer:
result.append(num_of_computers[user[0]]
在这种情况下,结果 table 仍然与原始 table 大小相同,并且只计算成功登录。如果有一台计算机所有登录失败结果table将显示这台计算机每次出现在原来的table.
还有一件事,我是 pandas、dataframes 和 tables 的新手,我想知道你如何在不使用示例的情况下描述这样的任务,比如,应该如何我命名我的问题是为了让它更笼统。
使用GroupBy.transform
with DataFrameGroupBy.nunique
, for count only Success
rows repalce not matched User
to missing values by Series.where
:
print (df)
Computer User Type
0 computer1 user1 Fail
1 computer1 user2 Success
2 computer1 user3 Fail
3 computer2 user1 Success
4 computer2 user1 Fail
5 computer3 user1 Success
6 computer3 user2 Fail
7 computer3 user2 Success
df['User_Count'] = df.groupby('Computer')['User'].transform('nunique')
df['User_Count_Success'] = (df['User'].where(df['Type'].eq('Success'))
.groupby(df['Computer'])
.transform('nunique'))
print (df)
Computer User Type User_Count User_Count_Success
0 computer1 user1 Fail 3 1
1 computer1 user2 Success 3 1
2 computer1 user3 Fail 3 1
3 computer2 user1 Success 1 1
4 computer2 user1 Fail 1 1
5 computer3 user1 Success 2 2
6 computer3 user2 Fail 2 2
7 computer3 user2 Success 2 2
详情:
print (df['User'].where(df['Type'].eq('Success')))
0 NaN
1 user2
2 NaN
3 user1
4 NaN
5 user1
6 NaN
7 user2
Name: User, dtype: object
我有一个登录数据集。我想计算有多少用户连接到每台计算机仅使用 pandas 内置函数。我需要结果数据集与原始数据集大小相同,因此每次在原始 table 中出现 1 台计算机时,它将以相同的登录次数出现在结果 table 中:
所以如果这是原来的 table:
Computer | User |
---|---|
computer1 | user1 |
computer1 | user2 |
computer1 | user3 |
computer2 | user1 |
computer2 | user1 |
computer3 | user1 |
computer3 | user2 |
computer3 | user2 |
我希望结果table是这样的:
Computer | User_Count |
---|---|
computer1 | 3 |
computer1 | 3 |
computer1 | 3 |
computer2 | 1 |
computer2 | 1 |
computer3 | 2 |
computer3 | 2 |
computer3 | 2 |
简单的列表对我有用:
result = []
num_of_computers = {}
for user in set(user_and_computer):
computers = []
for logon in user_and_computers:
if user == logon[0]:
computer.append(logon[1])
num_of_computers[user] = len(computers)
for user in user_and_computer:
result.append(num_of_computers[user[0]]
此外,我尝试在第三列(失败或成功)上计算一个条件,以仅计算成功登录:
result = []
num_of_computers = {}
for user in set(user_and_computer):
computers = []
for logon in user_and_computers:
if user == logon[0] and logon[2] == 'Success':
computer.append(logon[1])
num_of_computers[user] = len(computers)
for user in user_and_computer:
result.append(num_of_computers[user[0]]
在这种情况下,结果 table 仍然与原始 table 大小相同,并且只计算成功登录。如果有一台计算机所有登录失败结果table将显示这台计算机每次出现在原来的table.
还有一件事,我是 pandas、dataframes 和 tables 的新手,我想知道你如何在不使用示例的情况下描述这样的任务,比如,应该如何我命名我的问题是为了让它更笼统。
使用GroupBy.transform
with DataFrameGroupBy.nunique
, for count only Success
rows repalce not matched User
to missing values by Series.where
:
print (df)
Computer User Type
0 computer1 user1 Fail
1 computer1 user2 Success
2 computer1 user3 Fail
3 computer2 user1 Success
4 computer2 user1 Fail
5 computer3 user1 Success
6 computer3 user2 Fail
7 computer3 user2 Success
df['User_Count'] = df.groupby('Computer')['User'].transform('nunique')
df['User_Count_Success'] = (df['User'].where(df['Type'].eq('Success'))
.groupby(df['Computer'])
.transform('nunique'))
print (df)
Computer User Type User_Count User_Count_Success
0 computer1 user1 Fail 3 1
1 computer1 user2 Success 3 1
2 computer1 user3 Fail 3 1
3 computer2 user1 Success 1 1
4 computer2 user1 Fail 1 1
5 computer3 user1 Success 2 2
6 computer3 user2 Fail 2 2
7 computer3 user2 Success 2 2
详情:
print (df['User'].where(df['Type'].eq('Success')))
0 NaN
1 user2
2 NaN
3 user1
4 NaN
5 user1
6 NaN
7 user2
Name: User, dtype: object