如何使用 pandas 对 csv 文件中的相同数字求和
How to sum same number in a csv file using pandas
我有一个 csv 文件,其中包含日期、计数和服务列。有很多日期、计数和服务列,但这是我将在下面编写的示例。
Number Count Service Number Count service
0 13 NO SERVICE 0 10
1 14 tcpmux 1 10
2 9 compressnet 2 14
所以我想要这样的答案:
Number Total Count Service
0 23 NO SERVICE
1 24 tcpmux
2 23 compressnet
如何执行 pandas
中的代码
import pandas as pd
df =pd.read_csv ("/Users/mani/Desktop/monthly report/geoip/2017-20dstipsum12.csv")
hasil = df.groupby(['NUMBER']).sum()
hasil.to_csv('gotttt.txt', sep='\t', encoding='utf-8')
如果第 Number
列在所有数据中都相同:
#sum all column Count
df['Total Count'] = df['Count'].sum(axis=1)
#select first and third column and join Total Count column
df = df.iloc[:, [0,2]].join(df['Total Count'])
print (df)
Number Total Count Total Service
0 0 23 NO SERVICE
1 1 24 tcpmux
2 2 23 compressnet
在较新版本的 pandas 中,read_csv
中的列名已被删除,因此 select 列需要 filter
:
print (df)
Number Count Service Number.1 Count.1 Service.1
0 0 13 NO SERVICE 0 10
1 1 14 tcpmux 1 10
2 2 9 compressnet 2 14
df['Total Count'] = df.filter(like='Count').sum(axis=1)
df = df[['Number','Total Count','Service']]
print (df)
Number Total Count Total Service
0 0 23 NO SERVICE
1 1 24 tcpmux
2 2 23 compressnet
我有一个 csv 文件,其中包含日期、计数和服务列。有很多日期、计数和服务列,但这是我将在下面编写的示例。
Number Count Service Number Count service
0 13 NO SERVICE 0 10
1 14 tcpmux 1 10
2 9 compressnet 2 14
所以我想要这样的答案:
Number Total Count Service
0 23 NO SERVICE
1 24 tcpmux
2 23 compressnet
如何执行 pandas
中的代码import pandas as pd
df =pd.read_csv ("/Users/mani/Desktop/monthly report/geoip/2017-20dstipsum12.csv")
hasil = df.groupby(['NUMBER']).sum()
hasil.to_csv('gotttt.txt', sep='\t', encoding='utf-8')
如果第 Number
列在所有数据中都相同:
#sum all column Count
df['Total Count'] = df['Count'].sum(axis=1)
#select first and third column and join Total Count column
df = df.iloc[:, [0,2]].join(df['Total Count'])
print (df)
Number Total Count Total Service
0 0 23 NO SERVICE
1 1 24 tcpmux
2 2 23 compressnet
在较新版本的 pandas 中,read_csv
中的列名已被删除,因此 select 列需要 filter
:
print (df)
Number Count Service Number.1 Count.1 Service.1
0 0 13 NO SERVICE 0 10
1 1 14 tcpmux 1 10
2 2 9 compressnet 2 14
df['Total Count'] = df.filter(like='Count').sum(axis=1)
df = df[['Number','Total Count','Service']]
print (df)
Number Total Count Total Service
0 0 23 NO SERVICE
1 1 24 tcpmux
2 2 23 compressnet