缩短此代码的最佳(或最快)方法是什么?
What's the best (or fastest) way to make this code shorter?
为清楚起见,我从中获取数据的文件有几千行如下所示:
[12:29, 8.2.2020] Fabian Obst: Wir sind stammtisch heute raus
[12:30, 8.2.2020] Benedikt Stumpf: Dito
[12:40, 8.2.2020] Louis Rückel: Ich wär da
[12:41, 8.2.2020] Jan Hofmann: Ich geb nochmal bescheid
如果专业程序员看到这段代码,他们可能会流血 - 但我还不知道缩短它的有效方法。你能帮帮我吗?
class Months():
December17 = []
January18 = []
February18 = []
March18 = []
April18 = []
May18 =[]
June18 = []
July18 = []
August18 = []
September18 = []
October18 = []
November18 = []
December18 = []
January19 = []
February19 = []
March19 = []
April19 = []
May19 =[]
June19 = []
July19 = []
August19 = []
September19 = []
October19 = []
November19 = []
December19 = []
January20 = []
February20 = []
March20 = []
April20 = []
May20 =[]
with open('whatsapp.txt','r', encoding="UTF-8") as file:
for line in file:
if '12.2017' in line:
December17.append(line)
elif '.1.2018' in line:
January18.append(line)
elif '.2.2018' in line:
February18.append(line)
elif '3.2018' in line:
March18.append(line)
elif '4.2018' in line:
April18.append(line)
elif '5.2018' in line:
May18.append(line)
elif '6.2018' in line:
June18.append(line)
elif '7.2018' in line:
July18.append(line)
elif '8.2018' in line:
August18.append(line)
elif '9.2018' in line:
September18.append(line)
elif '10.2018' in line:
October18.append(line)
elif '11.2018' in line:
November18.append(line)
elif '12.2018' in line:
December18.append(line)
elif '.1.2019' in line:
January19.append(line)
elif '.2.2019' in line:
February19.append(line)
elif '3.2019' in line:
March19.append(line)
elif '4.2019' in line:
April19.append(line)
elif '5.2019' in line:
May19.append(line)
elif '6.2019' in line:
June19.append(line)
elif '7.2019' in line:
July19.append(line)
elif '8.2019' in line:
August19.append(line)
elif '9.2019' in line:
September19.append(line)
elif '10.2019' in line:
October19.append(line)
elif '11.2019' in line:
November19.append(line)
elif '12.2019' in line:
December19.append(line)
elif '.1.2020' in line:
January20.append(line)
elif '.2.2020' in line:
February20.append(line)
elif '3.2020' in line:
March20.append(line)
elif '4.2020' in line:
April20.append(line)
elif '5.2020' in line:
May20.append(line)
print (" December17:", len(December17),"\n",
"January18:", len(January18),"\n",
"February18:", len(February18),"\n",
"March18:", len(March18),"\n",
"April18:", len(April18),"\n",
"May18:", len(May18),"\n",
"June18:", len(June18),"\n",
"July18:", len(July18),"\n",
"August18:", len(August18),"\n",
"September18:", len(September18),"\n",
"October18:", len(October18),"\n",
"November18:", len(November18),"\n",
"December18:", len(December18),"\n",
"January19:", len(January19),"\n",
"February19:", len(February19),"\n",
"March19:", len(March19),"\n",
"April19:", len(April19),"\n",
"May19:", len(May19),"\n",
"June19:", len(June19),"\n",
"July19:", len(July19),"\n",
"August19:", len(August19),"\n",
"September19:", len(September19),"\n",
"October19:", len(October19),"\n",
"November19:", len(November19),"\n",
"December19:", len(December19),"\n",
"January20:", len(January20),"\n",
"February20:", len(February20),"\n",
"March20:", len(March20),"\n",
"April20:", len(April20),"\n",
"May20:", len(May20),"\n",
)
Summary = len(December17+January18+February18+March18+April18
+May18+June18+July18+August18+September18+October18
+November18+December18+January19+February19+March19
+April19+May19+June19+July19+August19+September19
+October19+November19+December19+January20+February20
+March20+April20+May20)
print ("There are", Summary, "messages in total.")
returns 应该是什么:
December17: 19
January18: 13
February18: 41
March18: 43
April18: 80
May18: 241
June18: 67
July18: 183
August18: 280
September18: 83
October18: 61
November18: 116
December18: 228
January19: 145
February19: 111
March19: 131
April19: 188
May19: 151
June19: 120
July19: 222
August19: 289
September19: 141
October19: 127
November19: 107
December19: 190
January20: 92
February20: 73
March20: 90
April20: 45
May20: 136
There are 3813 messages in total.
对于顶部的 30 个列表,我可能只需要几行,或者对于 if
语句和最后的 print
语句也可以。
你想要这样的东西:
from collections import OrderedDict
from datetime import datetime
months = OrderedDict()
with open('whatsapp.txt', 'r', encoding='utf-8') as file:
for line in file:
ts = datetime.strptime(line.split(']')[0], '[%H:%M, %d.%m.%Y')
months.setdefault(ts.strftime('%b %Y'), []).append(line)
for month, messages in months.items():
print(f'{month}:', len(messages))
print('There are {} messages in total.'.format(sum(map(len, months.values()))))
line.split(']')[0]
获取每一行的开头,例如“[12:29, 8.2.2020”,然后被解析为一个 datetime
对象。然后 datetime
用于在有序字典中形成像 "January 2020" 这样的键,并将行附加到它。其余的是对该聚合数据的计算。
为清楚起见,我从中获取数据的文件有几千行如下所示:
[12:29, 8.2.2020] Fabian Obst: Wir sind stammtisch heute raus
[12:30, 8.2.2020] Benedikt Stumpf: Dito
[12:40, 8.2.2020] Louis Rückel: Ich wär da
[12:41, 8.2.2020] Jan Hofmann: Ich geb nochmal bescheid
如果专业程序员看到这段代码,他们可能会流血 - 但我还不知道缩短它的有效方法。你能帮帮我吗?
class Months():
December17 = []
January18 = []
February18 = []
March18 = []
April18 = []
May18 =[]
June18 = []
July18 = []
August18 = []
September18 = []
October18 = []
November18 = []
December18 = []
January19 = []
February19 = []
March19 = []
April19 = []
May19 =[]
June19 = []
July19 = []
August19 = []
September19 = []
October19 = []
November19 = []
December19 = []
January20 = []
February20 = []
March20 = []
April20 = []
May20 =[]
with open('whatsapp.txt','r', encoding="UTF-8") as file:
for line in file:
if '12.2017' in line:
December17.append(line)
elif '.1.2018' in line:
January18.append(line)
elif '.2.2018' in line:
February18.append(line)
elif '3.2018' in line:
March18.append(line)
elif '4.2018' in line:
April18.append(line)
elif '5.2018' in line:
May18.append(line)
elif '6.2018' in line:
June18.append(line)
elif '7.2018' in line:
July18.append(line)
elif '8.2018' in line:
August18.append(line)
elif '9.2018' in line:
September18.append(line)
elif '10.2018' in line:
October18.append(line)
elif '11.2018' in line:
November18.append(line)
elif '12.2018' in line:
December18.append(line)
elif '.1.2019' in line:
January19.append(line)
elif '.2.2019' in line:
February19.append(line)
elif '3.2019' in line:
March19.append(line)
elif '4.2019' in line:
April19.append(line)
elif '5.2019' in line:
May19.append(line)
elif '6.2019' in line:
June19.append(line)
elif '7.2019' in line:
July19.append(line)
elif '8.2019' in line:
August19.append(line)
elif '9.2019' in line:
September19.append(line)
elif '10.2019' in line:
October19.append(line)
elif '11.2019' in line:
November19.append(line)
elif '12.2019' in line:
December19.append(line)
elif '.1.2020' in line:
January20.append(line)
elif '.2.2020' in line:
February20.append(line)
elif '3.2020' in line:
March20.append(line)
elif '4.2020' in line:
April20.append(line)
elif '5.2020' in line:
May20.append(line)
print (" December17:", len(December17),"\n",
"January18:", len(January18),"\n",
"February18:", len(February18),"\n",
"March18:", len(March18),"\n",
"April18:", len(April18),"\n",
"May18:", len(May18),"\n",
"June18:", len(June18),"\n",
"July18:", len(July18),"\n",
"August18:", len(August18),"\n",
"September18:", len(September18),"\n",
"October18:", len(October18),"\n",
"November18:", len(November18),"\n",
"December18:", len(December18),"\n",
"January19:", len(January19),"\n",
"February19:", len(February19),"\n",
"March19:", len(March19),"\n",
"April19:", len(April19),"\n",
"May19:", len(May19),"\n",
"June19:", len(June19),"\n",
"July19:", len(July19),"\n",
"August19:", len(August19),"\n",
"September19:", len(September19),"\n",
"October19:", len(October19),"\n",
"November19:", len(November19),"\n",
"December19:", len(December19),"\n",
"January20:", len(January20),"\n",
"February20:", len(February20),"\n",
"March20:", len(March20),"\n",
"April20:", len(April20),"\n",
"May20:", len(May20),"\n",
)
Summary = len(December17+January18+February18+March18+April18
+May18+June18+July18+August18+September18+October18
+November18+December18+January19+February19+March19
+April19+May19+June19+July19+August19+September19
+October19+November19+December19+January20+February20
+March20+April20+May20)
print ("There are", Summary, "messages in total.")
returns 应该是什么:
December17: 19
January18: 13
February18: 41
March18: 43
April18: 80
May18: 241
June18: 67
July18: 183
August18: 280
September18: 83
October18: 61
November18: 116
December18: 228
January19: 145
February19: 111
March19: 131
April19: 188
May19: 151
June19: 120
July19: 222
August19: 289
September19: 141
October19: 127
November19: 107
December19: 190
January20: 92
February20: 73
March20: 90
April20: 45
May20: 136
There are 3813 messages in total.
对于顶部的 30 个列表,我可能只需要几行,或者对于 if
语句和最后的 print
语句也可以。
你想要这样的东西:
from collections import OrderedDict
from datetime import datetime
months = OrderedDict()
with open('whatsapp.txt', 'r', encoding='utf-8') as file:
for line in file:
ts = datetime.strptime(line.split(']')[0], '[%H:%M, %d.%m.%Y')
months.setdefault(ts.strftime('%b %Y'), []).append(line)
for month, messages in months.items():
print(f'{month}:', len(messages))
print('There are {} messages in total.'.format(sum(map(len, months.values()))))
line.split(']')[0]
获取每一行的开头,例如“[12:29, 8.2.2020”,然后被解析为一个 datetime
对象。然后 datetime
用于在有序字典中形成像 "January 2020" 这样的键,并将行附加到它。其余的是对该聚合数据的计算。