想要在列表中存储变量名,而不是变量的内容
Want to store variable names in list, not said variable's contents
抱歉,如果标题令人困惑;让我解释一下。
所以,我编写了一个程序,使用 nltk 和 sklearn 的工具按主题对电子邮件进行分类。
代码如下:
#Extract Emails
tech = extract_message("C:\Users\Cody\Documents\Emails\tech.html")
gary = extract_message("C:\Users\Cody\Documents\Emails\gary.html")
gary2 = extract_message("C:\Users\Cody\Documents\Emails\gary2.html")
jesus = extract_message("C:\Users\Cody\Documents\Emails\Jesus.html")
jesus2 = extract_message("C:\Users\Cody\Documents\Emails\jesus2.html")
hockey = extract_message("C:\Users\Cody\Documents\Emails\hockey.html")
hockey2 = extract_message("C:\Users\Cody\Documents\Emails\hockey2.html")
shop = extract_message("C:\Users\Cody\Documents\Emails\shop.html")
#Build dictionary of features
count_vect = CountVectorizer()
x_train_counts = count_vect.fit_transform(news.data)
#Downscaling
tfidf_transformer = TfidfTransformer()
x_train_tfidf = tfidf_transformer.fit_transform(x_train_counts)
tf_transformer = TfidfTransformer(use_idf=False).fit(x_train_counts)
x_train_tf = tf_transformer.transform(x_train_counts)
#Train classifier
clf = MultinomialNB().fit(x_train_tfidf, news.target)
#List of the extracted emails
docs_new = [gary, gary2, jesus, jesus2, shop, tech, hockey, hockey2]
#Extract feautures from emails
x_new_counts = count_vect.transform(docs_new)
x_new_tfidf = tfidf_transformer.transform(x_new_counts)
#Predict the categories for each email
predicted = clf.predict(x_new_tfidf)
现在我希望根据预测的标签将每个变量存储在适当的列表中。我想我可以这样做:
#Store Files in a category
hockey_emails = []
computer_emails = []
politics_emails = []
tech_emails = []
religion_emails = []
forsale_emails = []
#Print out results and store each email in the appropritate category list
for doc, category in zip(docs_new, predicted):
print('%r ---> %s' % (doc, news.target_names[category]))
if(news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(doc)
if(news.target_names[category] == 'rec.sport.hockey'):
hockey_emails.append(doc)
if(news.target_names[category] == 'talk.politics.misc'):
politics_emails.append(doc)
if(news.target_names[category] == 'soc.religion.christian'):
religion_emails.append(doc)
if(news.target_names[category] == 'misc.forsale'):
forsale_emails.append(doc)
if(news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(doc)
如果我要打印出这些列表之一,例如 hockey,我的输出将显示存储在变量中的内容而不是变量本身。
我想要这个:
print(hockey_emails)
output: ['hockey', 'hockey2']
但我得到的是:
output: ['View View online click here Hi Thanks for signing up as a EA SPORTS NHL insider You ll now receive all of the latest and greatest news and info at this e mail address as you ve requested EA com If you need technical assistance please contact EA Help Privacy Policy Our Certified Online Privacy Policy gives you confidence whenever you play EA games To view our complete Privacy and Cookie Policy go to privacy ea com or write to Privacy Policy Administrator Electronic Arts Inc Redwood Shores Parkway Redwood City CA Electronic Arts Inc All Rights Reserved Privacy Policy User Agreement Legal ActionsMark as UnreadMark as ReadMark as SpamStarClear StarArchive Previous Next ', 'View News From The Hockey Writers The Editor s Choice stories from The Hockey Writers View this email in your browser edition Recap Stars Steamroll Predators By Matt Pryor on Dec am As the old Mary Chapin Carpenter song goes Sometimes you re the windshield Sometimes you re the bug It hasn t happened very often this season but the Dallas Stars had a windshield Continue Reading A Review of Years in Blue and White Damien Cox One on One By Anthony Fusco on Dec pm The Toronto Maple Leafs are one of the most storied and iconic franchises in the entire National Hockey League They have a century of history that spans all the way back to the early s When you have an Continue Reading Bruins Will Not Miss Beleskey By Kyle Benson on Dec am On Monday it was announced that Matt Beleskey will miss the next six weeks due to a knee injury he sustained over the weekend in a game against the Buffalo Sabres Six weeks is a long stint to be without a potential top Continue Reading Recent Articles Galchenyuk Injury Costly for CanadiensFacing Off Picking Team Canada for World JuniorsAre Johnson s Nomadic Days Over Share Tweet Forward Latest News Prospects Anaheim Ducks Arizona Coyotes Boston Bruins Buffalo Sabres Calgary Flames Carolina Hurricanes Chicago Blackhawks Colorado Avalanche Columbus Blue Jackets Dallas Stars Detroit Red Wings Edmonton Oilers Florida Panthers Los Angeles Kings Minnesota Wild Montreal Canadiens Nashville Predators New Jersey Devils New York Islanders New York Rangers Philadelphia Flyers Pittsburgh Penguins Ottawa Senators San Jose Sharks St Louis Blues Tampa Bay Lightning Toronto Maple Leafs Vancouver Canucks Washington Capitals Winnipeg Jets Copyright The Hockey Writers All rights reserved You are receiving this email because you opted in at The Hockey Writers or one of our Network Sites Our mailing address is The Hockey Writers Victoria Ave St Lambert QC J R R CanadaAdd us to your address book unsubscribe from this list update subscription preferences ActionsMark as UnreadMark as ReadMark as SpamStarClear StarArchive Previous Next ']
我认为这很简单,但我坐在这里挠头。这可能吗?我应该使用其他东西而不是列表吗?这可能很简单,我只是在空白。
您必须自己记录名称,Python不会为您做。
names = 'gary gary2 Jesus jesus2 shop tech hockey hockey2'.split()
docs_new = [extract_message("C:\Users\Cody\Documents\Emails\%s.html" % name)
for name in names]
for name, category in zip(names, predicted):
print('%r ---> %s' % (name, news.target_names[category]))
if (news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(name)
不要这样做。使用字典来保存你的电子邮件集合,当你想知道什么是什么时,你可以打印字典键。
docs_new = dict()
docs_new["tech"] = extract_message("C:\Users\Cody\Documents\Emails\tech.html")
docs_new["gary"] = extract_message("C:\Users\Cody\Documents\Emails\gary.html")
etc.
当您遍历字典时,您会看到键。
for doc, category in zip(docs_new, predicted):
print('%s ---> %s' % (doc, news.target_names[category]))
(更多字典基础知识:要迭代字典值,将上面的 docs_new
替换为 docs_new.values()
;或者对键和值都使用 docs_new.items()
。)
抱歉,如果标题令人困惑;让我解释一下。
所以,我编写了一个程序,使用 nltk 和 sklearn 的工具按主题对电子邮件进行分类。
代码如下:
#Extract Emails
tech = extract_message("C:\Users\Cody\Documents\Emails\tech.html")
gary = extract_message("C:\Users\Cody\Documents\Emails\gary.html")
gary2 = extract_message("C:\Users\Cody\Documents\Emails\gary2.html")
jesus = extract_message("C:\Users\Cody\Documents\Emails\Jesus.html")
jesus2 = extract_message("C:\Users\Cody\Documents\Emails\jesus2.html")
hockey = extract_message("C:\Users\Cody\Documents\Emails\hockey.html")
hockey2 = extract_message("C:\Users\Cody\Documents\Emails\hockey2.html")
shop = extract_message("C:\Users\Cody\Documents\Emails\shop.html")
#Build dictionary of features
count_vect = CountVectorizer()
x_train_counts = count_vect.fit_transform(news.data)
#Downscaling
tfidf_transformer = TfidfTransformer()
x_train_tfidf = tfidf_transformer.fit_transform(x_train_counts)
tf_transformer = TfidfTransformer(use_idf=False).fit(x_train_counts)
x_train_tf = tf_transformer.transform(x_train_counts)
#Train classifier
clf = MultinomialNB().fit(x_train_tfidf, news.target)
#List of the extracted emails
docs_new = [gary, gary2, jesus, jesus2, shop, tech, hockey, hockey2]
#Extract feautures from emails
x_new_counts = count_vect.transform(docs_new)
x_new_tfidf = tfidf_transformer.transform(x_new_counts)
#Predict the categories for each email
predicted = clf.predict(x_new_tfidf)
现在我希望根据预测的标签将每个变量存储在适当的列表中。我想我可以这样做:
#Store Files in a category
hockey_emails = []
computer_emails = []
politics_emails = []
tech_emails = []
religion_emails = []
forsale_emails = []
#Print out results and store each email in the appropritate category list
for doc, category in zip(docs_new, predicted):
print('%r ---> %s' % (doc, news.target_names[category]))
if(news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(doc)
if(news.target_names[category] == 'rec.sport.hockey'):
hockey_emails.append(doc)
if(news.target_names[category] == 'talk.politics.misc'):
politics_emails.append(doc)
if(news.target_names[category] == 'soc.religion.christian'):
religion_emails.append(doc)
if(news.target_names[category] == 'misc.forsale'):
forsale_emails.append(doc)
if(news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(doc)
如果我要打印出这些列表之一,例如 hockey,我的输出将显示存储在变量中的内容而不是变量本身。
我想要这个:
print(hockey_emails)
output: ['hockey', 'hockey2']
但我得到的是:
output: ['View View online click here Hi Thanks for signing up as a EA SPORTS NHL insider You ll now receive all of the latest and greatest news and info at this e mail address as you ve requested EA com If you need technical assistance please contact EA Help Privacy Policy Our Certified Online Privacy Policy gives you confidence whenever you play EA games To view our complete Privacy and Cookie Policy go to privacy ea com or write to Privacy Policy Administrator Electronic Arts Inc Redwood Shores Parkway Redwood City CA Electronic Arts Inc All Rights Reserved Privacy Policy User Agreement Legal ActionsMark as UnreadMark as ReadMark as SpamStarClear StarArchive Previous Next ', 'View News From The Hockey Writers The Editor s Choice stories from The Hockey Writers View this email in your browser edition Recap Stars Steamroll Predators By Matt Pryor on Dec am As the old Mary Chapin Carpenter song goes Sometimes you re the windshield Sometimes you re the bug It hasn t happened very often this season but the Dallas Stars had a windshield Continue Reading A Review of Years in Blue and White Damien Cox One on One By Anthony Fusco on Dec pm The Toronto Maple Leafs are one of the most storied and iconic franchises in the entire National Hockey League They have a century of history that spans all the way back to the early s When you have an Continue Reading Bruins Will Not Miss Beleskey By Kyle Benson on Dec am On Monday it was announced that Matt Beleskey will miss the next six weeks due to a knee injury he sustained over the weekend in a game against the Buffalo Sabres Six weeks is a long stint to be without a potential top Continue Reading Recent Articles Galchenyuk Injury Costly for CanadiensFacing Off Picking Team Canada for World JuniorsAre Johnson s Nomadic Days Over Share Tweet Forward Latest News Prospects Anaheim Ducks Arizona Coyotes Boston Bruins Buffalo Sabres Calgary Flames Carolina Hurricanes Chicago Blackhawks Colorado Avalanche Columbus Blue Jackets Dallas Stars Detroit Red Wings Edmonton Oilers Florida Panthers Los Angeles Kings Minnesota Wild Montreal Canadiens Nashville Predators New Jersey Devils New York Islanders New York Rangers Philadelphia Flyers Pittsburgh Penguins Ottawa Senators San Jose Sharks St Louis Blues Tampa Bay Lightning Toronto Maple Leafs Vancouver Canucks Washington Capitals Winnipeg Jets Copyright The Hockey Writers All rights reserved You are receiving this email because you opted in at The Hockey Writers or one of our Network Sites Our mailing address is The Hockey Writers Victoria Ave St Lambert QC J R R CanadaAdd us to your address book unsubscribe from this list update subscription preferences ActionsMark as UnreadMark as ReadMark as SpamStarClear StarArchive Previous Next ']
我认为这很简单,但我坐在这里挠头。这可能吗?我应该使用其他东西而不是列表吗?这可能很简单,我只是在空白。
您必须自己记录名称,Python不会为您做。
names = 'gary gary2 Jesus jesus2 shop tech hockey hockey2'.split()
docs_new = [extract_message("C:\Users\Cody\Documents\Emails\%s.html" % name)
for name in names]
for name, category in zip(names, predicted):
print('%r ---> %s' % (name, news.target_names[category]))
if (news.target_names[category] == 'comp.sys.ibm.pc.hardware'):
computer_emails.append(name)
不要这样做。使用字典来保存你的电子邮件集合,当你想知道什么是什么时,你可以打印字典键。
docs_new = dict()
docs_new["tech"] = extract_message("C:\Users\Cody\Documents\Emails\tech.html")
docs_new["gary"] = extract_message("C:\Users\Cody\Documents\Emails\gary.html")
etc.
当您遍历字典时,您会看到键。
for doc, category in zip(docs_new, predicted):
print('%s ---> %s' % (doc, news.target_names[category]))
(更多字典基础知识:要迭代字典值,将上面的 docs_new
替换为 docs_new.values()
;或者对键和值都使用 docs_new.items()
。)