Fasttext 如何将 .csv 列加载到 model.predict
Fasttext how to load a .csv column into model.predict
我是 python 和 NLP 的新手。
我已按照本教程 (https://fasttext.cc/docs/en/supervised-tutorial.html) 在 Python 中训练我的 fasttxt 监督模型。
我有一个带有文本列的 csv,我想预测文件中每一行的标签。
我的问题是如何加载(转换)预测输入中的 csv 列并保存标签。
model.predict("Which baking dish is best to bake a banana bread ?", k=-1, threshold=0.5)
而不是这个(“"Which baking...." 中的文本)我想逐行加载并将标签最好保存在同一 csv 的新列中。
我愿意提供任何帮助或我可以遵循的教程。
到目前为止,我已经尝试将列转换为 pandas 和 numpy 数组的列表,但都返回了 "AttributeError: 'function' object has no attribute 'find'"
以此CSV为例:
index;id;text;author
0;id26305;This process, however, afforded me no means of...;EAP
1;id17569;It never once occurred to me that the fumbling...;HPL
2;id11008;In his left hand was a gold snuff box, from wh...;EAP
3;id27763;How lovely is spring As we looked from Windsor...;MWS
4;id12958;Finding nothing else, not even gold, the Super...;HPL
5;id22965;A youth passed in solitude, my best years spen...;MWS
6;id09674;The astronomer, perhaps, at this point, took r...;EAP
7;id13515;The surcingle hung in ribands from my body. ;EAP
8;id19322;I knew that you could not say to yourself 'ste...;EAP
9;id00912;I confess that neither the structure of langua...;MWS
您可以使用以下代码:
import pandas as pd
import fastText as ft
# here you load the csv into pandas dataframe
df=pd.read_csv('csv_file.csv',sep=';')
# here you load your fasttext module
model=ft.load_model(MODELPATH)
# line by line, you make the predictions and store them in a list
predictions=[]
for line in df['text']:
pred_label=model.predict(line, k=-1, threshold=0.5)[0][0]
predictions.append(pred_label)
# you add the list to the dataframe, then save the datframe to new csv
df['prediction']=predictions
df.to_csv('csv_file_w_pred.csv',sep=';',index=False)
我是 python 和 NLP 的新手。
我已按照本教程 (https://fasttext.cc/docs/en/supervised-tutorial.html) 在 Python 中训练我的 fasttxt 监督模型。
我有一个带有文本列的 csv,我想预测文件中每一行的标签。 我的问题是如何加载(转换)预测输入中的 csv 列并保存标签。
model.predict("Which baking dish is best to bake a banana bread ?", k=-1, threshold=0.5)
而不是这个(“"Which baking...." 中的文本)我想逐行加载并将标签最好保存在同一 csv 的新列中。
我愿意提供任何帮助或我可以遵循的教程。
到目前为止,我已经尝试将列转换为 pandas 和 numpy 数组的列表,但都返回了 "AttributeError: 'function' object has no attribute 'find'"
以此CSV为例:
index;id;text;author
0;id26305;This process, however, afforded me no means of...;EAP
1;id17569;It never once occurred to me that the fumbling...;HPL
2;id11008;In his left hand was a gold snuff box, from wh...;EAP
3;id27763;How lovely is spring As we looked from Windsor...;MWS
4;id12958;Finding nothing else, not even gold, the Super...;HPL
5;id22965;A youth passed in solitude, my best years spen...;MWS
6;id09674;The astronomer, perhaps, at this point, took r...;EAP
7;id13515;The surcingle hung in ribands from my body. ;EAP
8;id19322;I knew that you could not say to yourself 'ste...;EAP
9;id00912;I confess that neither the structure of langua...;MWS
您可以使用以下代码:
import pandas as pd
import fastText as ft
# here you load the csv into pandas dataframe
df=pd.read_csv('csv_file.csv',sep=';')
# here you load your fasttext module
model=ft.load_model(MODELPATH)
# line by line, you make the predictions and store them in a list
predictions=[]
for line in df['text']:
pred_label=model.predict(line, k=-1, threshold=0.5)[0][0]
predictions.append(pred_label)
# you add the list to the dataframe, then save the datframe to new csv
df['prediction']=predictions
df.to_csv('csv_file_w_pred.csv',sep=';',index=False)