如何迭代数据框中的所有行并返回所有行的结果?
How to Iterate all the rows in dataframe and returning the results for all the rows?
我在数据框中有 1 列和 3 行。数据框在下面
Text
0 Provided by Hindustan Times Wuhan Institute of...
1 Kattappa continues to narrate how he ended up ...
2 National Commercial Bank (NCB), Saudi Arabia’s...
我正在尝试汇总所有 3 行并想创建另一列,例如
Text Summarize
0 Provided by Hindustan Times Wuhan Institute of... It's related to virus
1 Kattappa continues to narrate how he ended up ... It's a movie story
2 National Commercial Bank (NCB), Saudi Arabia’s... Article related to finance
我尝试了下面的代码
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
print(text)
但是输出是
Article related to finance
谁能帮我解决这个问题?
您在每次迭代时都覆盖了 text
的值 - 所以它变成了 "It's related to virus"
,然后变成了 "It's a movie story"
并且忘记了之前的值,最后变成了 "Article related to finance"
并且忘记了之前的值。
不使用单个字符串,而是在每次迭代时使用字符串列表和 append
,如下所示:
summaries = []
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
summaries.append(text)
print(summaries)
我在数据框中有 1 列和 3 行。数据框在下面
Text
0 Provided by Hindustan Times Wuhan Institute of...
1 Kattappa continues to narrate how he ended up ...
2 National Commercial Bank (NCB), Saudi Arabia’s...
我正在尝试汇总所有 3 行并想创建另一列,例如
Text Summarize
0 Provided by Hindustan Times Wuhan Institute of... It's related to virus
1 Kattappa continues to narrate how he ended up ... It's a movie story
2 National Commercial Bank (NCB), Saudi Arabia’s... Article related to finance
我尝试了下面的代码
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
print(text)
但是输出是
Article related to finance
谁能帮我解决这个问题?
您在每次迭代时都覆盖了 text
的值 - 所以它变成了 "It's related to virus"
,然后变成了 "It's a movie story"
并且忘记了之前的值,最后变成了 "Article related to finance"
并且忘记了之前的值。
不使用单个字符串,而是在每次迭代时使用字符串列表和 append
,如下所示:
summaries = []
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
summaries.append(text)
print(summaries)