DataFrame apply/append 一个函数 returns 每行一个字典

DataFrame apply/append a function that returns a dict to each row

我希望将 get_sentiment 应用于数据框中的每一行,并将返回的字典附加到该行。有什么好的方法吗?

def get_sentiment(txt: str) -> dict:
    response = client.detect_sentiment(Text=txt, LanguageCode='en')

    sentiment_data = dict()
    sentiment_data['Sentiment'] = response['Sentiment']
    sentiment_data['Sentiment_Score_Positive'] = response['SentimentScore']['Positive']
    sentiment_data['Sentiment_Score_Neutral'] = response['SentimentScore']['Neutral']
    sentiment_data['Sentiment_Score_Negative'] = response['SentimentScore']['Negative']
    return sentiment_data


def analyze_txt(df: DataFrame):
    df[] = df['Text'].apply(get_sentiment) #<- what I'm trying to do

基本上要df从

id Text
1 hello world
2 this is something here

id Text Sentiment Sentiment_Score_Positive Sentiment_Score_Neutral Sentiment_Score_Negative
1 hello world Neutral .5 .5 .5
2 this is something here Neutral .5 .5 .5

当您将 get_sentiment 应用到 Text 列时,它 returns 一系列字典,因此获得所需输出的一种方法是将其转换为字典列表并用它构造一个DataFrame;然后 joindf:

new_df = df.join(pd.DataFrame(df['Text'].apply(get_sentiment).tolist()))

如果df有特定的索引需要保留,可以在构造要连接的DataFrame时赋值:

s = df['Text'].apply(get_sentiment)
new_df = df.join(pd.DataFrame(s.tolist(), index=s.index))

一种更快的方法可能是简单地将 get_sentiment 映射到 Text 列值:

new_df = df.join(pd.DataFrame(map(get_sentiment, df['Text'].tolist())))

pd.concat looks like a viable option, too. To turn a list of dictionaries (or a list of lists that represent rows) into a dataframe, you can use pd.DataFrame.from_records.

df2 = pd.concat([df, pd.DataFrame.from_records(df.Text.apply(get_sentiment))], axis=1)