DataFrame apply/append 一个函数 returns 每行一个字典
DataFrame apply/append a function that returns a dict to each row
我希望将 get_sentiment
应用于数据框中的每一行,并将返回的字典附加到该行。有什么好的方法吗?
def get_sentiment(txt: str) -> dict:
response = client.detect_sentiment(Text=txt, LanguageCode='en')
sentiment_data = dict()
sentiment_data['Sentiment'] = response['Sentiment']
sentiment_data['Sentiment_Score_Positive'] = response['SentimentScore']['Positive']
sentiment_data['Sentiment_Score_Neutral'] = response['SentimentScore']['Neutral']
sentiment_data['Sentiment_Score_Negative'] = response['SentimentScore']['Negative']
return sentiment_data
def analyze_txt(df: DataFrame):
df[] = df['Text'].apply(get_sentiment) #<- what I'm trying to do
基本上要df从
id
Text
1
hello world
2
this is something here
至
id
Text
Sentiment
Sentiment_Score_Positive
Sentiment_Score_Neutral
Sentiment_Score_Negative
1
hello world
Neutral
.5
.5
.5
2
this is something here
Neutral
.5
.5
.5
当您将 get_sentiment
应用到 Text
列时,它 returns 一系列字典,因此获得所需输出的一种方法是将其转换为字典列表并用它构造一个DataFrame;然后 join
到 df
:
new_df = df.join(pd.DataFrame(df['Text'].apply(get_sentiment).tolist()))
如果df
有特定的索引需要保留,可以在构造要连接的DataFrame时赋值:
s = df['Text'].apply(get_sentiment)
new_df = df.join(pd.DataFrame(s.tolist(), index=s.index))
一种更快的方法可能是简单地将 get_sentiment
映射到 Text
列值:
new_df = df.join(pd.DataFrame(map(get_sentiment, df['Text'].tolist())))
pd.concat
looks like a viable option, too. To turn a list of dictionaries (or a list of lists that represent rows) into a dataframe, you can use pd.DataFrame.from_records
.
df2 = pd.concat([df, pd.DataFrame.from_records(df.Text.apply(get_sentiment))], axis=1)
我希望将 get_sentiment
应用于数据框中的每一行,并将返回的字典附加到该行。有什么好的方法吗?
def get_sentiment(txt: str) -> dict:
response = client.detect_sentiment(Text=txt, LanguageCode='en')
sentiment_data = dict()
sentiment_data['Sentiment'] = response['Sentiment']
sentiment_data['Sentiment_Score_Positive'] = response['SentimentScore']['Positive']
sentiment_data['Sentiment_Score_Neutral'] = response['SentimentScore']['Neutral']
sentiment_data['Sentiment_Score_Negative'] = response['SentimentScore']['Negative']
return sentiment_data
def analyze_txt(df: DataFrame):
df[] = df['Text'].apply(get_sentiment) #<- what I'm trying to do
基本上要df从
id | Text |
---|---|
1 | hello world |
2 | this is something here |
至
id | Text | Sentiment | Sentiment_Score_Positive | Sentiment_Score_Neutral | Sentiment_Score_Negative |
---|---|---|---|---|---|
1 | hello world | Neutral | .5 | .5 | .5 |
2 | this is something here | Neutral | .5 | .5 | .5 |
当您将 get_sentiment
应用到 Text
列时,它 returns 一系列字典,因此获得所需输出的一种方法是将其转换为字典列表并用它构造一个DataFrame;然后 join
到 df
:
new_df = df.join(pd.DataFrame(df['Text'].apply(get_sentiment).tolist()))
如果df
有特定的索引需要保留,可以在构造要连接的DataFrame时赋值:
s = df['Text'].apply(get_sentiment)
new_df = df.join(pd.DataFrame(s.tolist(), index=s.index))
一种更快的方法可能是简单地将 get_sentiment
映射到 Text
列值:
new_df = df.join(pd.DataFrame(map(get_sentiment, df['Text'].tolist())))
pd.concat
looks like a viable option, too. To turn a list of dictionaries (or a list of lists that represent rows) into a dataframe, you can use pd.DataFrame.from_records
.
df2 = pd.concat([df, pd.DataFrame.from_records(df.Text.apply(get_sentiment))], axis=1)