如果某些行缺少列中的值，如何应用 TextBlob？

Question

我有一个如下所示的数据框：

     Text
0    this is amazing
1    nan
2    wow you are great

我想将数据框单元格中的每个单词迭代到 textblob 中，以获取新列中的极性。然而，许多行中有 nan。

我认为这导致 TextBlob 在新列中为所有行（即使其中包含文本的行）实现极性分数 0.0。

如何运行 TextBlob.sentiment.polarity 遍历我专栏中的每个文本并创建一个包含极性得分的新专栏？

新的 df 应该是这样的：

     Text                 sentiment
0    this is amazing      0.9
1    nan                  0.0
2    wow you are great    0.8

我不关心 nan 所以情绪值可以是 nan 或 0。

当前无效的代码：

for text in df.columns:
    a = TextBlob(text)
    df['sentiment']=a.sentiment.polarity
    print(df.value)

提前谢谢你。

编辑：

要补充一点，不确定这是否有所不同，df 上的索引不会重置，因为 df 的其他部分由相同的索引号组合在一起。

Answer 1

如果您对 nan 有疑问，您可以 apply 您的函数到 Text 列中没有 nan 的行，例如：

mask = df['Text'].notnull() #select the rows without nan
df.loc[mask,'sentiment'] = df.loc[mask,'Text'].apply(lambda x: TextBlob(x).sentiment.polarity)

注意：我没有 TextBlob，所以我根据您的代码假设 TextBlob(x).sentiment.polarity 可以。

Answer 2

试试这个：

>>> s=pd.Series(['this is amazing',np.NaN,'wow you are great'],name='Text')
>>> s
Out[100]: 
0      this is amazing
1                  NaN
2    wow you are great
Name: Text, dtype: object

>>> s.apply(lambda x: np.NaN if pd.isnull(x) else TextBlob(x).sentiment.polarity)
Out[101]: 
0    0.60
1     NaN
2    0.45
Name: Text, dtype: float64

Answer 3

另一个解决方案：

d = {'text': ['text1', 'text2', 'text3', 'text4', 'text5'], 'desc': ['The weather is nice today in my city.', 'I hate this weather.', 'Nice weather today.', 'Perfect weather today.', np.NaN]}
df = pd.DataFrame(data=d)
print(df)

    text                                   desc
0  text1  The weather is nice today in my city.
1  text2                   I hate this weather.
2  text3                    Nice weather today.
3  text4                 Perfect weather today.
4  text5                                    NaN

使用 TextBlob 应用情绪分析并将结果添加到新列：

df['sentiment'] = df['desc'].apply(lambda x: 'NaN' if pd.isnull(x) else TextBlob(x).sentiment.polarity)
print(df)

    text                                   desc sentiment
0  text1  The weather is nice today in my city.       0.6
1  text2                   I hate this weather.      -0.8
2  text3                    Nice weather today.       0.6
3  text4                 Perfect weather today.         1
4  text5                                    NaN       NaN

如果某些行缺少列中的值，如何应用 TextBlob？

How to apply TextBlob if value in columns are missing for some rows?

python

python-3.x

pandas

textblob