如何在 pandas 数据框中附加具有默认值的额外列？

Question

如何在 pandas 数据框中附加带有默认值的额外列？

请参考以下代码：

userID = "narendramodi"
tweets = api.user_timeline(screen_name=userID, 
                           # 200 is the maximum allowed count
                           count=500,
                           include_rts = True,
                           # Necessary to keep full_text 
                           # otherwise only the first 140 words are extracted
                           tweet_mode = 'extended'
                           )

all_tweets = []
all_tweets.extend(tweets)
oldest_id = tweets[-1].id
while True:
    tweets = api.user_timeline(screen_name=userID, 
                           # 200 is the maximum allowed count
                           count=200,
                           include_rts = True,
                           max_id = oldest_id - 1,
                           # Necessary to keep full_text 
                           # otherwise only the first 140 words are extracted
                           tweet_mode = 'extended'
                           )
    if len(tweets) == 0:
        break
    oldest_id = tweets[-1].id
    all_tweets.extend(tweets)
    print('N of tweets downloaded till now {}'.format(len(all_tweets)))


from pandas import DataFrame
outtweets = [[
              
              tweet.id_str, 
              tweet.created_at, 
              tweet.favorite_count, 
              tweet.retweet_count,]  for idx,tweet in enumerate(all_tweets)]

df = DataFrame(outtweets,columns=["id",
                                  "created_at",
                                  "favorite_count",
                                  "retweet_count",)]
df.head(10)

请参考下面的代码，它运行正常，但我想在数据框中添加额外的列。假设所有反映在数据框中的推文的默认值为 domain = "NA"。

Answer 1

就这么简单：

df['domain'] = "NA"

Answer 2

它将用 NaN 值填充 new_col。


import numpy as np

df['new_col'] = np.NaN

如何在 pandas 数据框中附加具有默认值的额外列？

How to append extra column with default value in pandas dataframe?

python

dataframe

tweepy

pandas

twitterapi-python