如何为列表中的每个项目执行 API 调用

Question

我正在尝试获取元组列表中每个公司的价格数据 [(company_name, symbol)]。在这种情况下，我使用的是 TD Ameritrade API.

此外，我在使用 Reddit 时遇到了同样的问题。唯一的区别是我试图检索每个 post 'id' 的所有评论。但是，使用 Reddit 代码，我从 pandas df 而不是列表中提取 ID。

这是我现在所在的位置：

这是针对 TD Ameritrade API

    async def run_app(symbols):
        # empty list to append dataframes to
        all_dfs = []
        for sym,name in symbols:

            # gets the price history data 
            rdata = asyncio.get_price_data(symbol=sym, period='10', periodType='day', frequency='minute', frequencyType='1')

            # prepares data for panda df
            can = unpack_ph_data(rdata, 'candles', 'symbol') 

            # function for creating a panda df
            df = create_ph_df(can) # function for creating a panda df

            # append to all_dfs
            all_dfs.append(df)

            return rdata

我的想法是，使用 for 语句它将运行每一步用于符号列表中的每个项目。首先，我尝试不使用 asyncio，然后我看到了一个与此类似但没有使用 API 的示例，所以我想我会尝试一下。

对于 Reddit：

我正在尝试使用 praw 包进行类似的操作。但是为此，我将 pandas df 和运行中每一行的数据提取到同一个问题中。

我有一个函数可以获取指定的 subreddit 和 returns pandas df:

中的所有数据

def get_subreddit_data(subreddit="all", limit=25): 
    """
    :param subreddit: Which subreddit to get top posts. 
    :param limit: number of desired posts (This will be for setting the limit 
    after.hot(limit=num_of_posts)and eventually determined from user input. 
    Default 25

    :returns: top posts in subreddit or default (top posts on reddit).
    data is returned in pandas df with the following columns: 
    >>> title, score, id, subreddit, url, comments, selftext, created <<<
    """
    # empty list to insert data to: 
    posts = []

    # variable for data 
    top_posts = reddit.subreddit(subreddit).hot(limit=limit) # limit and subreddit params

    # FOR loop to append data to posts 
    for post in top_posts:
        posts.append([post.title, post.score, post.id, post.subreddit, post.url, post.num_comments, post.selftext, post.created])

    # Create df 
    df = pd.DataFrame(posts, columns=["title","score", "id", "subreddit", "url", "comments","selftext", "created"])

    return df

df = get_reddit_subreddit(subreddit, limit) 工作正常并且 returns a pandas df

这是我运行遇到问题的地方：

IDs = []
for ID in [df["id"]]:
    IDs.append(ID) # Add IDs to ID list

def return_comments_for(ID_list):
    # empty list to append comments to
    _comments = []
    """
    :param ID_list: list of post IDs 
    :returns: list of comments for each post ID 
    in ID_list
    """
    # for loop to extract each ID one by one
    for ID in ID_list:
        # Create submission instance
        submission = reddit.submission(id=ID)
        submission.comments.replace_more(limit=None)
        for comment in submission.comments.list():
            _comments.append(comment.body)

comments = return_comments_for(IDs)

那没有用，所以我尝试不创建函数并使用队列：

# Empty list for all IDs
queue = [] 
IDs = [df["id"]] # get IDs from DF
for i in IDs:
    queue.append(i) # Add IDs to queue 

# list to append comments to
_comments = []
while queue:
    # pop item index 0 and assign to ID
    ID = queue.pop(0)
    # create submission instance for ID
    submission = reddit.submission(id=ID)
    submission.comments.replace_more(limit=None)

    # for each comment in submission instance 
    for comment in submission.comments.list():
        _comments.append(comment.body) # append to main comment list

这也不是我尝试使用队列堆栈的唯一方法。我尝试了多种不同的方式，但我无法全部记住它们。但无论哪种方式，none 都有效，所以我遗漏了一些东西。

这是我每次收到的完整错误消息。不管我怎么试。

ValueError                                Traceback (most recent call last)
<ipython-input-9-2cddf98f54ba> in <module>
     20 
     21 
---> 22 comments = return_comments_for(IDs)
     23 print(comments)

<ipython-input-9-2cddf98f54ba> in return_comments_for(ID_list)
     14     for ID in ID_list:
     15         # Create submission instance
---> 16         submission = reddit.submission(id=ID)
     17         submission.comments.replace_more(limit=None)
     18         for comment in submission.comments.list():

C:\ProgramData\Anaconda3\lib\site-packages\praw\reddit.py in submission(self, id, url)
    847 
    848         """
--> 849         return models.Submission(self, id=id, url=url)

C:\ProgramData\Anaconda3\lib\site-packages\praw\models\reddit\submission.py in __init__(self, reddit, id, url, _data)
    532 
    533         """
--> 534         if (id, url, _data).count(None) != 2:
    535             raise TypeError("Exactly one of `id`, `url`, or `_data` must be provided.")
    536         self.comment_limit = 2048

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1476 
   1477     def __nonzero__(self):
-> 1478         raise ValueError(
   1479             f"The truth value of a {type(self).__name__} is ambiguous. "
   1480             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Answer 1

您尝试提取 ID 列表的方式有误，只需执行以下操作即可：

IDs = df["id"].values.tolist()

Answer 2

之后

IDs = []
for ID in [df["id"]]:
    IDs.append(ID) # Add IDs to ID list

IDS 不是一个int列表，而是pandas.Series一个列表。恰好包含一个系列，即 df["id"].

# This does what you were trying to do:
IDs = []
for ID in df["id"]:
    IDs.append(ID)
    
# Which can be shortened to
IDs = list(df["id"])

# But I think just passing the Series to your function, should work fine:

comments = return_comments_for(df["id"])

真正的错误是 comments 在此之后将是 None，因为 return_comments_for 没有 return 任何东西，所以它会隐式 return None.

如何为列表中的每个项目执行 API 调用

How do I Perform an API Call for each Item in a List

python

reddit

dataframe

pandas