如何为列表中的每个项目执行 API 调用
How do I Perform an API Call for each Item in a List
我正在尝试获取元组列表中每个公司的价格数据 [(company_name, symbol)]
。在这种情况下,我使用的是 TD Ameritrade API.
此外,我在使用 Reddit 时遇到了同样的问题。唯一的区别是我试图检索每个 post 'id' 的所有评论。但是,使用 Reddit 代码,我从 pandas df 而不是列表中提取 ID。
这是我现在所在的位置:
这是针对 TD Ameritrade API
async def run_app(symbols):
# empty list to append dataframes to
all_dfs = []
for sym,name in symbols:
# gets the price history data
rdata = asyncio.get_price_data(symbol=sym, period='10', periodType='day', frequency='minute', frequencyType='1')
# prepares data for panda df
can = unpack_ph_data(rdata, 'candles', 'symbol')
# function for creating a panda df
df = create_ph_df(can) # function for creating a panda df
# append to all_dfs
all_dfs.append(df)
return rdata
我的想法是,使用 for 语句它将 运行 每一步 用于 符号列表中的每个项目。首先,我尝试不使用 asyncio
,然后我看到了一个与此类似但没有使用 API 的示例,所以我想我会尝试一下。
对于 Reddit:
我正在尝试使用 praw
包进行类似的操作。但是为此,我将 pandas df 和 运行 中每一行的数据提取到同一个问题中。
我有一个函数可以获取指定的 subreddit 和 returns pandas df:
中的所有数据
def get_subreddit_data(subreddit="all", limit=25):
"""
:param subreddit: Which subreddit to get top posts.
:param limit: number of desired posts (This will be for setting the limit
after.hot(limit=num_of_posts)and eventually determined from user input.
Default 25
:returns: top posts in subreddit or default (top posts on reddit).
data is returned in pandas df with the following columns:
>>> title, score, id, subreddit, url, comments, selftext, created <<<
"""
# empty list to insert data to:
posts = []
# variable for data
top_posts = reddit.subreddit(subreddit).hot(limit=limit) # limit and subreddit params
# FOR loop to append data to posts
for post in top_posts:
posts.append([post.title, post.score, post.id, post.subreddit, post.url, post.num_comments, post.selftext, post.created])
# Create df
df = pd.DataFrame(posts, columns=["title","score", "id", "subreddit", "url", "comments","selftext", "created"])
return df
df = get_reddit_subreddit(subreddit, limit)
工作正常并且 returns a pandas df
这是我 运行 遇到问题的地方:
IDs = []
for ID in [df["id"]]:
IDs.append(ID) # Add IDs to ID list
def return_comments_for(ID_list):
# empty list to append comments to
_comments = []
"""
:param ID_list: list of post IDs
:returns: list of comments for each post ID
in ID_list
"""
# for loop to extract each ID one by one
for ID in ID_list:
# Create submission instance
submission = reddit.submission(id=ID)
submission.comments.replace_more(limit=None)
for comment in submission.comments.list():
_comments.append(comment.body)
comments = return_comments_for(IDs)
那没有用,所以我尝试不创建函数并使用队列:
# Empty list for all IDs
queue = []
IDs = [df["id"]] # get IDs from DF
for i in IDs:
queue.append(i) # Add IDs to queue
# list to append comments to
_comments = []
while queue:
# pop item index 0 and assign to ID
ID = queue.pop(0)
# create submission instance for ID
submission = reddit.submission(id=ID)
submission.comments.replace_more(limit=None)
# for each comment in submission instance
for comment in submission.comments.list():
_comments.append(comment.body) # append to main comment list
这也不是我尝试使用队列堆栈的唯一方法。我尝试了多种不同的方式,但我无法全部记住它们。但无论哪种方式,none 都有效,所以我遗漏了一些东西。
这是我每次收到的完整错误消息。不管我怎么试。
ValueError Traceback (most recent call last)
<ipython-input-9-2cddf98f54ba> in <module>
20
21
---> 22 comments = return_comments_for(IDs)
23 print(comments)
<ipython-input-9-2cddf98f54ba> in return_comments_for(ID_list)
14 for ID in ID_list:
15 # Create submission instance
---> 16 submission = reddit.submission(id=ID)
17 submission.comments.replace_more(limit=None)
18 for comment in submission.comments.list():
C:\ProgramData\Anaconda3\lib\site-packages\praw\reddit.py in submission(self, id, url)
847
848 """
--> 849 return models.Submission(self, id=id, url=url)
C:\ProgramData\Anaconda3\lib\site-packages\praw\models\reddit\submission.py in __init__(self, reddit, id, url, _data)
532
533 """
--> 534 if (id, url, _data).count(None) != 2:
535 raise TypeError("Exactly one of `id`, `url`, or `_data` must be provided.")
536 self.comment_limit = 2048
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1476
1477 def __nonzero__(self):
-> 1478 raise ValueError(
1479 f"The truth value of a {type(self).__name__} is ambiguous. "
1480 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
您尝试提取 ID 列表的方式有误,只需执行以下操作即可:
IDs = df["id"].values.tolist()
之后
IDs = []
for ID in [df["id"]]:
IDs.append(ID) # Add IDs to ID list
IDS 不是一个int列表,而是pandas.Series一个列表。恰好包含一个系列,即 df["id"]
.
# This does what you were trying to do:
IDs = []
for ID in df["id"]:
IDs.append(ID)
# Which can be shortened to
IDs = list(df["id"])
# But I think just passing the Series to your function, should work fine:
comments = return_comments_for(df["id"])
真正的错误是 comments
在此之后将是 None
,因为 return_comments_for
没有 return 任何东西,所以它会隐式 return None
.
我正在尝试获取元组列表中每个公司的价格数据 [(company_name, symbol)]
。在这种情况下,我使用的是 TD Ameritrade API.
此外,我在使用 Reddit 时遇到了同样的问题。唯一的区别是我试图检索每个 post 'id' 的所有评论。但是,使用 Reddit 代码,我从 pandas df 而不是列表中提取 ID。
这是我现在所在的位置:
这是针对 TD Ameritrade API
async def run_app(symbols): # empty list to append dataframes to all_dfs = [] for sym,name in symbols: # gets the price history data rdata = asyncio.get_price_data(symbol=sym, period='10', periodType='day', frequency='minute', frequencyType='1') # prepares data for panda df can = unpack_ph_data(rdata, 'candles', 'symbol') # function for creating a panda df df = create_ph_df(can) # function for creating a panda df # append to all_dfs all_dfs.append(df) return rdata
我的想法是,使用 for 语句它将 运行 每一步 用于 符号列表中的每个项目。首先,我尝试不使用
asyncio
,然后我看到了一个与此类似但没有使用 API 的示例,所以我想我会尝试一下。对于 Reddit:
我正在尝试使用
praw
包进行类似的操作。但是为此,我将 pandas df 和 运行 中每一行的数据提取到同一个问题中。我有一个函数可以获取指定的 subreddit 和 returns pandas df:
中的所有数据def get_subreddit_data(subreddit="all", limit=25): """ :param subreddit: Which subreddit to get top posts. :param limit: number of desired posts (This will be for setting the limit after.hot(limit=num_of_posts)and eventually determined from user input. Default 25 :returns: top posts in subreddit or default (top posts on reddit). data is returned in pandas df with the following columns: >>> title, score, id, subreddit, url, comments, selftext, created <<< """ # empty list to insert data to: posts = [] # variable for data top_posts = reddit.subreddit(subreddit).hot(limit=limit) # limit and subreddit params # FOR loop to append data to posts for post in top_posts: posts.append([post.title, post.score, post.id, post.subreddit, post.url, post.num_comments, post.selftext, post.created]) # Create df df = pd.DataFrame(posts, columns=["title","score", "id", "subreddit", "url", "comments","selftext", "created"]) return df
df = get_reddit_subreddit(subreddit, limit)
工作正常并且 returns a pandas df这是我 运行 遇到问题的地方:
IDs = [] for ID in [df["id"]]: IDs.append(ID) # Add IDs to ID list def return_comments_for(ID_list): # empty list to append comments to _comments = [] """ :param ID_list: list of post IDs :returns: list of comments for each post ID in ID_list """ # for loop to extract each ID one by one for ID in ID_list: # Create submission instance submission = reddit.submission(id=ID) submission.comments.replace_more(limit=None) for comment in submission.comments.list(): _comments.append(comment.body) comments = return_comments_for(IDs)
那没有用,所以我尝试不创建函数并使用队列:
# Empty list for all IDs queue = [] IDs = [df["id"]] # get IDs from DF for i in IDs: queue.append(i) # Add IDs to queue # list to append comments to _comments = [] while queue: # pop item index 0 and assign to ID ID = queue.pop(0) # create submission instance for ID submission = reddit.submission(id=ID) submission.comments.replace_more(limit=None) # for each comment in submission instance for comment in submission.comments.list(): _comments.append(comment.body) # append to main comment list
这也不是我尝试使用队列堆栈的唯一方法。我尝试了多种不同的方式,但我无法全部记住它们。但无论哪种方式,none 都有效,所以我遗漏了一些东西。
这是我每次收到的完整错误消息。不管我怎么试。
ValueError Traceback (most recent call last) <ipython-input-9-2cddf98f54ba> in <module> 20 21 ---> 22 comments = return_comments_for(IDs) 23 print(comments) <ipython-input-9-2cddf98f54ba> in return_comments_for(ID_list) 14 for ID in ID_list: 15 # Create submission instance ---> 16 submission = reddit.submission(id=ID) 17 submission.comments.replace_more(limit=None) 18 for comment in submission.comments.list(): C:\ProgramData\Anaconda3\lib\site-packages\praw\reddit.py in submission(self, id, url) 847 848 """ --> 849 return models.Submission(self, id=id, url=url) C:\ProgramData\Anaconda3\lib\site-packages\praw\models\reddit\submission.py in __init__(self, reddit, id, url, _data) 532 533 """ --> 534 if (id, url, _data).count(None) != 2: 535 raise TypeError("Exactly one of `id`, `url`, or `_data` must be provided.") 536 self.comment_limit = 2048 C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self) 1476 1477 def __nonzero__(self): -> 1478 raise ValueError( 1479 f"The truth value of a {type(self).__name__} is ambiguous. " 1480 "Use a.empty, a.bool(), a.item(), a.any() or a.all()." ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
您尝试提取 ID 列表的方式有误,只需执行以下操作即可:
IDs = df["id"].values.tolist()
之后
IDs = []
for ID in [df["id"]]:
IDs.append(ID) # Add IDs to ID list
IDS 不是一个int列表,而是pandas.Series一个列表。恰好包含一个系列,即 df["id"]
.
# This does what you were trying to do:
IDs = []
for ID in df["id"]:
IDs.append(ID)
# Which can be shortened to
IDs = list(df["id"])
# But I think just passing the Series to your function, should work fine:
comments = return_comments_for(df["id"])
真正的错误是 comments
在此之后将是 None
,因为 return_comments_for
没有 return 任何东西,所以它会隐式 return None
.