如何在 Stack Exchange API 中使用自定义过滤器?

How can I use custom filters in the Stack Exchange API?

我正在尝试从 StackApi 获取问题和答案以训练深度学习模型。我的问题是我不明白如何使用自定义过滤器,所以我只能得到问题的主体。

这是我的代码:

from stackapi import StackAPI
import torch
import torch.nn as nn

SITE = StackAPI('Whosebug')
SITE.max_pages=1
SITE.page_size=1
data = SITE.fetch('questions', tagged='python',filter = '!*SU8CGYZitCB.D*(BDVIficKj7nFMLLDij64nVID)N9aK3GmR9kT4IzT*5iO_1y3iZ)6W.G*', sort = 'votes')
for quest in data['items']:
    question = quest['title']
    print(question)
    question_id = quest['question_id']
    print (question_id)
    dataAnswer = SITE.fetch('questions/{ids}/answers', ids=[question_id], filter='withbody')
    print(dataAnswer)

我的 dataAnswer 结果:

{'backoff': 0, 'has_more': True, 'page': 1, 'quota_max': 300, 'quota_remaining': 300, 'total': 0, 'items': [{'owner': {'reputation': 404, 'user_id': 11182732, 'user_type': 'registered', 'profile_image': 'https://lh6.googleusercontent.com/-F2a9OP4yGHc/AAAAAAAAAAI/AAAAAAAADVo/Mn4oVgim-m8/photo.jpg?sz=128', 'display_name': 'Aditya patil', 'link': 'https://whosebug.com/users/11182732/aditya-patil'}, 'is_accepted': False, 'score': 8, 'last_activity_date': 1609856797, 'last_edit_date': 1609856797, 'creation_date': 1587307868, 'answer_id': 61306333, 'question_id': 231767, 'content_license': 'CC BY-SA 4.0', 'body': '<p><strong>The yield keyword is going to 
replace return in a function definition to create a generator.</strong></p>\n<pre><code>def create_generator():\n   for i in range(100):\n   yield i\nmyGenerator = create_generator()\nprint(myGenerator)\n# &lt;generator object create_generator at 0x102dd2480&gt;\nfor i in myGenerator:\n   print(i) # prints 0-99\n</code></pre>\n<p>When the returned generator is first used—not in the assignment but the for loop—the function definition will execute until it reaches the yield statement. There, it will pause (see why it’s called yield) until used again. Then, it will pick up where it left off. Upon the final iteration of the generator, any code after the yield command will execute.</p>\n<pre><code>def create_generator():\n   print(&quot;Beginning of generator&quot;)\n   for i in range(4):\n      yield i\n   print(&quot;After yield&quot;)\nprint(&quot;Before assignment&quot;)\n\nmyGenerator = create_generator()\n\nprint(&quot;After assignment&quot;)\nfor i in myGenerator :\n   print(i) # prints 0-3\n&quot;&quot;&quot;\nBefore assignment\nAfter assignment\nBeginning of generator\n0\n1\n2\nAfter yield\n</code></pre>\n<p>The <strong>yield</strong> keyword modifies a function’s behavior to produce a generator that’s paused at each yield command during iteration. The function isn’t executed except upon iteration, 
which leads to improved resource management, and subsequently, a better overall performance. Use generators (and yielded functions) for creating large data sets meant for single-use iteration.</p>\n'}]}

现在我只想得到结果的主体。我可以用自定义过滤器替换 withbody 过滤器吗?如果可以,我可以替换哪个过滤器?

  1. Select 你的方法来自 API docs. In this case, it's the /questions/{ids}/answers 一个。
  2. 点击默认过滤器旁边的 [edit],编辑所需的字段,然后点击保存。
  3. 复制出现的过滤器并将其粘贴到您的代码中。

由于缺少 /filters/create 方法的(适当的)文档,以编程方式创建过滤器很复杂。因为您想要 answerbody,所以您需要在过滤器中包含 answer.body,以及默认的 .wrapper 字段。例如:

from stackapi import StackAPI

defaultWrapper = '.backoff;.error_id;.error_message;.error_name;.has_more;.items;.quota_max;.quota_remaining;'
includes = 'answer.body'

SITE = StackAPI('Whosebug')
# See https://stackapi.readthedocs.io/en/latest/user/advanced.html#end-points-that-don-t-accept-site-parameter
SITE._api_key = None
data = SITE.fetch('filters/create', base = 'none', include = defaultWrapper + includes)
print(data['items'][0]['filter'])

相应地更改 includes

参考文献: