如何从该网站抓取随机生成的句子

How do I scrape a randomly generated sentence from this website

我正在使用 python 3.8x 尝试从该网站抓取随机生成的句子。 https://randomwordgenerator.com/sentence.php 除了我阅读它时,生成的句子不在 HTML 中。谁能帮我找到一种方法来抓取生成的句子?我在生成句子时找到了 HTML 标签,但在我请求时却没有生成。

这是我的代码。

random_sentence_webpage = 'https://randomwordgenerator.com/sentence.php'

# The HTML tag for the generated sentence
start_marker = '"support-sentence">'
end_marker = '</span>'

from urllib.request import urlopen, Request
headers = {'User-Agent': 'Chrome/81.0.4044.129'}
reg_url = random_sentence_webpage
req = Request(url=reg_url, headers=headers) 
html = urlopen(req).read()
html_text = html.decode('utf-8', 'backslashreplace')

starting_position = html_text.find(start_marker)
end_position = html_text.find(end_marker,starting_position)

random_generated_sentence = html_text[starting_position + len(start_marker):end_position]

print(random_generated_sentence)

当我 运行 你的代码时,我得到了一团 html 输出。

</div>
</div>
<div class="container pt bottom_desc">
<div class="row">
<div class="col-md-6">
<p>If you're visiting this page, you're likely here because you're searching for a random sentence. Sometimes a random word just isn't enough, and that is where the random sentence generator comes into play. By inputting the desired number, you can make a list of as many random sentences as you want or need. Producing random sentences can be helpful in a number of different ways.</p>
<p>For writers, a random sentence can help them get their creative juices flowing. Since the topic of the sentence is completely unknown, it forces the writer to be creative when the sentence appears. There are a number of different ways a writer can use the random sentence for creativity. The most common way to use the sentence is to begin a story. Another option is to include it somewhere in the story. A much more difficult challenge is to use it to end a story. In any of these cases, it forces the writer to think creatively since they have no idea what sentence will appear from the tool.</p>
<p>For those writers who have writers' block, this can be an excellent way to take a step to crumbling those walls. By taking the writer away from the subject matter that is causing the block, a random sentence may allow them to see the project they're working on in a different light and perspective. Sometimes all it takes is to get that first sentence down to help break the block.</p>
<p>It can also be successfully used as a daily exercise to get writers to begin writing. Being shown a random sentence and using it to complete a paragraph each day can be an excellent way to begin any writing session.</p>
<p>Random sentences can also spur creativity in other types of projects being done. If you are trying to come up with a new concept, a new idea or a new product, a random sentence may help you find unique qualities you may not have considered. Trying to incorporate the sentence into your project can help you look at it in different and unexpected ways than you would normally on your own.</p>
<p>It can also be a fun way to surprise others. You might choose to share a random sentence on social media just to see what type of reaction it garners from others. It's an unexpected move that might create more conversation than a typical post or tweet.</p>
<p>These are just a few ways that one might use the random sentence generator for their benefit. If you're not sure if it will help in the way you want, the best course of action is to try it and see. Have several random sentences generated and you'll soon be able to see if they can help with your project.</p>
<p>Our goal is to make this tool as useful as possible. For anyone who uses this tool and comes up with a way we can improve it, we'd love to know your thoughts. Please contact us so we can consider adding your ideas to make the random sentence generator the best it can be.</p>
<div class="faq" id="faq" itemscope="" itemtype="https://schema.org/FAQPage"><h2 style="margin-bottom:25px">Frequently Asked Questions</h2>
<div itemscope="" itemprop="mainEntity" itemtype="https://schema.org/Question">
<h3 class="faq__title" itemprop="name">Are random sentences computer generated?</h3>
<div itemscope="" itemprop="acceptedAnswer" itemtype="https://schema.org/Answer">
<div itemprop="text"><p>No, the random sentences in our generator are not computer generated. We considered using computer generated sentences when building this tool, but found the results to be disappointing. Even though it took a lot of time, all the sentences in this generator were created by us.</p></div>
</div>
</div>
<div itemscope="" itemprop="mainEntity" itemtype="https://schema.org/Question">
<h3 class="faq__title" itemprop="name">Can I use these random sentences for my project?</h3>
<div itemscope="" itemprop="acceptedAnswer" itemtype="https://schema.org/Answer">
<div itemprop="text"><p>Yes! Feel free to use any of the random sentences for any project that you may be doing.</p></div>
</div>
</div>

我猜你想提取这些 p 标签和随机文本。

您可以使用 beautifulsoup 格式化您的输出。当我 运行 你的代码时,我得到许多嵌入在

标签中的随机文本字符串。您需要找到正确的路径才能找到那些 p 标签并获取文本。

这是一个演示。您需要根据您的需要进行更改。

random_sentence_webpage = 'https://randomwordgenerator.com/sentence.php'

# The HTML tag for the generated sentence
start_marker = '"support-sentence">'
end_marker = '</span>'

from urllib.request import urlopen, Request
headers = {'User-Agent': 'Chrome/81.0.4044.129'}
reg_url = random_sentence_webpage
req = Request(url=reg_url, headers=headers) 
html = urlopen(req).read()
html_text = html.decode('utf-8', 'backslashreplace')
starting_position = html_text.find(start_marker)
end_position = html_text.find(end_marker,starting_position)

random_generated_sentence = html_text[starting_position + len(start_marker):end_position]

# print(random_generated_sentence)

from bs4 import BeautifulSoup


soup = BeautifulSoup (random_generated_sentence, features="lxml")

block_ps =  soup.findAll("div", {"class": "col-md-6"})
for a in block_ps:
  print(a.findAll('p'))

您可以在此处找到更多详细信息Using python Requests with js pages

但简短的解决方案是使用 requests_html:

from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://randomwordgenerator.com/sentence.php')
r.html.render() 
print(r.html.find(".support-sentence")[0].text)

产出

Having no hair made him look even hairier.