如何在 Instagram 上获得 post 描述?

How do I get a post description on instagram?

我正在尝试获取 Instagram 上每张图片的 post 描述,但我只获得了描述的一小部分。有人可以帮我获取整张图片 post 的描述吗?

import requests
from bs4 import BeautifulSoup
from selenium import webdriver

# ---------------- getting hrefs in posts ------------------ #
# Step 1
driver = webdriver.Chrome('/Users/jjcauton/Documents/python/chromedriver')
driver.get('https://www.instagram.com/addict_for_sneakers/')


hrefs = driver.find_elements_by_tag_name('a')
print(hrefs)
hrefs_elem = [elem.get_attribute('href') for elem in hrefs]
hrefs_elem = [href for href in hrefs_elem if '/p/' in href]
print(hrefs_elem)

for href in hrefs_elem:
    driver.get(href)
    page = requests.get(href)
    soup = BeautifulSoup(page.content, 'lxml')
    page_contents = soup.title
    contents = page_contents.get_text()
    print(contents)

结果是这样的:

Boricua Adicto A Tenis on Instagram: “ Giveaway   Win a FREE pair of Adidas Yeezy 350 v2 "Yeshaya" (Winner Picks His or Her Size) by following the simple steps below.  Here’s…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “The the future of sneakers trading is here   Make money by buying shares, then selling them for more than what you paid   Start with only…”

Boricua Adicto A Tenis on Instagram: “What’s your favorite AJ11?”

Boricua Adicto A Tenis on Instagram: “ Giveaway   Win a FREE pair of Retro 1 Fearless  by following the simple steps below.  Here’s how you can win: 1️⃣ Follow:…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “Choose One!”

Boricua Adicto A Tenis on Instagram: “FREEGIVEAWAY  Win the ️red 1️⃣1️⃣ for FREE by following these steps:  Step 1️⃣. Follow them: @_jsole_ @wallkicksofficial @pr_sneaks23…”

Boricua Adicto A Tenis on Instagram: “What’s your favorite retro 4?”

Boricua Adicto A Tenis on Instagram: “ Giveaway   Win a FREE pair of Retro 1 Turbo Green by following the simple steps below.  Here’s how you can win: 1️⃣ Follow:…”

Boricua Adicto A Tenis on Instagram: “1,2,3,4,5,6,7,8,9 or 10?
#Tecatodetenis”

Boricua Adicto A Tenis on Instagram: “✨LAST CHANCE✨ ☁️CHOOSE YOUR FAVORITE SHOE☁️ ⠀ To Enter Simply: 1️⃣: Like This Picture 2️⃣: Follow  @Luisanglcordova @Hypedseason…”

如您所见,它只给出了图片 post 描述的一小部分。我需要完整的描述。谢谢!

您正在查看错误的标签。 Instagram 只有 <script> 标签内的帖子全文,因此返回所有 <a> 标签对您没有帮助。您需要找到包含 'edge_media_to_caption' 的 <script> 标签。脚本标签很长,但其中包含以下内容(取自 Instagram 帐户 /katyperry/):

"edge_media_to_caption": {
                             "edges": [{
                                 "node": {
                                     "text": "Many people wonder how the pyramids were actually built... but me, I am in constant awe and wonder of how such a loving/kind/compassionate/supportive/talented/deeply spiritual/did I mention incredibly good looking/James Bond of a human being can actually exist in the flesh!\n\nThere\u2019s a reason why all animals and children run straight into his arms... It\u2019s his heart, so pure. I love you Orlando Jonathan Blanchard Copeland Bloom. Happiest 43rd year. \u2665\ufe0f\ud83c\udf82\u2660\ufe0f"
                                 }
                             }]
                         },

使用它,您可以使用字符串 [index1:index2] 提取数据,其中可以使用 string.find("some value")

找到索引