如何解析某个<div>标签内的所有<p>标签?

How to parse all <p> tags within a certain <div> tag?

我正在使用 BeautifulSoup 来解析一些 html 页面。 我想获取这个<div id="commentary"><p>标签内的所有文本信息 link to image of that html script content which I want to get

当我使用 find_all 获取所有 <p> 标签时,列表仅包含第一个。我曾经使用以下代码来计算编号。 <p> 个标签出现在 <div> 下。您可以从上图中清楚地看到,在突出显示的 <div> 标签中大约有 19 个 <p> 标签,我的代码仍然打印出 1.

content = soup.find('div', attrs={'class':'company-profile'})
points = content.find('div', attrs={'id':'commentary'})
count = 0
for point in points.find_all('p'):
    count = count + 1
print(count)
print(points.text)

我不知道为什么会这样,也不知道为什么 find_all 方法不会 return 完整列表。 我还尝试使用 points.text 打印 <div id="commentary"> 标签内的所有文本,但它只打印第一个 <p> 标签的内容。

(mlenv) chirag@debian10:~/ML/Finaments$ python main.py
<class 'bs4.element.Tag'>

State Bank of India is a Fortune 500 company. It is an Indian Multinational, Public Sector banking and financial services statutory body headquartered in Mumbai. It is the largest and oldest bank in India with over 200 years of history.#

1
1

Ratios (Q3FY21)
Capital Adequacy Ratio - 14.50%
Net Interest Margin - 3.34%
Gross NPA - 4.77%
Net NPA - 1.23%
CASA Ratio - 45.15%#




(mlenv) chirag@debian10:~/ML/Finaments$ ^C
(mlenv) chirag@debian10:~/ML/Finaments$ 

那些 1 是来自 print(count) 的,然后它只打印来自 print(points.text) 的第一个 <p> 标签的内容。 我刚开始使用beautifulsoup,请帮助我。

您可以直接搜索具有该信息的 url。不过,您需要在其中传递正确的 cookie 和 csrf 令牌:

import requests
from bs4 import BeautifulSoup

url = 'https://www.screener.in/wiki/company/3188/commentary/'
headers= {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
'referer': 'https://www.screener.in/company/SBIN/consolidated/',
'x-csrftoken': 'E8zDjm7CtmSqCM2B9rTYPXTcPMJ22w2oynWzWzT4bCgAIaKkt4DmrirBSEPdCP0W',
'cookie': '_gcl_au=1.1.69436223.1621345270; _ga=GA1.2.2056656539.1621345271; _gid=GA1.2.1452432592.1621345271; csrftoken=E8zDjm7CtmSqCM2B9rTYPXTcPMJ22w2oynWzWzT4bCgAIaKkt4DmrirBSEPdCP0W; sessionid=mrdcmrlqpe72dqjrqgtrb2m2v375sjv0; _gat_UA-2456523-7=1'}

response = requests.post(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')

count = 0
for point in soup.find_all('p'):
    count = count + 1
print(count)
print(soup.text)

输出:

19

Ratios (Q3FY21)
Capital Adequacy Ratio - 14.50%
Net Interest Margin - 3.34%
Gross NPA - 4.77%
Net NPA - 1.23%
CASA Ratio - 45.15%#
Branch Network
Presently, the bank operates a network of 22,330 branches and ~58,000 ATMs across India. It also operates ~71,000 business correspondent outlets across India.#
Market Share
The bank has a market share of 22.84% in deposits and 19.69% share in advances in India. It has a strong customer base of ~45 crore customers.#
Loan Book
Retail loans account for 39% of the loan book, followed by corporate (37%), SME (14%) and Agriculture (10%).#
Retail Book - Home loans account for 68% of the retail book, followed by xpress credit (22%), auto loans (9%), personal gold loans (2%) and others (9%).#
Exposure
The bank has a well-diversified loan book exposed to various sectors. Top sectors include home loans (23%), infrastructure (15%), services (12%) and agriculture (10%).
~75% of the corporate advances are rated A and better ratings from rating agencies. 38% of the corporate book accounts for PSUs & Govt. departments.#
Segmental NPAs
Presently, the total NPAs of the bank stands at 1,17,244 crores. agriculture segment accounts for the major ratio of NPAs i.e. 13.71% of all loans are NPA. Corporate segment accounts for 59,400 crores worth of NPAs i.e. 51% of total NPAs of the bank.#
International Business
The bank has a global footprint with a network of 233 branches/offices in 32 countries.# It has  presence in USA, Canada, Brazil, Russia, Germany, France, Turkey, Australia, Bangladesh, Nepal, Sri Lanka and other countries.#
Presently, Overseas business accounts for 3% of total deposits# and 13% of total advances.#
Government Business
SBI has always been the banker of choice to the government of India and is the market leader in government business. It had turnover of ~52,50,000 lakh crores and commissions of ~3,700 crores from government business in FY20.#
Financial Inclusion Business
The bank has ~71,000 BC outlets which has primary focus on financial inclusion customers.# The bank accounts for 40% of all PMJDY accounts i.e. more than 12 crore accounts.# Presently, the deposits from PMJDY accounts are ~42,500 crores i.e. 1.2% of total deposits of the bank.
Digital Metrics
Increasing digitization resulted in ~40% of asset accounts and ~60% of liability customers added via digital channels in FY21.# 67% of all transactions were initiated through digital channels in 2020 which is up from 58% in the previous year.#
Subsidiaries Operations
The bank owns various subsidiaries which are engaged in related business activities :-
1. SBI Capital Markets Ltd (100% stake) -  SBICAP is a leading investment banker, offering investment banking and corporate advisory services to clients across three product categories i.e. project advisory and structured finance, equity capital markets and debt capital markets.
This company further has wholly owned subsidiaries in related businesses viz. SBICAP Securities, SBICAP Trustee Co., SBICAP Ventures & others.#
2. SBI DHFI Ltd (72% stake) - It is a primary dealer and supports the book building process and provide depth and liquidity to secondary markets in G-Sec. It also deals in money market instruments, non G-Sec debt instruments, amongst others.#
3. SBI Cards and Payment Services Ltd (69% stake) - It is a non-banking financial company that offers extensive credit card portfolio to individual cardholders and corporate clients. It has diversified customer acquisition network that enables to engage prospective customers across multiple channels.#
The IPO of SBI Cards was launched in March 2020 wherein the company sold ~13 crore equity shares for a consideration of ₹10,350 crores.#
4. SBI Life Insurance Co. Ltd (57.6% stake) - It is one of the leading life insurance company in India which offers a wide range of individual and group insurance solutions that meet various life stage needs of customers.#
5. SBI Funds Management Pvt Ltd (63% stake) - It is a JV between SBI and AMUNDI (France). It is an asset management company with the fastest CAGR of 33% as against industrial average of 14% in the last 3 years.#
6. SBI General Insurance Company Ltd (70% stake) - It is a general insurance company which focuses on profitable growth in banc-assurance channel along  with other distribution channels and line of businesses. It is first non-life insurance company in India to cross 6,000 crores in a decade of operations.#
Amalgamation of Associate Banks
In March 2017, the bank acquired its 5 associate state banks and Bharatiya Mahila Bank by allotting ~13.5 crore equity shares of SBI.#