网页抓取后如何分割线？

Question

I'm now having a problem, this is the website i'm scraping : https://tw.dictionary.search.yahoo.com/search;_ylt=AwrtXGvbWIJibWYAFCp9rolQ;_ylc=X1MDMTM1MTIwMDM4MQRfcgMyBGZyA3NmcARmcjIDc2ItdG9wBGdwcmlkAwRuX3JzbHQDMARuX3N1Z2cDMARvcmlnaW4DdHcuZGljdGlvbmFyeS5zZWFyY2gueWFob28uY29tBHBvcwMwBHBxc3RyAwRwcXN0cmwDMARxc3RybAM0BHF1ZXJ5A3RhcGUEdF9zdG1wAzE2NTI3MDk5NTM-?p=take&fr2=sb-top&fr=sfp , it's a web dictionary provided by Yahoo, what I am trying to do is when you input your request to translate and the output will show the results.

import requests
from bs4 import BeautifulSoup

def searchdic():
  global d
  a = "https://tw.dictionary.search.yahoo.com/search;_ylt=AwrtXGvbWIJibWYAFCp9rolQ"
  b = ";_ylc=X1MDMTM1MTIwMDM4MQRfcgMyBGZyA3NmcARmcjIDc2ItdG9wBGdwcmlkAwRuX3JzbHQDMARuX3N1Z2cDMARvcmlnaW4DdHcuZGljdGlvbmFyeS5zZWFyY2gueWFob28uY29tBHBvcwMwBHBxc3RyAwRwcXN0cmwDMARxc3RybAM0BHF1ZXJ5A3RhcGUEdF9zdG1wAzE2NTI3MDk5NTM-?"
  c = "p="
  e = "&fr2=sb-top&fr=sfp"
  search = a+b+c+d+e
  print(search)

  resp = requests.get(search)
  soup = BeautifulSoup(resp.text, 'html.parser')
  #print(soup.find('','compList mb-25 p-rel'))
  
  if soup.find('','compList mb-25 p-rel') == None:
    print("Invalid query!")
  else:  
    print(soup.find('div','compList mb-25 p-rel').text)
    #divs = soup.find_all('div', 'compList mb-25 p-rel')
    #for div in divs:
      #print(f"{[s for s in div.stripped_strings]}""\n") 

def changechinesetourl():
  global d
  from urllib import parse
  str = d
  d = parse.quote(str)
  searchdic()

def is_contains_chinese():
    global d
    for _char in d:
        if '\u4e00' <= _char <= '\u9fa5':
            return True
    return False

d = input("What do you want to translate: ")
is_contains_chinese()

if True:
  changechinesetourl()
else:
  searchdic()

Here's what i have written, and my output shows like if you type "take":

vt. 拿，取；握，抱；拿走，取走；夺取，占领；抓，捕；吸引 vi. （染料）被吸收，染上；依法获得财产 n. 一次拍摄的电影（电视）镜头[C]；捕获量；收获量；收入[S1]

and i wanted to see is separated like this:

vt. 拿，取；握，抱；拿走，取走；夺取，占领；抓，捕；吸引

vi. （染料）被吸收，染上；依法获得财产

n. 一次拍摄的电影（电视）镜头[C]；捕获量；收获量；收入[S1]

I've tried to use

#divs = soup.find_all('div', 'compList mb-25 p-rel')
    #for div in divs:
      #print(f"{[s for s in div.stripped_strings]}""\n")

but the results is the same but only with [ at the beginning and ] at the ending.

I'm not sure if it is because the original web html didn't split lines.

this is a part of the original page code:

<div class="compList mb-25 p-rel" ><ul ><li class="lh-22 mh-22 mt-12 mb-12 mr-25"><div class=" pos_button fz-14 fl-l mr-12">vt.</div> <div class=" fz-16 fl-l dictionaryExplanation">拿，取；握，抱；拿走，取走；奪取，佔領；抓，捕；吸引</div> </li><li class="lh-22 mh-22 mt-12 mb-12 mr-25"><div class=" pos_button fz-14 fl-l mr-12">vi.</div> <div class=" fz-16 fl-l dictionaryExplanation">（染料）被吸收，染上；依法獲得財產</div> </li><li class="lh-22 mh-22 mt-12 mb-12 mr-25 last"><div class=" pos_button fz-14 fl-l mr-12">n.</div> <div class=" fz-16 fl-l dictionaryExplanation">一次拍攝的電影（電視）鏡頭[C]；捕獲量；收穫量；收入[S1]</div>

Answer 1

要按照您想要的方式格式化该文本，我必须这样做：

for div in divs:
    lis = div.find_all('li')
    for li in lis:
        print(li.text.replace('\n', ''))

网页抓取后如何分割线？

How to split lines after web scraping?

python

beautifulsoup