网页爬取问题:无法删除\n字符

Web crawling problem: Can't delete \n characters

我现在正在使用 python 从网站抓取数据。事情进展顺利,直到我发现我无法一次合并所有处理过的行。 这是我的错误代码:(我正在使用 scrapy 进行抓取)

rep = response.xpath('/html/body/div[1]/div[2]/div[3]/div[{:d}]'.format(i)).get()
rep = rep.replace('<div class="d-flex justify-content-between search-result-line py-3 px-3">','')
rep = rep.replace('<div class="font-weight-bold">','')
rep = rep.replace('<span>','')
rep = rep.replace('</span>','')
rep = rep.replace('</div></div>',',')
rep = rep.replace('</div>','":')
rep = rep.replace('<div>','"')
rep.join(rep.split('\n'))

该代码的原始输入:

<div class="search-result py-4 px-0 col-12 col-md-6 col-lg-5 mx-auto mt-4"><div class="font-weight-bold mb-3 px-3">Candidate number : <span class="student-id text-dc3545">33000001</span></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>Math</div><div class="font-weight-bold">6.40</div></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>Literature</div><div class="font-weight-bold">4.50</div></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>History</div><div class="font-weight-bold">6.50</div></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>Geography</div><div class="font-weight-bold">7.50</div></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>Foreign language (<span>N1</span>)</div><div class="font-weight-bold">3</div></div><div class="d-flex justify-content-between search-result-line py-3 px-3"><div>Civic Education</div><div class="font-weight-bold">7.75</div></div></div>

我在该代码之后期望的是: “数学”:6.40,“文学”:4.50等 但这是我真正得到的:

"Math":6.40,
"Literature":4.50,
etc.

我是不是搞砸了什么?

scrapy shell

In [1]: courses = response.xpath('//div[contains(@class, "d-flex justify-content-between search-result-line py-3 px-3")
   ...: ]')

In [2]: for course in courses:
   ...:     data = course.xpath('.//text()').getall()
   ...:     data_str = ' '.join(data)
   ...:     print(data_str)
   ...:
Math 6.40
Literature 4.50
History 6.50
Geography 7.50
Foreign language ( N1 ) 3
Civic Education 7.75