YouTube 订阅列表抓取
YouTube Subscriptions List Scraping
我想将我的 YouTube 订阅列表删除到一个 csv 文件中。我输入了这段代码(但我还没有完成编码):
import requests
from bs4 import BeautifulSoup
import csv
url = 'https://www.youtube.com/feed/channels'
source = requests.get(url)
soup = BeautifulSoup(source, 'lxml')
我发现了这个错误:
File "/Users/hendy/YouTube subscriptions scraping.py", line 7, in
soup = BeautifulSoup(source, 'lxml') File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/bs4/init.py",
line 312, in init
elif len(markup) <= 256 and ( TypeError: object of type 'Response' has no len()
不知道是什么问题
会发生什么?
您使用整个 response
对象并将其推送到 BeautifulSoup
什么都行不通。
如何修复?
要生成 BeautifulSoup
对象,请使用您回复的 content
或 text
:
BeautifulSoup(source.content, 'lxml')
示例
from bs4 import BeautifulSoup
import requests
headers ={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'}
url = 'https://www.youtube.com/feed/channels'
source = requests.get(url, headers=headers)
soup = BeautifulSoup(source.content, 'lxml')
我想将我的 YouTube 订阅列表删除到一个 csv 文件中。我输入了这段代码(但我还没有完成编码):
import requests
from bs4 import BeautifulSoup
import csv
url = 'https://www.youtube.com/feed/channels'
source = requests.get(url)
soup = BeautifulSoup(source, 'lxml')
我发现了这个错误:
File "/Users/hendy/YouTube subscriptions scraping.py", line 7, in soup = BeautifulSoup(source, 'lxml') File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/bs4/init.py", line 312, in init elif len(markup) <= 256 and ( TypeError: object of type 'Response' has no len()
不知道是什么问题
会发生什么?
您使用整个 response
对象并将其推送到 BeautifulSoup
什么都行不通。
如何修复?
要生成 BeautifulSoup
对象,请使用您回复的 content
或 text
:
BeautifulSoup(source.content, 'lxml')
示例
from bs4 import BeautifulSoup
import requests
headers ={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'}
url = 'https://www.youtube.com/feed/channels'
source = requests.get(url, headers=headers)
soup = BeautifulSoup(source.content, 'lxml')