使用 Python 访问 LinkedIn 个人资料
Access LinkedIn Profile with Python
我正在尝试通过 API 计算访问我自己的 LinkedIn 个人资料,以下载我自己的 post。最近有三个 Python 包装器可以访问我的个人资料,例如linkedin-sdk, pawl, LinkedIn V2. However, I have been unable to make them work. The problem is the authentication. I have seen the famous LinkedIn-API wrapper, but its authentication process is complex and 可能是由于 LinkedIn 更改了其身份验证过程和访问范围。
基于 this tutorial from last year 我已经能够访问我自己的个人资料以查看我的姓名、国家/地区、语言和 ID。
import requests
#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
URL = "https://www.linkedin.com/oauth/v2/authorization"
client_id= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
scope='r_liteprofile'
PARAMS = {'response_type':'code', 'client_id':client_id, 'redirect_uri':redirect_uri, 'scope':scope}
r = requests.get(url = URL, params = PARAMS)
return_url = r.url
print('Please copy the URL and paste it in browser for getting authentication code')
print('')
print(return_url)
get_auth_link()
# Make a POST request to exchange the Authorization Code for an Access Token
import json
def get_access_token():
headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
AUTH_CODE = 'XXXX'
ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
client_id= 'XXXX'
client_secret= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
PARAM = {'grant_type': 'authorization_code',
'code': AUTH_CODE,
'redirect_uri': redirect_uri,
'client_id': client_id,
'client_secret': client_secret}
response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
data = response.json()
print(data)
access_token = data['access_token']
return access_token
get_access_token()
access_token = 'XXXX'
def get_profile(access_token):
URL = "https://api.linkedin.com/v2/me"
headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
response = requests.get(url=URL, headers=headers)
print(response.json())
get_profile(access_token)
一旦我将范围从 r_liteprofile
更改为 r_basicprofile
,我就会收到 unauthorized_scope_error:r_basicprofile 未授权您的应用程序。在我的开发人员页面中,我已授权范围 r_emailaddress
、r_liteprofile
和 w_member_social
。但只有 r_liteprofile
有效。据我了解LinkedIn documentation,评论无法下载?
所以真正的大问题是,可以通过 API 下载评论吗?
机器人或爬虫不是一种选择,因为它们需要 LinkedIn 的明确许可,而我没有。
更新: 所以请不要使用非法解决方案。我在写这篇文章之前就知道 post 它们的存在。
感谢您的帮助!
我发现使用 linkedin-api by tomquirk was really easy. However, a KeyError was raised when a post does not have any comment. I fixed it in a fork 登录并提交了拉取请求。如果您使用 python setup.py install
安装分叉,则以下代码将获取您所有带有评论的帖子:
from linkedin_api import Linkedin
import getpass
print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')
api = Linkedin(username, password)
my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']
my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
print('POST:' + post_urn + '\n')
comments = api.get_post_comments(post_urn, comment_count=100)
for comment in comments:
commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")
注意:这里没有使用官方API,而根据README.md:
This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.
但是,只要您只从自己的帐户中抓取评论就没问题。
下载不违反 LinkedIn 条款和条件的评论有两种合法选择。两者都需要 LinkedIn 的许可。
选项A: Comment API
Comment API 是页面管理 API 的一部分,而页面管理 API 又是营销开发人员计划 (MDP) 的一部分。 LinkedIn 在此处 描述了其营销开发人员计划的申请流程。它需要填写一个指定用例的表格。然后 LinkedIn 决定是否授予访问权限。 这些用例 将受到限制或不获批准。
选项B: Web 抓取和抓取具有豁免(白名单)的LinkedIn
描述了豁免过程here。
我选择选项 A。看看他们是否允许我访问。我将 up-date 相应地 post。
Up-date 19/05/2022
LinkedIn 已授予 MDP 的权限。大约用了 2 周。
Up-date 2022 年 5 月 27 日
Here is a great tutorial to get individual posts. Getting company page posts is another story - - so opened a new
我正在尝试通过 API 计算访问我自己的 LinkedIn 个人资料,以下载我自己的 post。最近有三个 Python 包装器可以访问我的个人资料,例如linkedin-sdk, pawl, LinkedIn V2. However, I have been unable to make them work. The problem is the authentication. I have seen the famous LinkedIn-API wrapper, but its authentication process is complex and
基于 this tutorial from last year 我已经能够访问我自己的个人资料以查看我的姓名、国家/地区、语言和 ID。
import requests
#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
URL = "https://www.linkedin.com/oauth/v2/authorization"
client_id= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
scope='r_liteprofile'
PARAMS = {'response_type':'code', 'client_id':client_id, 'redirect_uri':redirect_uri, 'scope':scope}
r = requests.get(url = URL, params = PARAMS)
return_url = r.url
print('Please copy the URL and paste it in browser for getting authentication code')
print('')
print(return_url)
get_auth_link()
# Make a POST request to exchange the Authorization Code for an Access Token
import json
def get_access_token():
headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
AUTH_CODE = 'XXXX'
ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
client_id= 'XXXX'
client_secret= 'XXXX'
redirect_uri = 'http://localhost:8080/login'
PARAM = {'grant_type': 'authorization_code',
'code': AUTH_CODE,
'redirect_uri': redirect_uri,
'client_id': client_id,
'client_secret': client_secret}
response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
data = response.json()
print(data)
access_token = data['access_token']
return access_token
get_access_token()
access_token = 'XXXX'
def get_profile(access_token):
URL = "https://api.linkedin.com/v2/me"
headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
response = requests.get(url=URL, headers=headers)
print(response.json())
get_profile(access_token)
一旦我将范围从 r_liteprofile
更改为 r_basicprofile
,我就会收到 unauthorized_scope_error:r_basicprofile 未授权您的应用程序。在我的开发人员页面中,我已授权范围 r_emailaddress
、r_liteprofile
和 w_member_social
。但只有 r_liteprofile
有效。据我了解LinkedIn documentation,评论无法下载?
所以真正的大问题是,可以通过 API 下载评论吗?
机器人或爬虫不是一种选择,因为它们需要 LinkedIn 的明确许可,而我没有。
更新: 所以请不要使用非法解决方案。我在写这篇文章之前就知道 post 它们的存在。
感谢您的帮助!
我发现使用 linkedin-api by tomquirk was really easy. However, a KeyError was raised when a post does not have any comment. I fixed it in a fork 登录并提交了拉取请求。如果您使用 python setup.py install
安装分叉,则以下代码将获取您所有带有评论的帖子:
from linkedin_api import Linkedin
import getpass
print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')
api = Linkedin(username, password)
my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']
my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
print('POST:' + post_urn + '\n')
comments = api.get_post_comments(post_urn, comment_count=100)
for comment in comments:
commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")
注意:这里没有使用官方API,而根据README.md:
This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.
但是,只要您只从自己的帐户中抓取评论就没问题。
下载不违反 LinkedIn 条款和条件的评论有两种合法选择。两者都需要 LinkedIn 的许可。
选项A: Comment API
Comment API 是页面管理 API 的一部分,而页面管理 API 又是营销开发人员计划 (MDP) 的一部分。 LinkedIn 在此处 描述了其营销开发人员计划的申请流程。它需要填写一个指定用例的表格。然后 LinkedIn 决定是否授予访问权限。 这些用例 将受到限制或不获批准。
选项B: Web 抓取和抓取具有豁免(白名单)的LinkedIn
描述了豁免过程here。
我选择选项 A。看看他们是否允许我访问。我将 up-date 相应地 post。
Up-date 19/05/2022
LinkedIn 已授予 MDP 的权限。大约用了 2 周。
Up-date 2022 年 5 月 27 日
Here is a great tutorial to get individual posts. Getting company page posts is another story -