使用 Python 访问 LinkedIn 个人资料

Access LinkedIn Profile with Python

我正在尝试通过 API 计算访问我自己的 LinkedIn 个人资料,以下载我自己的 post。最近有三个 Python 包装器可以访问我的个人资料,例如linkedin-sdk, pawl, LinkedIn V2. However, I have been unable to make them work. The problem is the authentication. I have seen the famous LinkedIn-API wrapper, but its authentication process is complex and 可能是由于 LinkedIn 更改了其身份验证过程和访问范围。

基于 this tutorial from last year 我已经能够访问我自己的个人资料以查看我的姓名、国家/地区、语言和 ID。

import requests

#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
    URL = "https://www.linkedin.com/oauth/v2/authorization"
    client_id= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    scope='r_liteprofile'
    PARAMS = {'response_type':'code', 'client_id':client_id,  'redirect_uri':redirect_uri, 'scope':scope}
    r = requests.get(url = URL, params = PARAMS)
    return_url = r.url
    print('Please copy the URL and paste it in browser for getting authentication code')
    print('')
    print(return_url)

get_auth_link()

# Make a POST request to exchange the Authorization Code for an Access Token
import json

def get_access_token():
    headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
    AUTH_CODE = 'XXXX'
    ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
    client_id= 'XXXX'
    client_secret= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    PARAM = {'grant_type': 'authorization_code',
      'code': AUTH_CODE,
      'redirect_uri': redirect_uri,
      'client_id': client_id,
      'client_secret': client_secret}
    response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
    data = response.json()
    print(data)
    access_token = data['access_token']
    return access_token

get_access_token()

access_token = 'XXXX'

def get_profile(access_token):
    URL = "https://api.linkedin.com/v2/me"
    headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
    response = requests.get(url=URL, headers=headers)
    print(response.json())

get_profile(access_token)

一旦我将范围从 r_liteprofile 更改为 r_basicprofile,我就会收到 unauthorized_scope_error:r_basicprofile 未授权您的应用程序。在我的开发人员页面中,我已授权范围 r_emailaddressr_liteprofilew_member_social。但只有 r_liteprofile 有效。据我了解LinkedIn documentation,评论无法下载?

所以真正的大问题是,可以通过 API 下载评论吗?

机器人或爬虫不是一种选择,因为它们需要 LinkedIn 的明确许可,而我没有。

更新: 所以请不要使用非法解决方案。我在写这篇文章之前就知道 post 它们的存在。

感谢您的帮助!

我发现使用 linkedin-api by tomquirk was really easy. However, a KeyError was raised when a post does not have any comment. I fixed it in a fork 登录并提交了拉取请求。如果您使用 python setup.py install 安装分叉,则以下代码将获取您所有带有评论的帖子:

from linkedin_api import Linkedin
import getpass

print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')

api = Linkedin(username, password)

my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']

my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
    post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
    print('POST:' + post_urn + '\n')
    comments = api.get_post_comments(post_urn, comment_count=100)
    for comment in comments:
        commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
        print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")

注意:这里没有使用官方API,而根据README.md:

This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.

但是,只要您只从自己的帐户中抓取评论就没问题。

下载不违反 LinkedIn 条款和条件的评论有两种合法选择。两者都需要 LinkedIn 的许可。

选项A: Comment API

Comment API 是页面管理 API 的一部分,而页面管理 API 又是营销开发人员计划 (MDP) 的一部分。 LinkedIn 在此处 描述了其营销开发人员计划的申请流程。它需要填写一个指定用例的表格。然后 LinkedIn 决定是否授予访问权限。 这些用例 将受到限制或不获批准。



选项B: Web 抓取和抓取具有豁免(白名单)的LinkedIn
描述了豁免过程here

我选择选项 A。看看他们是否允许我访问。我将 up-date 相应地 post。

Up-date 19/05/2022
LinkedIn 已授予 MDP 的权限。大约用了 2 周。

Up-date 2022 年 5 月 27 日
Here is a great tutorial to get individual posts. Getting company page posts is another story - - so opened a new