加载 csv 并将每个值传递给 url 参数？

Question

我正在尝试汇总来自 nba.com 的 NBA 球员数据。我在 csv 中有一个玩家 ID 列表。我想加载每个玩家 ID 并将其传递到参数列表中。然后使用requests.get中的参数列表。当我直接输入玩家 ID 作为参数时，request.get 起作用。并且通过函数循环的 csv 测试似乎有效。但是，我无法将 playerids 成功传递到参数列表中。我试过查看类似的 nba python 代码，但看不出哪里出错了。

'''

import pandas as pd
import requests
import csv

best_db=pd.DataFrame()


def table_Scrape():
    
    global best_db 
    
    with open("SHORT_ID_plyr.csv", "r") as f_urls: 
        f_urls_list = csv.reader(f_urls, delimiter=',') 
        next(f_urls_list)    
        
        ##step5. open 1st url from .csv
        for lines in f_urls_list:        
            u = lines[0]
            print(u) #<loop test
            # import requests
            player_id = u
            
            url = """
            http://stats.nba.com/stats/playergamelogs?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerID=203932&PlusMinus=N&Rank=N&Season=2021-22&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&VsConference=&VsDivision=
            """
                #url = """
                #https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2021-22&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight=
                #"""
            header_dict = {
                'User-Agent': 'Mozilla/5.0',
                'x-nba-stats-origin': 'stats',
                'x-nba-stats-token': 'true',
                'Referer': 'https://stats.nba.com',
                'Connection': 'keep-alive',
                'Pragma': 'no-cache',
                'Cache-Control': 'no-cache',
                'Host': 'stats.nba.com'
            }

            params = {
                'LastNGames': '0',
                'LeagueID': '00',
                'MeasureType': 'Base',
                'Month': '0',
                'OpponentTeamID': '0',
                'PORound': '0',
                'PaceAdjust': 'N',
                'PerMode': 'Totals',
                'Period': '0',
                'PlayerID': u,
                'PlusMinus': 'N',
                'Rank': 'N',
                'Season': '2021-22',
                'SeasonType': 'Regular+Season'
                }

            res = requests.get(url, headers=header_dict, params=params)
            json_set = res.json()
            headers = json_set['resultSets'][0]['headers']
            data_set = json_set['resultSets'][0]['rowSet']
            df = pd.DataFrame(columns=headers)
            df.head #test the dataframe NOTE: does not appear to be working either
            
table_Scrape() #call the function

Answer 1

您正在将有效载荷传递到一个 url 上，该 url 已经具有带有硬编码播放器 ID 的有效载荷。将 palyer_id 变量放入 url

import pandas as pd
import requests


best_db=pd.DataFrame()


def table_Scrape():
    
    df = pd.read_csv('SHORT_ID_plyr.csv')
    player_id_list = list(df['PLAYER_ID'])
        
    for player_id in player_id_list:
        print(player_id)
        url = f"""http://stats.nba.com/stats/playergamelogs?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerID={player_id}&PlusMinus=N&Rank=N&Season=2021-22&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&VsConference=&VsDivision="""

        header_dict = {
            'User-Agent': 'Mozilla/5.0',
            'x-nba-stats-origin': 'stats',
            'x-nba-stats-token': 'true',
            'Referer': 'https://stats.nba.com',
            'Connection': 'keep-alive',
            'Pragma': 'no-cache',
            'Cache-Control': 'no-cache',
            'Host': 'stats.nba.com'
        }

        res = requests.get(url, headers=header_dict)
        print(res.text)
        json_set = res.json()
        headers = json_set['resultSets'][0]['headers']
        data_set = json_set['resultSets'][0]['rowSet']
        df = pd.DataFrame(data_set,columns=headers)
        print(df.head()) #test the dataframe NOTE: does not appear to be working either
            
table_Scrape() #call the function

加载 csv 并将每个值传递给 url 参数？

Load csv and pass each value into a url parameter?

python

csv

parameter-passing

web-scraping

pandas