Pyplot 散点名称未定义
Pyplot scatter name not defined
我从网页上抓取了数据,现在我想可视化该数据。当我尝试分散时,我在 plt.scatter(data[x],data[y])
处收到错误“NameError:名称 'x' 未定义”。我试图查看我从网站上抓取的代码和数据,并查看了我自己的代码。不确定为什么 x
和 y
不起作用。有什么解决办法吗?
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from mplsoccer.pitch import Pitch
from pandas.core.indexes.base import Index
text_color = 'w'
data = pd.read_csv(#filename)
fig, ax = plt.subplots(figsize=(13,8.5)) #lager figurene
fig.set_facecolor('#22312b')
ax.patch.set_facecolor('#22312b')
pitch = Pitch(pitch_color='#aabb97', line_color='white')
pitch.draw(ax=ax)
plt.scatter(data[x],data[y])
我从中读取数据的 csv 文件是这样的:
import requests
from bs4 import BeautifulSoup
import json
import pandas as pd
base_url = 'https://understat.com/match/'
match = input('Please enter the match id: ')
url = base_url + match
res = requests.get(url)
soup = BeautifulSoup(res.content, 'lxml')
scripts = soup.find_all('script')
strings = scripts[1].string
ind_start = strings.index("('")+2
ind_end = strings.index("')")
json_data = strings[ind_start:ind_end]
json_data = json_data.encode('utf8').decode('unicode_escape')
data = json.loads(json_data)
team = []
minute = []
xg = []
result = []
x = []
y = []
situation = []
player = []
data_away = data['a']
data_home = data['h']
for index in range(len(data_home)):
for key in data_home[index]:
if key == 'X':
x.append(data_home[index][key])
if key == 'Y':
y.append(data_home[index][key])
if key == 'xG':
xg.append(data_home[index][key])
if key == 'h_team':
team.append(data_home[index][key])
if key == 'result':
result.append(data_home[index][key])
if key == 'situation':
situation.append(data_home[index][key])
if key == 'minute':
minute.append(data_home[index][key])
if key == 'player':
player.append(data_home[index][key])
for index in range(len(data_away)):
for key in data_away[index]:
if key == 'X':
x.append(data_away[index][key])
if key == 'Y':
y.append(data_away[index][key])
if key == 'xG':
xg.append(data_away[index][key])
if key == 'a_team':
team.append(data_away[index][key])
if key == 'result':
result.append(data_away[index][key])
if key == 'situation':
situation.append(data_away[index][key])
if key == 'minute':
minute.append(data_away[index][key])
if key == 'player':
player.append(data_away[index][key])
col_names = ['Minute','Player','Situation','Team','xG','Result','x-coordinate','y-coordinate']
df = pd.DataFrame([minute,player,situation,team,xg,result,x,y], index=col_names)
df.to_csv('shotmaps.csv', encoding='utf-8')
df = df.T
这是我的数据框
Unnamed: 0 0 1 ... 30 31 32
0 Minute 8 10 ... 78 79 86
1 Player Cristiano Ronaldo Cristiano Ronaldo ... Allan Saint-Maximin Joe Willock Joelinton
2 Situation OpenPlay OpenPlay ... OpenPlay FromCorner OpenPlay
3 Team Manchester United Manchester United ... Newcastle United Newcastle United Newcastle United
4 xG 0.05710771679878235 0.03967716544866562 ... 0.020885728299617767 0.013165773823857307 0.05987533554434776
5 Result MissedShots MissedShots ... BlockedShot SavedShot MissedShots
6 x-coordinate 0.9780000305175781 0.9719999694824218 ... 0.7390000152587891 0.705999984741211 0.9119999694824219
7 y-coordinate 0.33799999237060546 0.72 ... 0.47900001525878905 0.4640000152587891 0.5929999923706055
错误信息
File "C:\Users\#name\AppData\Local\Programs\PythonCodingPack\lib\site-packages\pandas\core\indexes\base.py", line 2889, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 97, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'x-coordinate'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "#filename", line 19, in <module>
plt.scatter(data['x-coordinate'],data['y-coordinate'])
File "#filename", line 2899, in __getitem__
indexer = self.columns.get_loc(key)
File "#filename", line 2891, in get_loc
raise KeyError(key) from err
KeyError: 'x-coordinate'
仅仅因为变量是在您 运行 的文件中定义的,并不意味着它在您稍后 运行 的另一个文件中自动可用。您需要以某种方式传递它们,例如第二个文件 import
在第一个文件中 return
调用一个函数,该函数 return
是您要查找的值。
但是,这个特定问题的解决方案要容易得多。在您的绘图仪文件中,只需更改
plt.scatter(data[x],data[y])
到
plt.scatter(data["x-coordinate"],data["y-coordinate"])
这使用数据框命名列中的数据,这正是您想要的。
编辑
上面的修复会起作用,但对于抓取代码末尾的一个简单问题:
df.to_csv('shotmaps.csv', encoding='utf-8')
df = df.T
您正在将 df 保存为 CSV,然后转置它。切换这两行,在绘图文件中使用我上面的代码,你应该已经准备好了。我没有安装 mplsoccer
,所以我只是注释掉了这些行。
df
应类似于以下示例,使用 id 14620 创建
# display(df.head())
Minute Player Situation Team xG Result x-coordinate y-coordinate
0 13 Roberto Firmino OpenPlay Liverpool 0.03234297037124634 BlockedShot 0.774000015258789 0.43
1 13 Andrew Robertson OpenPlay Liverpool 0.03856334835290909 MissedShots 0.8830000305175781 0.6880000305175781
2 16 Roberto Firmino OpenPlay Liverpool 0.07978218793869019 MissedShots 0.835 0.509000015258789
3 20 Xherdan Shaqiri OpenPlay Liverpool 0.04507734999060631 BlockedShot 0.7919999694824219 0.48900001525878906
4 21 Roberto Firmino OpenPlay Liverpool 0.09094344824552536 BlockedShot 0.9009999847412109 0.639000015258789
我从网页上抓取了数据,现在我想可视化该数据。当我尝试分散时,我在 plt.scatter(data[x],data[y])
处收到错误“NameError:名称 'x' 未定义”。我试图查看我从网站上抓取的代码和数据,并查看了我自己的代码。不确定为什么 x
和 y
不起作用。有什么解决办法吗?
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from mplsoccer.pitch import Pitch
from pandas.core.indexes.base import Index
text_color = 'w'
data = pd.read_csv(#filename)
fig, ax = plt.subplots(figsize=(13,8.5)) #lager figurene
fig.set_facecolor('#22312b')
ax.patch.set_facecolor('#22312b')
pitch = Pitch(pitch_color='#aabb97', line_color='white')
pitch.draw(ax=ax)
plt.scatter(data[x],data[y])
我从中读取数据的 csv 文件是这样的:
import requests
from bs4 import BeautifulSoup
import json
import pandas as pd
base_url = 'https://understat.com/match/'
match = input('Please enter the match id: ')
url = base_url + match
res = requests.get(url)
soup = BeautifulSoup(res.content, 'lxml')
scripts = soup.find_all('script')
strings = scripts[1].string
ind_start = strings.index("('")+2
ind_end = strings.index("')")
json_data = strings[ind_start:ind_end]
json_data = json_data.encode('utf8').decode('unicode_escape')
data = json.loads(json_data)
team = []
minute = []
xg = []
result = []
x = []
y = []
situation = []
player = []
data_away = data['a']
data_home = data['h']
for index in range(len(data_home)):
for key in data_home[index]:
if key == 'X':
x.append(data_home[index][key])
if key == 'Y':
y.append(data_home[index][key])
if key == 'xG':
xg.append(data_home[index][key])
if key == 'h_team':
team.append(data_home[index][key])
if key == 'result':
result.append(data_home[index][key])
if key == 'situation':
situation.append(data_home[index][key])
if key == 'minute':
minute.append(data_home[index][key])
if key == 'player':
player.append(data_home[index][key])
for index in range(len(data_away)):
for key in data_away[index]:
if key == 'X':
x.append(data_away[index][key])
if key == 'Y':
y.append(data_away[index][key])
if key == 'xG':
xg.append(data_away[index][key])
if key == 'a_team':
team.append(data_away[index][key])
if key == 'result':
result.append(data_away[index][key])
if key == 'situation':
situation.append(data_away[index][key])
if key == 'minute':
minute.append(data_away[index][key])
if key == 'player':
player.append(data_away[index][key])
col_names = ['Minute','Player','Situation','Team','xG','Result','x-coordinate','y-coordinate']
df = pd.DataFrame([minute,player,situation,team,xg,result,x,y], index=col_names)
df.to_csv('shotmaps.csv', encoding='utf-8')
df = df.T
这是我的数据框
Unnamed: 0 0 1 ... 30 31 32
0 Minute 8 10 ... 78 79 86
1 Player Cristiano Ronaldo Cristiano Ronaldo ... Allan Saint-Maximin Joe Willock Joelinton
2 Situation OpenPlay OpenPlay ... OpenPlay FromCorner OpenPlay
3 Team Manchester United Manchester United ... Newcastle United Newcastle United Newcastle United
4 xG 0.05710771679878235 0.03967716544866562 ... 0.020885728299617767 0.013165773823857307 0.05987533554434776
5 Result MissedShots MissedShots ... BlockedShot SavedShot MissedShots
6 x-coordinate 0.9780000305175781 0.9719999694824218 ... 0.7390000152587891 0.705999984741211 0.9119999694824219
7 y-coordinate 0.33799999237060546 0.72 ... 0.47900001525878905 0.4640000152587891 0.5929999923706055
错误信息
File "C:\Users\#name\AppData\Local\Programs\PythonCodingPack\lib\site-packages\pandas\core\indexes\base.py", line 2889, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 97, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'x-coordinate'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "#filename", line 19, in <module>
plt.scatter(data['x-coordinate'],data['y-coordinate'])
File "#filename", line 2899, in __getitem__
indexer = self.columns.get_loc(key)
File "#filename", line 2891, in get_loc
raise KeyError(key) from err
KeyError: 'x-coordinate'
仅仅因为变量是在您 运行 的文件中定义的,并不意味着它在您稍后 运行 的另一个文件中自动可用。您需要以某种方式传递它们,例如第二个文件 import
在第一个文件中 return
调用一个函数,该函数 return
是您要查找的值。
但是,这个特定问题的解决方案要容易得多。在您的绘图仪文件中,只需更改
plt.scatter(data[x],data[y])
到
plt.scatter(data["x-coordinate"],data["y-coordinate"])
这使用数据框命名列中的数据,这正是您想要的。
编辑
上面的修复会起作用,但对于抓取代码末尾的一个简单问题:
df.to_csv('shotmaps.csv', encoding='utf-8')
df = df.T
您正在将 df 保存为 CSV,然后转置它。切换这两行,在绘图文件中使用我上面的代码,你应该已经准备好了。我没有安装 mplsoccer
,所以我只是注释掉了这些行。
df
应类似于以下示例,使用 id 14620 创建
# display(df.head())
Minute Player Situation Team xG Result x-coordinate y-coordinate
0 13 Roberto Firmino OpenPlay Liverpool 0.03234297037124634 BlockedShot 0.774000015258789 0.43
1 13 Andrew Robertson OpenPlay Liverpool 0.03856334835290909 MissedShots 0.8830000305175781 0.6880000305175781
2 16 Roberto Firmino OpenPlay Liverpool 0.07978218793869019 MissedShots 0.835 0.509000015258789
3 20 Xherdan Shaqiri OpenPlay Liverpool 0.04507734999060631 BlockedShot 0.7919999694824219 0.48900001525878906
4 21 Roberto Firmino OpenPlay Liverpool 0.09094344824552536 BlockedShot 0.9009999847412109 0.639000015258789