我有一个我不理解的值错误。我该如何解决?

I have a value error that I am not understanding. How can I fix it?

我收到 ValueError:形状不匹配:无法将对象广播到单个形状。

当我运行以下代码时出现错误:

plt.bar(range(1, 14), pca.explained_variance_ratio_, alpha=0.5,
...  align='center')
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/me/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2648, in bar
    ret = ax.bar(*args, **kwargs)
  File "/me/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/__init__.py", line 1717, in inner
    return func(ax, *args, **kwargs)
  File "/me/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/axes/_axes.py", line 2019, in bar
    np.atleast_1d(x), height, width, y, linewidth)
  File "/me/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/lib/stride_tricks.py", line 249, in broadcast_arrays
    shape = _broadcast_shape(*args)
  File "/me/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/lib/stride_tricks.py", line 184, in _broadcast_shape
    b = np.broadcast(*args[:32])
ValueError: shape mismatch: objects cannot be broadcast to a single shape

这是用于复制错误的完整代码:

from sklearn.datasets.samples_generator import make_blobs
from pandas import DataFrame

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn.decomposition import PCA

# generate a classification dataset
X, y = make_blobs(n_samples=1000, centers=3, n_features=10, random_state=1, 
   cluster_std=3)


X_train, X_test, y_train, y_test = \
   train_test_split(X, y, test_size=0.2, 
                 stratify=y,
                 random_state=0)

sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)

pca = PCA()
X_train_pca = pca.fit_transform(X_train_std)
pca.explained_variance_ratio_

import matplotlib.pyplot as plt
import numpy as np

# Error occurs here
plt.bar(range(1, 14), pca.explained_variance_ratio_, alpha=0.5, 
 align='center')


plt.step(range(1, 14), np.cumsum(pca.explained_variance_ratio_), where='mid')
plt.ylabel('Explained variance ratio')
plt.xlabel('Principal components')

plt.show()

这是因为你的range的大小和你的pca方差比大小不一样。可以用两个不同的范围复制相同的错误:

# same ranges are ok
plt.bar(range(10), range(10))
<Container object of 10 artists>

# different ones are not
plt.bar(range(15), range(10))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2648, in bar
    ret = ax.bar(*args, **kwargs)
  File "/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/__init__.py", line 1717, in inner
    return func(ax, *args, **kwargs)
  File "/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/matplotlib/axes/_axes.py", line 2019, in bar
    np.atleast_1d(x), height, width, y, linewidth)
  File "/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/lib/stride_tricks.py", line 249, in broadcast_arrays
    shape = _broadcast_shape(*args)
  File "/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/lib/stride_tricks.py", line 184, in _broadcast_shape
    b = np.broadcast(*args[:32])
ValueError: shape mismatch: objects cannot be broadcast to a single shape

因此请使用与 pca args 长度相同的范围:

x = pca.explained_variance_ratio_.shape[0]

plt.bar(
    range(1, x+1), 
    pca.explained_variance_ratio_, 
    alpha=0.5, 
    align='center'
)