有什么方法可以找到给定数据集中的缺失值

Question

代码如下

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Dataset = pd.read_csv('/Users\HANISH\Desktop\mllearning\Datapreprocessing\Data.csv')
X = Dataset.iloc[:,:-1]
Y = Dataset.iloc[:,-1]

from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(X[:,1:3])
X[:,1:3] = imputer.transform(X[:,1:3])

print(X)

我使用的数据是：

Dataset

我得到的错误如下：

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File ~\.spyder-py3\temp1.py:18 in <module>
    imputer.fit(X[:,1:3])

  File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py:3505 in __getitem__
    indexer = self.columns.get_loc(key)

  File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py:3628 in get_loc
    self._check_indexing_error(key)

  File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py:5637 in _check_indexing_error
    raise InvalidIndexError(key)

InvalidIndexError: (slice(None, None, None), slice(1, 3, None))

请建议我刚开始学习时的更改。

Answer 1

您需要将 X[:,1:3] 更改为 X.iloc[:,1:3]

有什么方法可以找到给定数据集中的缺失值

Is there any way to find the missing values in given dataset

python

pandas

scikit-learn