Scikit-learn 和 keras 多核命令 n_jobs = -1
Scikit-learn and keras multicore command n_jobs = -1
我刚刚用 Keras 创建了一个人工神经网络,我想将 Scikit-learn 函数 cross_val_score 传递给它,以在一些 X_train 和 y_train 数据上对其进行训练设置。
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
def build_classifier():
classifier = Sequential()
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu', input_dim = 30))
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size=25, epochs = 10)
results = cross_val_score(classifier, X_train, y_train, cv=10, n_jobs=-1)
我得到的输出是 Epoch 1/1 重复了 4 次(我有 4 个核心),没有别的,因为在那之后它卡住了,计算永远不会完成。
我用任何其他 Scikit-learn 算法测试了 n_jobs = -1,它工作正常。我没有使用 GPU,只使用 CPU.
要测试代码,只需添加以下规范化数据集:
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
df = pd.DataFrame(data['data'])
target = pd.DataFrame(data['target'])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size = 0.2, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
在使用 n_jobs(设置为 1、2、3 或 -1)后,我得到了一些奇怪的结果,比如 Epoch 1/1 只重复了 3 次而不是 4 次(即使 n_jobs = -1) 或者当我在这里中断内核时,我得到的是:
Process ForkPoolWorker-33:
Traceback (most recent call last):
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/home/myname/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/pool.py", line 362, in get
return recv()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
可能是多处理中的问题,但我不知道如何解决。
上面的代码对我来说工作正常。请升级您的模块。
步骤 1) pip 安装 --upgrade tensorflow
步骤 2) pip 安装 keras
我试过了,它可以使用 TensorFlow 后端。
我有:
In [7]: sklearn.version Out[7]: '0.19.1'
In [8]: keras.version Out[8]: '2.2.4'
并且:
import keras
/anaconda2/lib/python2.7/site-packages/h5py/init.py:36:
FutureWarning: Conversion of the second argument of issubdtype from
float
to np.floating
is deprecated. In future, it will be treated
as np.float64 == np.dtype(float).type
. from ._conv import
register_converters as _register_converters
Using TensorFlow backend.
我切换到 sklearn 版本 = 0.20.1
现在 n_jobs 问题 "works",因为命令运行和完成的时间比 n_jobs = 1 短。
尽管如此:
1) n_jobs = 2 或更高
的计算时间没有显着改善
2) 在某些情况下,我会收到此警告:
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
/home/my_name/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py:706:
UserWarning: A worker stopped while some jobs were given to the executor.
This can be caused by a too short worker timeout or by a memory leak.
"timeout or by a memory leak.", UserWarning
最后一句话:n_jobs != 1 在 Jupyter notebook 中不再显示神经网络与纪元的交互计算,但在终端中 (!?)
我刚刚用 Keras 创建了一个人工神经网络,我想将 Scikit-learn 函数 cross_val_score 传递给它,以在一些 X_train 和 y_train 数据上对其进行训练设置。
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
def build_classifier():
classifier = Sequential()
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu', input_dim = 30))
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size=25, epochs = 10)
results = cross_val_score(classifier, X_train, y_train, cv=10, n_jobs=-1)
我得到的输出是 Epoch 1/1 重复了 4 次(我有 4 个核心),没有别的,因为在那之后它卡住了,计算永远不会完成。 我用任何其他 Scikit-learn 算法测试了 n_jobs = -1,它工作正常。我没有使用 GPU,只使用 CPU.
要测试代码,只需添加以下规范化数据集:
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
df = pd.DataFrame(data['data'])
target = pd.DataFrame(data['target'])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size = 0.2, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
在使用 n_jobs(设置为 1、2、3 或 -1)后,我得到了一些奇怪的结果,比如 Epoch 1/1 只重复了 3 次而不是 4 次(即使 n_jobs = -1) 或者当我在这里中断内核时,我得到的是:
Process ForkPoolWorker-33:
Traceback (most recent call last):
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/home/myname/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/pool.py", line 362, in get
return recv()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
可能是多处理中的问题,但我不知道如何解决。
上面的代码对我来说工作正常。请升级您的模块。
步骤 1) pip 安装 --upgrade tensorflow
步骤 2) pip 安装 keras
我试过了,它可以使用 TensorFlow 后端。
我有:
In [7]: sklearn.version Out[7]: '0.19.1'
In [8]: keras.version Out[8]: '2.2.4'
并且:
import keras
/anaconda2/lib/python2.7/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_convertersUsing TensorFlow backend.
我切换到 sklearn 版本 = 0.20.1
现在 n_jobs 问题 "works",因为命令运行和完成的时间比 n_jobs = 1 短。
尽管如此:
1) n_jobs = 2 或更高
的计算时间没有显着改善2) 在某些情况下,我会收到此警告:
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
/home/my_name/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py:706:
UserWarning: A worker stopped while some jobs were given to the executor.
This can be caused by a too short worker timeout or by a memory leak.
"timeout or by a memory leak.", UserWarning
最后一句话:n_jobs != 1 在 Jupyter notebook 中不再显示神经网络与纪元的交互计算,但在终端中 (!?)