Aws Sagemaker - ModuleNotFoundError: No module named 'cv2'
Aws Sagemaker - ModuleNotFoundError: No module named 'cv2'
我正在尝试 运行 Aws 中的对象检测代码。尽管需求文件中列出了 opencv,但我有错误“没有名为 cv2 的模块”。我不确定如何解决此错误。有人可以帮我吗
我的 requirement.txt 文件有
- opencv-python
- numpy>=1.18.2
- scipy>=1.4.1
- wget>=3.2
- tensorflow==2.3.1
- tensorflow-gpu==2.3.1
- tqdm==4.43.0
- pandas
- boto3
- awscli
- urllib3
- mss
我也尝试安装“imgaug”和“opencv-python headless”..但仍然无法摆脱这个错误。
sh-4.2$ python train_launch.py
[INFO-ROLE] arn:aws:iam::021945294007:role/service-role/AmazonSageMaker-ExecutionRole-20200225T145269
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_count has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
2021-04-14 13:29:58 Starting - Starting the training job...
2021-04-14 13:30:03 Starting - Launching requested ML instances......
2021-04-14 13:31:11 Starting - Preparing the instances for training......
2021-04-14 13:32:17 Downloading - Downloading input data...
2021-04-14 13:32:41 Training - Downloading the training image..WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.
2021-04-14 13:33:03,970 sagemaker-containers INFO Imported framework sagemaker_tensorflow_container.training
2021-04-14 13:33:05,030 sagemaker-containers INFO Invoking user script
Training Env:
{
"additional_framework_parameters": {},
"channel_input_dirs": {
"training": "/opt/ml/input/data/training"
},
"current_host": "algo-1",
"framework_module": "sagemaker_tensorflow_container.training:main",
"hosts": [
"algo-1"
],
"hyperparameters": {
"unfreezed_epochs": 2,
"freezed_batch_size": 8,
"freezed_epochs": 1,
"unfreezed_batch_size": 8,
"model_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model"
},
"input_config_dir": "/opt/ml/input/config",
"input_data_config": {
"training": {
"TrainingInputMode": "File",
"S3DistributionType": "FullyReplicated",
"RecordWrapperType": "None"
}
},
"input_dir": "/opt/ml/input",
"is_master": true,
"job_name": "yolov4-2021-04-14-15-29",
"log_level": 20,
"master_hostname": "algo-1",
"model_dir": "/opt/ml/model",
"module_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz",
"module_name": "train_indu",
"network_interface_name": "eth0",
"num_cpus": 8,
"num_gpus": 1,
"output_data_dir": "/opt/ml/output/data",
"output_dir": "/opt/ml/output",
"output_intermediate_dir": "/opt/ml/output/intermediate",
"resource_config": {
"current_host": "algo-1",
"hosts": [
"algo-1"
],
"network_interface_name": "eth0"
},
"user_entry_point": "train_indu.py"
}
Environment variables:
SM_HOSTS=["algo-1"]
SM_NETWORK_INTERFACE_NAME=eth0
SM_HPS={"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2}
SM_USER_ENTRY_POINT=train_indu.py
SM_FRAMEWORK_PARAMS={}
SM_RESOURCE_CONFIG={"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"}
SM_INPUT_DATA_CONFIG={"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}}
SM_OUTPUT_DATA_DIR=/opt/ml/output/data
SM_CHANNELS=["training"]
SM_CURRENT_HOST=algo-1
SM_MODULE_NAME=train_indu
SM_LOG_LEVEL=20
SM_FRAMEWORK_MODULE=sagemaker_tensorflow_container.training:main
SM_INPUT_DIR=/opt/ml/input
SM_INPUT_CONFIG_DIR=/opt/ml/input/config
SM_OUTPUT_DIR=/opt/ml/output
SM_NUM_CPUS=8
SM_NUM_GPUS=1
SM_MODEL_DIR=/opt/ml/model
SM_MODULE_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz
SM_TRAINING_ENV={"additional_framework_parameters":{},"channel_input_dirs":{"training":"/opt/ml/input/data/training"},"current_host":"algo-1","framework_module":"sagemaker_tensorflow_container.training:main","hosts":["algo-1"],"hyperparameters":{"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2},"input_config_dir":"/opt/ml/input/config","input_data_config":{"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}},"input_dir":"/opt/ml/input","is_master":true,"job_name":"yolov4-2021-04-14-15-29","log_level":20,"master_hostname":"algo-1","model_dir":"/opt/ml/model","module_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz","module_name":"train_indu","network_interface_name":"eth0","num_cpus":8,"num_gpus":1,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"},"user_entry_point":"train_indu.py"}
SM_USER_ARGS=["--freezed_batch_size","8","--freezed_epochs","1","--model_dir","s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","--unfreezed_batch_size","8","--unfreezed_epochs","2"]
SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
SM_CHANNEL_TRAINING=/opt/ml/input/data/training
SM_HP_UNFREEZED_EPOCHS=2
SM_HP_FREEZED_BATCH_SIZE=8
SM_HP_FREEZED_EPOCHS=1
SM_HP_UNFREEZED_BATCH_SIZE=8
SM_HP_MODEL_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model
PYTHONPATH=/opt/ml/code:/usr/local/bin:/usr/lib/python36.zip:/usr/lib/python3.6:/usr/lib/python3.6/lib-dynload:/usr/local/lib/python3.6/dist-packages:/usr/lib/python3/dist-packages
Invoking script with the following command:
/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4667030854237447206
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 3059419181456814147
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 6024475084695919958
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 14949928141
locality {
bus_id: 1
links {
}
}
incarnation: 13034103301168381073
physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5"
]
Traceback (most recent call last):
File "train_indu.py", line 12, in <module>
from yolov3.dataset import Dataset
File "/opt/ml/code/yolov3/dataset.py", line 3, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
2021-04-14 13:33:08,453 sagemaker-containers ERROR ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"
2021-04-14 13:33:11 Uploading - Uploading generated training model
2021-04-14 13:33:54 Failed - Training job failed
Traceback (most recent call last):
File "train_launch.py", line 41, in <module>
estimator.fit(s3_data_path, logs=True, job_name=job_name) #the argument logs is crucial if you want to see what happends
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 535, in fit
self.latest_training_job.wait(logs=logs)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 1210, in wait
self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 3365, in logs_for_job
self._check_job_status(job_name, description, "TrainingJobStatus")
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 2957, in _check_job_status
actual_status=status,
sagemaker.exceptions.UnexpectedStatusException: Error for Training job yolov4-2021-04-14-15-29: Failed. Reason: AlgorithmError: ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"
确保您的估算器有
- framework_version = '2.3',
- py_version = 'py37',
我正在尝试 运行 Aws 中的对象检测代码。尽管需求文件中列出了 opencv,但我有错误“没有名为 cv2 的模块”。我不确定如何解决此错误。有人可以帮我吗
我的 requirement.txt 文件有
- opencv-python
- numpy>=1.18.2
- scipy>=1.4.1
- wget>=3.2
- tensorflow==2.3.1
- tensorflow-gpu==2.3.1
- tqdm==4.43.0
- pandas
- boto3
- awscli
- urllib3
- mss
我也尝试安装“imgaug”和“opencv-python headless”..但仍然无法摆脱这个错误。
sh-4.2$ python train_launch.py
[INFO-ROLE] arn:aws:iam::021945294007:role/service-role/AmazonSageMaker-ExecutionRole-20200225T145269
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_count has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
2021-04-14 13:29:58 Starting - Starting the training job...
2021-04-14 13:30:03 Starting - Launching requested ML instances......
2021-04-14 13:31:11 Starting - Preparing the instances for training......
2021-04-14 13:32:17 Downloading - Downloading input data...
2021-04-14 13:32:41 Training - Downloading the training image..WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.
2021-04-14 13:33:03,970 sagemaker-containers INFO Imported framework sagemaker_tensorflow_container.training
2021-04-14 13:33:05,030 sagemaker-containers INFO Invoking user script
Training Env:
{
"additional_framework_parameters": {},
"channel_input_dirs": {
"training": "/opt/ml/input/data/training"
},
"current_host": "algo-1",
"framework_module": "sagemaker_tensorflow_container.training:main",
"hosts": [
"algo-1"
],
"hyperparameters": {
"unfreezed_epochs": 2,
"freezed_batch_size": 8,
"freezed_epochs": 1,
"unfreezed_batch_size": 8,
"model_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model"
},
"input_config_dir": "/opt/ml/input/config",
"input_data_config": {
"training": {
"TrainingInputMode": "File",
"S3DistributionType": "FullyReplicated",
"RecordWrapperType": "None"
}
},
"input_dir": "/opt/ml/input",
"is_master": true,
"job_name": "yolov4-2021-04-14-15-29",
"log_level": 20,
"master_hostname": "algo-1",
"model_dir": "/opt/ml/model",
"module_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz",
"module_name": "train_indu",
"network_interface_name": "eth0",
"num_cpus": 8,
"num_gpus": 1,
"output_data_dir": "/opt/ml/output/data",
"output_dir": "/opt/ml/output",
"output_intermediate_dir": "/opt/ml/output/intermediate",
"resource_config": {
"current_host": "algo-1",
"hosts": [
"algo-1"
],
"network_interface_name": "eth0"
},
"user_entry_point": "train_indu.py"
}
Environment variables:
SM_HOSTS=["algo-1"]
SM_NETWORK_INTERFACE_NAME=eth0
SM_HPS={"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2}
SM_USER_ENTRY_POINT=train_indu.py
SM_FRAMEWORK_PARAMS={}
SM_RESOURCE_CONFIG={"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"}
SM_INPUT_DATA_CONFIG={"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}}
SM_OUTPUT_DATA_DIR=/opt/ml/output/data
SM_CHANNELS=["training"]
SM_CURRENT_HOST=algo-1
SM_MODULE_NAME=train_indu
SM_LOG_LEVEL=20
SM_FRAMEWORK_MODULE=sagemaker_tensorflow_container.training:main
SM_INPUT_DIR=/opt/ml/input
SM_INPUT_CONFIG_DIR=/opt/ml/input/config
SM_OUTPUT_DIR=/opt/ml/output
SM_NUM_CPUS=8
SM_NUM_GPUS=1
SM_MODEL_DIR=/opt/ml/model
SM_MODULE_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz
SM_TRAINING_ENV={"additional_framework_parameters":{},"channel_input_dirs":{"training":"/opt/ml/input/data/training"},"current_host":"algo-1","framework_module":"sagemaker_tensorflow_container.training:main","hosts":["algo-1"],"hyperparameters":{"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2},"input_config_dir":"/opt/ml/input/config","input_data_config":{"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}},"input_dir":"/opt/ml/input","is_master":true,"job_name":"yolov4-2021-04-14-15-29","log_level":20,"master_hostname":"algo-1","model_dir":"/opt/ml/model","module_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz","module_name":"train_indu","network_interface_name":"eth0","num_cpus":8,"num_gpus":1,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"},"user_entry_point":"train_indu.py"}
SM_USER_ARGS=["--freezed_batch_size","8","--freezed_epochs","1","--model_dir","s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","--unfreezed_batch_size","8","--unfreezed_epochs","2"]
SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
SM_CHANNEL_TRAINING=/opt/ml/input/data/training
SM_HP_UNFREEZED_EPOCHS=2
SM_HP_FREEZED_BATCH_SIZE=8
SM_HP_FREEZED_EPOCHS=1
SM_HP_UNFREEZED_BATCH_SIZE=8
SM_HP_MODEL_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model
PYTHONPATH=/opt/ml/code:/usr/local/bin:/usr/lib/python36.zip:/usr/lib/python3.6:/usr/lib/python3.6/lib-dynload:/usr/local/lib/python3.6/dist-packages:/usr/lib/python3/dist-packages
Invoking script with the following command:
/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4667030854237447206
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 3059419181456814147
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 6024475084695919958
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 14949928141
locality {
bus_id: 1
links {
}
}
incarnation: 13034103301168381073
physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5"
]
Traceback (most recent call last):
File "train_indu.py", line 12, in <module>
from yolov3.dataset import Dataset
File "/opt/ml/code/yolov3/dataset.py", line 3, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
2021-04-14 13:33:08,453 sagemaker-containers ERROR ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"
2021-04-14 13:33:11 Uploading - Uploading generated training model
2021-04-14 13:33:54 Failed - Training job failed
Traceback (most recent call last):
File "train_launch.py", line 41, in <module>
estimator.fit(s3_data_path, logs=True, job_name=job_name) #the argument logs is crucial if you want to see what happends
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 535, in fit
self.latest_training_job.wait(logs=logs)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 1210, in wait
self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 3365, in logs_for_job
self._check_job_status(job_name, description, "TrainingJobStatus")
File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 2957, in _check_job_status
actual_status=status,
sagemaker.exceptions.UnexpectedStatusException: Error for Training job yolov4-2021-04-14-15-29: Failed. Reason: AlgorithmError: ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"
确保您的估算器有
- framework_version = '2.3',
- py_version = 'py37',