如何在 OSX 上编译 caffe_rtpose?

How to compile caffe_rtpose on OSX?

我最近偶然发现了 caffe_rtpose,我尝试编译并 运行 这个例子。不幸的是,我对 C++ 非常有经验,所以我 运行 遇到了很多编译和链接问题。

我试过调整 Makefile 配置(从 existing Ubuntu config 修改而来)。 (我正在使用系统 运行ning OSX 10.11.5 和 nVidia GeForce 750M,我已经安装了 CUDA 7.5 和 libcudnn):

## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#   You should not set this flag if you will be reading LMDBs with any
#   possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
# OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
        -gencode arch=compute_35,code=sm_35 \
        -gencode arch=compute_50,code=sm_50 \
        -gencode arch=compute_50,code=compute_50 \
        -gencode arch=compute_52,code=sm_52 \
        # -gencode arch=compute_60,code=sm_60 \
        # -gencode arch=compute_61,code=sm_61
# Deprecated
#CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
#       -gencode arch=compute_20,code=sm_21 \
#       -gencode arch=compute_30,code=sm_30 \
#       -gencode arch=compute_35,code=sm_35 \
#       -gencode arch=compute_50,code=sm_50 \
#       -gencode arch=compute_50,code=compute_50

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
BLAS_INCLUDE := /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers/
BLAS_LIB := /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
        /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        # $(ANACONDA_HOME)/include/python2.7 \
        # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
# WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
# Q ?= @

这是 install_caffe_and_cpm_osx.sh 脚本的修改版本:

#!/bin/bash



echo "------------------------- INSTALLING CAFFE AND CPM -------------------------"
echo "NOTE: This script assumes that CUDA and cuDNN are already installed on your machine. Otherwise, it might fail."



function exitIfError {
    if [[ $? -ne 0 ]] ; then
        echo ""
        echo "------------------------- -------------------------"
        echo "Errors detected. Exiting script. The software might have not been successfully installed."
        echo "------------------------- -------------------------"
        exit 1
    fi
}



# echo "------------------------- Checking Ubuntu Version -------------------------"
# ubuntu_version="$(lsb_release -r)"
# echo "Ubuntu $ubuntu_version"
# if [[ $ubuntu_version == *"14."* ]]; then
#     ubuntu_le_14=true
# elif [[ $ubuntu_version == *"16."* || $ubuntu_version == *"15."* || $ubuntu_version == *"17."* || $ubuntu_version == *"18."* ]]; then
#     ubuntu_le_14=false
# else
#     echo "Ubuntu release older than version 14. This installation script might fail."
#     ubuntu_le_14=true
# fi
# exitIfError
# echo "------------------------- Ubuntu Version Checked -------------------------"
# echo ""



echo "------------------------- Checking Number of Processors -------------------------"
NUM_CORES=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || sysctl -n hw.ncpu)
echo "$NUM_CORES cores"
exitIfError
echo "------------------------- Number of Processors Checked -------------------------"
echo ""



echo "------------------------- Installing some Caffe Dependencies -------------------------"
# Basic
# sudo apt-get --assume-yes update
# sudo apt-get --assume-yes install build-essential
#General dependencies
brew install protobuf leveldb snappy hdf5
# with Python pycaffe needs dependencies built from source - from http://caffe.berkeleyvision.org/install_osx.html
# brew install --build-from-source --with-python -vd protobuf
# brew install --build-from-source -vd boost boost-python
# without Python the usual installation suffices
brew install boost
# sudo apt-get --assume-yes install libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler
# sudo apt-get --assume-yes install --no-install-recommends libboost-all-dev
# Remaining dependencies, 14.04
brew install gflags glog lmdb
# if [[ $ubuntu_le_14 == true ]]; then
#     sudo apt-get --assume-yes install libgflags-dev libgoogle-glog-dev liblmdb-dev
# fi
# OpenCV 2.4
# sudo apt-get --assume-yes install libopencv-dev
exitIfError
echo "------------------------- Some Caffe Dependencies Installed -------------------------"
echo ""



echo "------------------------- Compiling Caffe & CPM -------------------------"
cp Makefile.config.OSX.10.11.5.example Makefile.config
make all -j$NUM_CORES
# make test -j$NUM_CORES
# make runtest -j$NUM_CORES
exitIfError
echo "------------------------- Caffe & CPM Compiled -------------------------"
echo ""


# echo "------------------------- Installing CPM -------------------------"
# echo "Compiled"
# exitIfError
# echo "------------------------- CPM Installed -------------------------"
# echo ""



echo "------------------------- Downloading CPM Models -------------------------"
models_folder="./model/"
# COCO
coco_folder="$models_folder"coco/""
coco_model="$coco_folder"pose_iter_440000.caffemodel""
if [ ! -f $coco_model ]; then
    wget http://posefs1.perception.cs.cmu.edu/Users/tsimon/Projects/coco/data/models/coco/pose_iter_440000.caffemodel -P $coco_folder
fi
exitIfError
# MPI
mpi_folder="$models_folder"mpi/""
mpi_model="$mpi_folder"pose_iter_160000.caffemodel""
if [ ! -f $mpi_model ]; then
    wget http://posefs1.perception.cs.cmu.edu/Users/tsimon/Projects/coco/data/models/mpi/pose_iter_160000.caffemodel -P $mpi_folder
fi
exitIfError
echo "Models downloaded"
echo "------------------------- CPM Models Downloaded -------------------------"
echo ""



echo "------------------------- CAFFE AND CPM INSTALLED -------------------------"
echo ""

但是我得到这个错误:

examples/rtpose/rtpose.cpp:1088:22: error: variable length array of non-POD element type 'Frame'
    Frame frame_batch[BATCH_SIZE];

我试过将数组换成向量:

std::vector<Frame> frame_batch;
    std::cout << "allocating " << BATCH_SIZE << " frames" << std::endl;
    frame_batch.reserve(BATCH_SIZE);

这似乎解决了那个编译错误,但现在我收到一个链接器错误: ld:找不到 -lgomp 的库 clang:错误:链接器命令失败,退出代码为 1(使用 -v 查看调用)

我搜索了 lib lib gomp,发现了一些关于 caffe 和 OpenMP 的相关帖子,提到了 OSX 和 OpenMP 上的 clang 问题。 我尝试了什么:

  1. 之后,我安装了带有自制软件的 gcc 4.9(因为 gcc 5 的自制软件公式安装了 5.9,这可能太高了?)
  2. 我根据 设置了 -fopenmp=libomp:这对我不起作用 ++-4.9: error: unrecognized command line option '-fopenmp=libomp'

我可以使用 official instructions 单独下载和构建 Caffe,但我似乎无法弄清楚如何编译这个看起来很棒的演示。 不幸的是,我没有使用 c++ 和 OpenMP 的经验,所以我真的可以在这里使用你的建议。谢谢

更新:我尝试了 Mark Setchell 关于通过 clang 安装 llvm 的有用建议。我已经更新了 Makefile 配置以使用

CUSTOM_CXX := /usr/local/opt/llvm/bin/clang++

但 CUDA 不喜欢它:

nvcc fatal   : The version ('30801') of the host compiler ('clang') is not supported

我已经尝试使用 CPU_ONLY 进行编译,但我仍然遇到 CUDA 错误:

examples/rtpose/rtpose.cpp:235:5: error: use of undeclared identifier 'cudaMalloc'
    cudaMalloc(&net_copies[device_id].canvas, DISPLAY_RESOLUTION_WIDTH * DISPLAY_RESOLUTION_HEIGHT * 3 * sizeof(float));
    ^
examples/rtpose/rtpose.cpp:236:5: error: use of undeclared identifier 'cudaMalloc'
    cudaMalloc(&net_copies[device_id].joints, MAX_NUM_PARTS*3*MAX_PEOPLE * sizeof(float) );
    ^
examples/rtpose/rtpose.cpp:1130:146: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
                cudaMemcpy(net_copies[tid].canvas, frame.data_for_mat, DISPLAY_RESOLUTION_WIDTH * DISPLAY_RESOLUTION_HEIGHT * 3 * sizeof(float), cudaMemcpyHostToDevice);
                                                                                                                                                 ^
examples/rtpose/rtpose.cpp:1136:108: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
                cudaMemcpy(pointer + 0 * offset, frame_batch[0].data, BATCH_SIZE * offset * sizeof(float), cudaMemcpyHostToDevice);
                                                                                                           ^
examples/rtpose/rtpose.cpp:1178:13: error: use of undeclared identifier 'cudaMemcpyHostToDevice'
            cudaMemcpyHostToDevice);
            ^
examples/rtpose/rtpose.cpp:1192:155: error: use of undeclared identifier 'cudaMemcpyDeviceToHost'
                cudaMemcpy(frame_batch[n].data_for_mat, net_copies[tid].canvas, DISPLAY_RESOLUTION_HEIGHT * DISPLAY_RESOLUTION_WIDTH * 3 * sizeof(float), cudaMemcpyDeviceToHost);
                                                                                                                                                          ^
examples/rtpose/rtpose.cpp:1202:155: error: use of undeclared identifier 'cudaMemcpyDeviceToHost'
                cudaMemcpy(frame_batch[n].data_for_mat, net_copies[tid].canvas, DISPLAY_RESOLUTION_HEIGHT * DISPLAY_RESOLUTION_WIDTH * 3 * sizeof(float), cudaMemcpyDeviceToHost);

我不是专家,但快速浏览代码后,我看不出 CPU_ONLY 版本如何与 cuda 调用一起工作。

再看一眼 caffe OSX Installation guide,我可能会尝试这条路线 > 胆小者不宜

我终于成功地编译了 rtpose 示例。

这是我所做的:

在 examples/rtpose/rtpose.cpp 中将帧数组交换为矢量,如上所述:

std::vector<Frame> frame_batch;
    std::cout << "allocating " << BATCH_SIZE << " frames" << std::endl;
    frame_batch.reserve(BATCH_SIZE);

使用默认的 clang++ 编译器,在尝试使用 gcc++-4.9 失败后,Homebrew 安装了 LLVM 的 clang++,但删除了 -fopenmp 标志和 -pthread 链接器标志,而不是编译器标志,基于 this answer

编译完成后,我尝试运行它,但是得到一个与libjpeg相关的错误:

dyld: Symbol not found: __cg_jpeg_resync_to_restart
  Referenced from: /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
  Expected in: /usr/local/lib/libJPEG.dylib
 in /System/Library/Frameworks/ImageIO.framework/Versions/A/ImageIO
Trace/BPT trap: 5

解决方法是 mdemirst's answer。为了以防万一,我备份了旧的符号链接。我从 ImageIO.framework.

做了符号链接 libjpeg/libpng/libtiff/libgif

我已经在 github 上提交了上述 config/setup 脚本。

示例已经编译完成,我还是不能运行,可能是GPU内存不够:

F0331 02:02:16.231935 528384 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @        0x10c7a89da  google::LogMessage::Fail()
    @        0x10c7a80d5  google::LogMessage::SendToLog()
    @        0x10c7a863b  google::LogMessage::Flush()
    @        0x10c7aba17  google::LogMessageFatal::~LogMessageFatal()
    @        0x10c7a8cc7  google::LogMessageFatal::~LogMessageFatal()
    @        0x1079481db  caffe::SyncedMemory::to_gpu()
    @        0x107947c9e  caffe::SyncedMemory::mutable_gpu_data()
    @        0x1079affba  caffe::CuDNNConvolutionLayer<>::Forward_gpu()
    @        0x107861331  caffe::Layer<>::Forward()
    @        0x107918016  caffe::Net<>::ForwardFromTo()
    @        0x1077a86f1  warmup()
    @        0x1077b211d  processFrame()
    @     0x7fff8b11899d  _pthread_body
    @     0x7fff8b11891a  _pthread_start
    @     0x7fff8b116351  thread_start
Abort trap: 6

我已尝试尽可能调低设置:

./build/examples/rtpose/rtpose.bin -caffemodel ./model/coco/pose_iter_440000.caffemodel -caffeproto ./model/coco/pose_deploy_linevec.prototxt -camera_resolution "40x30" -camera 0 -resolution "40x30" -start_scale 0.1 -num_scales=0 -no_display true -net_resolution "16x16"

但无济于事。实际上 运行举这个例子本身可能是另一个问题。