OpenCV使用指南(v1.5)
1. 概述
PPU SDK可以兼容支持OpenCV的硬件加速能力,目前验证的OpenCV版本为4.10。
基于OpenCV4.10,带有硬件加速能力的opencv-python库: opencv_contrib_python-4.10.0.84已加入SAIL pip源,可直接pip install进行安装:
pip install opencv-contrib-python==4.10.0.84如下文档介绍如何自行定制编译带硬件加速能力OpenCV和OpenCV-python包。
在安装或者自行编译安装带硬件加速能力的OpenCV和OpenCV-python包之前,建议先卸载环境中已经安装的OpenCV和OpenCV-python包。
2. 编译安装opencv
2.1 下载代码
2.2 准备脚本
编译前请确认已经安装PPU SDK并执行了envsetup.sh,配置好SDK的环境。
编译删除了一些模块,尽可能多的打开了cuda加速模块,用户可以根据自己的实际需要做增删修改:
2.3 配置FFmpeg支持
如果系统没有安装FFmpeg,在opencv configure时,仅仅打开-DWITH_FFMPEG=ON,编译是无法找到ffmpeg的,最终编译的libopencv_videoio.so不会包含对ffmpeg的支持。
有关FFmpeg的编译和使用,可以参考Video FFMpeg使用指南。
假设你的FFmpeg安装目录是 FFMPEG_PATH,在build_opencv.sh中补充如下脚本:
export PKG_CONFIG_LIBDIR=$FFMPEG_PATH/lib/pkgconfig:$PKG_CONFIG_LIBDIR
cmake_option+=" -DWITH_FFMPEG=ON -DOPENCV_FFMPEG_USE_FIND_PACKAGE=ON -DOPENCV_FFMPEG_SKIP_BUILD_CHECK=ON -DFFMPEG_DIR=./"配置成功,在编译的时候可以看到(Yes后面显示的是本地的FFmpeg版本):
-- Video I/O:
-- FFMPEG: YES (find_package)
-- avcodec: YES (60.31.102)
-- avformat: YES (60.16.100)
-- avutil: YES (58.29.100)
-- swscale: YES (7.5.100)
-- avresample: NO请把ffmpeg目录指向自编译的支持硬件加速的ffmpeg安装目录。
2.4 配置testcase和samples
增加如下option可以编译并安装opencv自带的testcases和testdata,此项不是必须的:
cmake_option+=" -DBUILD_PERF_TESTS=ON -DBUILD_TESTS=ON -DINSTALL_TESTS=ON -DOPENCV_TEST_DATA_PATH=../opencv_extra/testdata"这里opencv_extra是test data所在目录,前面的脚本已经从github下载了。
2.5 编译
2.6 运行示例
进入安装目录下的bin目录:
cd ../bin
export LD_LIBRARY_PATH=$FFMPEG_PATH/lib:../lib:$LD_LIBRARY_PATH # 把ffmpeg和opencv lib目录加入LD_LIBRARY_PATH
export OPENCV_TEST_DATA_PATH=../share/opencv4/testdata # testdata是git clone https://github.com/opencv/opencv_extra.git得到的目录可以执行相关native测试程序。
3. 编译安装opencv-python
3.1 下载代码
git clone --branch 84 --depth 1 https://github.com/opencv/opencv-python.git因为上一章节中编译的是opencv 4.10,所以这里需要下载对应的opencv-python版本。
3.2 修改setup.py
可以直接使用 #2.1 已经下载好的opencv和opencv_contrib代码,不需要重新下载
cd opencv-python
rm opencv opencv_contrib opencv_extra multibuild -r
# 假设opencv-python, opencv, opencv_contrib都在同一个目录
ln -sf ../opencv opencv
ln -sf ../opencv_contrib opencv_contrib
ln -sf ../opencv_extra opencv_extra
git clone https://github.com/multi-build/multibuild.git修改setup.py,去掉对submodule的git update,更改编译options(修改项跟上面#2.2差不多,可以根据需要自行增删):
- if os.path.exists(".git"):
- import pip._internal.vcs.git as git
- g = git.Git() # NOTE: pip API's are internal, this has to be refactored
- g.run_command(["submodule", "sync"])
- if build_rolling:
- g.run_command(
- ["submodule", "update", "--init", "--recursive", "--remote", cmake_source_dir]
- )
- if build_contrib:
- g.run_command(
- ["submodule", "update", "--init", "--recursive", "--remote", "opencv_contrib"]
- )
- else:
- g.run_command(
- ["submodule", "update", "--init", "--recursive", cmake_source_dir]
- )
-
- if build_contrib:
- g.run_command(
- ["submodule", "update", "--init", "--recursive", "opencv_contrib"]
- )
... ...
"-DBUILD_OPENEXR=ON",
+ "-DWITH_OPENCL=OFF",
+ "-DOPENCV_DNN_OPENCL=OFF",
+ "-DWITH_FFMPEG=ON",
+ "-DOPENCV_FFMPEG_USE_FIND_PACKAGE=ON",
+ "-DOPENCV_FFMPEG_SKIP_BUILD_CHECK=ON",
+ "-DFFMPEG_DIR=/usr/local", # 这里换成你本地自己编译的支持硬件加速的ffmpeg安装目录
+ "-DWITH_CUDA=ON",
+ "-DWITH_CUDNN=ON",
+ "-DWITH_CUBLAS=ON",
+ "-DWITH_CUFFT=ON",
+ "-DOPENCV_DNN_CUDA=ON",
+ "-DCUDA_ARCH_BIN=80",
+ "-DWITH_V4L=OFF",
+ "-DWITH_GSTREAMER=OFF",
+ "-DWITH_1394=OFF",
+ "-DWITH_ANDROID_MEDIANDK=OFF",
+ "-DWITH_GTK=OFF",
+ "-DWITH_IPP=OFF",
+ "-DBUILD_JAVA=OFF",
+ "-DWITH_VTK=OFF",
+ "-DENABLE_FLAKE8=OFF",
+ "-DENABLE_PYLINT=OFF",
+ "-DBUILD_opencv_legacy=ON",
+ "-DOPENCV_ENABLE_NONFREE=ON",
+ "-DBUILD_opencv_xfeatures2d=OFF",
+ "-DBUILD_opencv_matlab=OFF",
+ "-DBUILD_opencv_xobjdetect=OFF",
+ "-DBUILD_opencv_xphoto=OFF",
+ "-DBUILD_opencv_wechat_qrcode=OFF",
+ "-DBUILD_opencv_ximgproc=ON",
+ "-DBUILD_opencv_cudev=ON",
+ "-DBUILD_opencv_cudastereo=ON",
+ "-DBUILD_opencv_cudaarithm=ON",
+ "-DBUILD_opencv_cudaimgproc=ON",
+ "-DBUILD_opencv_cudacodec=ON",
+ "-DBUILD_opencv_cudafilters=ON",
+ "-DBUILD_opencv_cudawarping=ON",
+ "-DBUILD_opencv_cudaoptflow=ON",
+ "-DBUILD_opencv_cudabgsegm=ON",
+ "-DBUILD_opencv_cudafeatures2d=ON",
+ "-DBUILD_opencv_cudaobjdetect=ON",
+ "-DBUILD_opencv_cudalegacy=ON",
+ "-DOPENCV_GENERATE_PKGCONFIG=ON"
]3.3 编译
依赖skbuild,可以pip install scikit-build
export ENABLE_CONTRIB=1
python3 setup.py bdist_wheel3.4 安装
编译生成的opencv_contrib_python-4.10.0.84-cp37-cp37m-linux_x86_64.whl在dist目录下
pip install dist/opencv_contrib_python-4.10.0.84-cp37-cp37m-linux_x86_64.whl此时可以进入python交互页面,输入import cv2看看能否成功。
4. 已知问题
cuBLAS:
GEMM:cublasCgemm_v2 is not supported。
cuFFT:
Dft/Convolve:cufftPlan2d is not supported。
cudacodec:
不支持MPEG4/MPEG2/VC1/VP8等比较旧的视频格式。
cudaoptflow
部分测试项不支持,PPU没有opt flow硬件。
除此之外,还会有少量native测试项在PPU上会fail,经过分析都是可以skip的,主要包括如下情况:
结果不一致,浮点精度有较小的误差,但在ulp许可范围。这其中有一些测试项在NV A100卡上也一样会Fail。
kernel代码变量没有初始化为0,或者分配的GPU内存没有memset为0导致的结果差异:在NV的环境下,局部变量和分配的设备内存会被默认初始化为0,但PPU不会。
算子或者算法采用不相同导致的结果差异。
5. 使能示例
Ultralytics yolov5使用opencv-python做目标检测模型推理的前处理和后处理的硬件编解码加速
https://github.com/ultralytics/yolov5.git
前处理使能PPU Video解码硬件加速,需要显式使用FFmpeg做VideoCapture,并配置cuvid硬件加速的选项,参考下面的代码段:
后处理cv2.VideoWriter默认使用FFmpeg进行视频编码,如果按照该文档配置了硬件加速的FFmpeg,默认会走到硬件编码,无需修改代码: