• Parallel version of SVM::trainAuto: this function were using only one CPU core on my 10-core machine, with these changes, I got a 10x speedup.

  • Load DNN from a memory buffer: for one of my client, I decided to use OpenCV DNN to do deep learning inference on iOS and Android. At that time, OpenCV DNN were only able to load files from disk which was an issue with sandboxed environment. Adding cv::dnn::readNetFromTensorflow and cv::dnn::readNetFromCaffe with memory buffer as input solved my issue.

  • Allows structured_light pipeline to be run from Python: this API were incompletely exposed to Python. Added the needed annotations to C++ code and some Python tests.

  • Memory leak when using OpenCV CUDA Stream in Python: usage of stream in Python were slowing down the execution by inserting unnecessary cudaStreamCreate and cudaStreamDestroy.

  • Crash when using trackbar in Python on macOS: on IRC’s #opencv channel, a user reported an issue. Owning multiple macOS machines and having the OpenCV tree configured for compilation in Debug mode, it was a matter of minutes to reproduce the issue and find a solution using Xcode debugging tools.

  • Improve performance of cv::cuda::Convolve: for a client, I extended cv::cuda::createTemplateMatching to support CV_TM_CCORR_NORMED for CV_32F depth. Template matching is using convolution which in turn uses FFT to perform its computation. With some versions of CUDA, cv::cuda::Convolve were 10 times slower than CPU version !!! Correcting the issue had a side effect of improving performance of this function by 2x on all CUDA versions.