At work, I was working on an OpenCV project that utilized warpPerspective quite extensively for real-time perspective correction. It worked great, but it was slow. When I profiled my solution, warpPerspective was taking ~20% of the CPU on my MacBook Pro.
Now, if you look at the code for warpPerspective, it's essentially doing quite a bit of matrix multiplication for each pixel. Since I'm assuming that my camera won't change position during a run, that's a lot of wasted effort.
I use the following code to calculate the map that is generated by warpPerspective up front so I can use it in a remap call later to get the same results for less than half the CPU cost. I store the final map in cv::Mats named transformation_x and transformation_y.
transformationMatrix = cv::getPerspectiveTransform(originalCorners, destinationCorners);
// Since the camera won't be moving, let's pregenerate the remap LUT
cv::Mat inverseTransMatrix;
cv::invert(transformationMatrix, inverseTransMatrix);
// Generate the warp matrix
cv::Mat map_x, map_y, srcTM;
srcTM = inverseTransMatrix.clone(); // If WARP_INVERSE, set srcTM to transformationMatrix
map_x.create(sourceFrame.size(), CV_32FC1);
map_y.create(sourceFrame.size(), CV_32FC1);
double M11, M12, M13, M21, M22, M23, M31, M32, M33;
M11 = srcTM.at<double>(0,0);
M12 = srcTM.at<double>(0,1);
M13 = srcTM.at<double>(0,2);
M21 = srcTM.at<double>(1,0);
M22 = srcTM.at<double>(1,1);
M23 = srcTM.at<double>(1,2);
M31 = srcTM.at<double>(2,0);
M32 = srcTM.at<double>(2,1);
M33 = srcTM.at<double>(2,2);
for (int y = 0; y < sourceFrame.rows; y++) {
double fy = (double)y;
for (int x = 0; x < sourceFrame.cols; x++) {
double fx = (double)x;
double w = ((M31 * fx) + (M32 * fy) + M33);
w = w != 0.0f ? 1.f / w : 0.0f;
float new_x = (float)((M11 * fx) + (M12 * fy) + M13) * w;
float new_y = (float)((M21 * fx) + (M22 * fy) + M23) * w;
map_x.at<float>(y,x) = new_x;
map_y.at<float>(y,x) = new_y;
}
}
// This creates a fixed-point representation of the mapping resulting in ~4% CPU savings
transformation_x.create(sourceFrame.size(), CV_16SC2);
transformation_y.create(sourceFrame.size(), CV_16UC1);
cv::convertMaps(map_x, map_y, transformation_x, transformation_y, false);
// If the fixed-point representation causes issues, replace it with this code
//transformation_x = map_x.clone();
//transformation_y = map_y.clone();
I then apply the map using a remap call:
cv::remap(sourceImage, destinationImage, transformation_x, transformation_y, CV_INTER_LINEAR);
Ended up dropping CPU usage for the call down to ~8%. I'll take it.
3 comments:
Hi,
Firstly, great thanks for a very detailed and interesting article!
I look forward to more of the same in the future!
Secondly, how would you say one should go about optimizing warpPerspective if the homography that warps the image is always changing, but the destination size remains the same? Any thoughts?
Hoping to hear from you soon, good day :)
bad_keypoints,
This optimization is for when you have a fixed camera position, unfortunately.
You might want to look into warpAffine and see if you get acceptable results from it.
What I've done in the past with real time warping based on variable maps is to identify an invariant constant element of the mapping and generate new maps based on modifying that with a few low-latency arithmetic operations with Scalar variables (shifting, scaling, that sort of thing). remap() itself is pretty fast, so the key to doing image warps with low latency is to cache as much of the map generation as possible. Also, don't forget that the required accuracy of a mapping is roughly limited to the image resolution, so you can usually get away with some pretty rough approximations, like the first couple terms in a Taylor series.
Post a Comment