Measurements from images

A few weeks ago I alluded to an amazing computer science class I took in which we tried to identify a car from some extremely poor surveillance video. In movies and TV shows enhancing video is a snap, but for some reason just typing “enhance” while watching the video didn’t work on our machines. Instead we had to use a whole bunch of computer vision techniques to estimate the car’s position, dimensions and color 15 frames of terrible video recorded outside of a gas station in the middle of the night. Needless to say, every class an episode CSI, NUMB3RS and Mathnet rolled into one.

Our biggest success was getting good measurements from video. Our basic technique was to identify as many parallel lines and right angles in the scene as possible. From those we could infer the location of the camera, and then produce a correspondence between points in the 2D image and points in 3D space. Given this correspondence, along with the dimensions of one object in the image, we could estimate the length of everything else (e.g. the car).

It sounds complicated, but really the math involved isn’t that hard (it’s mostly linear algebra), and identifying parallel lines and right angles is something people can do manually by drawing on the image. All in all, the approach we used struck me as a useful tool that someone could either sell, or put online. Since no one from the class did this, I’m was glad to read that a company called VisualSize will offer a similar service.

Their website doesn’t have much info, but according to the TechCrunch post I read, they require users to submit two images of a scene, taken from slightly different positions. This means that instead of parallel line, they are using the difference between the two images to infer camera position and distance. This is very similar to how people use two eyes to perceive depth. The advantage of their approach (I think) is that a user doesn’t need to have measured the size of any object in the images. The disadvantage is that you need two images.

It sounds like VisualSize’s length estimates are about 1 to 5 percent accuracy (that’s about what we got). If VisualSize’s image measurements get a bit more reliable, they would be a big help to anyone who’s ever painstakingly taken measurements of a building or space. If they improve even more, they could potentially be used to automatically produce 3D models from 2D images. This would take something like google street and turn it into 3D models of cities.

Computer vision has come a long way in the last 15 years, and regardless of VisualSize’s success, more and more computer vision research is going to make its way into commercial products. Facial recognition in digital cameras is another prime example.

Update: Professor Yuan-Fang Wang from VisualSize generously shared some of their algorithm’s sample outputs with me. Many of the measurements were very impressive. Here’s the one from the TechCrunch article. Professor Wang also pointed out that their algorithm requires a single known distance in the two images, as since it is not theoretically possible to determine absolute distance without a frame of reference.

2 Responses to “Measurements from images”

  1. Yuan-Fang Wang Says:

    Thank you for mentioning our (Visualsize) offer.

    You are absolutely correct that we use a slightly different technique. In our algorithm, we recover the camera motion in between two shots using point correspondences auatomatically established in two views. The whole process is automated, so the user does not have to manually identify parallel lines. The extra effort – that of clicking the camera button one more time to take one more picture – seems to be manageable.

    Thanks again.

  2. Jehad Says:

    thank you for talking about such important matter, i work as an NDT engineer and in many cases i find myself unable to reach the desired objects for measurements, this is extremely important for me as my job is to evaluate the job before doing it, having the ability to measure from phots depending on a refrence value in the photo is what i was looking for, i have just sent e-mail request to VisualSize and i hope i get positive responce! thanks again

Leave a Reply

A blog by EERac