We get questions about iPi Soft and Kinect all the time. People want to know how Kinect differs from OpenStage 2, the professional markerless motion capture systems we offer, or they ask if Kinect can be used for accurate 3D tracking. The fact is that we love Kinect for home gaming and basic gesture capture, areas where it truly excels. However, as the basis of an accurate 3D motion capture solution, suitable for professional applications, it has serious limitations.
Here’s a video we did a while ago that we often point people to when they want to understand what’s really going on behind the scenes.
It all comes down to the data that the system can collect and how that data is processed. In brief, Kinect works for basic gesture recognition at a very low consumer price point because it keeps the data simple. It has a 2D depth map view of the subject and software that has been trained against tons of sample data. A 2D depth map means that it is only looking from one angle but it can get information, based on an infrared sensor, about how far away different parts of that 2D image are. In sculptural terms, you can think of its data as a bas-relief, as opposed to a free-standing statue; it has depth, but only from one side. Because of this, Kinect can’t know if there is anything behind the surface or anything blocked by what it sees. But it can take the data it does get and match it against what it has learned in order to say, in effect, “that looks like a guy with his right arm raised.” It doesn’t really understand the arm’s true 3D position, its angle, etc. It just goes with its closest guess.
In order to get more out of the technology, there have been a bunch of Kinect based hacks – in fact the video above shows how we hacked Kinect to add our own software and additional cameras. For example we have tried, as have many others, using multiple Kinects to get a better 3D data set. While that does improve things to some degree, it also has real limitations. By using two Kinects you still don’t get a full 3D view of the subject, but rather two 2D depth maps, which still are unable to provide accurate info about blocked body parts. Adding more Kinects doesn’t work either because the infrared projections that Kinect uses to collect its depth map data interfere with each other between multiple devices. We know because we’ve tried integrating Kinects beyond even what is shown in the video above, thinking that it might provide a replacement for some or all of the regular cameras we use. In the end we found that for true 3D data, multiple regular video cameras provide better data.
Of course having an extra dimension to the data not only provides a huge advantage in measurement, it also multiplies the amount of data that needs to be processed. Processing all of that extra data in realtime requires sophisticated techniques. This capability – efficiently processing and understanding 3D data based on multiple cameras — is at the heart of Organic Motion’s technology and the basis for the accurate tracking capabilities of OpenStage 2 and the rest of our products.
Here’s a video of our software in action, using just regular video cameras with no Kinect involved. As you can see, the actors are being accurately tracked in realtime through a wide variety of complex motions.
OpenStage 2 can accurately measure 3D position, rotation and angle based on what it sees from multiple cameras at once. This is not only more accurate, but also allows for tracking many types of motion which Kinect and similar solutions cannot. For example, poses where part of the body is occluded (blocked) from the perspective of the Kinect can never track well without 3D data. Similar problems occur when the subject turns away from the camera or spins around, positions for which the Kinect software presumable has not been trained. Tracking multiple people is also only practical with real 3D data, otherwise it is impossible to tell what is going on when one subjects steps in front of the other.
To see how good the tracking data is from OpenStage 2, check out these free motion capture data samples.