Markerless technologies use the features of the face such as nostrils, the corners of the lips and eyes, and wrinkles and then track them. This technology is discussed and demonstrated at CMU, IBM, University of Manchester (where much of this started with Tim Cootes, Gareth Edwards and Chris Taylor) and other locations, using active appearance models, principal component analysis, eigen tracking, deformable surface models and other techniques to track the desired facial features from frame to frame. This technology is much less cumbersome, and allows greater expression for the actor.
These vision based approaches also have the ability to track pupil movement, eyelids, teeth occlusion by the lips and tongue, which are obvious problems in most computer animated features. Typical limitations of vision based approaches are resolution and frame rate, both of which are decreasing as issues as high speed, high resolution CMOS cameras become available from multiple sources.
The technology for markerless face tracking is related to that in a Facial recognition system, since a facial recognition system can potentially be applied sequentially to each frame of video, resulting in face tracking. For example, the Neven Vision system (formerly Eyematics, now acquired by Google) allowed real-time 2D face tracking with no person-specific training; their system was also amongst the best-performing facial recognition systems in the U.S. Government's 2002 Facial Recognition Vendor Test (FRVT). On the other hand some recognition systems do not explicitly track expressions or even fail on non-neutral expressions, and so are not suitable for tracking. Conversely, systems such as deformable surface models pool temporal information to disambiguate and obtain more robust results, and thus could not be applied from a single photograph.
Markerless face tracking has progressed to commercial systems such as image-metrics, Dynamixyz and has been applied in movies such as The Matrix sequels and The Curious Case of Benjamin Button. The latter used the Mova Contour system to capture a deformable facial model, which was then animated with a combination of manual and vision tracking. Avatar was another prominent performance capture movie however it used painted markers rather than being markerless.
Markerless systems can be classified according to several distinguishing criteria:
- 2D versus 3D tracking
- whether person-specific training or other human assistance is required
- real-time performance (which is only possible if no training or supervision is required)
- whether they need an additional source of information such as projected patterns or invisible paint such as used in the Mova system.
To date, no system is ideal with respect to all these criteria. For example the Neven Vision system was fully automatic and required no hidden patterns or per-person training, but was 2D. The Face/Off system is 3D, automatic, and real-time but requires projected patterns.