Directorate-General for Research & Innovation logo Horizon: the EU Research & Innovation magazine | European Commission logo
Receive our editor’s picks

Using natural images to model objects

Performance capture can yield far richer images than current techniques. Image courtesy of Christian Theobalt.
Performance capture can yield far richer images than current techniques. Image courtesy of Christian Theobalt.

A German computer scientist has hit on a new way of modelling objects for films and video games that could create much richer images than current methods achieve, and cost less.

At the moment, actors wear light markers – usually placed at or near joints – and execute a series of movements while surrounded by a circle of cameras. A computer then creates a digital skeleton that moves in the same way as the actor, and can be manipulated electronically. This was how boy wizard Harry Potter managed to defeat his enemy Voldemort in Harry Potter and the Deathly Hallows, and how the blue humanoids in Avatar achieved their acrobatic flourishes.

However, the technique, called motion capture, only succeeds in creating lifelike movement: the shape and surfaces then need to be completed by a graphic artist, and the surfaces still don’t look real. So researcher Christian Theobalt, based in Saarbrücken, Germany, is using sophisticated algorithms to create digital models of moving characters with much less work.

Performance capture

Theobalt’s method – known as performance capture – uses textures and edges instead of light markers, providing models that already feature shape, movements and surface appearance. That yields far richer images with textures and moving surfaces. It could be applied in video games, as well as in fields such as medicine and engineering.

‘The actual challenge is that in the real world, things are much more complex.’

Christian Theobalt head of the Graphics, Vision and Video research group, the Max-Planck-Institute for Informatics, Germany

‘We have to find a replacement for markers in natural images,’ said Theobalt, who heads the Graphics, Vision and Video research group at the Max-Planck-Institute for Informatics, Germany. ‘The complexity is orders of magnitude higher.’

The complex interaction of light and objects in the real world produces drastic changes in the appearance of surfaces. That makes it hard to maintain a grasp on them, and so far, performance capture only works in special, restricted circumstances and just for scenes such as a single person in simple clothing.

Under a five-year project funded by a European Research Council (ERC) grant, Theobalt is aiming to extend the technology to digitally capture multiple objects with complex surface textures under arbitrary lighting. The objects will also move and interact in unpredictable ways. He wants this to be done quickly, in detail and using just a few cameras, some of which may be moving.

‘So far, we are still just in the studio, where the lighting can be controlled,’ said Theobalt. ‘The actual challenge is that in the real world, things are much more complex. We are rethinking the way we do dynamic scenery construction.’

Moving target

Grasping moving objects is an area where computers still lag the human brain by a long way. Typically, designers create virtual images in two parts. First comes the modelling stage, where a mathematical model is created for the shapes, textures, and movements that will be needed in a scene. Second, they use graphic visualization technology to twist, stretch, and move the model. This is done using an algorithm that ‘renders’ the model by simulating the way light interacts with its surface as it moves, making it appear real.

Christian Theobalt who heads the Graphics, Vision and Video research group at the Max-Planck-Institute for Informatics, Germany. Image courtesy of Christian Theobalt.Christian Theobalt who heads the Graphics, Vision and Video research group at the Max-Planck-Institute for Informatics, Germany. Image courtesy of Christian Theobalt.

The tools for this second stage have leapt ahead in recent years, and computer graphic images are now common in visual media. Doctors and engineers also use them to display high quality results from scanners and other imaging devices. But model generation has not kept up. 

Models used to create computer-generated scenes in movies must often be crafted manually using software, a process that often takes months of work for just a few seconds of footage.

‘The ultimate long-term goal of performance capture is to bridge this gap,’ Theobalt said.

The technology could be valuable: the global movies and entertainment market is expected to be worth EUR 69 billion by the end of 2016, according to Marketline, a UK-based research company, and Europe has many graphics and visual technology firms competing for a share of that. In addition, doctors could assess the healing process using computer models of treated parts of a patient’s body. Engineers and physicists could measure complex deforming surfaces. And new search tools could also be built for image databases.

‘Sports doctors said that if they can make such measurements, they can derive the underlying muscle movements and forces,’ said Theobalt. Despite the complexity of the work, he said, ‘In major areas we will make large progress.’

More info