“In traditional computer graphics, a pipeline renders a 3D model to a 2D screen. But there’s information to be gained from doing the opposite—a model that could infer a 3D object from a 2D image would be able to perform better object tracking, for example.,” NVIDIA explains.
What the researchers came up with is a rendering framework called DIB-R, which stands for differentiable interpolation-based renderer. The goal was to design a framework that could accomplish this task while integrating seamlessly with machine learning techniques.
“The result, DIB-R, produces high-fidelity rendering by using an encoder-decoder architecture, a type of neural network that transforms input into a feature map or vector that is used to predict specific information such as shape, color, texture and lighting of an image,” NVIDIA says.
Researchers at NVIDIA trained their model on a bunch of datasets, including a set collection of bird images. Once trained, a system could look at a 2D image of a bird and a produce a 3D model with the right shape and texture.
“This is essentially the first time ever that you can take just about any 2D image and predict relevant 3D properties,” says Jun Gao, one of a team of researchers who collaborated on DIB-R.
Beyond robotics, NVIDIA also sees this as being handy in transforming 2D images of dinosaurs and other extinct animals, into lifelike 3D images in quick fashion (under a second).