Researchers developed a New Method for Controlling how AI Systems Create Images

Researchers developed a New Method for Controlling how AI Systems Create Images

North Carolina State University researchers have created a new cutting-edge method for controlling how artificial intelligence (AI) systems generate images. The research has applications ranging from autonomous robotics to AI training.

The task at hand is conditional image generation, a type of AI task in which AI systems create images that meet a specific set of conditions. For example, a system could be trained to generate unique images of cats or dogs based on the animal requested by the user. More recent techniques have expanded on this to include image layout conditions. This allows users to specify which types of objects should appear in which locations on the screen. For example, the sky could go in one box, a tree in another, a stream in a different box, and so on.

The new work builds on those techniques to give users more control over the resulting images, and to retain certain characteristics across a series of images.

Researchers have developed a new state-of-the-art method for controlling how artificial intelligence (AI) systems create images. The work has applications for fields from autonomous robotics to AI training.

“Our approach is highly reconfigurable,” says Tianfu Wu, an assistant professor of computer engineering at NC State and co-author of a paper on the work. “Our approach, like previous approaches, allows users to instruct the system to generate an image based on a specific set of conditions. However, ours allows you to keep that image and add to it. Users could, for example, instruct the AI to create a mountain scene. Users could then instruct the system to add skiers to the scene.”

Furthermore, the new approach allows users to instruct the AI to manipulate specific elements so that they are recognizable but have moved or changed in some way. For instance, the AI could generate a series of images in which skiers turn toward the viewer as they move across the landscape.

“One application for this would be to assist autonomous robots in ‘imagining’ what the end result might look like before beginning a given task,” Wu says. “The system could also be used to generate images for AI training. Rather than compiling images from external sources, you could use this system to generate images for training other AI systems.”

Researchers-developed-a-New-Method-for-Controlling-how-AI-Systems-Create-Images-1
Researchers fine-tune control over AI image generation

The COCO-Stuff and Visual Genome datasets were used to test the researchers’ new approach. The new approach outperformed previous state-of-the-art image creation techniques on standard image quality measures.

Tianfu Wu of NC State stated, “Our approach, like previous approaches, allows users to instruct the system to generate an image based on a specific set of conditions. However, ours allows you to keep that image and add to it.”

The method can also be used to rig specific components so that they are identifiably the same but in a different position or have been altered in some way. Using the COCO-Stuff and Visual Genome datasets, the technique outperformed previous state-of-the-art image generation methods.

“Our next step will be to see if we can extend this work to video and three-dimensional images,” Wu explains. The researchers used a 4-GPU workstation to train for the new approach, which necessitates a significant amount of computational power. However, the system’s deployment is less computationally expensive. “We discovered that one GPU provides almost real-time speed,” Wu says.

The training of the new methodology necessitates a significant amount of computer power, with researchers employing 4-GPU workstations. Nonetheless, the system is less expensive on a computerized basis. “We found that one GPU gives you almost real-time speed,” Wu said. “In addition to our document, we also published our source code for this method on GitHub. Having said that, we are always interested in collaborating with industry partners.”

Wu proposed applications for the technique, such as assisting autonomous robots “Before beginning a task or creating images for AI training, “imagine” the appearance of the end result. We’ve made our source code for this approach available on GitHub in addition to our paper.

Share This Post