Researchers from Stanford, UC Berkeley and Adobe Research have developed a new AI model that can realistically insert specific humans into different scenes

https://arxiv.org/abs/2304.14406

The creative industries have witnessed a new era of possibilities with the advent of generative models, computational tools that can generate text or images based on training data. Inspired by these advances, researchers at Stanford University, UC Berkeley and Adobe Research have introduced a new model that can seamlessly insert specific humans into different scenes with impressive realism.

Researchers employed a self-supervised training approach to train a diffusion model. This generative model converts noise into the desired images by appending and then reversing the training data destruction process. The model was trained on videos in which humans move through various scenes, randomly selecting two frames from each video. The humans in the first frame were masked and the model used the unmasked individuals in the second frame as a conditioning signal to realistically reconstruct the individuals in the masked frame.

The model learned to infer potential poses from the context of the scene through this training process, reposition the person, and integrate them seamlessly into the scene. The researchers found that their generative model worked exceptionally well at placing individuals in scenes, generating modified images that appeared highly realistic. Model predictions of perceived possibilities for actions or interactions within an environment have surpassed previously introduced non-generative models.

Check out 100s AI Tools in our AI Tools Club

The findings have significant potential for future research in accessibility perceptions and related areas. They can contribute to advances in robotics research by identifying potential opportunities for interaction. Furthermore, the practical applications of the models extend to the creation of realistic media, including images and video. Integrating the template into creative software tools could enhance image editing capabilities, supporting artists and media creators. Additionally, the model could be incorporated into photo-editing smartphone applications, allowing users to easily and realistically insert people into their photographs.

Researchers have identified several avenues for future exploration. They aim to incorporate greater controllability into the generated poses and explore the generation of realistic human movement within scenes rather than static images. Furthermore, they seek to improve the efficiency of the model and expand the approach beyond humans to encompass all objects.

In conclusion, the introduction by the researchers of a new model allows the realistic insertion of human beings in the scenes. Leveraging generative models and self-supervised training, the model demonstrates impressive performance in delivering perception and holds the potential for various applications in the creative industries and robotics research. Future research will focus on refining and expanding the model’s capabilities.


Check out ThePaper.Don’t forget to subscribeour 22k+ ML SubReddit,Discord channel,ANDEmail newsletterwhere we share the latest news on AI research, cool AI projects, and more. If you have any questions regarding the above article or if you have missed anything, please do not hesitate to email us atAsif@marktechpost.com

Check out 100s AI Tools in the AI ​​Tools Club

Niharika is a technical consulting intern at Marktechpost. She is a third year student, currently pursuing her B.Tech at Indian Institute of Technology (IIT), Kharagpur. She is a very enthusiastic individual with a keen interest in machine learning, data science and artificial intelligence and an avid reader of the latest developments in these fields.

Ultimate Guide to Data Labeling in Machine Learning

#Researchers #Stanford #Berkeley #Adobe #Research #developed #model #realistically #insert #specific #humans #scenes

Leave a Comment