TK1051 : Converting 2D to 3D images of an object using deep learning for use in augmented reality
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2024
Authors:
[Author], Seyed Masoud Mirrezaei[Supervisor]
Abstarct: Reconstructing 3D images from 2D images is a very important and vital task in the field of machine vision and is also a main prerequisite for use in various fields such as augmented reality, virtual reality, creating video games, creating 3D models of various objects, diagnosing diseases, and designing products. This research aims to improve the quality and reduce the time of producing 3D models and to examine the challenges and provide solutions to improve the process of reconstructing 3D images from 2D images. Among the important challenges in this field are the lack of sufficient resolution of the produced model, incorrect separation of the foreground from the background, and the long time of producing the 3D model, which this study intends to overcome by presenting an innovative method. The overall design of the study includes providing a frxamework for generating 3D meshes baxsed on large reconstruction models, which makes this possible by synergizing the strengths of an off-the-shelf multi-view diffusion model and a sparse view reconstruction model. In this method, by combining these two models and using geometric controls such as depth and normal, we are able to produce 3D models in a few seconds. Among the innovations of this research, we can mention the application of the pre-processing section, which provides a method for accurately separating the foreground from the background. Also, applying the texture backing method on the generated mesh in the form of the post-processing section, which will have a huge contribution to the accuracy of the final 3D model, is one of the important innovations of this research. Other notable innovations include replacing iso surface extraction module instead of computationally heavy and time-consuming volumetric representations, improving the camera's capabilities to correctly recognize different views of the object in question, optimizing the neural representation, and reducing execution time. The main findings of the research show that the proposed solutions have a significant impact on the production of realistic and interactive 3D models. For example, in the Google Scanned Objects (GSO) databaxse, the proposed method has shown better performance by obtaining a value of 0.884 in the FS criterion and a value of 22.81 in the PSNR criterion compared to the TripoSR, LGM, CRM, SV3D, and Instant Mesh methods, which makes the pin-guided method applicable for use in various applications
Keywords:
#Keywords: 3D model #augmented reality #virtual reality #diffusion model #3D mesh #iso surface extraction module Keeping place: Central Library of Shahrood University
Visitor: