Implementation Approach for Duplicate Image Identification and Removal
1. Zaw Ye Htet,
Student, Yangon Technological University, Myanmar
2. Tin Shine Aung,
Lecturer, Yangon Technological University, Myanmar
This paper presents a systematic approach for identifying and removing duplicate images from various 3D image format collections. The identification process considers image structure, density, meta descriptions, and other properties. The system employs a preprocessing module to standardise and extract meta descriptions from diverse formats like STL, OBJ, FBX, and others. A vector database, utilising tools like FAISS or Milvus, stores the image vectors and meta descriptions for efficient similarity searches. Deep learning models, particularly Convolutional Neural Networks (CNNs), are trained to extract image features and compare vectors using cosine similarity or Euclidean distance. An integrated search engine allows users to find similar images by uploading an image and its meta description. A human validation interface is provided for manual confirmation of potential duplicates. This approach ensures efficient management and retrieval of 3D images while enhancing storage utilisation. Future work will further explore alternative models and similarity measures to improve system accuracy and efficiency.
This article details an implementation technique for finding and deleting duplicate photos in 3D image collections. It offers a comprehensive and scalable solution to a significant issue that many businesses encounter. The system efficiently extracts picture characteristics and compares vectors using cosine or geometric distance metrics using deep learning methods, namely Convolutional Neural Networks (CNNs). Duplicate detection using structure, density, and meta-descriptions is undoubtedly accurate.
The system's preprocessing module standardises various 3D image formats and extracts necessary meta-descriptions, facilitating consistent and accurate processing. Utilising a vector database like FAISS or Milvus, the system efficiently stores and retrieves image vectors, enabling rapid and precise similarity searches. Including a search engine allows users to find similar images by uploading an image and its meta description, further enhancing the system's utility.
The human validation interface is a crucial system feature, allowing users to manually confirm or reject potential duplicates flagged by the AI. This human-centric approach ensures that the system can handle ambiguous cases and meet user expectations. Integrating user feedback helps continuously refine and improve the system's accuracy.
Exploring alternative approaches, such as using Transformer-based models like Vision Transformers (ViTs) for image feature extraction, offers promising avenues for further improvement. Additionally, considering different similarity measures, such as Jaccard similarity and Hamming distance, and leveraging Natural Language Processing (NLP) techniques for meta-description analysis can enhance the system's performance and accuracy.
In conclusion, the presented approach improves storage efficiency and enhances the accuracy and speed of duplicate image identification and removal. Future work will explore these alternative models and similarity measures to further optimise the system, ensuring it remains a robust solution for managing extensive collections of 3D images.
The study's design, data collection, result analysis, and manuscript preparation were entirely managed by the author.
No grants from public, commercial, or non-profit funding agencies supported the research, authorship, or publication of this article.
The authors disclose no conflicts of interest in relation to this work.
There are no data available for sharing in this work.
The research did not involve the use of any particular software or tools.
My gratitude goes to those who assisted in this study and manuscript preparation, and to the anonymous reviewers for their constructive insights.
Yangon Technological University, Student, Myanmar
Yangon Technological University, Lecturer, Myanmar
Copyright: ©2024 Corresponding Author. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Htet, Zaw Ye, and Aung, Tin Shine. “Implementation Approach for Duplicate Image Identification and Removal.” Scientific Research Journal of Science, Engineering and Technology, vol. 2, no. 1, 2024, pp. 11-17, https://isrdo.org/journal/SRJSET/currentissue/implementation-approach-for-duplicate-image-identification-and-removal
Htet, Z., & Aung, T. (2024). Implementation Approach for Duplicate Image Identification and Removal. Scientific Research Journal of Science, Engineering and Technology, 2(1), 11-17. https://isrdo.org/journal/SRJSET/currentissue/implementation-approach-for-duplicate-image-identification-and-removal
Htet Zaw Ye and Aung Tin Shine, Implementation Approach for Duplicate Image Identification and Removal, Scientific Research Journal of Science, Engineering and Technology 2, no. 1(2024): 11-17, https://isrdo.org/journal/SRJSET/currentissue/implementation-approach-for-duplicate-image-identification-and-removal
HTML | XML | Total | |
---|---|---|---|
158 | 33 | 32 | 223 |