Abstract
Digital facilitator networks managing extensive visual documentation across multiple languages face challenges in retrieving related content when identical diagrams exist in different linguistic versions. Traditional metadata-based search approaches fail to identify visually similar materials, reducing content reuse efficiency. This research addresses the implementation of an image similarity search system for organizational environments processing approximately 66 diagrams weekly. The objective is to enable accurate identification of exact matches and language variants while maintaining computational efficiency on standard hardware. The system employs multi-scale feature extraction combining color histogram analysis in RGB and HSV spaces, multi-threshold edge detection using Canny operators, Sobel gradient texture characterization, and template signatures, producing a 261-dimensional feature representation. A tiered similarity assessment framework evaluates relationships between images, while language variant detection combines filename pattern analysis with visual similarity verification. The implementation uses Flask framework with Open Source Computer Vision Library (OpenCV) for computer vision operations. Testing with organizational diagrams across English, Russian, and Kazakh languages demonstrates 92% accuracy in identifying exact matches and language variants, with average response times of 1.8 seconds and peak memory usage of 72 MB. Language variant detection achieves 89% accuracy with false positive rates below 3%. The modular architecture enables deployment on conventional office systems, demonstrating that traditional computer vision approaches remain applicable for organizational content management when properly adapted to practical constraints.


