Systems and Means of Informatics
2025, Volume 35, Issue 3, pp 17-32
DEVELOPMENT OF A SMALL-OBJECT AUGMENTATION METHOD BASED ON SUPER-RESOLUTION NETWORKS
- P. O. Arkhipov
- S. L. Philippskih
- M. V. Tsukanov
Abstract
The paper examines the limitations of modern data augmentation methods when applied to images captured by unmanned aerial vehicles in scenarios characterized by high object density and small object sizes. A specialized method, Contextual Small-Object Augmentation, is proposed to intelligently place visually enhanced objects into semantically relevant regions of the image while preserving spatial realism. In particular, the study focuses on a data augmentation module that utilizes super-resolution (SR) networks to improve the visual quality of small objects. For this purpose, several state-of-the-art SR neural models - RCAN, Real-ESRGAN, and SwinIR - were selected.
Their impact on the accuracy of object detection and classification was evaluated using the SSD MobileNet V2 FPNLite 320 x 320 model trained on various versions of the VisDrone benchmark dataset. The detection results were compared against a baseline model trained on the original dataset following the evaluation protocol of the COCO Evaluation Metrics. The experimental results demonstrate that incorporating high-resolution networks into the augmentation pipeline significantly improves the detection accuracy of small objects while maintaining computational efficiency.
[+] References (31)
- Ferlitsch, A. 2021. Deep learning patterns and practices. Manning Publications Co. 472 p.
- Foster, D. 2023. Generative deep learning. 2nd ed. O'Reilly Media. 453 p.
- Zhang, H., S. Zhang, and R. Zou. 2024. Select-mosaic: Data augmentation method for dense small object scenes. Cornell University. 6 p. Available at: https://arxiv.org/pdf/2406.05412v1 (accessed October 6, 2025).
- Arkhipov, P. O., S. L. Philippskih, and M. V. Tsukanov. 2024. A method for creating realistic synthetic images using a generative deep learning model for classifying anomalies in panoramas. Pattern Recognition Image Analysis 34(3):805-809. doi: 10.1134/S105466182470069X. EDN: FEXJQE.
- Arkhipov, P. O., and S. L. Philippskih. 2022. Building an ensemble of convolutional neural networks for classifying panoramic images. Pattern Recognition Image Analysis
32(3):511-514. doi: 10.1134/S1054661822030051. EDN: CHGJXU.
- Arkhipov, P.O., A. K. Trofimenkov, M.V. Tsukanov, and N. Yu. Nosova. 2022. Issledovanie metodov detektirovaniya klyuchevykh tochek pri sozdanii panoramnykh izobrazheniy [Study of methods for detecting key points when creating panoramic images]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 32(2):92-104. doi: 10.14357/08696527220209. EDN: RQCSHV.
- Arkhipov, P. O., and S. L. Philippskih. 2023. Raspoznavanie anomaliy na raznovremennykh panoramakh s ispol'zovaniem neyrosetevogo metoda konsolidatsii modeley [Recognition of anomalies on multitime panoramas using the neural network method of model amalgamation]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 33(2): 13-24. doi: 10.14357/08696527230202. EDN: TSMYNA.
- Prokhorets, I.O., and A. S. Stepanov. 2024. Kartografirovanie zemel' sel'skokhozyaystvennogo naznacheniya Khabarovskogo kraya metodami mashinnogo obucheniya s ispol'zovaniem izobrazheniy Sentinel-2 [Mapping of the Khabarovsk Region arable lands by machine learning using Sentinel-2 images]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 34(1):57-69. doi: 10.14357/ 08696527240105. EDN: YEZNXD.
- Chen, P., S. Liu, H. Zhao, X. Wang, and J. Jia. 2024. GridMask data augmentation. Cornell University. 9 p. Available at: https://arxiv.org/pdf/2001.04086v3 (accessed October 6, 2025).
- GitHub - mdbloice/Augmentor. Available at: https://github.com/mdbloice/Augmentor (accessed October 6, 2025).
- DeVries, T., and G. W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. Cornell University. 8 p. Available at: https://arxiv.org/pdf/1708.04552v2 (accessed October 6, 2025).
- Yun, S., D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo. 2019. CutMix: Regularization strategy to train strong classifiers with localizable features. IEEE/CVF Conference (International) on Computer Vision Proceedings. IEEE. 6022-6031. doi: 10.1109/ICCV.2019.00612.
- Zhong, Z., L. Zheng, G. Kang, S. Li, and Y. Yang. 2020. Random erasing data augmentation. AAAI Conference on Artificial Intelligence. AAAI Press. 13001-13008. doi: 10.1609/aaai.v34i07.7000.
- Zhang, H., M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. 2018. Mixup: Beyond empirical risk minimization. Cornell University. 13 p. Available at: https://arxiv. org/pdf/1710.09412 (accessed October 6, 2025).
- Uddin, A.F.M. S., M.S. Monira, W. Shin, T. Chung, and S.-H. Bae. 2021. Salien- cyMix: A saliency guided data augmentation strategy for better regularization. Cornell University. 12 p. Available at: https://arxiv.org/pdf/arXiv:2006.01791 (accessed October 6, 2025).
- Kim, J., W. Choo, and H. O. Song. 2020. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. 37th Conference (International) on Machine Learning Proceedings. 5275-5285.
- Fang, H., J. Sun, R. Wang, M. Gou, Y.-L. Li, and C. Lu. 2019. InstaBoost: Boosting instance segmentation via probability map guided copy-pasting. IEEE/CVF Conference (International) on Computer Vision Proceedings. IEEE. 682-691. doi: 10.1109/ICCV.2019.00077.
- Ghiasi, G., Y. Cui, A. Srinivas, R. Qian, T.-Y. Lin, E. D. Cubuk, Q.V. Le, and B. Zoph. 2021. Simple copy-paste is a strong data augmentation method for instance segmentation. IEEE/CVF Conference on Computer Vision and Pattern Recognition Proceedings. IEEE. 2918-2928. doi: 10.1109/CVPR46437.2021.00294.
- Dong, C., C. C. Loy, and X. Tang. 2016. Accelerating the super-resolution convolutional neural network. Computer vision. Eds. B. Leibe, J. Matas, N. Sebe, and M. Welling. Lecture notes in computer science ser. Cham: Springer. 9906:391-407. doi: 10.1007/978-3-319-46475-6/5.
- Zhang, Y., K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu. 2018. Image super-resolution using very deep residual channel attention networks. Computer vision. Eds. V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss. Lecture notes in computer sciece ser. Cham: Springer. 11211:294-310. doi: 10.1007/978-3-030-01234-2_18.
- Wang, X., K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C.C. Loy. 2019. ESRGAN: Enhanced super-resolution generative adversarial networks. Computer vision. Eds. L. Leal-Taixe and S. Roth. Lecture notes in computer science ser. Cham: Springer. 11133:63-79. doi: 10.1007/978-3-030-11021-5/.
- Wang, X., L. Xie, C. Dong, and Y. Shan. 2021. Real-ESRGAN: Training real- world blind super-resolution with pure synthetic data. IEEE/CVF Conference (International) on Computer Vision Workshop Proceedings. IEEE. 1905-1914. doi: 10.1109/ICCVW54120.2021.00217.
- Liang, J., J. Cao, G. Sun, K. Zhang, L. Van Gool, and R. Timofte. 2021. SwinIR: Image restoration using Swin Transformer. IEEE/CVF Conference (International) on Computer Vision Workshop Proceedings. IEEE. 1833-1844. doi: 10.1109/ ICCVW54120.2021.00210.
- Hore, A., and D. Ziou. 2010. Image quality metrics: PSNR vs. SSIM. Conference (International) on Pattern Recognition Proceedings. IEEE. 2366-2369. doi: 10.1109/ ICPR.2010.579.
- Howard, A.G., M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. Cornell University. 9 p. Available at: https:// arxiv.org/pdf/1704.04861v1 (accessed October 6, 2025).
- TensorFlow 2 detection model zoo. Available at: https://github.com/tensorflow/ models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md (accessed October 6, 2025).
- Hailo model zoo. Available at: https://github.com/hailo-ai/hailo_model_zoo/tree/ master/docs/public_models (accessed October 6, 2025).
- RKNN model zoo. Available at: https://github.com/airockchip/rknn_model_zoo (accessed October 6, 2025).
- Chiu, Y.-C., C.-Y. Tsai, M.-D. Ruan, G.-Y. Shen and T.-T. Lee. 2020. Mobilenet- SSDv2: An improved object detection model for embedded systems. Conference (International) on System Science and Engineering Proceedings. IEEE. Art. 9219319. 5 p. doi: 10.1109/1CSSE50014.2020.9219319.
- Duan, C., Z. Wei, C. Zhang, S. Qu, and H. Wang. 2021. Coarse-grained density map guided object detection in aerial images. CVF Conference (International) on Computer Vision Workshops Proceedings. IEEE. 2789-2798. doi: 10.1109/ ICCVW54120.2021.00313.
- Lee, J. C., J.Y. Yoo, Y. Kim, S.T. Moon, and J.H. Ko. 2021. Robust detection of small and dense objects in images from autonomous aerial vehicles. Electron. Lett.
57(16):611-613. doi: 10.1049/ ELL2.12245.
[+] About this article
Title
DEVELOPMENT OF A SMALL-OBJECT AUGMENTATION METHOD BASED ON SUPER-RESOLUTION NETWORKS
Journal
Systems and Means of Informatics
Volume 35, Issue 3, pp 17-32
Cover Date
2025-11-10
DOI
10.14357/08696527250302
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
object detection; object classification; transformer; convolutional neural network; generative adversarial network; data augmentation
Authors
P. O. Arkhipov  , S. L. Philippskih  , and M. V. Tsukanov
Author Affiliations
 Federal Research Center "Computer Science and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|