JOURNAL OF NANJING FORESTRY UNIVERSITY ›› 2023, Vol. 47 ›› Issue (3): 1-10.doi: 10.12302/j.issn.1000-2006.202206048

Special Issue: 第三届中国林草计算机应用大会论文精选

Previous Articles     Next Articles

Research on the optimized pest image instance segmentation method based on the Swin Transformer model

GAO Jiajun(), ZHANG Xu, GUO Ying(), LIU Yukun, GUO Anqi, SHI Mengmeng, WANG Peng, YUAN Ying   

  1. Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, China
  • Received:2022-06-25 Revised:2022-12-27 Online:2023-05-30 Published:2023-05-25
  • Contact: GUO Ying E-mail:wuligaojiajun@163.com;guoying@ifrit.ac.cn

Abstract:

【Objective】To achieve accurate pest monitoring, the author proposes an optimized instance segmentation method based on the Swin Transformer to effectively solve the difficulty in image recognition and segmentation of multi-larval individuals under complex real scenarios.【Method】The Swin Transformer model was selected to improve the backbone network of the Mask R-CNN instance segmentation model and to identify and segment Heortia vitessoides larvae which harmed Aquilaria sinensis. The input and output dimensions of all layers of the Swin Transformer and ResNet models with different structural parameters were adjusted. Both models were set as the backbone networks of Mask R-CNN for comparative experiments. H. vitessoides moore larvae identification and segmentation performances for different backbone networks were quantitatively and qualitatively analyzed using Mask R-CNN models to determine the best model structure.【Result】(1) Using this method, the F1 score and AP were 89.7% and 88.0%, respectively, in terms of pest identification framing, and 84.3% and 82.2%, respectively, in terms of pest identification and segmentation, increasing by 8.75% and 8.40%, respectively, compared to that of the Mask R-CNN model in terms of target framing and segmentation. (2) For small target pest identification and segmentation tasks, the F1 score and AP were 88.4% and 86.3%, respectively, in terms of pest identification framing, 84.0% and 81.7%, respectively, in terms of pest identification and segmentation, and increased by 9.30% and 9.45%, respectively, compared to that of the Mask R-CNN model in terms of target framing and segmentation.【Conclusion】In segmentation tasks under complex real scenarios, the recognition and segmentation effects depend to a large extent on the model’s ability to extract image features. By integrating the Swin Transformer, the mask R-CNN instance segmentation model has a stronger ability to extract features in the backbone network, with a better overall recognition and segmentation effect. It could provide technical support for the identification and monitoring of pests and solutions for the protection of agriculture, forestry, animal husbandry, and other industrial resources.

Key words: pest recognition, Swin Transformer, Mask R-CNN, instance segmentation, Aguilaria sinensis, Heortia vitessoides

CLC Number: