南京林业大学学报(自然科学版) ›› 2024, Vol. 48 ›› Issue (4): 93-103.doi: 10.12302/j.issn.1000-2006.202209055

所属专题: 专题报道Ⅲ:智慧林业之森林可视化研究

• 专题报道Ⅲ:智慧林业之森林可视化研究(执行主编 李凤日、张怀清、曹林) • 上一篇    下一篇

基于注意力机制和改进DeepLabV3+的无人机林区图像地物分割方法

赵玉刚1(), 刘文萍1,*(), 周焱1, 陈日强1, 宗世祥2, 骆有庆2   

  1. 1.北京林业大学信息学院,国家林业和草原局林业智能信息处理工程技术研究中心, 北京 100083
    2.北京林业大学林学院,北京 100083
  • 收稿日期:2022-09-24 修回日期:2022-11-01 出版日期:2024-07-30 发布日期:2024-08-05
  • 通讯作者: *刘文萍(wendyl@vip.163.com),教授。
  • 作者简介:

    赵玉刚(15621377528@163.com)。

  • 基金资助:
    国家林业和草原局重大应急科技项目(ZD202001);国家重点研发计划(2021YFD1400900)

UAV forestry land-cover image segmentation method based on attention mechanism and improved DeepLabV3+

ZHAO Yugang1(), LIU Wenping1,*(), ZHOU Yan1, CHEN Riqiang1, ZONG Shixiang2, LUO Youqing2   

  1. 1. School of Information, Beijing Forestry University, Engineering Research Center for Forestry-oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing 100083, China
    2. School of Forestry, Beijing Forestry University, Beijing 100083, China
  • Received:2022-09-24 Revised:2022-11-01 Online:2024-07-30 Published:2024-08-05

摘要:

【目的】为提取林区主要地物分布信息,基于注意力机制和DeepLabV3+语义分割网络提出一种面向无人机林区图像的地物分割方法Tree-DeepLab。【方法】根据不同的林区地物类型对图像进行标注,标注类型分为法国梧桐(Platanus orientalis)、银杏(Ginkgo biloba)、杨树(Populus sp.)、草地、道路和裸地6类,以获取语义分割数据集。对语义分割网络进行改进:①将带有分组注意力机制的ResNeSt101网络作为DeepLabV3+语义分割网络的主干网络;②将空洞空间卷积池化金字塔模块的连接方式设置成串并行相结合形式,同时改变空洞卷积的扩张率组合;③解码器增加浅层特征融合分支;④解码器增加空间注意力模块;⑤解码器增加高效通道注意力模块。【结果】在自制数据集基础上进行训练和测试,试验结果表明:Tree-DeepLab语义分割模型的平均像素精度和平均交并比分别为97.04%和85.01%,较原始DeepLabV3+分别提升4.03和14.07个百分点,且优于U-Net和PSPNet语义分割网络。【结论】Tree-DeepLab语义分割网络能够有效分割无人机航拍林区图像,以获取林区主要地物类型的分布信息。

关键词: 无人机, 地物分割, 林区图像, DeepLabV3+, 注意力机制, ResNeSt

Abstract:

【Objective】This study proposes the feature segmentation method Tree-DeepLab for unmanned aerial vehicle (UAV) forest images, based on an attention mechanism and the DeepLabV3+ semantic segmentation network, to extract the main feature distribution information in forest areas.【Method】First, the forest images were annotated according to feature types from six categories (Platanus orientalis, Ginkgo biloba, Populus sp., grassland, road, and bare ground) to obtain the semantic segmentation datasets. Second, the following improvements were made to the semantic segmentation network: (1) the Xception network, the backbone of the DeepLabV3+ semantic segmentation network, was replaced by ResNeSt101 with a split attention mechanism; (2) the atrous convolutions of different dilation rates in the atrous spatial pyramid pooling were connected using a combination of serial and parallel forms, while the combination of the atrous convolution dilation rates was simultaneously changed; (3) a shallow feature fusion branch was added to the decoder; (4) spatial attention modules were added to the decoder; and (5) efficient channel attention modules were added to the decoder.【Result】Training and testing were performed based on an in-house dataset. The experimental results revealed that the Tree-DeepLab semantic segmentation model had mean pixel accuracy (mPA) and mean intersection over union (mIoU) values of 97.04% and 85.01%, respectively, exceeding those of the original DeepLabV3+ by 4.03 and 14.07 percentage points, respectively, and outperforming U-Net and PSPNet.【Conclusion】The study demonstrates that the Tree-DeepLab semantic segmentation model can effectively segment UAV aerial photography images of forest areas to obtain the distribution information of the main feature types in forest areas.

Key words: unmanned aerial vehicle(UAV), land-cover image segmentation, forestry images, DeepLabV3+, attention mechanism, ResNeSt

中图分类号: