PDF(1924 KB)
Birdsong classification research based on multi-view ensembles
LIU Jiang, ZHANG Yan, LYU Danju, LU Jing, XIE Shanshan, ZI Jiali, CHEN Xu, ZHAO Youjie
Journal of Nanjing Forestry University (Natural Sciences Edition) ›› 2023, Vol. 47 ›› Issue (4) : 23-30.
PDF(1924 KB)
PDF(1924 KB)
Birdsong classification research based on multi-view ensembles
【Objective】This study aimed to build a birdsong classification model with strong generalization integrating multi-view features and maximizing feature information to promote profound research on bird species diversity protection and ecological environmentally-intelligent monitoring.【Method】Using 16 types of birdsong audio data as the research objects, the short-time Fourier transform (STFT), wavelet transform (WT) and Hilbert-Huang transform (HHT) feature extraction methods were used to generate three types of birdsong spectrograms to constitute multi-view feature data, and as the input of the convolutional neural network (CNN), the base classifiers STFT-CNN, WT-CNN, and HHT-CNN for different views were trained. The multi-view bagging ensemble convolutional neural network (MVB-CNN) and multi-view stacking ensemble convolutional neural network (MVS-CNN) models were constructed using bagging and stacking integration methods, respectively. With the powerful feature extraction capability of CNN, the multi-view cascaded ensemble convolutional neural network (MVC-CNN) model was proposed to cascade and fuse the deep features extracted from different views through CNN. The classification results were obtained by using a support vector machine (SVM). 【Result】The accuracy rates of the base classification models WT-CNN, STFT-CNN, and HHT-CNN constructed in this study were 89.11%, 88.36%, and 81.00%, respectively; the accuracy rates of the ensemble models MVB-CNN and MVS-CNN were 89.92% and 93.54%, respectively; and the accuracy of the multi-view cascade ensemble model MVC-CNN was 95.76%. The accuracy of the MVC-CNN model improved by 6.65%-14.76% over the single-view-based classification model and by 5.84% and 2.22% over the MVB-CNN and MVS-CNN models, respectively.【Conclusion】The MVC-CNN model proposed in this study fully combined the advantages of multi-view features of birdsong, effectively improving the birdsong classification effects with a greater stability and better generalizational ability, providing a technical solution for multi-view birdsong classification researches.
feature extraction / multi-view / ensemble learning / convolutional neural network
| [1] |
胡耀文. 音频信号特征提取及其分类研究[D]. 昆明: 昆明理工大学, 2018.
|
| [2] |
|
| [3] |
马克平. 多样性监测依赖于地面人工观测与先进技术手段的有机结合[J]. 生物多样性, 2016, (11): 1201-1202.
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
陈瀚翔, 邱志斌, 王海祥, 等. 基于MFCC特征与GMM的输电线路渉鸟故障相关鸟种智能识别[J]. 水电能源科学, 2021, 39(7): 171-174, 67.
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
谢将剑, 李文彬, 张军国, 等. 基于Chirplet语图特征和深度学习的鸟类物种识别方法[J]. 北京林业大学学报, 2018, 40(3): 122-127.
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
张华美, 张皎洁. 基于人工智能的脱机手写数字识别研究综述[J]. 南京邮电大学学报(自然科学版), 2021, 41(5): 83-91.
|
| [32] |
| [33] |
|
| [34] |
|
/
| 〈 |
|
〉 |