我们的网站为什么显示成这样?

可能因为您的浏览器不支持样式,您可以更新您的浏览器到最新版本,以获取对此功能的支持,访问下面的网站,获取关于浏览器的信息:

|Table of Contents|

基于单个特征分类准确率的特征选择方法研究(PDF)

《南京林业大学学报(自然科学版)》[ISSN:1000-2006/CN:32-1161/S]

Issue:
2019年04期
Page:
109-116
Column:
研究论文
publishdate:
2019-07-24

Article Info:/Info

Title:
Research on feature selection based on single feature classification accuracy
Article ID:
1000-2006(2019)04-0109-08
Author(s):
DU Xuehui MENG Chun* LIU Meishuang
(College of Engineering and Technology, Northeast Forestry University, Harbin 150040, China)
Keywords:
feature selection single feature classification accuracy Landsat-8 satellite random forest(RF) support vector machine(SVM) romote sensing classification
Classification number :
S757.2
DOI:
10. 3969/ j. issn. 1000-2006. 201807059
Document Code:
A
Abstract:
【Objective】  In this study, random forest(RF)and support vector machine(SVM)classifiers were used to explore a method for guaranteeing classification accuracy and reducing feature dimensions of remote sensing classification. 【Method】Using the Fuxing Forest Farm in Antu County of Jilin Province as the research area and Landsat-8 image in 2015 as a data source, the spectral(red, green, blue, near-infrared and shortwave infrared bands), vegetation index(NDVI, enhanced vegetation index ratio vegetation index, and bare soil vegetation index), texture(homogeneity, mean, second moment, variance, difference, contrast, entropy and correlation), and topographic information(slope and aspect)were determined for a total of 19 indicators as classification features. Feature selection was based on estimations of feature importance in RF as a contrast according to the classification accuracy of a signal feature in RF and SVM classifiers to select features, and the selected features were divided into two cases based on whether principal component analysis was performed. Next, the RF and SVM classifiers were used for classification. Finally, the classification accuracy was evaluated, and the optimal feature and classifier combination was determined.【Result】The method, which was based on the SVM single feature classification accuracy was used to select features, and the selected features were analyzed by principle components analysis. RF was used for classification, which was better than other classification properties. The feature dimension was 5, overall accuracy was 0.86, and Kappa coefficient was 0.83. By comparing the classification of all features, the classification accuracy was improved and dimensions decreased, increasing the rate of classification. The RF classification of features selected based on the feature importance of RF achieved high classification accuracy. However, when the feature dimensions were less than 7, the classification accuracy fluctuated greatly, reaching a maximum value of 0.88 when the feature dimension was 4, followed by an immediate decrease to 0.83, after which this value was maintained. The classification accuracy of features selected based on a single feature classification accuracy changed more slowly, as in the method described as the best classification combination above, with accuracy fluctuation showing a range of approximately 0.02.  Classification of features selected based on the classification accuracy of a single feature did not affect the RF and SVM classifiers, in the subsequent classification process, the accuracy of the SVM classifier was higher than that of the RF. SVM classification of features selected based on RF single feature classification accuracy and RF classification of features selected based on SVM single feature classification accuracy and the selected features were analyzed by principle components analysis. The results were compared with those obtained using the SVM or RF single classifier to select features and for classification; the former showed higher accuracy.【Conclusion】 The feature selection method based on single feature classification accuracy can guarantee classification accuracy and reduce feature dimensions. Classification of features selected based on this method was more stable than that selected based on the estimation of the feature importance of RF.  Features selected based on the classification accuracy of a single feature in different classifiers as well as the final classification accuracy differed. The classification performance of different classifiers was better than that of a single classifier for selecting features and classification.  In the middle and low dimensions, the classification accuracy of the RF classifier may be related to the feature input order, and principal component analysis to input features may be beneficial for improving the classification accuracy and stability of RF.

References

[1] LOEHLE C, IDSO C, WIGLEY T. Physiological and ecological factors influencing recent trends in United States forest health responses to climate change[J]. Forest Ecology and Management,2016, 363:179-189. DOI:10.1016/j.foreco.2015.12.042. [2] 高广磊,信忠保,丁国栋,等.基于遥感技术的森林健康研究综述[J]. 生态学报,2013,33(6):1675-1689. DOI:10.5846/stxb201112011838. GAO G L, XIN Z B, DING G D, et al. Forest health studies based on remote sensing: a review[J]. Acta Ecologica Sinica,2013, 33(6):1675-1689. [3] PAUSE M, SCHWEITZER C, ROSENTHAL M, et al. In situ/remote sensing integration to assess forest health-a review[J]. Remote Sensing, 2016, 8(6):471. DOI:10.3390/rs8060471. [4] 程希萌,沈占锋,邢廷炎,等.基于mRMR特征优选算法的多光谱遥感影像分类效率精度分析[J].地球信息科学学报,2016,18(6):815-823. DOI:10.3724/SP.J.1047.2016.00815. CHENG X M, SHEN Z F, XING T Y, et al. Efficiency and accuracy analysis of multispectral image classification based on mRMR feature selection method [J]. Journal of Geo-Information Science,2016,18(6):815-823. [5] 熊艳,高仁强,徐战亚.机载LiDAR点云数据降维与分类的随机森林方法[J].测绘学报, 2018,47(4):508-518.DOI:10.11947/j.AGCS.2018.20170417. XIONG Y, GAO R Q, XU Z Y. Random forest method for dimension reduction and point cloud classification based on airborne LiDAR [J]. Acta Geodaetica et Cartographica Sinica, 2018, 47(4): 508-518. [6] 贾坤,李强子.农作物遥感分类特征变量选择研究现状与展望[J].资源科学,2013,35(12):2507-2516. JIA K, LI Q Z. Review of features selection in crop classification using remote sensing data[J]. Resources Science,2013,35(12):2507-2516. [7] 马玥,姜琦刚,孟治国,等.基于随机森林算法的农耕区土地利用分类研究[J].农业机械学报,2016,47(1):297-303.DOI:10.6041/j.issn.1000-1298.2016.01.040. MA Y, JIANG Q G, MENG Z G, et al. Classification of land use in farming area based on random forest algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2016, 47(1):297-303. [8] STROBL C, BOULESTEIX A L, KNEIB T, et al. Conditional variable importance for random forests[J]. BMC Bioinformatics, 2008,9:307. DOI:10.1186/1471-2105-9-307. [9] STROBL C, BOULESTEIX A L, ZEILEIS A, et al. Bias in random forest variable importance measures: illustrations, sources and a solution[J]. BMC Bioinformatics, 2007,8:25. DOI:10.1186/1471-2105-8-25. [10] STROBL C, BOULESTEIX A L, AUGUSTIN T. Unbiased split selection for classification trees based on the Gini index[J]. Computational Statistics & Data Analysis, 2007,52(1):483-501. DOI:10.1016/j.csda.2006.12.030. [11] ROY D P, WULDER M A, LOVELAND T R, et al. Landsat-8: science and product vision for terrestrial global change research[J]. Remote Sensing of Environment, 2014(145):154-172. DOI:10.1016/j.rse.2014.02.001. [12] 全国国土资源标准化技术委员会. 土地利用现状分类:GB/T 21010-2017[S].北京:中国标准出版社,2017. National Standardization Technical Committee of National Land and Resources.Current land use classification:GB/T 21010-2017[S]. Beijing: Standards Press of China,2017. [13] 国家林业局. 林业资源分类与代码 森林类型:GB/T 14721-2010[S].北京:中国标准出版社,2010. State Forestry Administration. Classification and codes for forestry resources—forest types:GB/T 14721-2010[S]. Beijing: Standards Press of China,2010. [14] 李梦颖,邢艳秋,刘美爽,等.基于支持向量机的Landsat-8影像森林类型识别研究[J].中南林业科技大学学报,2017,37(4):52-58. DOI:10.14067/j.cnki.1673-923x.2017.04.009. LI M Y, XING Y Q, LIU M S, et al. Identification of forest type with Landsat-8 image based on SVM [J]. Journal of Central South University of Forestry & Technology, 2017,37(4):52-58. [15] BREIMAN L. Random forests[J]. Machine Learning, 2001,45(1):5-32. DOI:10.1023/A:1010933404324. [16] 陈元鹏,罗明,彭军还,等.基于网格搜索随机森林算法的工矿复垦区土地利用分类[J].农业工程学报,2017,33(14):250-257. DOI:10.11975/j.issn.1002-6819.2017.14.034. CHEN Y P, LUO M, PENG J H, et al. Classification of land use in industrial and mining reclamation area based grid-search and random forest classifier[J]. Transactions of the Chinese Society for Agricultural Engineering, 2017,33(14): 250-257. [17] 刘海娟,张婷,侍昊,等.基于RF模型的高分辨率遥感影像分类评价[J].南京林业大学学报(自然科学版),2015,39(1):99-103. DOI:10.3969/j.issn.1000-2006.2015.01.018. LIU H J, ZHANG T, SHI H, et al. Classification evaluation on high resolution remote sensing image based on RF[J]. Journal of Nanjing Forestry University(Natural Sciences Edition),2015, 39(1):99-103. [18] 王奕森,夏树涛.集成学习之随机森林算法综述[J].信息通信技术,2018,12(1):49-55. WANG Y S, XIA S T. A survey of random forests algorithms[J]. Information and Communications Technologies,2018,12(1):49-55. [19] GOLDSTEIN B A, POLLEY E C, BRIGGS F B. Random forests for genetic association studies[J]. Stat Appl Genet Mol Biol, 2011, 10(1):32. DOI:10.2202/1544-6115.1691. [20] 姚登举,杨静,詹晓娟.基于随机森林的特征选择算法[J].吉林大学学报(工学版),2014,44(1):137-141. DOI:10.13229/j.cnki.jdxbgxb201401024. YAO D J, YANG J, ZHAN X J. Feature selection algorithm based on random forest[J]. Journal of Jilin University(Engineering and Technology Edition),2014,44(1):137-141. [21] 孙杰,赖祖龙.利用随机森林的城区机载LiDAR数据特征选择与分类[J].武汉大学学报(信息科学版),2014,39(11):1310-1313. DOI:10.13203/j.whugis20130206. SUN J, LAI Z L. Airborne LiDAR feature selection for urban classification using random forests [J]. Geomatics and Information Science of Wuhan University,2014, 39(11):1310-1313. [22] 黄衍,查伟雄.随机森林与支持向量机分类性能比较[J].软件,2012,33(6):107-110. DOI:10.3969/j.issn.1003-6970.2012.06.038. HUANG Y, ZHA W X. Comparison on classification performance between random forests and support vector machine[J]. Software, 2012,33(6):107-110.

Last Update: 2019-07-22