
基于机器学习算法的森林火灾风险评估研究
Research on forest fire risk evaluation based on machine learning algorithm
【目的】 利用森林火灾风险图可提高有效巡护,优化有限防火资源,基于地形、人类活动、植被和气象因素数据,采用基于机器学习算法构建了林火发生预测模型,对林火防护提供一定的参考。【方法】 以安徽省滁州韭山为研究对象,提取林区的坡度、海拔、坡向、到居住点的距离、到道路的距离、地形湿度指数、归一化植被指数和温度驱动因素,评估火灾发生驱动因子,将潜在驱动因子分成地形、人类活动、植被与气象因素等4类;使用哨兵火灾产品,提取林区内的历史火点,然后采用机器学习算法建立林火发生的预测模型;最后利用混淆矩阵评估指标和接受者操作特征曲线(ROC)进行精度评价。【结果】 植被、温度和到道路的距离是研究区域火灾发生的主要驱动因素。两种模型的ROC曲线表明,逻辑回归预测模型准确度为71.07%,曲线下面积(AUC)值为0.717 2;随机森林模型具有较好的准确性,准确度达到84.91%,曲线下面积值为0.850 1。【结论】 随机森林模型表现出比逻辑回归模型更好的预测能力。森林火灾风险图表明,随机森林模型预测下,研究区11.91% (29.36 km2)位于高、极高风险级别。森林火灾风险图可有效协助林火防护管理者采取适当的林火预防措施,保护森林资源。
【Objective】 Forest fire risk maps are necessary to improve effective patrols and optimize the scientific layout of limited fire prevention resources. This study uses machine learning algorithms to construct forest fire occurrences based on terrain, human activities, vegetation, and meteorological factor data.【Method】With Jiushan Mountain in Chuzhou city in Anhui province as the research object, we extracted the following potential driving factors: slope, elevation, aspect, distance to settlement, distance to road, topographic wetness index, normalized difference vegetation index and temperature of the study area, evaluated the driving factors of fire occurrence, and then divided the potential driving factors into topography, human activities, vegetation, meteorological factors, and other four categories. Historical fire points in the forest area were extracted from the sentinel fire products. A prediction model for forest fire occurrence was then constructed using a machine learning algorithm. Finally, the accuracy of the models was evaluated using a confusion matrix and receiver operating characteristic curves. 【Result】We found that vegetation, temperature and distance to the road are the main driving factors of forest fires in the study area. The ROC curves of the two models showed that the Logistic regression prediction model had an accuracy of 71.07%, where the area under the curve was 0.717 2. Meanwhile, the random forest model had a better accuracy, with an accuracy of 84.91% and an area under the curve of 0.850 1.【Conclusion】The random forest model exhibits a better predictive ability than that does the logistic regression model. Furthermore, the generated forest fire risk map shows that 11.91% (29.36 km2) of the study area is at a high or extremely a high risk. If utilized, this forest fire risk map can effectively help forest fire protection managers implement appropriate measures to protect forest resources in Jiushan Mountain.
火灾风险 / 机器学习 / 预测模型 / 森林植被 / 气象因素
fire risk / machine learning / forecasting model / forest vegetation / meteorological factors
[1] |
\n\nThe concept of fire regime refers to a variety of fire characteristics occurring at a given place and period of time. Understanding fire regimes is relevant to fire ecology and fire management because it provides a better understanding of effects of fire as well as the potential effects of different future scenarios. Recent changes in the traditional fire regimes linked to climate and socioeconomic transformations in European Mediterranean areas have influenced fire regimes and their effects on both ecosystems and people. This paper presents a methodology for characterising fire regimes based on historical fire statistics. The analysis includes three dimensions: density, seasonality and interannual variability. The raw records were pre-processed to eliminate errors, and a principal component analysis was performed to identify the primary factors involved in the variation. A cluster analysis was then used to define the fire regimes. Approximately 38% of the spatial cells examined were found to have significant fire activity, but in spite that fires are important in these areas, fire activity showed a high interannual variability. Four fire regimes in the Spanish peninsular territory were described in terms of the density and seasonality of fire activity.\n
|
[2] |
|
[3] |
\n\nWe applied logistic regression and Random Forest to evaluate drivers of fire occurrence on a provincial scale. Potential driving factors were divided into two groups according to scale of influence: ‘climate factors’, which operate on a regional scale, and ‘local factors’, which includes infrastructure, vegetation, topographic and socioeconomic data. The groups of factors were analysed separately and then significant factors from both groups were analysed together. Both models identified significant driving factors, which were ranked in terms of relative importance. Results show that climate factors are the main drivers of fire occurrence in the forests of Fujian, China. Particularly, sunshine hours, relative humidity (fire seasonal and daily), precipitation (fire season) and temperature (fire seasonal and daily) were seen to play a crucial role in fire ignition. Of the local factors, elevation, distance to railway and per capita GDP were found to be most significant. Random Forest demonstrated a higher predictive ability than logistic regression across all groups of factors (climate, local, and climate and local combined). Maps of the likelihood of fire occurrence in Fujian illustrate that the high fire-risk zones are distributed across administrative divisions; consequently, fire management strategies should be devised based on fire-risk zones, rather than on separate administrative divisions.\n
|
[4] |
Topography, weather, and fuels are known factors driving fire behavior, but the degree to which each contributes to the spatial pattern of fire severity under different conditions remains poorly understood. The variability in severity within the boundaries of the 2006 wildfires that burned in the Klamath Mountains, northern California, along with data on burn conditions and new analytical tools, presented an opportunity to evaluate factors influencing fire severity under burning conditions representative of those where management of wildfire for resource benefit is most likely. Fire severity was estimated as the percent change in canopy cover (0–100%) classified from the Relativized differenced Normalized Burn Ratio (RdNBR), and spatial data layers were compiled to determine strength of associations with topography, weather, and variables directly or indirectly linked to fuels, such as vegetation type, number of previous fires, and time since last fire. Detailed fire progressions were used to estimate weather (e.g., temperature, relative humidity, temperature inversions, and solar radiation) at the time of burning. A generalized additive regression model with random effects and an additional spatial term to account for autocorrelation between adjacent locations was fitted to fire severity. In this fire year characterized by the relative absence of extreme fire weather, topographical complexity most strongly influenced severity. Upper‐ and mid‐slopes tended to burn at higher fire severity than lower‐slopes. East‐ and southeast‐facing aspects tended to burn at higher severity than other aspects. Vegetation type and fire history were also important predictors of fire severity. Shrub vegetation was more likely to burn at higher severity than mixed hardwood/conifer or hardwood vegetation. As expected, fire severity was positively associated with time since previous fire, but the relationship was non‐linear. Of the weather variables analyzed, temperature inversions, common in the complex topography of the Klamath Mountains, showed the strongest association with fire severity. Inversions trapped smoke and had a dampening effect on severity within the landscape underneath the inversion. Understanding the spatial controls on mixed‐severity fires allows managers to better plan for future wildfires and aide in the decision making when managing lightning ignitions for resource benefit might be appropriate.
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
This study predicts forest fire susceptibility in Chaloos Rood watershed in Iran using three machine learning (ML) models—multivariate adaptive regression splines (MARS), support vector machine (SVM), and boosted regression tree (BRT). The study utilizes 14 set of fire predictors derived from vegetation indices, climatic variables, environmental factors, and topographical features. To assess the suitability of the models and estimating the variance and bias of estimation, the training dataset obtained from the Natural Resources Directorate of Mazandaran province was subjected to resampling using cross validation (CV), bootstrap, and optimism bootstrap techniques. Using variance inflation factor (VIF), weight indicating the strength of the spatial relationship of the predictors to fire occurrence was assigned to each contributing variable. Subsequently, the models were trained and validated using the receiver operating characteristics (ROC) area under the curve (AUC) curve. Results of the model validation based on the resampling techniques (non, 5- and 10-fold CV, bootstrap and optimism bootstrap) produced AUC values of 0.78, 0.88, 0.90, 0.86 and 0.83 for the MARS model; 0.82, 0.82, 0.89, 0.87, 0.84 for the SVM and 0.87, 0.90, 0.90, 0.90, 0.91 for the BRT model. Across the individual model, the 10-fold CV performed best in MARS and SVM with AUC values of 0.90 and 0.89. Overall, the BRT outperformed the other models in all ramification with highest AUC value of 0.91 using optimism bootstrap resampling algorithm. Generally, the resampling process enhanced the prediction performance of all the models.
|
[10] |
周振伟, 于成龙. 层次分析法在森林火险预测中的应用[J]. 黑龙江气象, 2008, 25(增刊1):17-19.
|
[11] |
许志卿, 苏喜友, 张颐. 基于支持向量机方法的森林火险预测研究[J]. 中国农学通报, 2012, 28(13):126-131.
The support vector machine (SVM) was employed to predict the burned area of forest fires, to build the optimal kernel, and the semi-definite programming was used when solving the SVM problem. Also, the regression error characteristic curves were provided to illustrate the accuracy difference between the classic regression model as well as the SVM model based on gauss kernel. The experience had the result of 1.76 of the mean square error and 190 of the support vector number, approximately a half of the training number, which showed that the model had a relatively high accuracy compared to the classic regression method and a prevailing kernel. The result also indicated that the method effectively avoided the over-learning phenomenon and that it was useful for improving firefighting resource management. |
[12] |
崔亮, 张继权, 包玉龙, 等. 呼伦贝尔草原火灾风险预警研究[J]. 草业学报, 2012, 21(4):282-292.
草原火灾是一种突发性强、破坏性大、处置救助较为困难的自然灾害,对草原资源危害极为严重。以呼伦贝尔草原牧业6个旗为研究区,利用1994-2005年呼伦贝尔草原火灾统计月报表和相关气象资料,依据自然灾害风险形成理论、区域灾害系统理论、灾害预警理论建立草原火灾风险预警模型。利用Logistic回归模型进行警源识别确定影响草原火灾风险预警的关键因子,采用层次分析法通过专家打分计算各个指标的权重,用网格GIS技术结合回归分析对选区的指标进行空间展布使得空间评价尺度更加精确,利用加权综合评分法分析了呼伦贝尔草原火灾的内生警源和外生警源的警兆,用最优分割法对1994-2004年火灾案例进行最优分割确定预警阈值并划分为蓝色、黄色、橙色、红色警报。以2005年全年呼伦贝尔草原火灾案例为例对呼伦贝尔草原火灾进行风险预警。经过检验发现火灾风险高预警区与火点吻合较好,从一定程度上检验了该模型的准确性。
|
[13] |
|
[14] |
高超, 林红蕾, 胡海清, 等. 我国林火发生预测模型研究进展[J]. 应用生态学报, 2020, 31(9):3227-3240.
通过文献回顾,总结了国内林火发生预测模型的研究现状,并从林火发生驱动因子、林火发生概率预测模型、林火发生频次预测模型和模型检验方法等方面进行归纳分析。得出以下结论: 1)气象、地形、植被、可燃物、人类活动等因素是影响林火发生及模型预测精度的主要驱动因子;2)林火发生概率模型中,地理加权逻辑斯蒂回归模型考虑了变量之间的空间相关性,Gompit回归模型适宜非对称结构的林火数据,随机森林模型不需要多重共线性检验,在避免过度拟合的同时提高了预测精度,是林火发生概率预测模型的优选方法之一;3)林火发生频次模型中,负二项回归模型更适合对过度离散数据进行模拟,零膨胀模型和栅栏模型可以处理林火数据中包含大量零值的问题;4)ROC检验、AIC检验、似然比检验和Wald检验方法是林火概率和频次模型的常用检验方法。林火发生预测模型研究仍是我国当前林火管理工作的重点,预测模型的选择需要依据不同地区林火数据特点。此外,构建林火预测模型时需要考虑更多的影响因素,以提高模型预测精度;未来,需要进一步探索其他数学模型在林火发生预测中的应用,不断提高林火发生预测模型的准确度。
We summarized research progress of forest fire occurrence prediction model in China based on the literature review, from the prospects of forest fire drivers, models of forest fire occurrence probability, models of forest fire occurrence frequency and model validation methods. The main conclusions are: 1) Meteorology, terrain, vegetation, fuel and human activities were the main driving factors of forest fire occurrence and model prediction accuracy. 2) In the models of forest fire occurrence probability, the geographically weighted logistic regression model considered the spatial correlation between model variables, the Gompit regression model could fit the asymmetric structure fire data. The random forest algorithm had a high prediction accuracy without the requirement of multicollinearity test and excessive fitting, which made it as one of the optimal methods of forest fire occurrence probability prediction. 3) Among all the forest fire occurrence frequency models, the negative binomial regression model was suitable for fitting the over discrete data, the zero-inflated model and hurdle model could deal with fire data that contained a large number of zeros. 4) ROC test, AIC test, likelihood ratio test, and Wald test were the most common methods for evaluating the accuracy of fire occurrence probability and frequency models. The study of forest fire occurrence prediction model should be the main focus of the forest fire management. Model selection should base on fire data structure of different forests. More influencing factors should be taken into account to improve the prediction accuracy of model. In addition, it was necessary to further explore the application of other mathematical methods in forest fire prediction, to improve the accuracy of the models.
|
[15] |
|
[16] |
贾南, 陈悦, 康可霖, 等. 基于RF的森林火灾风险评价模型及其应用研究[J]. 安全与环境学报, 2020, 20(4):1236-1240.
|
[17] |
王雪娟, 张雪平. 韭山国家森林公园观赏植物资源及其保护与发展对策[J]. 安徽科技学院学报, 2010, 24(5):5-8.
|
[18] |
王佳辰, 冯雨馨, 郝志尚, 等. 安徽省森林火灾预测及分析[J]. 大众标准化, 2021(13):198-200.
|
[19] |
杨春梅, 侯玉宁, 刘九庆. 基于森林火灾数据的余火清理机器人模块化设计[J]. 森林工程, 2022, 38(2):105-111.
|
[20] |
Reasonable forest fire management measures can effectively reduce the losses caused by forest fires and forest fire driving factors and their impacts are important aspects that should be considered in forest fire management. We used the random forest model and MODIS Global Fire Atlas dataset (2010~2016) to analyse the impacts of climate, topographic, vegetation and socioeconomic variables on forest fire occurrence in six geographical regions in China. The results show clear regional differences in the forest fire driving factors and their impacts in China. Climate variables are the forest fire driving factors in all regions of China, vegetation variable is the forest fire driving factor in all other regions except the Northwest region and topographic variables and socioeconomic variables are only the driving factors of forest fires in a few regions (Northwest and Southwest regions). The model predictive capability is good: the AUC values are between 0.830 and 0.975, and the prediction accuracy is between 70.0% and 91.4%. High fire hazard areas are concentrated in the Northeast region, Southwest region and East China region. This research will aid in providing a national-scale understanding of forest fire driving factors and fire hazard distribution in China and help policymakers to design fire management strategies to reduce potential fire hazards.
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[26] |
|
[27] |
|
[28] |
Fire risk prediction is significant for fire prevention and fire resource allocation. Fire risk maps are effective methods for quantifying regional fire risk. Laoshan National Forest Park has many precious natural resources and tourist attractions, but there is no fire risk assessment model. This paper aims to construct the forest fire risk map for Nanjing Laoshan National Forest Park. The forest fire risk model is constructed by factors (altitude, aspect, topographic wetness index, slope, distance to roads and populated areas, normalized difference vegetation index, and temperature) which have a great influence on the probability of inducing fire in Laoshan. Since the importance of factors in different study areas is inconsistent, it is necessary to calculate the significance of each factor of Laoshan. After the significance calculation is completed, the fire risk model of Laoshan can be obtained. Then, the fire risk map can be plotted based on the model. This fire risk map can clarify the fire risk level of each part of the study area, with 16.97% extremely low risk, 48.32% low risk, 17.35% moderate risk, 12.74% high risk and 4.62% extremely high risk, and it is compared with the data of MODIS fire anomaly point. The result shows that the accuracy of the risk map is 76.65%.
|
[19] |
|
[30] |
|
[31] |
顾先丽, 吴志伟, 张宇婧, 等. 气候变化背景下江西省林火空间预测[J]. 生态学报, 2020, 40(2):667-677.
|
[32] |
|
[33] |
The present study deals with application of the weighted linear combination method for zoning of forest fire risk in Dohezar and Sehezar region of Mazandaran province in northern Iran. In this study, the effective criteria for fires were identified by the Delphi method, and these included ecological and socioeconomic parameters. In this regard, the first step comprised of digital layers; the required data were provided from databases, related centers, and field data collected in the region. Then, the map of criteria was digitized in a geographic information system, and all criteria and indexes were normalized by fuzzy logic. After that, the geographic information system (GIS 10.3) was integrated with the Weighted Linear Combination and the Analytical Network Process, to produce zonation of the forest fire risk map in the Dohezar and Sehezar region. In order to analyze accuracy of the evaluation, the results obtained from the study were compared to records of former fire incidents in the region. This was done using the Kappa coefficient test and a receiver operating characteristic curve. The model showing estimations for forest fire risk explained that the prepared map had accuracy of 90% determined by the Kappa coefficient test and the value of 0.924 by receiver operating characteristic. These results showed that the prepared map had high accuracy and efficacy.
|
/
〈 |
|
〉 |