چكيده لاتين
Particulate matter with an aerodynamic diameter smaller than 2.5 micrometers (PM₂.₅ ) is among the most critical atmospheric pollutants, with significant impacts on environmental systems and human health. Aerosol Optical Depth (AOD) has been widely used in remote sensing studies to estimate near-surface PM₂.₅ concentrations. However, columnar AOD values, typically retrieved from passive sensors like MODIS, are unable to differentiate the contribution of various vertical layers or resolve the vertical distribution of PM₂.₅. Despite its vertical profiling advantage, the CALIOP sensor suffers from spatial and temporal limitations, providing profiles only every 16 days at ~5 km intervals. To overcome this, recent efforts have leveraged passive satellite sensors with higher temporal resolution, such as MODIS aboard the Terra (launched 1999) and Aqua (launched 2002) satellites, which offer near-daily global coverage at 1×1 km resolution through the MAIAC product. In this study, several AOD–PM₂.₅ models—including eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Convolutional Neural Networks (CNN)—were compared using vertically stratified AOD values derived with the aid of CALIOP lidar observations. The results indicate that the XGBoost model outperformed others in terms of both accuracy and feature interpretabilityIn this work, AOD was estimated across four vertical layers (0–1.5, 1.5–3, 3–5, and 5–10 km) using MODIS reflectance bands (470 nm and 550 nm), the MAIAC AOD product, and previously developed models. Comparative analysis across seasonal and semi-annual models showed that the annual model exhibited superior stability and was less sensitive to seasonal meteorological fluctuations. This annual model was therefore selected for retrieving multilayer AOD. The extracted AOD values were collocated with ground-based PM₂.₅ measurements from co-located stations, and a comprehensive dataset was created incorporating temporal data, geographic coordinates, meteorological parameters, and MODIS spectral reflectance. Classical and deep learning models (XGBoost, RF, Decision Tree, CNN, U-Net, ResNet, and LSTM) were then trained under various input scenarios to predict PM₂.₅ concentrations. First, each AOD layer (AOD₁.₅, AOD₃, AOD₅, AOD₁₀) was individually combined with spatiotemporal and meteorological features to evaluate its individual contribution to PM₂.₅ estimation. Then, cumulative AOD inputs were constructed for vertical extents of 0–1.5, 0–3, 0–5, and 0–10 km to assess the influence of lower-atmosphere aerosol loading on surface PM₂.₅ . Next, a multilayer input configuration containing all four AOD layers was tested to determine whether full vertical information improves model performance. Additionally, a total-layer input (0–10 km), representing the aggregated AOD from all layers, was assessed to evaluate its potential as a compact yet informative predictor. Finally, the proposed method was benchmarked against models trained on the columnar MAIAC AOD to evaluate the relative effectiveness of vertically resolved versus traditional approaches. The results show that XGBoost consistently achieved low RMSE and MAE values across all AOD layers, with high R² scores, indicating strong predictive power for PM₂.₅ concentrations. It also demonstrated minimal overfitting, as reflected by stable performance across all vertical layers (AOD₁.₅ to AOD₁₀). These findings highlight the benefits of integrating vertically resolved AOD information with meteorological and spatiotemporal data to enhance PM₂.₅ estimation accuracy.