An interpretable significant wave height forecasting model using a causal AI framework with error correction
Abstract
Accurate forecasting of Significant Wave Height (SWH) is crucial for early warning of marine hazards. However, existing artificial intelligence (AI) methods often lack integration of domain-specific prior knowledge, limiting prediction accuracy and reliability. To overcome this limitation, this study proposes a causal sequence-tosequence (C-Seq2Seq) model for SWH forecasting, combined with an Extreme Gradient Boosting (XGBoost) model to correct its forecasting errors. The C-Seq2Seq architecture incorporates causal knowledge-derived from The Peter and Clark Momentary Conditional Independence (PCMCI) causal inference-via causality structure and causal weighting units, enabling simultaneous capture of temporal dependencies and causal relationships. Furthermore, Bayesian Optimization (BO) and 3-fold Randomized Search Cross Validation (3-fold RSCV) were applied to optimize the hyperparameters of the forecasting and error correction models, respectively. Compared to the LSTM baseline, the integrated C-Seq2Seq and XGBoost model achieved statistically significant Root Mean Square Error (RMSE) improvements across 1- to 24-h lead times. Specifically, the RMSE improved by 1.08 %- 18.68 % at Matsu Buoy (ID: C6W08) and by 1.34 %-4.85 % at Hualien Buoy (ID: 46699A), respectively. These findings indicate that incorporating causal and error correction mechanisms into time-series forecasting frameworks substantially strengthens predictive capability.