연구보고서
The Development of Artificial Intelligence Base Technologies for Objective Climate Predictions (II)
- 저자
- Dr. Miae Kim, Dr. Kyungwon Park, Dr. Seonkyu Lee, Dr. Yun-Young Lee, Dr. Uran Chung
- 작성일
- 2023.12.22
- 조회
- 294
- 요약
- 목차
Executive Summary
Recently, aritifical intelligence (AI)-based climate prediction models within 14 days have been developed by global companies and research centers such as NVIDIA (FourCastNet), Google DeepMind (GraphCast), HUAWEI (Pangu-Weather), ECMWF AIFS (European Centre for Medium-Range Weather Forecasts Artificial Intelligence/Integrated Forecasting System). These models are in experimental operation via the ECMWF website. However, scaling to sub-seasonal and seasonal timescales is still lacking as these models focus on a forecast horizon of 14 days or less. In addition, AI research requires a large amount of training data, but the ECMWF ERA-5 reanalysis, the climate data currently used by global companies for training, has only about 74 years of data from 1950 to present. If the data is be divided into training data, validation data, and test data, the dataset used for actual learning are very scarce.
Due to the climate crisis, the importance of highly accurate sub-seasonal prediction data in the applied climate field has grown significantly. Sub-seasonal prediction data faces challenges in predicting the diverse interactions of the atmosphere and oceans solely through numerical models based on physical principles. To overcome these challenges, an Attention U-Net model and a seasonal prediction model were developed to give more weight to important patterns and features in the input data, aiming to improve accuracy. Additionally, various sensitivity analyses were conducted using techniques such as Filters, Wrappers, and Embedded methods based on observational data, identifying features inherent in variables. By selecting specific combinations of variables from both the model and observational data, it was confirmed that accuracy improved compared to ECMWF model predictions. However, in many combinations of variables, statistical results did not consistently demonstrate optimized outcomes. Therefore, it is inferred that future improvements in artificial intelligence models in the climate field, considering the characteristics of variables and recent advancements, could lead to the utilization of more accurate prediction data.
The research was carried out to explore methods to enhance the prediction accuracy of Seasonal to Sub-seasonal (S2S) climate variables (such as maximum daily temperature and daily total precipitation) by extending the U-Net architecture using Attention and Residual mechanism while optimizing hyperparameters. Attention measures the relevance between inputs and subsequent outputs to focus on specific information, while Residual learning addresses the vanishing gradient problem. In this study, Attention U-Net and Residual U-Net were constructed by adding these two mechanisms to the existing U-Net, respectively, and Attention based-on Residual U-Net was constructed by combining the two mechanisms. Grid search algorithm was employed to optimize hyperparameters, with epochs, batch size, and learning rate set as key parameters. Training data utilized S2S prediction data. The optimized hyperparameter combinations showed similar trends in most S2S climate models. In the extended U-Net models of predictions of daily maximum temperature and daily total precipitation during the test period, the models with added Attention or Residual demonstrated improved accuracy, with the Attention-based Residual U-Net, incorporating both mechanisms, exhibiting superior performance. Specifically, the Residual mechanism appeared to influence temperature prediction, while Attention effectively improved precipitation prediction. However, limitations persisted in enhancing the week 1 prediction of temperature and the week 2-3 prediction of precipitation. Subsequent research is expected to propose suitable methods for S2S temperature and precipitation predictions by leveraging ensemble techniques to combine high-performing models.
Historical climate data are often not enough to train a deep learning model, especially when working with monthly data. A specific climate phenomenon such as Madden-Julian Oscillation (MJO) only accounts for a part of long-term climate data that have been collected over decades. Semi-supervised learning (SSL) is ideally suited to such situations, as it can effectively leverage a small amount of data while maintaining accuracy. In this study, we developed a SSL-based deep learning model for MJO phase classification using decades of climate anomaly images. First of all, a supervised learning approach was employed with all labeled data to classify MJO phases. Various input variables, data split strategies, model architectures were tested, and a optimal model was chosen based on test accuracy. The selected model was then trained in semi-supervised learning framework. The sensitivity of the SSL-based model, developed with Mean Teacher SSL algorithm, was analyzed with respect to data augmentation methods, model training process, consistency loss weight differences, and the number of labeled data. Comparison with results from using most or all labeled data (supervised learning) showed that the SSL-based model achieved comparable or even better performance with only about half of the whole labeled data. In addition to MJO phase classification, we also tested spatiotemporal predictive AI models to forecast MJO input fields on the next day or consecutive 7 days using timeseries of input fields with various pre-sequence lengths. Different sampling methods such as excluding inactive MJO cases or summer seasons were investigated.
In the climate field, research using deep learning techniques increases rapidly. However, research on improving the model by understanding a deep learning model is insufficient. In this study, XAI, loss surface analysis, and model internal structure analysis and improvement were performed to better understand deep learning models and derive improvements to the models. We analyzed whether the models built with supervised and semi-supervised learning through XAI techniques extracted similar information from the input data and used it to make predictions. Also, we conducted a study to determine whether the structure of the model is advantageous for finding the global minimum by analyzing the loss surface. In addition, we evaluated the predictability of models designed by attention mechanisms, dataset expansion, increasing the number of layer filters, and improving the model structure by analyzing the internal structure and feature collapse of the deep learning model.