EN
In the present article, an attempt is made to derive optimal data-driven machine learning methods for forecasting an average daily and monthly rainfall of the Fukuoka city in Japan. This comparative study is conducted concentrating on three aspects: modelling inputs, modelling methods and pre-processing techniques. A comparison between linear correlation analysis and average mutual information is made to find an optimal input technique. For the modelling of the rainfall, a novel hybrid multi-model method is proposed and compared with its constituent models. The models include the artificial neural network, multivariate adaptive regression splines, the k-nearest neighbour, and radial basis support vector regression. Each of these methods is applied to model the daily and monthly rainfall, coupled with a pre-processing technique including moving average and principal component analysis. In the first stage of the hybrid method, sub-models from each of the above methods are constructed with different parameter settings. In the second stage, the sub-models are ranked with a variable selection technique and the higher ranked models are selected based on the leave-one-out cross-validation error. The forecasting of the hybrid model is performed by the weighted combination of the finally selected models.