Abstract:A training set is obtained by combining diagnostic predictors calculated from the ECMWF finemesh reanalysis (ECthin) fields with the shortduration heavy rain cases from 2011 to 2014 between May and September. Based on the box difference indexes of all predictors, a thresholding method is proposed to rudimentarily decrease false alarms. A new classbalanced training set is reconstructed by using the Kmeans clustering, and meanwhile, predictors with greater average weights are selected by the Relief algorithm. A forecast model for shortduration heavy rainfall in the Chongqing region centered by the Xgboost algorithm is established. The results suggest three points: (1) this model provides probabilistic and deterministic binary forecasts generated by the customized threshold; (2) the verification of the independent validation set in 2015 shows that the model achieves better classification performance in general and outperforms the ECthin hourly total precipitation reanalysis when the probability threshold is set to be 0.1, with TS score reaching 0.3. Two case studies show that this Xgboostbased model can predict the probability and area of potential shortduration heavy rain events with higher precision than the hourly ECthin, scoring TS between 0.2 to 0.4. (3) The TS scores of the Xgboost model on cases from recent years are greater than 0.1, outperforming the ECthin and rivalling daily forecast operation, which means that its products are well worth referring to.