BackgroundChina experienced an unprecedented outbreak of dengue in 2014, and the number of dengue cases reached the highest level over the past 25 years. There is a significant delay in the release of official case count data, and our ability to timely track the timing and magnitude of local outbreaks of dengue remains limited. Material and methodsWe developed an ensemble penalized regression algorithm (EPRA) for initializing near-real time forecasts of the dengue epidemic trajectory by integrating different penalties (LASSO, Ridge, Elastic Net, SCAD and MCP) with the techniques of iteratively sampling and model averaging. Multiple streams of near-real time data including dengue-related Baidu searches, Sina Weibo posts, and climatic conditions with historical dengue incidence were used. We compared the predictive power of the EPRA with the alternates, penalized regression models using single penalties, to retrospectively forecast weekly dengue incidence and detect outbreak occurrence defined using different cutoffs, during the periods of 2011–2016 in Guangzhou, south China. ResultsThe EPRA showed the best or at least comparable performance for 1-, 2-week ahead out-of-sample and leave-one-out cross validation forecasts. The findings indicate that skillful near-real time forecasts of dengue and confidence in those predictions can be made. For detecting dengue outbreaks, the EPRA predicted periods of high incidence of dengue more accurately than the alternates. ConclusionThis study developed a statistically rigorous approach for near-real time forecast of dengue in China. The EPRA provides skillful forecasts and can be used as timely and complementary ways to assess dengue dynamics, which will help to design interventions to mitigate dengue transmission.