Eutrophication in lakes or reservoirs can lead to blue-green algae(BGA) bloom that is detrimental to the aquatic ecosystems and often pose a potentially serious health threat to people having access to the impaired waters.Since 1990s,Lake Dianchi has been experiencing BGA bloom almost on an annual basis,raising ever increasing levels of concerns from the public and governments.To control the eutrophication and reduce the outbreak frequency of BGA bloom in Lake Dianchi,it is necessary to identify the responsive relationship between the BGA bloom and the influencing factors including nutrient loadings,weather conditions,among others.However,to fulfill this task is very difficult in practice since there exist no,and it is impossible to obtain,a complete dataset that simultaneously covers the occurrence of BGA bloom and the influencing factors due to the transient and heterogeneous nature of the Lake Dianchi system.Even though the remote sensing technology offers a cost-effective way to acquire data about the occurrence of BGA bloom,it still far from sufficient to provide a complete dataset due to the data gap from lack of observations or interruption by clouds. To overcome the data limitation in deriving reliable relationship between BGA blooms and the main influencing factors,this study applied a robust EMB algorithm to reconstruct a complete dataset from the available dataset with missing data,hence forming a basis for quantitatively relating the BGA bloom to its contributing factors.The analysis starts with discovering the general trend of BGA bloom in Lake Dianch through using boxplots of chlorophyll-a data in the lake,which shows that the area represented by monitoring station Huiwanzhong appears to be the most likely place for BGA bloom.As for temporal trend,it was found that the chlorophyll-a concentration doesn′t have as much variability inter-annually as intra-annually.In general,chlorophyll-a concentrations increase from February to August,and then decrease thereafter until the next February.Subsequently,a multiple imputation based EMB algorithm was employed to re-construct the entire basic datasets from month April to October from 2004 through 2008 for the Waihai of Lake Dianchi based on the available incomplete datasets of weather condition and BGA bloom.With the complete datasets,the relationship between chlorophyll-a and outbreak probability of BGA bloom was established using conditional probability curves for two time ranges,i.e.,from April to October and June to September,respectively.The analysis suggests that there is a threshold concentration of chlorophyll-a determining whether a BGA bloom would occur.However,the same method was found to be invalid for analyzing the relationship between outbreak frequency and nutrient concentrations.The scatter plots of TN vs.TP along with labels of BGA bloom occurrence indicate that 1) there might not be a threshold exists for TN or TP to control BGA;2) it seems more efficient to reduce concentration of TN than TP;3) it is necessary to control concentration of TN and TP at a reasonable and realistic range,while pursuing a higher water quality standard in short term might not be necessary. The results of this study suggest that it is critical to reduce TN and TP under water quality standard level V by all means,while at the same time,supplemental measures such as water volume control,water quality and hydrodynamic condition adjustment and aquatic ecosystem restoration should be taken to direct the lake ecological system to evolve from the phytoplankton dominant regime to the macrophyte dominant regime.