With the development of technology and the extension to multimodality in linguistics, technology tools get more applied to linguistic study. It is recognized that metaphor should not be trapped in language, but expanded to various mode. Based on the multimedia corpus built by the multimedia annotator software ELAN which gives rise to a more detailed interpretation, this article attempts to explore mysteries in multimodal metaphor in Chinese ecological educational advertisements: what are prominent metaphors characterize this typical advertisement, how modes collaborate in multimodal metaphor, and how meaning creation is processed in multi-mode way. It is classified that four kinds of metaphors are prominently represented, that is, ontological metaphor, color metaphor, money metaphor, and BIG IS SIGNIFICANT. Various modes collaborate simultaneously to construct meaning, and it is suggested that in various modes, there exists differentiation between priority and the others. The more information a mode carries, the more significant it is in collaborations. The most efficient way of TV educational advertisements to convey meaning is visualization, so most of the time the visual mode acts as the priority in assistance with the others. Through simultaneous cueing -- saliently representing target and source domain simultaneously, and perceptual resemblance--the resemblance experienced through visual perception or auditory sense triggers similarity of the source and the target domain, meaning is created and conveyed to audiences.