Asset pricing via machine learning provides a promising way to capture price trends by fusing heterogeneous market factors to analyze their joint impact on stock movements rather than relying on statistical and econometric models in finance to explore the causality between a market indicator and stock returns. However, the fusion nature of machine learning also hides the way to unveil the internal mechanism of stock movements. In this study, a deep learning framework with visual clues is presented to unveil the entangled factors and their function on stock movements. In particular, a context-aware hierarchical attention mechanism (CHARM) is first proposed to encode unstructured textual media information to trace the literal power of news on such media-aware stock movements. The encoded media and other structured market factors are further fused via tensor-based learning to infer and visualize their interactions on stock fluctuations. Last, a pre-estimating method for locating turning points as trading clues is utilized to improve the efficiency of each investment opportunity. Experiments conducted in real securities markets demonstrate the proposed framework not only improve the readability of the investing strategies, but also enhance the predictive accuracy and the investing returns comparing with the state-of-the-art models, including AZFinText, TeSIA, eLSTM, CMT, MAC and SA-DLSTM.