Machine translation (MT) is a subfield of computer features that focuses on the automatic translation from one natural language into another without any human involvement. Due to native people interacting in a variety of languages, there is a great need for translating information between languages to send and communicate thoughts. However, they disregard the significance of semantic data encoded in the text features. In this paper, multimodal neural machine translation (MNMT) is proposed for Sanskrit-Hindi translation. The main goal of the proposed method is to fully utilize semantic text features on NMT architecture and to minimize testing and training time. The MNMT is validated on two different NMT architectures: recurrent neural network (RNN) and self-attention network (SAN). The MNMT method’s efficacy is demonstrated by employing the dataset of Sanskrit-Hindi Corpora. Extensive experimental outcomes represent the proposed method’s enhancement over baselines on both architectures. The existing methods, namely, English-to-Indian MT system, Sanskrit-Hindi MT system, and hybrid MT system are used to justify the efficacy of the MNMT method. When compared to the above-mentioned existing methods, RA-RNN respectively achieves a superior BLEU and METEOR of 80.5% and 75.3%, while the RA-SAN respectively achieves a superior BLEU and METEOR of 78.2% and 77.1%.