To better cope with the significant nonlinear radiation distortions (NRD) and severe rotational distortions in multi-modal remote sensing image matching, this paper introduces a rotationally robust feature-matching method based on the maximum index map (MIM) and 2D matrix, which is called the rotation-invariant local phase orientation histogram (RI-LPOH). First, feature detection is performed based on the weighted moment equation. Then, a 2D feature matrix based on MIM and a modified gradient location orientation histogram (GLOH) is constructed and rotational invariance is achieved by cyclic shifting in both the column and row directions without estimating the principal orientation separately. Each part of the sensed image’s 2D feature matrix is additionally flipped up and down to obtain another 2D matrix to avoid intensity inversion, and all the 2D matrices are concatenated by rows to form the final 1D feature vector. Finally, the RFM-LC algorithm is introduced to screen the obtained initial matches to reduce the negative effect caused by the high proportion of outliers. On this basis, the remaining outliers are removed by the fast sample consensus (FSC) method to obtain optimal transformation parameters. We validate the RI-LPOH method on six different types of multi-modal image datasets and compare it with four state-of-the-art methods: PSO-SIFT, MS-HLMO, CoFSM, and RI-ALGH. The experimental results show that our proposed method has obvious advantages in the success rate (SR) and the number of correct matches (NCM). Compared with PSO-SIFT, MS-HLMO, CoFSM, and RI-ALGH, the mean SR of RI-LPOH is 170.3%, 279.8%, 81.6%, and 25.4% higher, respectively, and the mean NCM is 13.27, 20.14, 1.39, and 2.42 times that of the aforementioned four methods.