Non-volatile memory (NVM) is expected to be the second tier of memory in two-tier memory systems. However, because of the limited write endurance, it is vital to reduce the number of writes on NVM. Large-scale nested loops are the performance bottleneck in programs since the data cannot be held on the first tier of memory and then causes many write operations on NVM. Loop tiling groups iterations and loop interchange changes the execution order of the loop to improve data locality and thus reduce communication with NVM. However, research that combines loop interchange and loop tiling for minimizing writes on NVM is uncommon. In this paper, we propose a new loop tiling scheme and combine the loop interchange to solve these issues. Specifically, we propose a strategy to generate the legal tile shape, which is a parallelogram, first. Then, we propose an optimal tile size selection technique to minimize the write operations on NVM. In addition, we adopt the loop interchange technique to help loop tiling generate an optimal tile size for multi-dimensional loops. Finally, we schedule the accessing operations and computations in a pipeline fashion to cover the NVM latency. Experiments show that the proposed scheme can reduce the write on NVM efficiently. In addition, for 2-dimensional loops, NVM latency can be completely hidden.