Memristors have emerged as promising devices for enabling efficient multiply-accumulate (MAC) operations in crossbar arrays, crucial for analog in-memory computing (AiMC). However, variations in memristors and associated circuits can affect the accuracy of analog computing. Typically, this is mitigated by on-chip training, which is challenging for memristors with limited endurance. We present a hardware-software codesign using magnetic tunnel junction (MTJ)-based AiMC off-chip calibration that achieves software accuracy without costly on-chip training. Hardware-wise, MTJ devices exhibit ultralow cycle-to-cycle variations, as experimentally evaluated over 1 million mass-produced devices. Software-wise, leveraging this, we propose an off-chip training method to adjust deep neural network parameters, achieving accurate AiMC inference. We validate this approach with MAC operations, showing improved transfer curve linearity and reduced errors. By emulating large-scale neural network models, our codesigned MTJ-based AiMC closely matches software baseline accuracy and outperforms existing off-chip training methods, highlighting MTJ's potential in AI tasks.