The STT-MRAM technology is a promising candidate for future on-chip cache memory because of its high density, low standby power, and nonvolatility. As the technology node scales, especially under 40-nm technology node, STT-MRAM cell design becomes a key issue to approach low power consumption, high access performance, and desirable reliability. The conventional 1T-1 magnetic tunnel junction (MTJ) and 2T-2MTJ cell designs cannot address these challenges efficiently. In this paper, we propose a novel 3T-3MTJ cell structure using the advanced perpendicular MTJ (p-MTJ) technology. It can store 2 bits with three MTJs. The differential sensing technique can be used to read out the most significant bit as fast as the 2T-2MTJ design. The sensing latency of 2 bits within the same cell is almost the same as the sensing latency of the 1T-1MTJ cell design. Therefore, the 3T-3MTJ cell can have the advantages of both 2T-2MTJ and 1T-1MTJ cells. Circuit-level simulations show that the proposed 3T-3MTJ cell structure can achieve a desirable tradeoff between storage density, access performance, and energy consumption compared to the prior 1T-1MTJ and 2T-2MTJ cell structures. Additionally, we propose a novel adaptive cache design based on the 3T-3MTJ cell structure, which can work in different modes to satisfy various memory access demands from different applications. Architecture level simulations validate the effectiveness of the proposed cache design.