Packet loss-resilient and security are two major challenges faced by real-time audio transmission over IP networks. Due to the capability of recovering the signal from a small set of samplings and the randomness in the acquisition process, compressive sensing (CS) has a vast prospect in dealing with these problems. In this paper, we propose a secure and packet loss-resistant real-time audio transmission framework (CS-SPT) based on the principle of CS. Inspired by the interleaving technique, an ultralow complexity scrambling matrix was adopted in the proposed CS-SPT to improve its packet loss-resilient capability by increasing the information redundancy. Moreover, the energy of ciphertext is homogenized using a diffusion operation. Experimental results show that compared with existing methods, the proposed CS-SPT not only improves the packet loss-resilient ability significantly but also can resist several major attacks, such as COAs and KPAs.