Terahertz security image analysis is crucial for non-contact, continuous, and covert human security, as well as the detection of dangerous items. However, current supervised detection algorithms require manual annotation of numerous pixels to identify areas containing hazardous items. This manual annotation task is repetitive and time-consuming. To address this issue, we propose an enhanced cascading pyramid weakly supervised learning framework (CPA) for terahertz images. In our framework, we utilize pixel-level image labels to segment terahertz images into grids of various sizes based on multi-instance learning (MIL). Each small grid is treated as an individual instance and generates pseudo-labels. By focusing on high-probability grids during image calculations, we expedite the training and inference speed. Experimental results demonstrate that our approach outperforms weakly supervised learning frameworks with similar levels of supervised information. In summary, our proposed method offers a viable and effective solution for inferring coarse-grained image-level labels while reducing annotation costs. It provides an efficient alternative to manual pixel-level annotation in the field of terahertz security image analysis.