Noise data in text are one of the main factors affecting the quality of text categorization. A parallel noise data elimination algorithm based on principal component analysis method and term frequency-inverse document frequency method for the noise data issue of massive text categorization is proposed. Five types of noise data which may occur during text categorization process are analyzed and summarized in this paper. Before text categorization, a redundant noise elimination algorithm based on key feature selection is presented for redundant noise features. During the process of text categorization, the error noise detection algorithm is given for inaccurate noise features. The proposed method is compared with other four typical noise processing methods in different noise ratios on two common corpora. The results show that the proposed method is feasible and can maintain more stable and excellent classification performance and lower running time.