Software Defect Prediction (SDP) plays a vital role in the software development life cycle as it helps identify and fix software defects. However, predicting software defects with irrelevant features and overlapping classes is challenging and can lead to lengthy training and low model accuracy. To address these challenges, this research introduces a novel Depth Linear Discrimination-Oriented Feature Selection Method based on Adaptive Sine Cosine Algorithm, named Depth Adaptive Sine Cosine Feature Selection (DASC-FS). DASC-FS integrates the Adaptive Sine Cosine Algorithm (ASCA) as a search algorithm to determine the relevant features and adopts Depth Linear Discriminant Analysis (D-LDA) to identify the discriminative features that maximize class separation. The paper proposes ASCA which is a metaheuristic algorithm meticulously designed to enhance the search capabilities of the standard Sine Cosine Algorithm (SCA). Combining the simplicity of the SCA with the efficiency of multiple mutation operators inspired by Genetic Algorithms (GA), ASCA enhances the diversity of the solutions and imparts remarkable adaptability to various situations. Furthermore, this study introduces a novel linear discriminant method, called Depth Linear Discriminant Analysis (D-LDA) to enhance the robustness of the original LDA. D-LDA systematically integrates the matrix depth concept into LDA, offering a systematic approach to address the challenges associated with scatter matrix estimation. As matrix depth measures how central or deep a particular matrix is within a distribution with respect to different directions, it is an efficient tool for computing a robust scatter matrix estimator that can handle outliers and complex data structures. The experimental results showed that DASC-FS consistently obtains the highest accuracy compared to most existing methods by integrating ASCA and D-LDA, thereby considering both accuracy optimization and class separation. The results also show that the use of multiple mutation operators in ASCA improves the search process capabilities. The results also show that the capacity of D-LDA to reduce data dimensionality and increase class separation yields highly competitive results compared to other LDAs. Finally, features related to code size and complexity have emerged as key factors for SDP because they consistently rank as important features across different classifiers and datasets. DASC-FS offers a valuable solution in domain knowledge for enhancing predictive accuracy and understanding factors contributing to software defects through enhanced search capabilities, robust scatter matrix estimation, and the ability to reduce data dimensionality.