This study uses a data-driven approach to address the complexities associated with research focused multi-sleeve Cone Penetration Test (CPT) devices, particularly focusing on the multi-friction attachment (MFA) and multi-piezo-friction attachment (MPFA) CPT devices. Hindered by time-consuming assembly and susceptibility to sensor stream losses due to extensive electronic components, these advanced devices demand optimization to transform from research devices to practice-suitable devices. This study aims at optimizing the design of the multi-sleeve CPT devices using machine learning, with soil type classification performance as the primary metric for device configuration effectiveness. The research scope centers not on using machine learning for soil classification but on refining the design of multi-sleeve CPT devices. A two-phase data-driven approach is adopted, testing various feature combinations across eight machine learning models. The first phase involves identifying the most suitable model for the dataset, followed by a refinement of features to balance sensor number minimization and soil classification accuracy. The result is a proposed configuration for a multi-sleeve CPT device, simplifying the original design while maintaining robustness, thereby enhancing cost-efficiency and operational effectiveness in geotechnical practice. This work sheds light on how the integration of machine learning can guide the design optimization of geotechnical instruments.