Abstract

Intellectual analysis of small and middle-sized datasets through machine learning tools presents challenges in various application domains. Existing methods fail to provide sufficient accuracy, and their utilization is accompanied by a range of issues during data analysis. This paper proposes the improvement of the input doubling method for middle-sized data analysis. The existing method employs an augmentation procedure where the augmented data sample increases quadratically. This imposes several limitations on the method's usage for middle-sized data analysis. The authors propose enhancing this method by introducing an additional clustering procedure during data augmentation. The training algorithms and application methods are described, and a visualization of the main steps of its operation is provided. Modeling is performed on two medium-sized datasets. Optimal parameters for the improved method are selected, demonstrating its high efficiency. Specifically, significant reductions in the volumes of augmented datasets (8-9 times for both datasets respectively) are achieved, accompanied by substantial reductions in the training procedure duration of the method (more than 100 and 260 times for both datasets respectively), while maintaining high accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.