Abstract

Machine learning (ML) technology is rapidly evolving, and the quality of ML systems is becoming an increasingly focal point of attention. Since the ML system is shaped by the dataset it learns from, its quality largely depends on the quality of the dataset. However, the dataset is often collected in a non-standardized process and few requirements and analysis methods are given to assist in identifying the needed dataset. This leads to no guarantee for the quality of dataset, affecting the generalization ability of model and resulting in low training efficiency. To address these issues, this paper proposes an ontology-based requirement analysis method where ontology integrates domain knowledge into the process of data requirements analysis and the coverage criteria on ontology are given for specifying data requirements which can later be used to guide the high-quality construction of the dataset. We held an experiment on an image recognition system in the field of autonomous driving to validate our approach. The result shows that the ML system trained by the dataset constructed through our data requirements analysis method has a better performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call