Abstract
In data-based situation assessment applications, the proliferation of data acquired and recorded on current technological systems is a key issue in that data remain unlabeled because labeling would require too much time and implies prohibitive costs. The data should therefore speak for itself. The different situations, e.g., normal or faulty, must hence be learned only from the data. Clustering methods, also named unsupervised classification methods, can be used for that purpose. These methods are designed to cluster the samples according to some similarity criterion. The different clusters can be associated to different situations whose discrimination may be relevant to obtain a proper diagnosis.Numerous algorithms have been developed in recent years for clustering numeric data but these methods are not applicable to categorical data. This is the case of the algorithm DyClee, named DyClee-N in the paper. However, in many application domains, qualitative features are key to properly describe the different situations. DyClee-N was recast to produce a version, named DyClee-C that accepts categorical features, but only categorical features. This paper presents DyClee-N&C that subsumes both the numeric and categorical feature based algorithms DyClee-N and DyClee-C respectively. DyClee-N&C is applied to a data set of the literature for the evaluation of risk in the automobile domain and compared to state of the art clustering methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.