Abstract

In syntactic and structural pattern recognition, data represented as strings appear in several supervised classification applications. In some situations, data collections show imbalanced class distributions, which typically results in the classifier biasing its performance to the class representing the majority of objects. To solve this problem, some oversampling methods have been proposed for data represented as strings. However, this type of method has been little studied in the literature. Therefore, in this paper, we present an oversampling method for working in string space that balances the minority class and gets better classification results than state-of-the-art oversampling methods, especially for highly imbalanced problems. Furthermore, according to our experiments, the proposed method is much faster than those reported in the literature.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call