Learning classifier systems (LCSs), an established evolutionary computation technique, are over 30 years old with much empirical testing and foundations of theoretical understanding. XCS is a well-tested LCS model that generates optimal (i.e., maximally general and accurate) classifier rules in the final solution. Previous work has hypothesized the evolution mechanisms in XCS by identifying the bounds of learning and population requirements. However, no work has shown exactly how an optimum rule is evolved or especially identifies whether the methods within an LCS are being utilized effectively. In this paper, we introduce a method to trace the evolution of classifier rules generated in an XCS-based classifier system. Specifically, we introduce the concept of a family tree, termed parent-tree, for each individual classifier rule generated in the system during training, which describes the whole generational process for that classifier. Experiments are conducted on two sample Boolean problem domains, i.e., multiplexer and count ones problems using two XCS-based systems, i.e., standard XCS and XCS with code-fragment actions. The analysis of parent-trees reveals, for the first time in XCS, that no matter how specific or general the initial classifiers are, all the optimal classifiers are converged through the mechanism `be specific then generalize' near the final stages of evolution. Populations where the initial classifiers were slightly more specific than the known `ideal' specificity in the target solutions evolve faster than either very specific, ideal or more general starting classifier populations. Consequently introducing the `flip mutation' method and reverting the conventional wisdom back to apply rule discovery in the match set has demonstrated benefits in binary classification problems, which has implications in using XCS for knowledge discovery tasks. It is further concluded that XCS does not directly utilize all relevant information or all breeding strategies to evolve the optimum solution, indicating areas for performance and efficiency improvement in XCS-based systems.