Abstract
The marked degree of heterogeneity in persons with developmental dyslexia has motivated the investigation of possible subtypes. Attempts have proceeded both from theoretical models of reading and the application of unsupervised learning (clustering) methods. Previous cluster analyses of data obtained from persons with reading disabilities have suffered from the inherent limitations of unsupervised learning methods. Specifically, the reliability and stability of cluster solutions have proven difficult to determine. Recent developments in the clustering literature have addressed these concerns by permitting checks on the internal validity of the solution. Resampling methods produce consistent groupings of the data independent of initialization effects, while the gap statistic provides a confidence measure for the determination of the optimal number of clusters present in the data. Combining these methods produces a robust data-driven classification that can be compared with theoretically based subtypes to produce better-informed models of developmental dyslexia. The present study is a novel application of resampling (bootstrap aggregating or bagging) methods and the gap statistic to the subtyping of children with developmental dyslexia. The specific aims of this study are: (1) to illustrate the use of bagging methods and the gap statistic in multivariate data obtained from children with developmental dyslexia; and (2) to compare the bagged clustering thresholded by the gap statistic against the predictions of the double-deficit hypothesis. The double-deficit hypothesis is a prominent theoretical model of developmental dyslexia, which predicts three subtypes: phonological, rate, and phonological-rate impaired readers. Three simulated data sets with known cluster structure were created to check the validity and illustrate the utility of the bagged clustering with the gap statistic in data with known structure. Subsequently, a clinical database of standardized test data (eight tests) from 93 children with developmental dyslexia was clustered using these methods. This procedure was repeated on a database of 93 children without reading disability matched for gender and age as a control. Finally, the clustering was repeated on the entire database of 186 participants. Cluster solutions were obtained for an increasing number of clusters (1-10) and were tested against the null hypothesis that no subtypes were present, i.e. the data represented a single cluster. Four clusters were identified in the children with developmental dyslexia. There was no evidence of significant cluster structure in the children without dyslexia. Two clusters were identified when children with and without reading impairments were considered together. Among the participants with developmental dyslexia, there was evidence of a phonological-deficit cluster, a rapid-naming cluster, and a cluster showing both depressed phonological processing and rapid naming. These accounted for 73 of the 93 participants (78%). All three are predicted by the double-deficit hypothesis. The fourth cluster consisted of children with normal phonological and rapid naming ability incommensurate with their high verbal ability. An analysis of variance with post-hoc multiple comparisons demonstrated that the phonological, rapid-naming, and double deficit clusters did not differ significantly in age, but the fourth cluster was comprised of significantly older children. The mixed data set revealed two clusters. One cluster consisted almost entirely of the double-deficit and phonological subtypes. The other consisted of the participants without dyslexia and the children with dyslexia demonstrating either a single rapid naming deficit or standardized test scores in the normal range. A silhouette analysis indicated that the four-cluster solution for the children with developmental dyslexia was superior to the two-cluster solution obtained for the entire data set. The study provides support for the presence of distinct subtypes in children with developmental dyslexia and for the double-deficit hypothesis. Specifically, this study finds three subtypes predicted by the double-deficit hypothesis without the assumption of an a priori theoretical model of reading. Taken together, these subtypes account for 79% of the participants with dyslexia. Further, the percentages of children in each subtype are in good agreement with previous studies. The participants in the subtype not predicted by the double-deficit hypothesis were significantly older than the other three groups. Recent advances in unsupervised learning can be expected to aid the improvement and refinement of the definition of developmental dyslexia. If reliable and consistent subtypes can be identified among persons with developmental dyslexia, it is reasonable to assume that diagnostic and intervention efforts will be greatly improved.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have