Abstract

In the world of design and decision making, perfect or optimal solutions typically only work in simplified worlds. In the complex constraint-driven nature of reality, satisfactory and sufficient (satisficing) designs and decisions are the norm (Leahey, 2003; Simon, 1956). Cleary a manual produced with input from a large and diverse committee of experts and stakeholders, when committees have been characterized as “a cul-de-sac down which ideas are lured and then quietly strangled” (Barnett Cocks, 1973), will not be perfect.Given this context, the opinions expressed here are based on the author's 45+ years of experience, including 12 years as a practicing school psychologist, an intelligence researcher and scholar, a university professor, an author of a major intelligence test (Woodcock-Johnson IV [WJ IV]), and a frequent expert regarding the intelligence quotient (IQ) prong in intellectual disability (ID) death penalty cases following Atkins v. Virginia. This author offers opinions on whether the American Association on Intellectual and Developmental Disabilities (AAIDD) 12th edition of Intellectual Disability: Definition, Diagnosis, Classification, and Systems of Support (Schalock et al., 2021; hereafter referred to as the purple manual) provides satisficing treatment on a handful of select issues. The complex and unresolved issue of using part scores for proxies of general intelligence (g) receives the largest discussion.Yes. Satisficing. Grade B+. As I argued in 2009/2010 (tinyurl.com/5dsaqh43), the Cattell-Horn-Carroll (CHC) theory of intelligence was, at the time of the 11th edition of the AAIDD manual (hereafter referred to as the green manual; Schalock et al., 2010), the consensus taxonomy of cognitive abilities (Floyd et al., 2021; McGrew, 2005, 2009, 2015; Schneider & McGrew, 2012, 2018; Watson, 2015). The purple manual now recognizes this consensus by stating, “The approach to intellectual assessment used in this manual incorporates the Cattell-Horn-Carroll theory of intelligence, which is currently the most comprehensive and empirically supported theory of intelligence” (Schalock et al., 2021, p. 25). The AAIDD purple manual IQ prong is now firmly grounded in contemporary intelligence theory and research evidence.This author frequently finds CHC analysis of IQ scores from different IQ tests, or an earlier version from a series of related tests (e.g., various editions of the Wechsler Intelligence Scale for Children and the Wechsler Adult Intelligence Scale; McGrew, 2015; Watson, 2015), useful when explaining to others the actual consistency of an individual's abilities across time or tests that, at first blush, may appear as inconsistency when only attending to full-scale IQ scores. AAIDD's formal recognition of the CHC theory supports this type of analysis. I would have liked to have seen the inclusion of a CHC model figure (of which many exist in a variety of publications) and a brief table of CHC broad ability construct definitions. Users will need to consult other sources such as Floyd et al. (2021) and Schneider and McGrew (2018).An A grade was not assigned given the manuals inadvertent muddying of the CHC waters. The CHC glossary definition only mentions fluid (Gf) and crystallized (Gc) intelligence. Gf and Gc are also the only broad CHC abilities mentioned on pages 25–28 (save one exception noted below) and are featured in Table 3 of the manual. Gf and Gc are indeed the consensus king and queen of the CHC taxonomy (McGrew, 2015; Schneider & McGrew, 2012, 2018). “These factors appear to have a degree of centrality in relationship to other intellectual abilities and are the broad ability factors most closely associated with the general factor of intelligence” (emphasis added; Watson, 2015, p. 128). However, the CHC abilities comprising most contemporary IQ tests may also include Gv (visual-spatial processing), Ga (auditory processing), Gwm (short-term working memory), Gs (processing speed), Gl (learning efficiency), or Gr (retrieval fluency) abilities (McGrew, 2015; Schneider & McGrew, 2012, 2018). (Please note that this author would typically not mention such a minor error, but the manual incorrectly references learning efficiency [Gl] as “efficiency,” as described in the AAIDD-referenced source [Schneider & McGrew, 2018]. However, as discussed later, the large number of copyedit errors in the purple manual tarnish its authoritative stature. Another example is that when mentioning crystallized intelligence across pages 25–28 and in the CHC definition in the glossary [p. 118], the term is used 12 times, and is incorrectly spelled crystalized [sic] in 11 of the 12 instances.) The manual mentions these “additional” abilities after the king and queen (Gf and Gc) are first anointed as the basis of the full-scale score that represents general intelligence—“the full-scale IQ score is based on general intelligence (i.e., g) that encompasses crystalized [sic] intelligence and fluid intelligence part scores, along with as many as six additional broad-strata abilities” (emphasis added; p. 27). The repeated reference and preferential treatment of fluid and crystallized intelligence suggests AAIDD implicitly or explicitly endorses only a partial CHC model, or is trying to straddle a theorical or measurement issue fence (e.g., see discussion of part scores vs. full-scale IQ scores).Yes and no. Marginally satisficing. Grade B-. In high-stakes IQ prong settings (e.g., social security and special education eligibility, Atkins death penalty cases), certain measurement issues almost always require attention.The purple manual, either in the glossary, text, or in both, provides satisficing treatment of SEM, confidence band intervals, norm obsolescence or the Flynn effect, and practice effects. However, this author was frustrated when trying to locate clear AAIDD descriptions or guidance for certain measurement issues. For example, the explanation of practice effects, although not in the green manual glossary, was listed in the green manual topic index that conveniently directed the reader to page 38 for a definition and practice guidance. Practice effects are now included under progressive error in the purple manual glossary, which may not be immediately apparent or familiar to all users, and only receives a mention in Table 3.5 and a passing comment on page 39 (“frequent re-administrations may lead to overestimating the examinee's true intelligence [i.e., practice effects]”). The description of the Flynn effect is mysteriously described under the second level subheading of “Making a Retrospective Diagnosis.” A Flynn effect adjustment is a function of the time between when an IQ test was administered and the norming year of the test, temporally orthogonal to whether a potential diagnosis is retrospective or not.There are other examples of the need to search for the needle (i.e., term, definition, practice guidance) in the haystack (the body of the manual) that could be resolved by retaining the topic index from the green manual. The inability to readily locate the corpus of AAIDD's treatment for key measurements terms and concepts in a coherent tractable manner is frustrating. This annoyance is exacerbated when attempting to crosswalk the same measurement concepts between the green and purple manuals.Finally, many high-stakes ID cases often include case files that include multiple IQ scores across time or from different IQ tests. Some form of guidance, at minimum in a passing reference, to the issues of the convergence of indicators and IQ score exchangeability would have been useful. Users will need to go beyond the AAIDD manual for guidance (see Floyd et al., 2021; McGrew, 2015; and Watson, 2015).Not satisficing. Grade C. This marginally passing grade is due to AAIDD's part-score position: (a) being inconsistent and confusing within the manual, (b) being at variance with other authoritative sources, and (c) not recognizing central scientific and legal tenants that underlie the complex issue. AAIDD needs to address the part-score issue with preemptive vigor to mitigate confusion and potential misuse of its ambiguous statements. Otherwise, legal entities may fill the void and prescribe a variety of case-specific remedies of dubious quality.The manual states that “Part scores should not be used in determining whether the individual's level of intellectual functioning … the current evidence indicates that there is no reason to question the validity of the full-scale IQ, even in individual cases where there is significant part/factor score profiles” (emphasis added; p. 28). The “just say no to part scores” position seems clear in the statement that “Gf or Gc scores [sic] should not be used as a proxy for general intelligence, even in unusual cases, such as when there is a substantial spread of subtest scores (emphasis added; p. 28).” (Please note the manual's inattention to what I call “misplaced italics” [e.g., Gc scores] is, unfortunately, frequent in the manual. I comment on this and other numerous copyedit errors later.) Yet, in the next sentence there is a suggestion that Gf and Gc part scores can be used: “Consistent with current thinking … the valid use of intelligence part scores requires at least 3-6 subtests [emphasis added] of Gf and [sic] Gc” (p. 28). Furthermore, by featuring crystallized and fluid intelligence part or factor scores from common IQ tests in Table 3.2, there is the implicit suggestion that fluid and crystallized part scores hold special value. AAIDD's ambiguous part-score statements only muddy an already contentious and complex issue in high-stakes ID diagnostic settings.In AAIDD's The Death Penalty and Intellectual Disability (Polloway, 2015), both McGrew (2015) and Watson (2015) suggest that part scores can be used in special cases. (Note that these two chapters, although published in an AAIDD book, do not necessarily represent the official position of AAIDD.) The limited use of part scores is also described in the 2002 National Research Council book on ID and social security eligibility (see McGrew, 2015; Watson, 2015). The authoritative Diagnostic and Statistical Manual of Mental Disorder—Fifth Edition (DSM-5) manual implies that part scores may be necessary when it states that “highly discrepant subtest scores may make an overall IQ score invalid” (American Psychiatric Association, 2013, p. 37). Finally, in the recent APA Handbook of Intellectual and Developmental Disabilities (Glidden, 2021), Floyd et al. state “in rare situations in which the repercussions of a false negative diagnostic decision would have undue or irreparable negative impact upon the client, a highly g-loaded part score (see McGrew, 2015a) might be selected to represent intellectual functioning” (emphasis added; p. 412).Specifying (either implicitly or explicitly) fluid and crystallized intelligence measures as the most valid g-proxies for unique cases fails to recognize an important distinction between gf /gc and Gf/Gc. As written, the purple manual references the broad Gf and Gc abilities as per contemporary CHC theory. However, it is not often understood that Horn and Carroll's broad Gf and Gc abilities are not isomorphic with Cattell's two gf /gc general abilities, constructs that are more consistent with the notion of general intelligence (g) as articulated by Cattell's mentor, Spearman (see Schneider & McGrew, 2018). (Note that CHC or the three-stratum Gf-Gc theory differentiates abilities at three levels [strata] of generality. General intelligence [g] is the most general and is at the apex [stratum III] of the hierarchy. Broad CHC abilities [Gf, Gc, Gv, etc.] are at stratum II, and narrow abilities at stratum I are subsumed by the broad abilities.) The purple manual's deference to fluid and crystallized intelligence, and particularly the passing mention of both abilities as potentially suitable part scores to represent general intelligence (see page 28), has a clear Cattell general ability (stratum III) construct ring, not the narrower (broad) notions of CHC Gf and Gc associated with the CHC theory endorsed in the manual. Perhaps this disconnect is the reason for the manual's ambiguous and contradictory treatment of fluid and crystallized intelligence g-proxy part scores.AAIDD needs to provide guidance regarding whether g-proxy measures should be more broad-like CHC Gf and Gc composites present in most contemporary IQ batteries or more Cattell-like general gf /gc composites. For example, the Wechsler Intelligence Scale for Children Fifth Edition (WISC-V) provides four-subtest Expanded Verbal (crystallized intelligence; Gc) and Expanded Fluid Index (Gf) scores that are consistent with the broad Gc and Gf CHC constructs. The Woodcock-Johnson III (WJ III) had a four-test Thinking Ability cluster that was more akin to Cattell's general gf, as it was comprised of tests that measured Gf, Gv, Ga, and Glr (now split into Gl and Gr). Interestingly, the Comprehensive Test of Nonverbal Intelligence—Second Edition (CTONI-2), typically considered a special purpose test, produces a six-test Gf-like score that is likely a more robust Gf measure than any Gf score from any individually administered IQ test. The popular CHC cross-battery assessment and interpretation methods and software (Flanagan et al., 2013) allow users the ability to generate unique mixtures of broad CHC-like Gf and Gc scores across multiple test batteries for as many individual tests a psychologist desires. Schneider (2013) has also provided information on formulas (and software tools; tinyurl.com/3oj2wu79; tinyurl.com/nn11zg81) to calculate statistically sound clusters for any mixture of tests.With the clear movement to flexible tablet-based digital test libraries and centralized online scoring platforms, publishers are soon likely to provide a menu-driven test selection approach where users can obtain broad CHC-like Gf and Gc scores based on three to four (or more) tests from the same battery of co-normed tests, or across different test batteries within a publisher's stable of test products. For the test battery this author coauthors (WJ IV), three-test Gf and Gc CHC broad clusters are available. By ignoring the WJ IV packaging boundaries of the Cognitive, Oral Language, and Achievement batteries, with minimal psychometric work and a software patch to the online scoring platform, four-test Gf and up to seven-test Gc (Schneider, 2016) IQ scores could readily be made available. Depending on which broad CHC abilities one considers as representing a general Cattell gf score (e.g., Gf, Gv, Ga, Gl, Gr), the current WJ IV could generate such a score based on five to approximately a dozen tests. Without theoretically and psychometrically sound guidance, there is the increased possibility of fluid and crystallized part-score IQ roulette.The core part- vs. full-scale IQ score issue, in part, reflects a fundamental tension between science and law. “While science attempts to discover universals hiding among the particulars, trial courts attempt to discover the particulars hiding among the universals” (Faigman, 1999, p. 69). A central issue is whether the scientific principle of ergodicity holds. In simple terms, do group-based research findings generalize or remain invariant when applied to individuals (Fisher et al., 2018; Gomes et al., 2019)? In the courts this is referred to as the General-2-individual or G2i principle (Faigman et al., 2017; National Academies of Sciences, Engineering, and Medicine, 2018). Group-based research consistently suggests that discrepant part scores do not invalidate full-scale IQ scores (Floyd et al., 2021). However, the ergodicity and G2i principles have not been proven to hold in the form of knowing, with any degree of certainty, that for any individual the group-based part vs. full-scale research findings may or may not apply to a specific individual. In fact, most all psychological processes are nonergodic (Gomes et al., 2019). In a unique n = 1 high-stakes setting, a psychologist may be ethically obligated to proffer an expert opinion whether the full-scale score is (or is not) the best indicator of general intelligence. There must be room for the judicious use of clinical judgment-based part scores. AAIDD's purple manual complicates rather than elucidates guidance for psychologists and the courts. In high-stakes settings, a psychologist may be hard pressed to explain that their proffered expert opinions are grounded in the AAIDD purple manual, but then explain why they disagree with the “just say no to part scores” AAIDD position.The theoretical construct of general intelligence (g) is the Loch Ness Monster of psychology. Since the early 1900s, psychologists have been searching for the theoretical basis of g in the form of a brain-based property, entity, or mechanism, to no avail. There is a distinction made, typically overlooked in applied settings, between psychometric g (represented by a full-scale IQ score) and theoretical g (i.e., a brain-wide property or entity that produces psychometric g). Emerging contemporary research focused on brain networks, dynamic mutualism and process overlap theories, provides compelling evidence that theoretical g may not exist (Barbey, 2018; Kan, van der Mass, & Levin, 2019; Kovacs & Conway, 2016; 2019; Schneider & McGrew, 2019; van der Maas et al., 2017). These studies suggest that “g is an emergent property rather than a causal latent trait: It is the consequence, not the cause, of correlations between cognitive ability tests” (Kovacs & Conway, 2019, p. 192). If theoretical g does not exist, and psychometric g is nothing more than a statistical emergent property index (much like horsepower in a car engine does not represent an entity in the engine, but is the resulting emergent property index from the interaction of distinct engine components), the theoretical glue binding together part scores in the service of the superordinate full-scale g score is dissolved—setting the stage for cogent theoretical and research-based arguments that certain part scores (viz., Gf/Gc; gf /gc) possess more coherent psychometric and theoretical validity than the atheoretical pragmatic full-scale IQ score (Kovacs & Conway, 2019; Schneider & McGrew, 2019). (For the statistically inclined readers, the fundamental issue is that g, as a theoretical construct, is modeled as a reflective latent trait construct. Contemporary non-g theoretical models suggest psychometric g is a pragmatic emergent formative construct that can explain the positive manifold among a collection of individual tests; Kovacs & Conway, 2019.)AAIDD's ambiguous part score statements raise more questions than answers. The wild west of easily crafted and psychometrically defensible three-or-more test Gf and Gc or gf /gc composites scores is here. Guidance is needed on: (a) how many tests should be required, at a minimum, to comprise these fluid and crystallized psychometric g-proxy composites, (b) whether these composites should align more with broad Gf and Gc as per CHC theory or should align more with the general Cattell gf /gc, and (c) what psychometric methods are permissible for crafting such composite scores (e.g., only norm-based composites from tests from the same standardization sample, composites from tests with statistically equated/linked test batteries,; composites derived from statistical formulas for tests that do not share common or statistically linked standardization samples).No. Not satisficing. Grade D. As of this writing, I have counted at least 20 copyedit errors, and these are only noted in the pages relevant to the IQ prong. Many are “misplaced italics” errors as described previously (e.g., page 28 and reference pages 138, 141, 146, 149). Kranzler is misspelled as Kanzler both on page 26 and in the references. On page 42 “test norms” is incorrectly written as “test e-norms.” The Floyd, Farmer, Schneider, and McGrew reference is correctly cited with the 2021 publication date in the references but is referenced as “in press” on pages 28 and 29. Other errors are mentioned earlier in this article as well.Such copyedit errors, in the quantity and variety observed, should not be present in what is intended to be the authoritative definitive source for diagnosing ID. The word authoritative conveys something as being official, approved, or definitive. It also connotes the object is precise, accurate, and correct. These easily preventable errors, plus the absence of a topic index, to some, may suggest that the manual was carelessly and hastily tossed together. These problems tarnish both the purple manual and the professional reputation of AAIDD.Just as a full-scale IQ score may not always be the best psychometric proxy for estimating an individual's general intellectual functioning, this author refrains from providing a global summative grade for the IQ prong material in the purple manual. Given this author's primary intelligence theory-practice gap criticism of the 11th edition green manual, AAIDD's endorsement of the CHC theory of intelligence is the most important positive revision to the intellectual prong of the purple manual. The weakest part of the intellectual functioning component of the manual is the obfuscation of AAIDD's g-proxy part-score position, a position at variance with most other prominent professional sources and a position that fails to recognize the underlying central scientific and legal evidence issues. Practitioners and, more importantly, individuals with unique cognitive characteristics, cannot wait another decade until the 13th edition manual is published for more robust AAIDD part-score guidance.I may be a tough grader, but my evaluative judgements (positive and negative) are intended to push AAIDD higher and farther to provide the best possible guidance for the identification of individuals with intellectual or developmental disabilities.The author declares a possible financial conflict of interest as a royalty receiving author of the WJ IV, an individually administered IQ test mentioned in this review and often used in the diagnosis of ID. The author also received a complimentary copy of the new AAIDD manual as a member of the advisory committee. The author's complete set of credentials and all potential conflicts of interest can be found at: http://themindhub.com/about-iap/the-director. The author thanks Randy Floyd for comments on an earlier version of this review.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call