This review addresses questions of what should be assessed in language acquisition, and how to do it. The design of a language assessment is crucially connected to its purpose, whether for diagnosis, development of an intervention plan, or for research. Precise profiles of language strengths and weaknesses are required for clear definitions of the phenotypes of particular language and neurodevelopmental disorders. The benefits and costs of formal tests versus language sampling assessments are reviewed. Content validity, theoretically and empirically grounded in child language acquisition, is claimed to be centrally important for appropriate assessment. Without this grounding, links between phenomena can be missed, and interpretations of underlying difficulties can be compromised. Sensitivity and specificity of assessment instruments are often assessed using a gold standard of existing tests and diagnostic practices, but problems arise if that standard is biased against particular groups or dialects. The paper addresses the issues raised by the goal of unbiased assessment of children from diverse linguistic and cultural backgrounds, especially speakers of non-mainstream dialects or bilingual children. A variety of new approaches are discussed for language assessment, including dynamic assessment, experimental tools such as intermodal preferential looking, and training studies that assess generalization. Stress is placed on the need for measures of the process of acquisition rather than just levels of achievement. Copyright © 2010 John Wiley & Sons, Ltd. For further resources related to this article, please visit the WIREs website.