A clinically useful characterization of the cognitive aging process requires the development of valid and robust behavioral tests, with an emphasis on explaining and understanding typical inter-individual variability in cognition. Here, using a dataset that includes behavioral scores collected with the National Institute of Health Toolbox Cognition Battery (NIHTB-CB) and other auxiliary tests, we examined (1) the differences between young and old adults across different cognitive domains, (2) the strength of across-subject correlations in behavioral test scores, (3) the consistency of low-dimensional behavioral representations across age using factor analysis, and (4) the accuracy of behavioral scores in predicting participants’ age. Our results revealed that (1) elderly females had better verbal episodic memory scores than elderly males, (2) across-subject correlations between behavioral tests varied with age group, (3) although a three-factor model explained the behavioral data in both age groups, some tasks loaded to different factors between the two groups, and (4) age-performance relationship (i.e. a regression model linking age to cognitive scores) in one group cannot be extrapolated to predict age in the other group, indicating an inconsistency in age-performance relationships across groups. These findings suggest that executive function tests might tap into different cognitive processes in different age groups, which might ultimately suggest that a statistically significant between-group difference in test performance might not always reflect differences in the same underlying cognitive processes. Overall, this study calls for more caution when interpreting age-related differences and similarities between age groups with different cognitive abilities even when the same tests are used.