The replication crisis has led to a renewed discussion about the impacts of measurement quality on the precision of psychology research. High measurement quality is associated with low measurement error, yet the role of reliability in the quality of experimental research is not always well understood. In this study, we attempt to understand the role of reliability through its relationship with power while focusing on between-group designs for experimental studies. We outline a latent variable framework to investigate this nuanced relationship through equations. An under-evaluated aspect of the relationship is the variance and homogeneity of the subpopulation from which the study sample is drawn. Higher homogeneity implies a lower reliability, but yields higher power. We proceed to demonstrate the impact of this relationship between reliability and power by imitating different scenarios of large-scale replications with between-group designs. We find negative correlations between reliability and power when there are sizable differences in the latent variable variance and negligible differences in the other parameters across studies. Finally, we analyze the data from the replications of the ego depletion effect (Hagger et al., 2016) and the replications of the grammatical aspect effect (Eerland et al., 2016), each time with between-group designs, and the results align with previous findings. The applications show that a negative relationship between reliability and power is a realistic possibility with consequences for applied work. We suggest that more attention be given to the homogeneity of the subpopulation when study-specific reliability coefficients are reported in between-group studies.