Proton magnetic resonance spectroscopy (1H-MRS) offers a growing variety of methods for querying potential diagnostic biomarkers of multiple sclerosis in living central nervous system tissue. For the past three decades, 1H-MRS has enabled the acquisition of a rich dataset suggestive of numerous metabolic alterations in lesions, normal-appearing white matter, gray matter, and spinal cord of individuals with multiple sclerosis, but this body of information is not free of seeming internal contradiction. The use of 1H-MRS signals as diagnostic biomarkers depends on reproducible and generalizable sensitivity and specificity to disease state that can be confounded by a multitude of influences, including experiment group classification and demographics; acquisition sequence; spectral quality and quantifiability; the contribution of macromolecules and lipids to the spectroscopic baseline; spectral quantification pipeline; voxel tissue and lesion composition; T1 and T2 relaxation; B1 field characteristics; and other features of study design, spectral acquisition and processing, and metabolite quantification about which the experimenter may possess imperfect or incomplete information. The direct comparison of 1H-MRS data from individuals with and without multiple sclerosis poses a special challenge in this regard, as several lines of evidence suggest that experimental cohorts may differ significantly in some of these parameters. We review the existing findings of in vivo 1H-MRS on central nervous system metabolic abnormalities in multiple sclerosis and its subtypes within the context of study design, spectral acquisition and processing, and metabolite quantification and offer an outlook on technical considerations, including the growing use of machine learning, by future investigations into diagnostic biomarkers of multiple sclerosis measurable by 1H-MRS.