To better interpret the Raman spectra from mammalian cells, it is often desirable to reduce their complexity by decomposing them into the spectral contributions from individual macromolecules or types of macromolecules. Diverse methods exist for demixing complex spectra, each with different benefits and drawbacks. However, some methods require a library of component spectra that might not be available, while others are hampered by noise and peak congestion that includes many proximal overlapping peaks. Through rapid fitting of individual peaks in every spectrum of a Raman hyperspectral data set, we have obtained individual peak parameters from which we determined the trends for all the peak amplitudes. We then grouped similar trends with k-means clustering. Then we used the peak parameters of all the peaks in a given cluster to reconstruct a spectrum representative of that cluster. This method produced spectra that were less distorted by unrelated overlapping peaks or noise, were less congested than those in the hyperspectral set, and thereby improved peak identification and macromolecule recognition. We have demonstrated the application of the method with Raman spectra from a perchlorate-polystyrene model system and extended it to complex spectra from methanol-fixed mammalian cells. We were able to recover independent spectra of perchlorate and polystyrene in the model system and spectra pertaining to individual macromolecular types (proteins, nucleic acids, lipids) from the mammalian cell data. We discuss how imperfections in spectral preprocessing and peak fitting can adversely affect the results. In summary, we have provided a proof-of-concept for a novel mixture resolution method with different attributes than extant ones.
Read full abstract