The identification of molecules within complex mixtures is a major bottleneck in natural products (NPs) research. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) has emerged as the main tool for the high-throughput characterization of NPs. The large amount of data sets by LC-MS/MS presents a challenge for data processing and interpretation, and the LC-MS/MS molecular network (MN) is one of the most prominent tools for analyzing large MS/MS data sets, widely used for rapid classification, identification, and structural speculation of unknown compounds. However, the existence of a large number of redundant nodes leads to false-positive results. To solve this problem, we proposed the in-depth analysis of MN. In this study, in-depth analysis of MN of five NPs representing the common structures of saponin, steroid, flavonoid, alkaloid, and phenolic acid revealed the presence of redundant nodes (including other adducts, isotope, and in-source fragmentation) in addition to the normal nodes, which can lead to false-positive identification results. Additionally, the reasons for different redundant nodes are discussed and experimentally verified, and it was found that the impact of redundant nodes can be mitigated by optimizing the experimental conditions and employing Feature-Based Molecular Networking. Furthermore, Ion Identity Molecular Networking can rapidly discover and screen redundant nodes, simplifying the in-depth analysis of MN and improving the network connectivity of structurally related molecules. Finally, a combination formulation of 7 NPs is used as an example to provide a guide for in-depth analysis of MN for comprehensive characterization of complex systems. This study highlights the importance of an in-depth analysis of MN for better understanding and utilization of MS/MS data in complex systems to reduce the false-positive rate of identification by screening and filtering redundant nodes.
Read full abstract