Several database search methods have been employed in untargeted metabolomics utilizing high-resolution mass spectrometry to comprehensively annotate acquired product ion spectra. Recent technical advancements in in silico analyses have facilitated the sorting of the degree of coincidence between a query product ion spectrum, and the molecular structures in the database. However, certain search results may be false positives, necessitating a method for controlling the false discovery rate (FDR). This study proposes 4 simple methods for controlling the FDR in compound search results. Instead of preparing a decoy compound database, a decoy spectral dataset was created from the measured product-ion spectral dataset (target). Target and decoy product ion spectra were searched against an identical compound database to obtain target and decoy hits. FDR was estimated based on the number of target and decoy hits. In this study, 3 decoy generation methods, polarity switching, mirroring, and spectral sampling, were compared. Additionally, the second-rank method was examined using second-ranked hits in the target search results as decoy hits. The performances of these 4 methods were evaluated by annotating product ion spectra from the MassBank database using the SIRIUS 5 CSI:FingerID scoring method. The results indicate that the FDRs estimated using the second-rank method were the closest to the true FDR of 0.05. Using this method, a compound search was performed on 4 human metabolomic data-dependent acquisition datasets with an FDR of 0.05. The FDR-controlled compound search successfully identified several compounds not present in the Human Metabolome Database.
Read full abstract