Abstract

BackgroundIn metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed.Methodology/Principal FindingsThe FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30–50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries.Conclusions/SignificanceHigh accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR <10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.

Highlights

  • In recent metabolomics studies using mass spectrometry (MS), advances in high-resolution MS, including time-of-flight (TOF)- [1], Orbitrap- [2], and Fourier transform ion cyclotron resonance (FTICR)-MS [3], have made it possible to acquire metabolome data with accurate mass-to-charge ratios (m/z) [4,5,6]

  • Putative elemental compositions could be assigned to many metabolite signals using these methods, it should be noted that incorrect hits derived from errors in mass analyses will be included in the search results [17]

  • Density and completeness of compound databases false discovery rates (FDR) of elemental composition search results are expected to be affected by three factors: (i) accuracy of the mass analysis of the query metabolome data (s), (ii) width of the threshold for searching (Dthres), and (iii) the properties of the compound database

Read more

Summary

Introduction

In recent metabolomics studies using mass spectrometry (MS), advances in high-resolution MS, including time-of-flight (TOF)- [1], Orbitrap- [2], and Fourier transform ion cyclotron resonance (FTICR)-MS [3], have made it possible to acquire metabolome data with accurate mass-to-charge ratios (m/z) [4,5,6]. Systematic searching of highresolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers [3,10,11,12,13,14,15]. Putative elemental compositions could be assigned to many metabolite signals using these methods, it should be noted that incorrect hits (i.e., false positives) derived from errors in mass analyses will be included in the search results [17]. Evaluation of false discovery rates (FDRs) in elemental composition search results is essential to minimize misinterpretation of metabolome data. In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call