MGDB: A Novel Bioinformatics Quality Control Tool for Clinical Next-Generation Sequencing
Background and Objectives:Next-generation sequencing (NGS) is transforming clinical diagnostics by enabling the detection of genetic variation with unprecedented precision. However, successful implementation of NGS workflows necessitates stringent quality control. This study introduces Molecular Genetics Dashboard (MGDB), a novel bioinformatics tool designed to enhance quality control in clinical NGS workflows.Methods:Using the Python dash framework for visualizations and MySQL databases, we have developed a novel tool for variant-level monitoring of clinical NGS sequencing runs. MGDB uses a docker-compose containerization for improved portability and can flexibly include or exclude samples from accumulated statistics with notes from interpreters.Results:MGDB facilitates variant-level run-to-run monitoring, ensuring the consistency of variant detection across sequencing cycles. The tool provides an interactive platform for visualizing and assessing variant data, identifying potential inconsistencies or outliers and improving data management and interpretation compared to traditional methods. MGDB was tested using samples sequenced with Oncomine Focus/Comprehensive Plus assays on S5 sequencers and analyzed via IonReporter software.Conclusions:MGDB offers a robust and user-friendly solution for enhancing quality control in clinical NGS workflows, contributing to greater accuracy and reliability in variant detection. The tool is freely available on GitHub: https://github.com/acri-nb/GeneticVariantsDB.
- Front Matter
4
- 10.1016/j.bdq.2015.09.003
- Sep 1, 2015
- Biomolecular Detection and Quantification
Guest editor's introduction for BDQ special issue: ‘Advanced Molecular Diagnostics for Biomarker Discovery’
- Research Article
- 10.1158/1538-7445.am2024-5054
- Mar 22, 2024
- Cancer Research
Formalin-fixed paraffin-embedded (FFPE) biopsies are highly valuable and widely used tissue specimens for clinical diagnostics. However, obtaining sufficient and high-quality nucleic acid material from limited FFPE samples presents a challenge for downstream molecular analysis, such as next-generation sequencing (NGS). We present an optimized sequential extraction method that generates high-quality DNA and RNA from a single set of input tissues that is automatable and operation-friendly. This workflow performs well with reduced FFPE tissue input and efficiently supports various high-throughput clinical NGS applications. The DNA/RNA yield, quality, purity, and impacts on NGS assay performances were on par or better than an existing validated comparator extraction method. With comparable FFPE input, the new method demonstrated a superior extraction performance with significantly higher yield, quality, and purity. For DNA, NGS libraries were made with two different library preparation methods: a TA-ligation-based method and a single-strand-based method, followed by hybrid capturing and sequencing. The DNA from the new extraction method demonstrated a superior library conversion rate and improved target enrichment uniformity with both chemistries. This provides the potential of reduced input requirements, allowing very limited tissue, such as fine needle aspirate or core needle biopsy, for clinical NGS testing. The sequencing results were highly concordant between the existing and new extraction methods when the extracted DNA was subjected to a validated comprehensive genomic profiling (CGP) clinical test, demonstrating the quality and robustness of the extraction. For RNA, NGS libraries were made following the validated CGP clinical test with hybrid capturing, and the results were comparable between the two extraction methods. In conclusion, an FFPE extraction method was optimized to allow more clinical biopsy samples to be tested with different NGS workflows, providing a better diagnostic value for patient care. Citation Format: Rebecca Hong, Jieying Chu, Steven Hsiao, Daniel Whang, Steven P. Rivera, Heng Xie, Jiannan Guo. Optimization and evaluation of an FFPE dual extraction protocol for next-generation sequencing applications [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 5054.
- Research Article
18
- 10.1016/j.jmoldx.2021.07.012
- Aug 5, 2021
- The Journal of Molecular Diagnostics
Accurate Detection and Quantification of FLT3 Internal Tandem Duplications in Clinical Hybrid Capture Next-Generation Sequencing Data
- Research Article
- 10.1200/jco.2020.38.15_suppl.3571
- May 20, 2020
- Journal of Clinical Oncology
3571 Background: Genomic events giving rise to driver negative LA in never smokers remain elusive. Here we report results of whole exome sequencing (WES) and targeted RNA sequencing in NS who had no mutation drivers found on routine clinical testing by targeted next generation sequencing (NGS). Methods: The cohort of never smokers with EGFR/ALK negative LA by clinical biomarker testing at Princess Margaret Cancer Centre, were first subjected to various clinical NGS profiling platforms (table). Where tissue was available, those negative for potential drivers in the clinical NGS then underwent WES (mean coverage > 200x) and Oncomine comprehensive v.3 RNA sequencing. We analyzed mutational signatures (MS) of the driver negative cohort based on the COSMIC catalog and assessed the median tumor mutation burden (mTMB mut/Mb -Megabase) in cases without a smoking MS, to avoid confounders. Results: Of 159 never smokers profiled with clinical NGS, potential drivers were found in 86 (54%): 75 (87%) with mutations in known LA driver genes and 11 (13%) with fusions. Among the remaining never smokers that tested negative by clinical NGS, 35 (48%) had available tissue for further testing. The Oncomine panel identified 9 cases (25%) with fusions or MET exon14 mutation (n = 7). Within the driver negative group, 24 (92%) underwent WES. Three tumors had WES base substitution patterns that were consistent with a smoking-related MS (MS4). Twenty-one patients exhibited signatures found common across all cancer types (MS 5), associated with DNA mismatch repair (MS 6, MS 20) or APOBEC over-activation (MS 2, MS13). In the driver-negative group, we identified 7 pts with somatic mutations in the KMT2 family (4 KMT2C, 4 KMT2A, 1 KMT2D), known for putative tumor suppressors and histone methyltransferases. mTMB on the driver negative group was 1.92, while one outlier with APOBEC MS and KMT2C/A mutations had a TMB of 16.8. Conclusions: Never smokers with driver negative LA are a heterogeneous group, with different MS and a wide TMB range. Mutations on KMT2 family are frequently found in driver negative LA in never smokers and warrant further investigations. [Table: see text]
- Research Article
4
- 10.1200/cci.21.00113
- May 1, 2022
- JCO Clinical Cancer Informatics
To better use genetic testing, which is used by clinicians to explain the molecular mechanism of disease and to suggest clinical actionability and new treatment options, clinical next-generation sequencing (NGS) laboratories must send the results into reports in PDF and discrete data element format (HL7). Although most clinical diagnostic tests have set molecular markers tested and have a set range of values or a binary result (positive or negative), the NGS genetic test could examine hundreds or thousands of genes with no predefined list of variants. Although there are some commercial and open-source tools for clinically reporting genomics results for oncology testing, they often lack necessary features. Using several available software tools for data storage including MySQL and MongoDB, database querying with Python, and a web-based user application using JAVA and JAVA script, we have developed a tool to store and query complex genomics and demographics data, which can be manually curated and reported by the user. We have developed a tool, Annotation SoftWare for Electronic Reporting (ANSWER), that can allow molecular pathologists to (1) filter variants to find those meeting quality control metrics in the genes that are clinically actionable by diagnosis; (2) visualize variants using data generated in the bioinformatics analysis; (3) create annotations that can be reused in future reports with association specific to the gene, variant, or diagnosis; (4) select variants and annotations that should be reported to match the details of the case; and (5) generate a report that includes demographics, reported variants, clinical actionability annotation, and references that can be exported into PDF or HL7 format, which can be electronically sent to an electronic health record. ANSWER is a tool that can be installed locally and is designed to meet the clinical reporting needs of a clinical oncology NGS laboratory for reporting.
- Research Article
2
- 10.1158/1538-7445.am2015-4863
- Aug 1, 2015
- Cancer Research
The results of numerous molecular screening and assay methods often rely on the overall quality of the genomic DNA (gDNA) input material. However extraction of genetic material can be challenging and often results in low amounts or variable quality of gDNA samples, which are further subjected to time and cost intensive downstream applications. For example, array comparative genome hybridization (aCGH) and Next Generation Sequencing (NGS) can require intact, high quality gDNA to ensure high quality, unambiguous results. It is therefore widely recommended to perform an initial quality control (QC) of the input material. Especially as only the final step of these workflows reveals if meaningful results have been achieved. In order to provide an objective and automated measure to standardize the gDNA integrity assessment, a software algorithm has been developed. This functionality of the 2200 TapeStation system provides a numerical determination of the gDNA integrity and is referred to as the DNA Integrity Number (DIN). This study demonstrates how DIN obtained by the upfront QC of gDNA on the Agilent Genomic DNA ScreenTape assay has allowed for significant saving of sequencing and sample preparation overhead using cancer samples in NGS workflows. Citation Format: Eva Schmidt, Isabell Pechtl, Barry McHoull, Melissa Liu. Streamlining NGS workflows using cancer samples by the application of the DNA Integrity Number (DIN) from the Genomic DNA ScreenTape assay. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 4863. doi:10.1158/1538-7445.AM2015-4863
- Abstract
- 10.1016/j.cancergen.2015.05.026
- Jun 1, 2015
- Cancer Genetics
Streamlining NGS Workflows Using Cancer Samples by the Application of Tthe DNA Integrity Number (DIN) from the Genomic DNA Screentape Assay
- Research Article
- 10.1093/ajcp/aqz112.019
- Sep 11, 2019
- American Journal of Clinical Pathology
Objectives Our goal was to enhance our next-generation sequencing (NGS) molecular oncology workflow from sequencing to analysis through improvements to our custom-built and previously described NGS application. Methods Over 1 year, we collected feedback regarding workflow pain-points and feature requests from all end users of our NGS application. The application consists of a series of scripted pipelines, a MySQL database, and a Java Graphic User Interface (GUI); the end users include molecular pathologists (MPs), medical technologist/medical laboratory technologists (MTs/MLTs), and the molecular laboratory manager. These feedback data were used to engineer significant changes to the pipelines and software architecture. These architecture changes provided the backbone to a suite of feature enhancements aimed to improve turnaround time, decrease manual processes, and increase efficiency for the molecular laboratory staff and directors. Summary The key software architecture changes include implementing support for multiple environments, refactoring common code in the different pipelines, migrating from a per-run pipeline model to a per-sample pipeline model, and key updates to the MySQL database. These changes enabled development of many technical and user experience improvements. We eliminated the need for the pipelines to be launched manually from the Linux command line. Multiple pipelines can be executed concurrently. We created a per-sample pipeline status monitor. Sample entry is integrated with our Laboratory Information System (LIS) barcodes, thus reducing the possibility of transcription errors. We developed quality assurance reports. Socket-based integration with Integrated Genomics Viewer (IGV) was enhanced. We enabled rapid loading of key alignment data into IGV over a wireless network. Features to support resident and fellow driven variant and gene annotation reporting were developed. Support for additional clinical databases was implemented. Conclusions The designed feature enhancements to our previously reported NGS application have added significant sophistication and safety to our clinical NGS workflow. For example, our NGS consensus conference can be held in a conference room over a wireless network, and a trainee can prepare and present each case without ever leaving the application. To date, we have analyzed 2,540 samples using three different assays (TruSight Myeloid Sequencing Panel, AmpliSeq Cancer Hotspot Panel, GlioSeq) and four sequencing instruments (NextSeq, MiSeq, Proton, PGM) in this application. The code is freely available on GitHub.
- Research Article
9
- 10.1371/journal.pone.0152851
- Apr 4, 2016
- PLOS ONE
Next-generation sequencing (NGS) is a powerful platform for identifying cancer mutations. Routine clinical adoption of NGS requires optimized quality control metrics to ensure accurate results. To assess the robustness of our clinical NGS pipeline, we analyzed the results of 304 solid tumor and hematologic malignancy specimens tested simultaneously by NGS and one or more targeted single-gene tests (EGFR, KRAS, BRAF, NPM1, FLT3, and JAK2). For samples that passed our validated tumor percentage and DNA quality and quantity thresholds, there was perfect concordance between NGS and targeted single-gene tests with the exception of two FLT3 internal tandem duplications that fell below the stringent pre-established reporting threshold but were readily detected by manual inspection. In addition, NGS identified clinically significant mutations not covered by single-gene tests. These findings confirm NGS as a reliable platform for routine clinical use when appropriate quality control metrics, such as tumor percentage and DNA quality cutoffs, are in place. Based on our findings, we suggest a simple workflow that should facilitate adoption of clinical oncologic NGS services at other institutions.
- Abstract
- 10.1182/blood-2024-201658
- Nov 5, 2024
- Blood
Detection of Recurrent Mutations, Immunoglobulin Rearrangements and Copy Number Changes in Cell-Free DNA and Bone Marrow on Patients with Waldenstrom's Macroglobulinemia over a Course of Treatment
- Abstract
- 10.1016/j.cancergen.2015.05.025
- Jun 1, 2015
- Cancer Genetics
Large Cryptic Derivative Chromosome 8 Detected by SNP Chromosomal Microarray
- Research Article
- 10.1158/1538-7445.am2017-1724
- Jul 1, 2017
- Cancer Research
Background: Genomic characterization of circulating tumor cells (CTCs) provides insights into cancer genetic changes, and might be utilized for cancer prognosis, diagnosis, as well as monitoring of therapeutic efficacy. Targeted Panel Next Generation Sequencing (NGS) enables analyzing CTC genetic variants of a focused gene panel at a relatively lower cost1. However, CTCs are rare, often resulting in very limited DNA quantities available that require whole genome amplification (WGA). In previous studies, we introduced the Vortex technology, a platform enabling label-free enrichment of CTCs from blood samples of colorectal cancer (CRC) patients and their use for genomic assays downstream2. In this study, we developed a simple and efficient NGS workflow for CTC samples collected by this technology. Method: An optimized workflow using the Qiagen GeneRead DNAseq targeted panel and Illumina MiSeq NGS was first verified on HCT116 CRC cell line before being applied on patient CTCs. For patient blood samples, CTCs were collected with the Vortex technology, immunostained (CK, Vimentin, CD45) and enumerated. Matched white blood cell (WBC) DNA was included to subtract germline background. Fresh frozen liver metastasis tissue was collected and analyzed using the same NGS workflow. DNA from CTCs was extracted and amplified using Qiagen REPLI-g single cell WGA kit. Mutation detection on the WGA amplified DNA was performed using the GeneRead DNAseq CRC targeted panel of 38 genes and MiSeq sequencing. The sequencing data were analyzed by QIAGEN NGS Data Analysis Web Portal and Ingenuity Variant Analysis software. Results: The Vortex technology was validated for the capture of CTCs from CRC patients. REPLI-g performed a uniform, unbiased amplification on fresh rare cells with a coverage of 97.7%, which enabled further targeted panel NGS. Blood from 3 CRC patients (P1, P2, P3) and 2 healthy donors (HD1, HD2) was processed with Vortex platform. Less than 1 CTCs/mL blood were found in HD1 and HD2. P1 and P2 had 66 and 20 CTCs/ mL of blood respectively, with many vimentin positive CTC clusters. P3 had 2 CTCs/mL of blood. No somatic mutation was found in healthy donors. Somatic variants were only detected in the CTCs from patient samples that were not present in matched germline WBCs. For P1, more mutations were found in the CTCs than in the liver metastasis while it was the opposite for P2 and P3. Conclusion: For each patient, variants in CTCs and germline WBCs were analyzed from one blood sample using an optimized targeted NGS workflow and compared to liver mets. Our optimized workflow, using the Qiagen REPLIg and GeneRead DNAseq Targeted Panel NGS enabled the detection of CTC mutations for 38 CRC-focused genes. The inclusion of a germline WBC control in the workflow allowed the detection of mutations from pooled CTC samples collected using the Vortex technology. Altmüller J, et al. (2014). Biol Chem. Kidess-Sigal E, et al. (2016). Oncotarget. Citation Format: Haiyan E. Liu, Melanie Triboulet, Amin Zia, Meghah Vuppalapaty, Evelyn Kidess-Sigal, John Coller, Vanita S. Natu, Vida Shokoohi, James Che, Corinne Renier, Natalie Chan, Violet Hanft, Elodie Sollier-Christen, Stefanie S. Jeffrey. Genomic profiling of Vortex-enriched CTCs using whole genome amplification and multiplex PCR-based targeted next generation sequencing [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 1724. doi:10.1158/1538-7445.AM2017-1724
- Abstract
- 10.1016/j.humimm.2014.08.156
- Sep 29, 2014
- Human Immunology
P094 : A FULL NGS WORKFLOW FOR REACHING THE ULTIMATE HLA TYPING RESOLUTION
- Research Article
- 10.1200/jco.2013.31.15_suppl.11099
- May 20, 2013
- Journal of Clinical Oncology
11099 Background: Next-generation sequencing (NGS) allows for simultaneous detection of numerous actionable somatic variants in cancer. We have implemented a clinical NGS panel to detect genetic alterations in 25 genes with established roles in cancer and report here the frequency of clinically actionable genetic variants in a variety of cancer types. Methods: NGS testing was performed in a CAP-certified, CLIA-licensed environment on DNA extracted from FFPE tissue in 209 cases spanning 41 histologic tumor types. DNA was enriched by hybrid capture and sequenced to >1,000x average coverage on Illumina sequencers with 2x101bp or 2x150bp reads. Variants were called using clinically validated parameters using the Genome Analysis Toolkit, Pindel, and the custom-written Clinical Genomicist Workstation. Results: Non-small cell lung cancer (45%), pancreatic cancer (10%), and colorectal cancer (8%) were the most common tumors sent for NGS analysis. An average of 3 (range 1- 16) non-synonymous, non-SNP sequence variants per case (SNVs and indels) were detected in the 130kb exonic target. Variants were most commonly seen in TP53, KRAS, and EGFR. 27% of cases (56/209) had one or more variants with therapeutic implications for the tumor type tested (e.g., EGFR mutation in NSCLC). 15% of cases (32/209) showed actionable variants not generally associated with the malignancy tested (e.g., detection of an activating KITvariant in thymic carcinoma). 10% of cases (21/209) had variants that were prognostically significant but not directly targetable. Some cases (9%) had variants that were prognostic/diagnostic and targetable. In 117 cases (56% of total), no therapeutically or prognostically significant variants were identified. Overall, in 92 cases (44%), NGS testing yielded information with therapeutic (majority), prognostic, or diagnostic ramifications. Conclusions: We found that 44% of unselected cancer cases have clinically relevant sequence variants in a set of 25 commonly mutated cancer genes. Our data suggest that clinical NGS testing may serve as an integral tool in realizing the potential of precision medicine in oncology.
- Research Article
- 10.3390/diagnostics16010037
- Dec 22, 2025
- Diagnostics
Background/Objectives: Conventional next-generation sequencing (NGS) workflows often require more than two weeks to complete, delaying treatment decisions and limiting access to precision oncology in community settings. This report aimed to demonstrate the feasibility of performing rapid, comprehensive cell-free DNA (cfDNA)-based genomic profiling by introducing a fully automated NGS workflow in a community hospital environment. Case Presentation: A postoperative patient with pancreatic ductal adenocarcinoma and liver metastasis underwent cfDNA-based liquid biopsy using plasma collected in PAXgene® Blood ccfDNA Tubes. Gene analysis was performed using the Oncomine Precision Assay GX5 on the Ion Torrent Genexus™ System (Thermo Fisher Scientific). Three pathogenic hotspot mutations—KRAS G12R, TP53 M246I/M246K, and GNA11—and one copy number gain in PIK3CA were identified, whereas no variants were detected in a healthy volunteer control. The total turnaround time from plasma separation to report generation was approximately 27 h, requiring only 40 min of total hands-on time. Discussion: This rapid, automated workflow enabled comprehensive cfDNA analysis within a clinically practical timeframe, overcoming key limitations of conventional multi-step NGS workflows that typically require external sample shipment and specialized personnel. The results confirm the technical feasibility of conducting high-quality molecular testing in a regional hospital setting. Conclusions: This report demonstrates that fully automated cfDNA-based NGS can achieve clinically meaningful genomic profiling within 27 h in a community hospital. This advancement addresses the time and cost barriers of traditional NGS analysis and represents a significant step toward promoting precision medicine in community healthcare.