Abstract

Next-generation sequencing (NGS) is widely utilized both in translational cancer genomics studies and in the setting of precision medicine. Identification and stratification of an individual's ancestry is fundamental for the correct interpretation of genetic and genomic profiling. EthSEQ provides an easy and effective computational workflow to determine the ancestry of individuals, exploiting single nucleotide polymorphism genotypes computed from NGS data of whole-exome and targeted sequencing experiments. Genotypes are determined by EthSEQ from sequencing alignment files (BAM files) or can be provided as input in Variant Call Format (VCF) or CoreArray Genomic Data Structure (GDS) format. Ancestry is determined and assigned to individuals by EthSEQ exploiting a reference model and a standard or multi-step refinement approach based on Principal Component Analysis (PCA). A complete and detailed set of textual and graphical output files are generated by EthSEQ as result. EthSEQ is easy to use and can be integrated into any NGS-based processing pipeline also exploiting multi-core capabilities. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Perform ancestry analysis using a pre-computed reference model Alternate Protocol: Perform ancestry analysis using a user-specified GDS file as reference model Basic Protocol 2: Perform ancestry analysis using multi-step refinement Support Protocol 1: Create a reference model from multiple VCF genotype data files Support Protocol 2: Create VCF genotype data files from a BAM file.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call