FamPipe: An Automatic Analysis Pipeline for Analyzing Sequencing Data in Families for Disease Studies.

Ren-Hua Chung,Po-Ju Yao,Chen-Yu Kang,Wei-Yun Tsai,Hui-Ju Tsai,Chia-Hsiang Chen

doi:10.1371/journal.pcbi.1004980

Abstract

In disease studies, family-based designs have become an attractive approach to analyzing next-generation sequencing (NGS) data for the identification of rare mutations enriched in families. Substantial research effort has been devoted to developing pipelines for automating sequence alignment, variant calling, and annotation. However, fewer pipelines have been designed specifically for disease studies. Most of the current analysis pipelines for family-based disease studies using NGS data focus on a specific function, such as identifying variants with Mendelian inheritance or identifying shared chromosomal regions among affected family members. Consequently, some other useful family-based analysis tools, such as imputation, linkage, and association tools, have yet to be integrated and automated. We developed FamPipe, a comprehensive analysis pipeline, which includes several family-specific analysis modules, including the identification of shared chromosomal regions among affected family members, prioritizing variants assuming a disease model, imputation of untyped variants, and linkage and association tests. We used simulation studies to compare properties of some modules implemented in FamPipe, and based on the results, we provided suggestions for the selection of modules to achieve an optimal analysis strategy. The pipeline is under the GNU GPL License and can be downloaded for free at http://fampipe.sourceforge.net.

Highlights

Next-generation sequencing (NGS) is a popular technique for identifying novel rare variants that are potentially associated with diseases
To address the challenge faced by family-based NGS analysis for disease studies, we developed a pipeline, FamPipe, which can be applied to the analysis of Mendelian disorders or complex diseases
For identifying variants responsible for Mendelian disorders, three methods were implemented in the disease model identification (DMI) module in FamPipe including the segregation scores [8], which can be used for identifying family-specific mutations at disease variants, the weighted-sum statistic [24], which is ideal for identifying mutations in multiple disease variants within a gene, and the filtering rules for compound heterozygosity [33]

Summary

Introduction

Next-generation sequencing (NGS) is a popular technique for identifying novel rare variants that are potentially associated with diseases. Instead of using external controls, tools such as OVPDT [25], which accounts for both common and rare variants with different directions of effects on disease, and FBAT [26], which implements the weighted-sum approach [27], are available for family-based association analysis when the sample size is large.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Jun 6, 2016
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

FamPipe: An Automatic Analysis Pipeline for Analyzing Sequencing Data in Families for Disease Studies.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

CmVCall: An automated and adjustable nanopore analysis pipeline for heteroplasmy detection of the control region in human mitochondrial genome
Lirong Jiang ... Zheng Wang
Forensic Science International: Genetics | VOL. 67
Lirong Jiang, et. al.Lirong Jiang ... Zheng Wang
14 Aug 2023
Forensic Science International: Genetics | VOL. 67

Benchmarking variant callers in next-generation and third-generation sequencing analysis.
Surui Pei ... Tao Liu
Briefings in Bioinformatics | VOL. 22
Surui Pei, et. al.Surui Pei ... Tao Liu
23 Jul 2020
Briefings in Bioinformatics | VOL. 22

Abstract 2280: A comprehensive sample tracking and data processing workflow for next generation sequencing
Chandra Sekhar Pedamallu ... Mariia Zueva
Cancer Research | VOL. 81
Chandra Sekhar Pedamallu, et. al.Chandra Sekhar Pedamallu ... Mariia Zueva
01 Jul 2021
Abstract 2280: A comprehensive sample tracking and data processing workflow for next generation sequencing
Chandra Sekhar Pedamallu ... Mariia Zueva

Computational approaches to analyze next generation sequencing data
...
-
, et. al. ...
02 Apr 2013
02 Apr 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FamPipe: An Automatic Analysis Pipeline for Analyzing Sequencing Data in Families for Disease Studies.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology