Abstract

Clonal mosaicism (a detectable post-zygotic mutational event in cellular subpopulations) is common in cancer patients. Detected segments of clonal mosaicism are usually bundled into large-locus regions for statistical analysis. However, low-frequency genes are overlooked and are not sufficient to elucidate qualitative differences between cancer patients and non-patients. Therefore, it is of interest to develop and describe a tool named Sub-GOFA for Sub-Gene Ontology function analysis in clonal mosaicism using semantic similarity. Sub-GOFA measures the semantic (logical) similarity among patients using the sub-GO network structures of various sizes segmented from the gene ontology (GO) for clustering analysis. The sub-GO's root-terms with significant differences are extracted as disease-associated genetic functions. Sub-GOFA selected a high ratio of cancer-associated genes under validation with acceptable threshold.

Highlights

  • Clonal mosaicism is a post-zygotic large-scale mutational event in chromosomes and mitochondria in cellular subpopulation

  • Using biological knowledge embedded in the gene ontology (GO) structure has enabled further comparison or classification of given set of genes obtained by various omics analysis techniques to understand the biological phenomena? Currently, number of semantic-based tools has played an important role in improving analysis of proteomics and transcriptomics at the level of functional genomics using different semantic similarity measures among GO terms [13]

  • It is of interest to develop and describe a tool named Sub-GOFA for Sub-Gene Ontology function analysis in clonal mosaicism using semantic similarity

Read more

Summary

Introduction

Clonal mosaicism is a post-zygotic large-scale mutational event in chromosomes and mitochondria in cellular subpopulation. Number of semantic-based tools has played an important role in improving analysis of proteomics and transcriptomics at the level of functional genomics using different semantic similarity measures among GO terms [13]. This approach has not yet been attempted for large-scale genomic regional dataset that consists of gene list in specific genomic region. Pair-wise semantic similarity measure between large-scale genomic regional datasets handles thousands of genes as variables. Even if the similarity of a characteristic genetic functions is found in a segmented specific GO network region, that similarity is homogenized within a global similarity measure in the entire GO network, and those genetic functions are overlooked

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call