Abstract

BackgroundRetinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision. Many retinal diseases occur as a result of inherited dysfunction of the rod and cone photoreceptor cells. Development and maintenance of photoreceptors requires appropriate regulation of the many genes specifically or highly expressed in these cells. Over the last decades, different experimental approaches have been developed to identify photoreceptor enriched genes. Recent progress in RNA analysis technology has generated large amounts of gene expression data relevant to retinal development. This paper assesses a machine learning methodology for supporting the identification of photoreceptor enriched genes based on expression data.ResultsBased on the analysis of publicly-available gene expression data from the developing mouse retina generated by serial analysis of gene expression (SAGE), this paper presents a predictive methodology comprising several in silico models for detecting key complex features and relationships encoded in the data, which may be useful to distinguish genes in terms of their functional roles. In order to understand temporal patterns of photoreceptor gene expression during retinal development, a two-way cluster analysis was firstly performed. By clustering SAGE libraries, a hierarchical tree reflecting relationships between developmental stages was obtained. By clustering SAGE tags, a more comprehensive expression profile for photoreceptor cells was revealed. To demonstrate the usefulness of machine learning-based models in predicting functional associations from the SAGE data, three supervised classification models were compared. The results indicated that a relatively simple instance-based model (KStar model) performed significantly better than relatively more complex algorithms, e.g. neural networks. To deal with the problem of functional class imbalance occurring in the dataset, two data re-sampling techniques were studied. A random over-sampling method supported the implementation of the most powerful prediction models. The KStar model was also able to achieve higher predictive sensitivities and specificities using random over-sampling techniques.ConclusionThe approaches assessed in this paper represent an efficient and relatively inexpensive in silico methodology for supporting large-scale analysis of photoreceptor gene expression by SAGE. They may be applied as complementary methodologies to support functional predictions before implementing more comprehensive, experimental prediction and validation methods. They may also be combined with other large-scale, data-driven methods to facilitate the inference of transcriptional regulatory networks in the developing retina. Furthermore, the methodology assessed may be applied to other data domains.

Highlights

  • Retinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision

  • Libraries were obtained from microdissected mouse photoreceptors from the retinal outer nuclear layers (ONL), retina from various mouse developmental stages and retina from the paired-homeodomain transcription factor Crx knockout mouse (Crx-/-) and its wild type counterpart (Crx+/+) at postnatal day (P)10, and from NIH3T3 mouse fibroblasts

  • Two libraries belonging to non-retinal tissues (3t3 and hypo) are clearly separated from other clusters, which confirm that the serial analysis of gene expression (SAGE) libraries reflect tissue specificity

Read more

Summary

Introduction

Retinal photoreceptors are highly specialised cells, which detect light and are central to mammalian vision. Development and maintenance of photoreceptors requires appropriate regulation of the many genes or highly expressed in these cells. Different experimental approaches have been developed to identify photoreceptor enriched genes. Many retinal diseases occur as a result of inherited dysfunction of the rod and cone photoreceptor cells. For example, Yoshida et al [3] revealed that 43 genes, which are differentially expressed in the absence of Nrl (neural retina leucine zipper protein), are either associated with or are candidates for retinal diseases involving rod or cone photoreceptor dysfunction. Katsanis et al [4] positioned 925 expressed sequence tags (ESTs) likely to be or preferentially expressed in the retina They identified positional candidate genes for 42 of 51 uncloned retinopathies. Libraries were obtained from microdissected mouse photoreceptors from the retinal outer nuclear layers (ONL), retina from various mouse developmental stages and retina from the paired-homeodomain transcription factor Crx knockout mouse (Crx-/-) and its wild type counterpart (Crx+/+) at postnatal day (P), and from NIH3T3 mouse fibroblasts

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.