ORdensity: user-friendly R package to identify differentially expressed genes

José María Martínez-Otzeta,Basilio Sierra,Itziar Irigoien,Concepción Arenas

doi:10.1186/s12859-020-3463-4

José María Martínez-Otzeta, Basilio Sierra + Show 2 more

Open Access

https://doi.org/10.1186/s12859-020-3463-4

Copy DOI

Abstract

BackgroundMicroarray technology provides the expression level of many genes. Nowadays, an important issue is to select a small number of informative differentially expressed genes that provide biological knowledge and may be key elements for a disease. With the increasing volume of data generated by modern biomedical studies, software is required for effective identification of differentially expressed genes. Here, we describe an R package, called ORdensity, that implements a recent methodology (Irigoien and Arenas, 2018) developed in order to identify differentially expressed genes. The benefits of parallel implementation are discussed.ResultsORdensity gives the user the list of genes identified as differentially expressed genes in an easy and comprehensible way. The experimentation carried out in an off-the-self computer with the parallel execution enabled shows an improvement in run-time. This implementation may also lead to an important use of memory load. Results previously obtained with simulated and real data indicated that the procedure implemented in the package is robust and suitable for differentially expressed genes identification.ConclusionsThe new package, ORdensity, offers a friendly and easy way to identify differentially expressed genes, which is very useful for users not familiar with programming.Availabilityhttps://github.com/rsait/ORdensity

Highlights

IntroductionAn important issue is to select a small number of informative differentially expressed genes that provide biological knowledge and may be key elements for a disease
Microarray technology provides the expression level of many genes
Results and case of use In addition to the experiments performed in our own computer, a code capsule has been created to allow the readers to experiment without having to install anything in their machines

Summary

Introduction

An important issue is to select a small number of informative differentially expressed genes that provide biological knowledge and may be key elements for a disease. Analysis of gene expression using microarray or RNA-Seq technologies is a very important task and the main goal is to identify a small number of informative genes whose patterns of expression differ according to the experimental conditions. Martínez-Otzeta et al BMC Bioinformatics (2020) 21:135 in a neighbourhood (FP), and density of false positives in a neighbourhood (dFP) were introduced in [6] This new procedure has been implemented in the ORdensity package described below. G and p ∈ Cp must contain small values corresponding to the majority of no DEGs. The most differentially expressed genes should show a different behaviour, they can be considered as outliers in V. The index OR, previously introduced in [7, 8], which identifies outliers, and two measures called false positives in a K-Nearest Neighbourhood (FP) and density of false positives in a K-

Results

Discussion

Conclusion