Simultaneous Parameter Learning and Bi-clustering for Multi-Response Models.

Ming Yu,Karthikeyan Natesan Ramamurthy,Aurélie C Lozano,Addie Thompson

doi:10.3389/fdata.2019.00027

Abstract

We consider multi-response and multi-task regression models, where the parameter matrix to be estimated is expected to have an unknown grouping structure. The groupings can be along tasks, or features, or both, the last one indicating a bi-cluster or “checkerboard” structure. Discovering this grouping structure along with parameter inference makes sense in several applications, such as multi-response Genome-Wide Association Studies (GWAS). By inferring this additional structure we can obtain valuable information on the underlying data mechanisms (e.g., relationships among genotypes and phenotypes in GWAS). In this paper, we propose two formulations to simultaneously learn the parameter matrix and its group structures, based on convex regularization penalties. We present optimization approaches to solve the resulting problems and provide numerical convergence guarantees. Extensive experiments demonstrate much better clustering quality compared to other methods, and our approaches are also validated on real datasets concerning phenotypes and genotypes of plant varieties.

Highlights

We consider multi-response and multi-task regression models, which generalize single-response regression to learn predictive relationships between multiple input and multiple output variables, referred to as tasks (Borchani et al, 2015)
Convex bi-clustering method (Chi et al, 2014) aims at grouping observations and features in a data matrix; while our approaches aim at discovering groupings in the parameter matrix of multi-response regression models while jointly estimating such a matrix, and the discovered groupings reflect groupings in features and responses
We introduce a surrogate parameter matrix Ŵ that will be used for bi-clustering

Summary

INTRODUCTION

We consider multi-response and multi-task regression models, which generalize single-response regression to learn predictive relationships between multiple input and multiple output variables, referred to as tasks (Borchani et al, 2015). A motivating example is that of multi-response Genome-Wide Association Studies (GWAS) (Schifano et al, 2013), where for instance a group of Single Nucleotide Polymorphisms or SNPs (input variables or features) might influence a group of phenotypes (output variables or tasks) in a similar way, while having little or no effect on another group of phenotypes. As another example, stocks values of related companies can affect the future value of a group of stocks .

Contributions

Related Work

Roadmap

PROBLEM STATEMENT AND PROPOSED METHODS

Formulation 1: “Hard Fusion”

Formulation 2: “Soft Fusion”

OPTIMIZATION ALGORITHMS FOR THE PROPOSED FORMULATIONS

Optimization for Formulation 1

Optimization for Formulation 2

Numerical Convergence

Weights and Sparsity Regularization

Penalty Multiplier Tuning

Result

Solution Paths

Bi-clustering Thresholds

SYNTHETIC DATA EXPERIMENTS

Performance Measures

Simulation Setup and Results

REAL DATA EXPERIMENTS

Phenotypic Trait Prediction From Remote Sensed Data

Multi-Response GWAS

CONCLUDING REMARKS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in big data	Publication Date: Aug 14, 2019
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Simultaneous Parameter Learning and Bi-clustering for Multi-Response Models.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data

Lead the way for us

Similar Papers

Genome-wide Association Analysis for Multiple Continuous Secondary Phenotypes
Elizabeth D Schifano ... Xihong Lin
The American Journal of Human Genetics | VOL. 92
Elizabeth D Schifano, et. al.Elizabeth D Schifano ... Xihong Lin
01 May 2013
The American Journal of Human Genetics | VOL. 92

Weighted Interaction SNP Hub (WISH) network method for building genetic networks for complex diseases and traits using whole genome genotype data.
Lisette Ja Kogelman ... Haja N Kadarmideen
BMC systems biology | VOL. Suppl 8 2
Lisette Ja Kogelman, et. al.Lisette Ja Kogelman ... Haja N Kadarmideen
01 Jan 2014
BMC systems biology | VOL. Suppl 8 2

ADuLT: An efficient and robust time-to-event GWAS
Emil M Pedersen ... Bjarni J Vilhjálmsson
Nature Communications | VOL. 14
Emil M Pedersen, et. al.Emil M Pedersen ... Bjarni J Vilhjálmsson
09 Sep 2023
Nature Communications | VOL. 14

Translating GWAS findings into therapies for depression and anxiety disorders: gene-set analyses reveal enrichment of psychiatric drug classes and implications for drug repositioning.
Hon-Cheong So ... Kai Zhao
Psychological medicine | VOL. 49
Hon-Cheong So, et. al.Hon-Cheong So ... Kai Zhao
20 Dec 2018
Psychological medicine | VOL. 49

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simultaneous Parameter Learning and Bi-clustering for Multi-Response Models.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data