Parallel Feed-forward Network Research Articles

The Mixture-of-Experts (MoE) approach has demonstrated outstanding scalability in multi-task learning including low-level upstream tasks such as concurrent removal of multiple adverse weather effects. However, the conventional MoE architecture with parallel Feed Forward Network (FFN) experts leads to significant parameter and computational overheads that hinder its efficient deployment. In addition, the naive MoE linear router is suboptimal in assigning task-specific features to multiple experts which limits its further scalability. In this work, we propose an efficient MoE architecture with weight sharing across the experts. Inspired by the idea of linear feature modulation (FM), our architecture implicitly instantiates multiple experts via learnable activation modulations on a single shared expert block. The proposed Feature Modulated Expert (FME) serves as a building block for the novel Mixture-of-Feature-Modulation-Experts (MoFME) architecture, which can scale up the number of experts with low overhead. We further propose an Uncertainty-aware Router (UaR) to assign task-specific features to different FM modules with well-calibrated weights. This enables MoFME to effectively learn diverse expert functions for multiple tasks. The conducted experiments on the multi-deweather task show that our MoFME outperforms the state-of-the-art in the image restoration quality by 0.1-0.2 dB while saving more than 74% of parameters and 20% inference time over the conventional MoE counterpart. Experiments on the downstream segmentation and classification tasks further demonstrate the generalizability of MoFME to real open-world applications.

Read full abstract

Malignancy is one of the leading causes of death. It is on the rise in the developed and low-income countries with survival rates of less than 40%. However, early diagnosis may increase survival chances. Histopathology images acquired from the biopsy are a popular method for cancer diagnosis. In this work, we propose a deep convolutional neural network-based method that helps classify breast cancer tumor subtypes from histopathology images. The model is trained on the BreakHis dataset but is also tested on images from other datasets. The model is trained to recognized eight different tumor subtypes, and also to perform binary classification (malignant/non-malignant). The CNN model combines an encoder–decoder architecture and a parallel feed-forward network with attention mechanism. The proposed model provides state-of-the-art scores. Comparing with the other models, the accuracy of the proposed model is higher at different magnification and patient levels. The implementation is available at github.com/rangan2510/Residual\(\_\)Unet

Read full abstract

Parallel Feed-forward Network Research Articles

Articles published on Parallel Feed-forward Network

Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation

Multi-path Convolutional Neural Network to Identify Tumorous Sub-classes for Breast Tissue from Histopathological Images

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Parallel Feed-forward Network Research Articles

Articles published on Parallel Feed-forward Network

Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-Wise Linear Modulation

Multi-path Convolutional Neural Network to Identify Tumorous Sub-classes for Breast Tissue from Histopathological Images