Abstract

Sparse learning has significant applications in statistics, big data, bioinformatics and machine learning. In big data systems, a large amount of redundant, missing and noisy data cause sparsity, and the rapid changes of information result in uncertainty. Since the traditional sparse learning model is difficult to deal with uncertain data, we propose a Fuzzy Granular Sparse Learning (FGSL) model for identifying antigenic variants of influenza viruses. Firstly, a fuzzy set theory is introduced to measure and granulate the influenza viruses. Some fuzzy granules are induced by a single feature fuzzy granulation. Then, a fuzzy granular vector is constructed from these fuzzy granules, and the fuzzy granular regression is presented. Some constraint norms for granules and granular vectors are proposed, which are two granule norms and four granular vector norms. Therefore, the FGSL model is constructed based on granular regression and constraint norms. The FGSL model includes granular ridge and lasso regressions under different constraint norms. Furthermore, we prove the derivative forms of two granular regression functions, guaranteeing the convergence of the FGSL model. The optimization problem of the FGSL model is discussed and two gradient descent algorithms of the FGSL model are designed. Finally, we employ the FGSL model to serologic data and hemagglutinin sequences for learning antigenicity-associated mutations and inferring antigenic variants. The experimental results confirm some advantages of the FGSL model with fast convergence, low RMSE and strong feature selection ability. We successfully identify antigenic variants of influenza viruses by the FGSL model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call