Abstract
Due to the complexity of the underlying biological processes, gene expression data obtained from DNA microarray technologies are typically noisy and have very high dimensionality and these make the mining of such data for gene function prediction very difficult. To tackle these difficulties, we propose to use an incremental fuzzy mining technique called incremental fuzzy mining (IFM). By transforming quantitative expression values into linguistic terms, such as highly or lowly expressed, IFM can effectively capture heterogeneity in expression data for pattern discovery. It does so using a fuzzy measure to determine if interesting association patterns exist between the linguistic gene expression levels. Based on these patterns, IFM can make accurate gene function predictions and these predictions can be made in such a way that each gene can be allowed to belong to more than one functional class with different degrees of membership. Gene function prediction problem can be formulated both as classification and clustering problems, and IFM can be used either as a classification technique or together with existing clustering algorithms to improve the cluster groupings discovered for greater prediction accuracies. IFM is characterized also by its being an incremental data mining technique so that the discovered patterns can be continually refined based only on newly collected data without the need for retraining using the whole dataset. For performance evaluation, IFM has been tested with real expression datasets for both classification and clustering tasks. Experimental results show that it can effectively uncover hidden patterns for accurate gene function predictions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.