• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Related Topics

  • Data Mining Research
  • Data Mining Research
  • Data Mining Techniques
  • Data Mining Techniques
  • Data Mining
  • Data Mining

Articles published on Data stream mining

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
328 Search results
Sort by
Recency
  • Research Article
  • 10.1186/s40537-025-01147-0
An accuracy-privacy optimization framework considering user’s privacy requirements for data stream mining
  • Jun 4, 2025
  • Journal of Big Data
  • Waruni Hewage + 2 more

Data stream mining is a critical process utilized by organizations to derive insights from real-time data. Consequently, preserving the privacy of sensitive information while maintaining high accuracy remains a persistent challenge. Privacy-preserving data mining techniques modify data to increase privacy, a process that invariably decreases the accuracy of data mining algorithms. Though different techniques have been proposed to preserve privacy, there is a lack of well-formulated frameworks to optimize the trade-off between accuracy and privacy. This paper introduces a novel Accuracy-Privacy Optimization Framework (APOF) that allows users to define privacy requirements and predicts achievable accuracy levels, enabling fine-tuning of this balance. The logistic cumulative noise addition was used as the data perturbation method that has experimentally shown better performance and Hoeffding trees as the classifier. Additionally, a data fitting module using kernel regression is integrated, a unique approach that predicts accuracy levels based on user-defined privacy thresholds. Experimental results show that the proposed framework archives an optimal privacy level above 97% while minimising the accuracy loss across various datasets. By addressing critical gaps in privacy-preserving data mining, this study offers significant contributions to real-world applications, facilitating secure and efficient data utilization in dynamic environments.

  • Research Article
  • 10.54097/6m7rrt90
An Ensemble Multi-Model Voting Method for Adapting to Concept Drift
  • Apr 29, 2025
  • Frontiers in Computing and Intelligent Systems
  • Min Wang

Aiming at the challenges posed by concept drift in streaming data mining, this paper proposes an Ensemble Multi-Model Voting Method for Adapting to Concept Drift (EMVM_ATCD). The method employs integrated multi-classifiers to improve model stability, uses online learning methods to update the model, and adds a dropout layer to force the model to learn different combinations to enhance generalization ability. A voting mechanism is used to process the model prediction results to enhance the ability to cope with concept drift. Experimental results show that the method achieves performance improvements ranging from 0.1% to 10% on multiple datasets, proving that it can effectively handle various types of data.

  • Research Article
  • Cite Count Icon 7
  • 10.1109/tnnls.2024.3382033
Efficient Online Stream Clustering Based on Fast Peeling of Boundary Micro-Cluster.
  • Mar 1, 2025
  • IEEE transactions on neural networks and learning systems
  • Jiarui Sun + 3 more

A growing number of applications generate streaming data, making data stream mining a popular research topic. Classification-based streaming algorithms require pre-training on labeled data. Manually labeling a large number of samples in the data stream is impractical and cost-prohibitive. Stream clustering algorithms rely on unsupervised learning. They have been widely studied for their ability to effectively analyze high-speed data streams without prior knowledge. Stream clustering plays a key role in data stream mining. Currently, most data stream clustering algorithms adopt the online-offline framework. In the online stage, micro-clusters are maintained, and in the offline stage, they are clustered using an algorithm similar to density-based spatial clustering of applications with noise (DBSCAN). When data streams have clusters with varying densities and ambiguous boundaries, traditional data stream clustering algorithms may be less effective. To overcome the above limitations, this article proposes a fully online stream clustering algorithm called fast boundary peeling stream clustering (FBPStream). First, FBPStream defines a decay-based kernel density estimation (KDE). It can discover clusters with varying densities and identify the evolving trend of streams well. Then, FBPStream implements an efficient boundary micro-cluster peeling technique to identify the potential core micro-clusters. Finally, FBPStream employs a parallel clustering strategy to effectively cluster core and boundary micro-clusters. The proposed algorithm is compared with ten popular algorithms on 15 data streams. Experimental results show that FBPStream is competitive with the other ten popular algorithms.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/tcyb.2024.3489605
CA-GNN: A Competence-Aware Graph Neural Network for Semi-Supervised Learning on Streaming Data.
  • Feb 1, 2025
  • IEEE transactions on cybernetics
  • Hang Yu + 4 more

One challenge of learning from streaming data is that only a limited number of labeled examples are available, making semi-supervised learning (SSL) algorithms becoming an efficient tool for streaming data mining. Recently, the graph-based SSL algorithms have been proposed to improve SSL performance because the graph structure can utilize the interactivity between surrounding nodes. However, graph-based SSL algorithms have two main limitations when applied to streaming data. First, not all the labels of the data in the streaming data may be reliable, and direct classification using a graph can lead to suboptimal performance. Second, graph-based SSL algorithms assume the structure of the graph is static, but the learning environment of streaming data is dynamic. Hence, we propose a competence-aware graph neural network (CA-GNN) to deal with these two limitations. Unlike other models, CA-GNN does not directly rely on graph information that could include mislabeled nodes. Instead, a competence model is used to explore latent semantic correlations in the streaming data and capture the reliability for each data. A streaming learning strategy then evolves CA-GNN's parameters to capture the dynamism of the graph sequences. We conducted experiments using seven real datasets and four synthetic datasets, respectively, and compared the outcomes across various methods. The results demonstrate that CA-GNN classifies streaming data more effectively than the state-of-the-art (SOTA) methods.

  • Research Article
  • 10.1109/tcyb.2025.3605663
A Multistream Concept Drift Handling Framework via Data Sharing.
  • Jan 1, 2025
  • IEEE transactions on cybernetics
  • Bin Zhang + 3 more

A frequent problem in data stream mining is concept drift, meaning the data distribution changes over time. A common issue when dealing with concept drift is insufficient data. Real-world applications of data stream mining often involve multiple data streams. However, most concept drift methods handle these data streams separately. This study uses data from other data streams to handle the problem of insufficient data. We propose a novel Multistream Concept Drift Handling Framework via data sharing, containing a fuzzy membership-based drift detection (FMDD) component and a fuzzy membership-based drift adaptation (FMDA) component, to train the new learning model for drifting streams by sharing weighted data from other nondrifting streams. A stream fuzzy set is defined with membership functions that measure the degree to which samples belong to a data stream. Our Concept Drift Handling Framework can detect when and in which stream concept drift occurs, and therefore the insufficient data issue can be solved by adding the weighted data from nondrifting streams to train new learning models. Synthetic and real-world experimental results show that our method can help avoid the insufficient data issue and thereby significantly improve the prediction performance.

  • Research Article
  • 10.1109/access.2025.3611957
A Framework for Dynamic User Modeling Integrating Data Stream Mining and Process Mining in Educational Contexts
  • Jan 1, 2025
  • IEEE Access
  • Maria Yesenia Zavaleta-Sanchez + 4 more

A Framework for Dynamic User Modeling Integrating Data Stream Mining and Process Mining in Educational Contexts

  • Research Article
  • 10.1016/j.asoc.2024.112353
An ensemble-based semi-supervised learning approach for non-stationary imbalanced data streams with label scarcity
  • Oct 18, 2024
  • Applied Soft Computing
  • Yousef Abdi + 2 more

An ensemble-based semi-supervised learning approach for non-stationary imbalanced data streams with label scarcity

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1007/s10994-024-06621-z
Change detection and adaptation in multi-target regression on data streams
  • Oct 9, 2024
  • Machine Learning
  • Bozhidar Stevanoski + 3 more

Abstract An essential characteristic of data streams is the possibility of occurrence of concept drift, i.e., change in the distribution of the data in the stream over time. The capability to detect and adapt to changes in data stream mining methods is thus a necessity. While methods for multi-target prediction on data streams have recently appeared, they have largely remained without such capability. In this paper, we propose novel methods for change detection and adaptation in the context of incremental online learning of decision trees for multi-target regression. One of the approaches we propose is ensemble based, while the other uses the Page–Hinckley test. We perform an extensive evaluation of the proposed methods on real-world and artificial data streams and show their effectiveness. We also demonstrate their utility on a case study from spacecraft operations, where cosmic events can cause change and demand an appropriate and timely positioning of the space craft.

  • Research Article
  • 10.32913/mic-ict-research.v2024.n2.1249
Method for Mining High-utility Patterns in Transaction Stream Data based on Linked List Structure
  • Sep 15, 2024
  • ICT Research
  • Minh-Thai Tran + 3 more

Mining valuable patterns in data streams presentsa significant challenge in the field of data mining. Thistask is crucial as it allows for the identification of highlyprofitable item sets within transaction databases. However, asnew transactions are continually added, new valuable patternsemerge, thus changing the usefulness of previously analyzeddata. It is essential to promptly update information regardingthese changes to enable effective business decision-making.Consequently, existing mining methods applied to transactionflow datasets require considerable time to identify newpatterns and update information related to new transactions.This article focuses on the research and proposal of a newtransaction stream data mining method called High-UtilityStream Linked-List Mining. The method utilizes a linkedlist structure known as the High-Utility Stream Linked List(HUSLL) to store information about patterns in the database.Mining and updating transaction information are directlyperformed on the HUSLL structure. Experimental resultsdemonstrate that this novel mining method exhibits moreefficient execution times compared to previous solutions.

  • Open Access Icon
  • Research Article
  • 10.14445/23488379/ijeee-v11i7p103
English
  • Jul 31, 2024
  • International Journal of Electrical and Electronics Engineering
  • Gollanapalli V Prasad + 1 more

The dynamic structure of data streams provides major challenges for sustaining prediction model accuracy over time. Concept drift, defined as changes in underlying data distributions, has been proven to have a considerable impact on the performance of machine learning models in real-time applications. While earlier methods often focus on either slow or abrupt concept drifts, a unified framework capable of identifying both types quickly is absent. As a result, to overcome the issue mentioned above, we propose the Concept Drift Detection Framework with Hybrid Meta-Learning, abbreviated as CDDF-HML. This incandescent method applies meta-learning, adaptive feature selection and ensemble-based process to address both slow as well as sudden concept drifts. Due to this, the framework is most appropriate in dynamic data stream mining, where the underlying structure is continually changing. It showcases how it can identify deviations of ideas with further capability in accommodating various data conditions. The study also performs the comparative analysis with other techniques to demonstrate that CDDF-HML is really an effective tool for discovering concept drift. The future possibilities of CDDF-HML include the implementation of the method within specific domains, further development of granular adjustment approaches, structural and extensional amendments to scalability, and partnerships with professionals from various industries. It is beneficial in the improvement of the concept drift detection in data stream mining so that the reliability of the model can be assured in dynamic data situations.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.compeleceng.2024.109420
An experimental review of the ensemble-based data stream classification algorithms in non-stationary environments
  • Jul 14, 2024
  • Computers and Electrical Engineering
  • Shirin Khezri + 2 more

An experimental review of the ensemble-based data stream classification algorithms in non-stationary environments

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • 10.1007/s40747-024-01524-x
Scalable concept drift adaptation for stream data mining
  • Jun 20, 2024
  • Complex & Intelligent Systems
  • Lisha Hu + 3 more

Stream data mining aims to handle the continuous and ongoing generation of data flows (e.g. weather, stock and traffic data), which often encounters concept drift as time progresses. Traditional offline algorithms struggle with learning from real-time data, making online algorithms more fitting for mining the stream data with dynamic concepts. Among families of the online learning algorithms, single pass stands out for its efficiency in processing one sample point at a time, and inspecting it only once at most. Currently, there exist online algorithms tailored for single pass over the stream data by converting the problems of classification into minimum enclosing ball. However, these methods mainly focus on expanding the ball to enclose the new data. An excessively large ball might overwrite data of the new concept, creating difficulty in triggering the model updating process. This paper proposes a new online single pass framework for stream data mining, namely Scalable Concept Drift Adaptation (SCDA), and presents three distinct online methods (SCDA-I, SCDA-II and SCDA-III) based on that framework. These methods dynamically adjust the ball by expanding or contracting when new sample points arrive, thereby effectively avoiding the issue of excessively large balls. To evaluate their performance, we conduct the experiments on 7 synthetic and 5 real-world benchmark datasets and compete with the state-of-the-arts. The experiments demonstrate the applicability and flexibility of the SCDA methods in stream data mining by comparing three aspects: predictive performance, memory usage and scalability of the ball. Among them, the SCDA-III method performs best in all these aspects.

  • Open Access Icon
  • Research Article
  • 10.31272/jae.i139.1099
Analyzing Time series by using Data mining
  • Jun 9, 2024
  • Journal of Administration and Economics
  • Prof Dr Husam Abulrazzak + 1 more

Algorithms and complex data analysis techniques are used in multiple fields that are expanding daily, and with it the challenges in facing multiple and more complex data types, and the directions of exploration research vary according to the diversity of these fields, and their use is increasing in the modern era in the field of artificial intelligence, which aims to facilitate human life in various fields. Mining of complex data types includes mining of time series, symbolic chains, and biological chains, in addition to mining of graphs, computer networks, mobile data, text mining, and data streams.

  • Research Article
  • 10.52783/jes.4235
Privacy Preserving Data Stream Classification: Recent Approaches and Open Challenges
  • Jun 1, 2024
  • Journal of Electrical Systems
  • Anita A Parmar

With the relevant growth of big data stream, the research industry has great attention to data stream mining which has a wide range of applications like banking, education, networking, telecommunication, weather forecasting, a stock market, and so on. Because of this, privacy preserving in data stream mining is having more attention from researchers. In this paper, we mainly focus on review of privacy preserving classification methods for data streams, which applies classification algorithms to big data streams while ensuring the privacy of data. Recently, the emerging big data analytics context has conferred a new light to this exciting research area.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.ins.2024.120575
Accelerating deep neural network learning using data stream methodology
  • Apr 10, 2024
  • Information Sciences
  • Piotr Duda + 2 more

Accelerating deep neural network learning using data stream methodology

  • Research Article
  • Cite Count Icon 3
  • 10.1145/3639285
Local Differentially Private Heavy Hitter Detection in Data Streams with Bounded Memory
  • Mar 12, 2024
  • Proceedings of the ACM on Management of Data
  • Xiaochen Li + 6 more

Top-k frequent items detection is a fundamental task in data stream mining. Many promising solutions are proposed to improve memory efficiency while still maintaining high accuracy for detecting the Top-k items. Despite the memory efficiency concern, the users could suffer from privacy loss if participating in the task without proper protection, since their contributed local data streams may continually leak sensitive individual information. However, most existing works solely focus on addressing either the memory-efficiency problem or the privacy concerns but seldom jointly, which cannot achieve a satisfactory tradeoff between memory efficiency, privacy protection, and detection accuracy. In this paper, we present a novel framework HG-LDP to achieve accurate Top-k item detection at bounded memory expense, while providing rigorous local differential privacy (LDP) protection. Specifically, we identify two key challenges naturally arising in the task, which reveal that directly applying existing LDP techniques will lead to an inferior "accuracy-privacy-memory efficiency" tradeoff. Therefore, we instantiate three advanced schemes under the framework by designing novel LDP randomization methods, which address the hurdles caused by the large size of the item domain and by the limited space of the memory. We conduct comprehensive experiments on both synthetic and real-world datasets to show that the proposed advanced schemes achieve a superior "accuracy-privacy-memory efficiency" tradeoff, saving 2300× memory over baseline methods when the item domain size is 41,270. Our code is anonymously open-sourced via the link.

  • Research Article
  • 10.3233/idt-230065
Explainable data stream mining: Why the new models are better
  • Feb 20, 2024
  • Intelligent Decision Technologies
  • Hanqing Hu + 2 more

Explainable Machine Learning brings expandability, interpretability, and accountability to Data Mining Algorithms. Existing explanation frameworks focus on explaining the decision process of a single model in a static dataset. However, in data stream mining changes in data distribution over time, called concept drift, may require updating the learning models to reflect the current data environment. It is therefore important to go beyond static models and understand what has changed among the learning models before and after a concept drift. We propose a Data Stream Explanability framework (DSE) that works together with a typical data stream mining framework where support vector machine models are used. DSE aims to help non-expert users understand model dynamics in a concept drifting data stream. DSE visualizes differences between SVM models before and after concept drift, to produce explanations on why the new model fits the data better. A survey was carried out between expert and non-expert users on the effectiveness of the framework. Although results showed non-expert users on average responded with less understanding of the issue compared to expert users, the difference is not statistically significant. This indicates that DSE successfully brings the explanability of model change to non-expert users.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.is.2024.102351
SuperGuardian: Superspreader removal for cardinality estimation in data streaming
  • Feb 17, 2024
  • Information Systems
  • Jie Lu + 5 more

SuperGuardian: Superspreader removal for cardinality estimation in data streaming

  • Research Article
  • Cite Count Icon 1
  • 10.1002/sam.11662
Rarity updated ensemble with oversampling: An ensemble approach to classification of imbalanced data streams
  • Feb 1, 2024
  • Statistical Analysis and Data Mining: The ASA Data Science Journal
  • Zahra Nouri + 2 more

Abstract Today's ever‐increasing generation of streaming data demands novel data mining approaches tailored to mining dynamic data streams. Data streams are non‐static in nature, continuously generated, and endless. They often suffer from class imbalance and undergo temporal drift. To address the classification of consecutive data instances within imbalanced data streams, this research introduces a new ensemble classification algorithm called Rarity Updated Ensemble with Oversampling (RUEO). The RUEO approach is specifically designed to exhibit robustness against class imbalance by incorporating an imbalance‐specific criterion to assess the efficacy of the base classifiers and employing an oversampling technique to reduce the imbalance in the training data. The RUEO algorithm was evaluated on a set of 20 data streams and compared against 14 baseline algorithms. On average, the proposed RUEO algorithm achieves an average‐accuracy of 0.69 on the real‐world data streams, while the chunk‐based algorithms AWE, AUE, and KUE achieve average‐accuracies of 0.48, 0.65, and 0.66, respectively. The statistical analysis, conducted using the Wilcoxon test, reveals a statistically significant improvement in average‐accuracy for the proposed RUEO algorithm when compared to 12 out of the 14 baseline algorithms. The source code and experimental results of this research work will be publicly available at https://github.com/vkiani/RUEO.

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3390/bdcc8020016
Fair-CMNB: Advancing Fairness-Aware Stream Learning with Naïve Bayes and Multi-Objective Optimization
  • Jan 31, 2024
  • Big Data and Cognitive Computing
  • Maryam Badar + 1 more

Fairness-aware mining of data streams is a challenging concern in the contemporary domain of machine learning. Many stream learning algorithms are used to replace humans in critical decision-making processes, e.g., hiring staff, assessing credit risk, etc. This calls for handling massive amounts of incoming information with minimal response delay while ensuring fair and high-quality decisions. Although deep learning has achieved success in various domains, its computational complexity may hinder real-time processing, making traditional algorithms more suitable. In this context, we propose a novel adaptation of Naïve Bayes to mitigate discrimination embedded in the streams while maintaining high predictive performance through multi-objective optimization (MOO). Class imbalance is an inherent problem in discrimination-aware learning paradigms. To deal with class imbalance, we propose a dynamic instance weighting module that gives more importance to new instances and less importance to obsolete instances based on their membership in a minority or majority class. We have conducted experiments on a range of streaming and static datasets and concluded that our proposed methodology outperforms existing state-of-the-art (SoTA) fairness-aware methods in terms of both discrimination score and balanced accuracy.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2025 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers