Stream Clustering Research Articles

In recent years, data streams have become an increasingly important area of research for the computer science, database and statistics communities. Data streams are ordered and potentially unbounded sequences of data points created by a typically non-stationary data generating process. Common data mining tasks associated with data streams include clustering, classification and frequent pattern mining. New algorithms for these types of data are proposed regularly and it is important to evaluate them thoroughly under standardized conditions. In this paper we introduce stream, a research tool that includes modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. In addition to data handling, plotting and easy scripting capabilities, R also provides many existing algorithms and enables users to interface code written in many programming languages popular among data mining researchers (e.g., C/C++, Java and Python). In this paper we describe the architecture of stream and focus on its use for data stream clustering research. stream was implemented with extensibility in mind and will be extended in the future to cover additional data stream mining tasks like classification and frequent pattern mining.

Read full abstract

Speaker diarization is the process of determining "who speak when?" with appropriate speaker labels with respect to the time regions where they spoke. Accordingly, in the previous work, a model based speaker diarization using the tangential weighted Mel frequency cepstral coefficients as the feature parameter for the voice activity detection and Lion optimization algorithm for the clustering of the audio streams into speaker group was performed. In this paper, speaker diarization system is proposed using multiple kernel weighted Mel frequency cepstral coefficient (MKMFCC) parameterization and Wu-and-Li Index (WLI)-fuzzy clustering. First, a MKMFCC which utilizes the multiple kernels like the tangential and exponential for weighting the MFCC's is proposed for the feature parameterization. Second, a clustering algorithm called the WLI-Fuzzy clustering is proposed for grouping the segments of the same speaker groups. The experimentation of the proposed speaker diarization system is carried out over the publically available ELSDSR corpus data set having the audio signal with seven different speakers. The performance evaluation of the proposed speaker diarization system is analysed using the measures such as diarization error rate, F-measure and false alarm rate. The results show that the proposed speaker diarization system proved better for tracking the active speakers from multiple speakers with improved tracking accuracy.

Read full abstract

Stream Clustering Research Articles

Related Topics

Articles published on Stream Clustering

Adaptive Clustering for Dynamic IoT Data Streams

TAILS FROM THE ORPHANAGE

Introduction to stream: An Extensible Framework for Data Stream Clustering Research with R

Fat node leading tree for data stream clustering with density peaks

State-of-the-art on clustering data streams

A clustering algorithm for stream data with LDA-based unsupervised localized dimension reduction

Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

Missing data imputation for paired stream and air temperature sensor data

Using internal evaluation measures to validate the quality of diverse stream clustering algorithms

Supervised Adaptive Incremental Clustering for data stream of chunks

An evolutionary algorithm for clustering data streams with a variable number of clusters

Gaps in globular cluster streams: giant molecular clouds can cause them too

Clustering Big Data streams: recent challenges and contributions

Distributed stream clustering using micro-clusters on Apache Storm

An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering

Efficiency of Stream Processing Engines for Processing BIGDATA Streams

Hyper-cylindrical micro-clustering for streaming data with unscheduled data removals

Data stream clustering by divide and conquer approach based on vector model

Penalty Parameter Selection for Hierarchical Data Stream Clustering

Topic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Stream Clustering Research Articles

Related Topics

Articles published on Stream Clustering

Adaptive Clustering for Dynamic IoT Data Streams

TAILS FROM THE ORPHANAGE

Introduction to stream: An Extensible Framework for Data Stream Clustering Research with R

Fat node leading tree for data stream clustering with density peaks

State-of-the-art on clustering data streams

A clustering algorithm for stream data with LDA-based unsupervised localized dimension reduction

Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

Missing data imputation for paired stream and air temperature sensor data

Using internal evaluation measures to validate the quality of diverse stream clustering algorithms

Supervised Adaptive Incremental Clustering for data stream of chunks

An evolutionary algorithm for clustering data streams with a variable number of clusters

Gaps in globular cluster streams: giant molecular clouds can cause them too

Clustering Big Data streams: recent challenges and contributions

Distributed stream clustering using micro-clusters on Apache Storm

An Ensemble of Adaptive Neuro-Fuzzy Kohonen Networks for Online Data Stream Fuzzy Clustering

Efficiency of Stream Processing Engines for Processing BIGDATA Streams

Hyper-cylindrical micro-clustering for streaming data with unscheduled data removals

Data stream clustering by divide and conquer approach based on vector model

Penalty Parameter Selection for Hierarchical Data Stream Clustering

Topic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization