Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Export
Sort by: Relevance
  • New
  • Open Access Icon
  • Research Article
  • 10.1007/s11634-025-00663-4
A distance-based aggregation method for finding consensus in preference-approvals
  • Jan 6, 2026
  • Advances in Data Analysis and Classification
  • Alessandro Albano + 1 more

  • Research Article
  • 10.1007/s11634-025-00659-0
Data-driven logistic regression ensembles with applications in genomics
  • Nov 25, 2025
  • Advances in Data Analysis and Classification
  • Anthony-Alexander Christidis + 2 more

  • Research Article
  • 10.1007/s11634-025-00660-7
Editorial for ADAC issue 4 of volume 19 (2025)
  • Nov 17, 2025
  • Advances in Data Analysis and Classification
  • Maurizio Vichi + 2 more

  • Open Access Icon
  • Research Article
  • 10.1007/s11634-025-00655-4
Low-bias discrimination of circular data with measurement errors
  • Oct 18, 2025
  • Advances in Data Analysis and Classification
  • Marco Di Marzio + 3 more

Abstract We study nonparametric discrimination among circular density populations when sample data are affected by measurement errors. Relatively little research seems to have been devoted to this topic. Notoriously, in these problems, a nonparametric method needs to account for an additional source of bias due to the presence of measurement errors, beyond the usual bias typical of local methods. In the described context of abundant bias, we propose a deconvolution approach involving lower bias kernel estimators. Some asymptotic properties are discussed, and numerical results are provided along with a real data case study.

  • Research Article
  • 10.1007/s11634-025-00650-9
Two-stage principal component analysis on interval-valued data using patterned covariance structures
  • Jul 19, 2025
  • Advances in Data Analysis and Classification
  • Anuradha Roy

  • Addendum
  • 10.1007/s11634-025-00648-3
Correction to: Sparse correspondence analysis for large contingency tables
  • Jun 26, 2025
  • Advances in Data Analysis and Classification
  • Ruiping Liu + 3 more

  • Open Access Icon
  • Research Article
  • 10.1007/s11634-025-00646-5
Sparse constrained and unconstrained non-symmetric correspondence analysis
  • Jun 23, 2025
  • Advances in Data Analysis and Classification
  • Mark De Rooij + 1 more

Abstract In this paper, we propose to regularize non-symmetric correspondence analysis (NSCA) and its canonical variant by employing LASSO and group LASSO penalties. NSCA visualizes the asymmetric association structure of a categorical predictor variable and a categorical response variable through a biplot with points for the predictor categories and vectors for the response categories. In canonical NSCA, external information is available about the categories of the predictor variable and this information is used to linearly constrain the coordinates of the points. When the number of predictor categories is large or when the number of external variables is large, this leads to problems in terms of interpretation and/or estimation. To avoid these problems, we propose to use a LASSO or group LASSO penalty on the parameters. Such penalties shrink the parameters to zero, offering a sparse solution. Therefore, we first cast (constrained) NSCA as a least squares estimation problem and then add the penalty to the least squares loss function. We derive a Majorization-Minimization algorithm to minimize this loss function. A bootstrap procedure is proposed for model selection, that is, determining the optimal dimensionality and optimal value of the penalty parameter. The procedures are illustrated using two empirical data sets, one for constrained (i.e., canonical) NSCA, and one for unconstrained NSCA. We discuss in detail the model selection procedure and the interpretation of the selected model.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.1007/s11634-025-00651-8
Flexible multi-class cost-sensitive thresholding
  • Jun 22, 2025
  • Advances in Data Analysis and Classification
  • Jorge C-Rella + 1 more

Abstract Classification involves categorizing input data into predefined classes based on their characteristics. Thresholding methods predict the optimal class for an observation given a score and a missclassification error cost specification. In multi-class classification, existing algorithms assume that a score is available for each possible response. However, there are scenarios where more classes can be predicted than the underlying response variable has. This paper extends the flexibility of the 2-DDR algorithm introduced by C-Rella et al. (Inf Sci 657:119956;2024) to the multi-class classification problem. The proposed method predicts the optimal classification in cost-sensitive multi-class problems considering a single score fitted over a binary variable, a problem not previously studied. Furthermore, a more efficient version of the algorithm is proposed. The good performance of the proposed multi-class method is demonstrated through extensive simulations and the analysis of four real data sets.

  • Open Access Icon
  • Research Article
  • 10.1007/s11634-025-00639-4
Initialization strategies for clustering mixed-type data with the k-prototypes algorithm
  • Jun 12, 2025
  • Advances in Data Analysis and Classification
  • Rabea Aschenbruck + 2 more

Abstract One of the most popular partitioning cluster algorithms is k-means, which is only applicable to numerical data. An extension to mixed-type data containing numerical and categorical variables is the k-prototypes algorithm. Due to its iterative structure, the algorithm may only converges to a local minimum rather than a global minimum. Therefore, just like the solution of the original k-means, the resulting cluster partition suffers from the initialization. In general, there are two ways of achieving an improvement of the random-based initialization of the algorithm: One possibility is to determine concrete initial cluster centers, and the other strategy is to repeat the algorithm with different randomly chosen initial centers. In this work, algorithm initializations of both options are analyzed and evaluated comparatively in a benchmark study. Therefore, selected initialization strategies of the k-means algorithm are transformed to the application on mixed-type data. For the simulation study, several data sets are artificially generated and cluster partitions are determined by using the competing initialization strategies. It is shown that an improvement of the cluster algorithm’s target criterion can be achieved as well as the ability to identify appropriate groups, even with manageable time expenditure.

  • Open Access Icon
  • Research Article
  • 10.1007/s11634-025-00643-8
Modeling time-dependent population proportions in a finite mixture model setting
  • Jun 6, 2025
  • Advances in Data Analysis and Classification
  • Igor Melnykov + 1 more

Abstract This paper focuses on modeling population proportions with finite mixtures of Dirichlet distributions in a time-sensitive setting. Specifically, we assume that the proportions are observed in blocks or sequences so that all data points in a block need to be classified together in the resulting clustering solution. The example motivating our model comes from the United Nations’ demographic database that records the proportions of single, married, widowed, etc., participants at different ages for over 200 countries. Our methodology provides a way to distinguish several patterns that exist among the countries when it comes to changes in marital status at different ages. Conducting separate analyses by gender, we observe that the most pronounced split appears between the countries with more traditional gender roles versus those where respondents of both genders tend to get married later in life.