Abstract

The systematic processing of unstructured communication data as well as the milestone of pattern recognition in order to determine communication groups in negotiations bears many challenges in Machine Learning. In particular, the so-called curse of dimensionality makes the pattern recognition process demanding and requires further research in the negotiation environment. In this paper, various selected renowned clustering approaches are evaluated with regard to their pattern recognition potential based on high-dimensional negotiation communication data. A research approach is presented to evaluate the application potential of selected methods via a holistic framework including three main evaluation milestones: the determination of optimal number of clusters, the main clustering application, and the performance evaluation. Hence, quantified Term Document Matrices are initially pre-processed and afterwards used as underlying databases to investigate the pattern recognition potential of clustering techniques by considering the information regarding the optimal number of clusters and by measuring the respective internal as well as external performances. The overall research results show that certain cluster separations are recommended by internal and external performance measures by means of a holistic evaluation approach, whereas three of the clustering separations are eliminated based on the evaluation results.

Highlights

  • We will report on the analytical comparison of renowned clustering techniques on high-dimensional communication data from e-negotiations in Negoisst

  • Thereby, a TDM reduced with Optimize Selection (OS) with a matrix size of [72,826 × 4841] as well as a TDM reduced with the Principal Component Analysis (PCA) method with a size of [72,826 × 175] were found to be efficiently compressed in further studies

  • The processing and evaluation of unstructured textual communication data is challenging for pattern recognition due to the missing structure and the high number of dimensions (Bonev et al 2008; Donoho 2000)

Read more

Summary

Motivation

The analysis of high-dimensional data requires an extensive pre-evaluation of techniques, as common statistical methods are not suitable for this kind of complex cases (Kumar 2009). Communication particles e.g. in the form of negotiation messages or sentences have so far been primarily examined by a manual coding-based approach to avoid the processing effort. The research question is as follows: How do renowned clustering methods perform with regard to detection of pattern groups in high-dimensional negotiation communication data?. The challenges of clustering techniques in high-dimensional space are presented related to the processing of textually exchanged messages and the so-called curse of dimensionality. The research paper concludes with a summary of the key findings and a research outlook

The Importance of Communicative Interactions in Electronic Negotiations
Clustering of High‐Dimensional Negotiation Messages
Research Approach
Dimensionality Reduction
Calculation of Similarity Measure
Evaluation of Optimal Cluster Number
Clustering Techniques
Performance Evaluation
Results
Discussion
Conclusion and Outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call