Large-scale group decision-making (GDM) arises when a large group of decision-makers aims to compare several given options by ranking them or assigning scores to those options, for example, determining a ranking for five finalist papers in a best-paper competition, based on a large number of votes submitted by scholars. We study two classes of GDM problems where the submitted votes as well as the final decisions are in the form of ranking (respectively, score) vectors, denoted by PR (respectively, PS). Because a group of decision-makers usually consists of clusters (subgroups) that are essentially different in how they evaluate options, identifying the clusters and their centroids (representative ranking/score vectors) is necessary. We show the single-cluster cases of PR and PS are solvable in polynomial time. We present a general framework for deriving sufficient conditions on the number of decision-makers who propose a ranking/score vector to ensure their proposal is selected as the final decision. We prove that if a 'weak majority' (i.e., at least half) of the decision-makers vote for a ranking vector, and they are in 'extreme disagreement' (i.e., propose the most dissimilar ranking vectors) with the rest of the decision-makers, their proposal is an optimal decision of the group. Inconsistent with intuition, we show that ensuring optimality of a ranking vector requires fewer votes if the decision-makers are in extreme disagreement. Additionally, ensuring optimality of a score vector requires more votes than a ranking vector. We prove the optimal value of the classical k-means clustering problem is a lower bound that is within 50% of that of PR and PS. Interestingly, we establish the tightness of this worst-case gap for PR and PS with any number of clusters. We demonstrate the generalizability of this worst-case gap beyond PR and PS. We prove PR and PS are NP-hard, using a reduction from the hypercube clustering problem. Consequently, we present heuristic algorithms for finding good solutions for PR and PS in a reasonable amount of time. Our heuristic algorithms perform two major steps at each iteration: an assignment step that assigns vectors to clusters and an update step that determines the centroids of clusters. We propose six heuristic variations that use different assignment and update procedures, and show each iteration of these heuristic algorithms is polynomial. According to our numerical experiments, a heuristic variation that determines the optimal centroids using our proposed polynomial procedures demonstrates a remarkable performance and outperforms the other five variations.
Read full abstract