Abstract
Let n respondents rank order d items, and suppose that . Our main task is to uncover and display the structure of the observed rank data by an exploratory riffle shuffling procedure which sequentially decomposes the n voters into a finite number of coherent groups plus a noisy group: where the noisy group represents the outlier voters and each coherent group is composed of a finite number of coherent clusters. We consider exploratory riffle shuffling of a set of items to be equivalent to optimal two blocks seriation of the items with crossing of some scores between the two blocks. A riffle shuffled coherent cluster of voters within its coherent group is essentially characterized by the following facts: 1) Voters have identical first TCA factor score, where TCA designates taxicab correspondence analysis, an L1 variant of correspondence analysis; 2) Any preference is easily interpreted as riffle shuffling of its items; 3) The nature of different riffle shuffling of items can be seen in the structure of the contingency table of the first-order marginals constructed from the Borda scorings of the voters; 4) The first TCA factor scores of the items of a coherent cluster are interpreted as Borda scale of the items. We also introduce a crossing index, which measures the extent of crossing of scores of voters between the two blocks seriation of the items. The novel approach is explained on the benchmarking SUSHI data set, where we show that this data set has a very simple structure, which can also be communicated in a tabular form.
Highlights
Ordering the elements of a set is a common decision making activity, such as, voting for a political candidate, choosing a consumer product, etc
Our main task is to uncover and display the structure of the observed rank data by an exploratory riffle shuffling procedure which sequentially decomposes the n voters into a finite number of coherent groups plus a noisy group: where the noisy group represents the outlier voters and each coherent group is composed of a finite number of coherent clusters
A riffle shuffled coherent cluster of voters within its coherent group is essentially characterized by the following facts: 1) Voters have identical first TCA factor score, where TCA designates taxicab correspondence analysis, an L1 variant of correspondence analysis; 2) Any preference is interpreted as riffle shuffling of its items; 3) The nature of different riffle shuffling of items can be seen in the structure of the contingency table of the first-order marginals constructed from the Borda scorings of the voters; 4) The first TCA factor scores of the items of a coherent cluster are interpreted as Borda scale of the items
Summary
Ordering the elements of a set is a common decision making activity, such as, voting for a political candidate, choosing a consumer product, etc. Often rank data is heterogenous: it is composed of a finite mixture of components. The traditional methods of finding mixture components of rank data are mostly based on parametric probability models, distance or latent class models, and are useful for sparse data and not for diffuse data. Rank data are sparse if there are at most a small finite number of permutations that capture the majority of the preferences; otherwise they are diffuse. The SUSHI data set is diffuse, because there are at most three counts for one observed permutation. It has been analyzed, among others by [2] [3] [4]. APA data set is considered as non-sparse by [2], because all the 120 permutations exist with positive probability. For a general background on statistical methods for rank data, see the excellent monograph by [6] and the book [7]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.