PurposeThe concept of productivity is central to performance management and decision-making, although it is complex and multifaceted. This paper aims to describe a methodology based on the use of Big Data in a cluster analysis combined with a data envelopment analysis (DEA) that provides accurate and reliable productivity measures in a large network of retailers.Design/methodology/approachThe methodology is described using a case study of a leading kitchen furniture producer. More specifically, Big Data is used in a two-step analysis prior to the DEA to automatically cluster a large number of retailers into groups that are homogeneous in terms of structural and environmental factors and assess a within-the-group level of productivity of the retailers.FindingsThe proposed methodology helps reduce the heterogeneity among the units analysed, which is a major concern in DEA applications. The data-driven factorial and clustering technique allows for maximum within-group homogeneity and between-group heterogeneity by reducing subjective bias and dimensionality, which is embedded with the use of Big Data.Practical implicationsThe use of Big Data in clustering applied to productivity analysis can provide managers with data-driven information about the structural and socio-economic characteristics of retailers' catchment areas, which is important in establishing potential productivity performance and optimizing resource allocation. The improved productivity indexes enable the setting of targets that are coherent with retailers' potential, which increases motivation and commitment.Originality/valueThis article proposes an innovative technique to enhance the accuracy of productivity measures through the use of Big Data clustering and DEA. To the best of the authors’ knowledge, no attempts have been made to benefit from the use of Big Data in the literature on retail store productivity.