Customer data acquisition is an important task in data-driven business analytics. Recently, there has been a growing interest in the effective use of an organization’s internal customer data, also known as first-party data. This work studies the acquisition of new data for business analytics based on first-party data resource. We address issues related to both acquisition cost and data quality. To reduce acquisition cost, we consider using auction-based methods, such as the generalized second price (GSP) auction, for acquiring data with differential prices for different customers. We find that the GSP-based data acquisition method incurs a lower cost and/or achieves a higher response rate than fixed price methods. To maximize data quality, we propose novel optimization models for different data acquisition methods and data quality measures. The proposed models maximize the quality of the acquired data while satisfying budget constraints. We derive and discuss the solutions to the optimization models analytically and provide managerial insights from the solutions. The proposed approach is effective in increasing customer responses, reducing selection bias, and enabling more accurate estimation and prediction for business analytics. The results of the experimental evaluation demonstrate the advantage of the proposed approach over existing data acquisition methods. History: Accepted by Ram Ramesh, Area Editor for Data Science and Machine Learning. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0037 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0037 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Read full abstract