Abstract

A novel approach to the problem of statistical inference for multivariate binary transaction data is proposed. A fundamental question that arises from this data, often referred to as market basket data, is how the items relate to one another. These relationships are naturally expressed by a graph and transactions can be modeled as samples of cliques from this association graph. A hierarchical model is developed that follows from this generative idea, along with an MCMC sampling procedure that handles large datasets and allows inference on a broad set of parameters. This model provides a sparser representation of associations between items as compared with frequent itemset mining (FIM) output, without sacrificing predictive accuracy. Additionally, by allowing inference on a broad set of parameters, the model provides a deeper level of insight into transaction data. Empirical results are provided on applications of this model to simulated data and real transaction data from Instacart.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.