Abstract
The stochastic block model (SBM) and its variants constitute an important family of probabilistic tools for studying network data. There is a rich literature on methods for estimating block labels and model parameters of stochastic block models. Most of these studies require the number of communities K as an input, making the estimation of K an important problem. Cross‐validation is a natural option for this problem since it is a widely used generic method for evaluating model fitting. However, cross‐validation is known to be inconsistent and prone to overfitting unless impractical split ratios are used. Cross‐validation with confidence (CVC) is proposed with better theoretical guarantees in conventional settings. We study the properties of CVC for stochastic block models. Our theoretical studies show that CVC, unlike the standard cross‐validation, can consistently pick the optimal K under suitable conditions. We implement this method and check its performance against other established methods on both synthetic and real datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.