Abstract

Many machine learning algorithms used for dimensional reduction and manifold learning leverage on the computation of the nearest neighbors to each point of a data set to perform their tasks. These proximity relations define a so-called geometric graph, where two nodes are linked if they are sufficiently close to each other. Random geometric graphs, where the positions of nodes are randomly generated in a subset of R^{d}, offer a null model to study typical properties of data sets and of machine learning algorithms. Up to now, most of the literature focused on the characterization of low-dimensional random geometric graphs whereas typical data sets of interest in machine learning live in high-dimensional spaces (d≫10^{2}). In this work, we consider the infinite dimensions limit of hard and soft random geometric graphs and we show how to compute the average number of subgraphs of given finite size k, e.g., the average number of k cliques. This analysis highlights that local observables display different behaviors depending on the chosen ensemble: soft random geometric graphs with continuous activation functions converge to the naive infinite-dimensional limit provided by Erdös-Rényi graphs, whereas hard random geometric graphs can show systematic deviations from it. We present numerical evidence that our analytical results, exact in infinite dimensions, provide a good approximation also for dimension d≳10.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.