Abstract
Network data are becoming increasingly available, and so there is a need to develop a suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold‐valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co‐occurrence networks.
Highlights
Networks are of wide interest and able to represent many different phenomena, for example, social interactions and connections between regions in the brain (Ginestet et al, 2017; Kolaczyk, 2009)
Another motivating application is the study of evolving writing styles in the novels of Jane Austen and Charles Dickens, in which each network is a representation of a novel based on word co-occurrences, and the covariate is the time that writing of the novel began (Severn et al, 2020)
The two applications presented involve a scalar covariate, but the Nadaraya–Watson estimator is appropriate to more general covariates, for example, spatial covariates
Summary
Funding information Engineering and Physical Sciences Research Council, Grant/Award Number: EP/T003928/1. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks. KEYWORDS consistency, dynamic network, graph Laplacian, manifold, metric, Nadaraya–Watson
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.