Abstract

Network data are becoming increasingly available, and so there is a need to develop a suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold‐valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co‐occurrence networks.

Highlights

  • Networks are of wide interest and able to represent many different phenomena, for example, social interactions and connections between regions in the brain (Ginestet et al, 2017; Kolaczyk, 2009)

  • Another motivating application is the study of evolving writing styles in the novels of Jane Austen and Charles Dickens, in which each network is a representation of a novel based on word co-occurrences, and the covariate is the time that writing of the novel began (Severn et al, 2020)

  • The two applications presented involve a scalar covariate, but the Nadaraya–Watson estimator is appropriate to more general covariates, for example, spatial covariates

Read more

Summary

ORIGINAL ARTICLE

Funding information Engineering and Physical Sciences Research Council, Grant/Award Number: EP/T003928/1. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks. KEYWORDS consistency, dynamic network, graph Laplacian, manifold, metric, Nadaraya–Watson

| INTRODUCTION
Xm Xm
1Àρ ρjkÀlj dα
| DISCUSSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.