Non‐parametric regression for networks

Katie E Severn,Ian L Dryden,Simon P Preston

doi:10.1002/sta4.373

Abstract

Network data are becoming increasingly available, and so there is a need to develop a suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold‐valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co‐occurrence networks.

Highlights

Networks are of wide interest and able to represent many different phenomena, for example, social interactions and connections between regions in the brain (Ginestet et al, 2017; Kolaczyk, 2009)
Another motivating application is the study of evolving writing styles in the novels of Jane Austen and Charles Dickens, in which each network is a representation of a novel based on word co-occurrences, and the covariate is the time that writing of the novel began (Severn et al, 2020)
The two applications presented involve a scalar covariate, but the Nadaraya–Watson estimator is appropriate to more general covariates, for example, spatial covariates

Summary

ORIGINAL ARTICLE

Funding information Engineering and Physical Sciences Research Council, Grant/Award Number: EP/T003928/1. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example, in dynamic networks where the covariate is time. We develop an adapted Nadaraya–Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks. KEYWORDS consistency, dynamic network, graph Laplacian, manifold, metric, Nadaraya–Watson

| INTRODUCTION

Xm Xm

1Àρ ρjkÀlj dα

| DISCUSSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Stat	Publication Date: May 7, 2021
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Non‐parametric regression for networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Stat

Lead the way for us

Similar Papers

When to choose dynamic vs. static social network analysis.
Damien R Farine
Journal of Animal Ecology | VOL. 87
Damien R FarineDamien R Farine
02 Nov 2017
Journal of Animal Ecology | VOL. 87

Comment on “Worldwide extremely low frequency magnetic field sensor network for sprite studies” by Toby Whitley et al.
A P Nickolaenko
Radio Science | VOL. 47
A P NickolaenkoA P Nickolaenko
30 Mar 2012
Radio Science | VOL. 47

Information dissemination in large-scale wireless networks with unreliable links
...
-
, et. al. ...
17 Nov 2008
17 Nov 2008

Performance of data networks with random links
Henryk Fukś ... Anna T Lawniczak
Mathematics and Computers in Simulation | VOL. 51
Henryk Fukś, et. al.Henryk Fukś ... Anna T Lawniczak
01 Dec 1999
Mathematics and Computers in Simulation | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Non‐parametric regression for networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Stat