Abstract

The enormous data distributed at the network edge and ubiquitous connectivity have led to the emergence of the new paradigm of distributed machine learning and large-scale data analytics. Distributed principal component analysis (PCA) concerns finding a low-dimensional subspace that contains the most important information of high-dimensional data distributed over the network edge. The subspace is useful for distributed data compression and feature extraction. This work advocates the application of over-the-air federated learning to efficient implementation of distributed PCA in a wireless network under a data-privacy constraint, termed AirPCA. The design features the exploitation of the waveform-superposition property of a multi-access channel to realize over-the-air aggregation of local subspace updates computed and simultaneously transmitted by devices to a server, thereby reducing the multi-access latency. The original drawback of this class of techniques, namely channel-noise perturbation to uncoded analog modulated signals, is turned into a mechanism for escaping from saddle points during stochastic gradient descent (SGD) in the AirPCA algorithm. As a result, the convergence of the AirPCA algorithm is accelerated. To materialize the idea, descent speeds in different types of descent regions are analyzed mathematically using martingale theory by accounting for wireless propagation and techniques including broadband transmission, over-the-air aggregation, channel fading and noise. The results reveal the accelerating effect of noise in saddle regions and the opposite effect in other types of regions. The insight and results are applied to designing an online scheme for adapting receive signal power to the type of current descent region. Specifically, the scheme amplifies the noise effect in saddle regions by reducing signal power and applies the power savings to suppressing the effect in other regions. From experiments using real datasets, such power control is found to accelerate convergence while achieving the same convergence accuracy as in the ideal case of centralized PCA.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.