Abstract

Aggregated inference on distributed data becomes more and more important due to the larger size of data collected in different industries. Modeling and inference are needed in the case where data cannot be obtained at a central location; aggregated statistical inference is a major tool to solve the aforementioned problems. In the literature, problems under the setting of regression model (more generally, M‐estimator) are extensively studied. There are at least two popular techniques for distributed estimation: (a) averaging estimators from local locations and (b) the one‐step approach, which combines the simple averaging estimator with a classical Newton's method (using the local Hessian matrices) to generate a “one‐step” estimator. It is proved that under certain assumptions, the above constructed estimators enjoy the same asymptotic properties as the centralized estimator, which is obtained as if all data were available at a central location. We review the aforementioned two major estimations. It can be seen that, in Big‐Data problems, dividing the data to multiple machines and then using the aggregation technique to solve the estimation problem in parallel can speed up the computation with little compromise of the quality of the estimators. We discuss potential extensions to other models, such as support vector machine, principle component analysis, and so on. Numerical examples are omitted due to the space limitation; they can be easily found in the literature.This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Knowledge Discovery Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods Statistical Models > Fitting Models Statistical and Graphical Methods of Data Analysis > Modeling Methods and Algorithms

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.