Analysis and Research on the Application of Sampling Big Data Statistical Method

Yan Liu

doi:10.54691/bcpbm.v13i.86

Abstract

Under the current statistical environment and technical conditions, there is a certain lag in the publication of statistical data. This means that there is a time lag in the completion of reports, which may delay the judgment of the current economic situation. Network real-time analysis based on big data analysis has gradually become the main force of data analysis. This paper puts forward the basic idea of understanding the statistical inference problem of non-probabilistic sampling. Sampling methods can consider sample selection based on sample matching, link tracking sampling method, etc., so that the obtained non-probabilistic samples are like probabilistic samples, so the statistical inference theory of probabilistic samples can be adopted. Random sampling technology and non-random sampling technology still have many applicable scenes, which are not only the scenes of traditional sampling survey in the past, but also applied to more modern information scenes with the times.

Highlights

With the rapid development and wide application of big data and artificial intelligence technology, the data in the whole world is showing a spurt of growth
In the growth process of the decision tree, the training sample sets are grouped continuously, and each branch of the decision tree is allowed to grow by grouping, until all the groupings are no longer meaningful, the growth process of the decision tree ends. It can be seen from the table that the performance of single classification tree is slightly better than that of combined classification tree on the training sample set, for new samples, the combined prediction model established by bootstrap samples is obviously better than that of single prediction model, which proves that the combined prediction model established on multiple groups of random samples can improve the robustness of prediction results to a certain extent, reflecting that random samples still have certain application value in the process of big data analysis
In this paper, aiming at the statistical inference of non-probabilistic sampling under the background of big data, the statistical inference theory of probabilistic samples can be used for inference

Summary

Introduction

With the rapid development and wide application of big data and artificial intelligence technology, the data in the whole world is showing a spurt of growth. The statistical characteristics in the era of big data challenge the traditional sampling analysis and question the representativeness and reliability of the sampling analysis results. The basic idea of sampling method is to randomly extract sub-samples from the initial big data instead of the original data to estimate, predict and statistically infer the model. The difficulty of sampling method lies in designing the probability distribution of sub-samples. We must provide a brandnew method to obtain effective information, adapt it to the large-scale data processing mode, transform the obtained simple data information into rich and diverse information assets with high growth rate, and improve the process optimization ability and decision-making power

Impact of Collection Method

Impact of Statistical Authority

Impact of Statistical Data Security

Tracking Sampling Method

Estimation of Optimal Sampling Design for Linear Model

Application Analysis of Statistical Method of Sampling Big Data

Summary

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis and Research on the Application of Sampling Big Data Statistical Method

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BCP Business & Management

Lead the way for us

Journal: BCP Business & Management	Publication Date: Nov 16, 2021
License type: CC BY 4.0

Similar Papers

Using Probability vs. Nonprobability Sampling to Identify Hard-to-Access Participants for Health-Related Research
Lucy Feild ... Norman G Levinsky
Journal of Aging and Health | VOL. 18
Lucy Feild, et. al.Lucy Feild ... Norman G Levinsky
01 Aug 2006
Journal of Aging and Health | VOL. 18

Discussion on geological science big data and its applications
Chonglong Wu ... Zhiting Zhang
Chinese Science Bulletin | VOL. 61
Chonglong Wu, et. al.Chonglong Wu ... Zhiting Zhang
16 May 2016
Chinese Science Bulletin | VOL. 61

A survey of data partitioning and sampling methods to support big data analysis
Mohammad Sultan Mahmud ... Tamer Z Emara
Big Data Mining and Analytics | VOL. 3
Mohammad Sultan Mahmud, et. al.Mohammad Sultan Mahmud ... Tamer Z Emara
01 Jun 2020
Big Data Mining and Analytics | VOL. 3

Mining One Percent of Twitter: Collections, Baselines, Sampling
Carolin Gerlitz ... Bernhard Rieder
M/C Journal | VOL. 16
Carolin Gerlitz, et. al.Carolin Gerlitz ... Bernhard Rieder
02 Mar 2013
M/C Journal | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis and Research on the Application of Sampling Big Data Statistical Method

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BCP Business &amp; Management

More From: BCP Business & Management