Abstract

Under the current statistical environment and technical conditions, there is a certain lag in the publication of statistical data. This means that there is a time lag in the completion of reports, which may delay the judgment of the current economic situation. Network real-time analysis based on big data analysis has gradually become the main force of data analysis. This paper puts forward the basic idea of understanding the statistical inference problem of non-probabilistic sampling. Sampling methods can consider sample selection based on sample matching, link tracking sampling method, etc., so that the obtained non-probabilistic samples are like probabilistic samples, so the statistical inference theory of probabilistic samples can be adopted. Random sampling technology and non-random sampling technology still have many applicable scenes, which are not only the scenes of traditional sampling survey in the past, but also applied to more modern information scenes with the times.

Highlights

  • With the rapid development and wide application of big data and artificial intelligence technology, the data in the whole world is showing a spurt of growth

  • In the growth process of the decision tree, the training sample sets are grouped continuously, and each branch of the decision tree is allowed to grow by grouping, until all the groupings are no longer meaningful, the growth process of the decision tree ends. It can be seen from the table that the performance of single classification tree is slightly better than that of combined classification tree on the training sample set, for new samples, the combined prediction model established by bootstrap samples is obviously better than that of single prediction model, which proves that the combined prediction model established on multiple groups of random samples can improve the robustness of prediction results to a certain extent, reflecting that random samples still have certain application value in the process of big data analysis

  • In this paper, aiming at the statistical inference of non-probabilistic sampling under the background of big data, the statistical inference theory of probabilistic samples can be used for inference

Read more

Summary

Introduction

With the rapid development and wide application of big data and artificial intelligence technology, the data in the whole world is showing a spurt of growth. The statistical characteristics in the era of big data challenge the traditional sampling analysis and question the representativeness and reliability of the sampling analysis results. The basic idea of sampling method is to randomly extract sub-samples from the initial big data instead of the original data to estimate, predict and statistically infer the model. The difficulty of sampling method lies in designing the probability distribution of sub-samples. We must provide a brandnew method to obtain effective information, adapt it to the large-scale data processing mode, transform the obtained simple data information into rich and diverse information assets with high growth rate, and improve the process optimization ability and decision-making power

Impact of Collection Method
Impact of Statistical Authority
Impact of Statistical Data Security
Tracking Sampling Method
Estimation of Optimal Sampling Design for Linear Model
Application Analysis of Statistical Method of Sampling Big Data
Summary
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.