Fitting Models to Data as an Application of Optimization in Calculus
The teaching and learning of calculus is enhanced by applications to real-world problems. The topic of optimization (finding the maximum or minimum of a function) is a place where many calculus instructors emphasize applications. However, the breadth of applications explored in optimization is usually quite limited. In this paper, we show how the classic statistics problem of fitting a model to data can be incorporated as an application of optimization techniques in single and multivariable calculus classes. In particular, we introduce the reader to the method of maximum likelihood estimation, illustrate its use via a number of examples, and provide a link to a larger collection of worked examples online. We also describe the positive impact that including this material had on student attitudes in a calculus class.
- Conference Article
- 10.1063/1.5016670
- Jan 1, 2017
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
- Research Article
- 10.3389/fpsyg.2021.580015
- Jul 27, 2021
- Frontiers in Psychology
In this paper, a new item-weighted scheme is proposed to assess examinees’ growth in longitudinal analysis. A multidimensional Rasch model for measuring learning and change (MRMLC) and its polytomous extension is used to fit the longitudinal item response data. In fact, the new item-weighted likelihood estimation method is not only suitable for complex longitudinal IRT models, but also it can be used to estimate the unidimensional IRT models. For example, the combination of the two-parameter logistic (2PL) model and the partial credit model (PCM, Masters, 1982) with a varying number of categories. Two simulation studies are carried out to further illustrate the advantages of the item-weighted likelihood estimation method compared to the traditional Maximum a Posteriori (MAP) estimation method, Maximum likelihood estimation method (MLE), Warm’s (1989) weighted likelihood estimation (WLE) method, and type-weighted maximum likelihood estimation (TWLE) method. Simulation results indicate that the improved item-weighted likelihood estimation method better recover examinees’ true ability level for both complex longitudinal IRT models and unidimensional IRT models compared to the existing likelihood estimation (MLE, WLE and TWLE) methods and MAP estimation method, with smaller bias, root-mean-square errors, and root-mean-square difference especially at the low-and high-ability levels.
- Research Article
60
- 10.1007/s00362-017-0971-z
- Dec 4, 2017
- Statistical Papers
In this study, we proposed some ridge estimators by considering the work of Mansson (Econ Model 29(2):178–184, 2012), Dorugade (J Assoc Arab Univ Basic Appl Sci 15:94–99, 2014) and some others for the gamma regression model (GRM). The GRM is a special form of the generalized linear model (GLM), where the response variable is positively skewed and well fitted to the gamma distribution. The commonly used method for estimation of the GRM coefficients is the maximum likelihood (ML) estimation method. The ML estimation method perform better, if the explanatory variables are uncorrelated. There are the situations, where the explanatory variables are correlated, then the ML estimation method is incapable to estimate the GRM coefficients. In this situation, some biased estimation methods are proposed and the most popular one is the ridge estimation method. The ridge estimators for the GRM are proposed and compared on the basis of mean squared error (MSE). Moreover, the outperforms of proposed ridge estimators are also calculated. The comparison has done using a Monte Carlo simulation study and two real data sets. Results show that Kibria’s and Mansson and Shukur’s methods are preferred over the ML method.
- Research Article
4
- 10.18187/pjsor.v19i4.4423
- Dec 6, 2023
- Pakistan Journal of Statistics and Operation Research
Researchers from various fields of science encounter phenomena of interest, and they seek to model the occurrences scientifically. An important approach of performing modeling is to use probability distributions. Probability distributions are probabilistic models that have many applications in different research areas, including, but not limited to, environmental and financial studies. In this paper, we study a quartic transmuted Weibull distribution from a general quartic transmutation family of distributions as a generalization and an alternative to the well-known Weibull distribution. We also investigate the practical application of this generalization by modeling climate-related data sets and check the goodness-of-fit of the proposed model. The statistical properties of the proposed model, which includes non-central moments, generating functions, survival function, and hazard function, are derived. Different estimation methods to estimate the parameters of the proposed quartic transmuted distribution: the maximum likelihood estimation method, the maximum product of spacings method, two least-squares-based methods, and three goodness-of-fit-based estimation methods. Numerical illustration and an extensive comparative Monte Carlo simulation study are conducted to investigate the performance of the estimators of the considered inferential methods. Regarding estimation methods, simulation outcomes indicated that the maximum likelihood estimation (MLE), Anderson-Darling estimation (ADE) and right Anderson-Darling (RADE) methods in general outperformed the other considered methods in terms of estimation efficiency for large sample size, while all considered estimation methods performed almost same in terms of goodness-of-fit regardless the values of shape and transmuted parameters. Two real-life data sets are used to demonstrate the suggested estimation methods, the applicability and flexibility of the proposed distribution compared to Weibull, transmuted Weibull, and cubic transmuted Weibull distributions. Weighted least squares estimation (WLSE) and least squares estimation (LSE) methods provided best model fitting estimates of the proposed distribution for Wheaton River and rainfall data respectively. The proposed quartic transmuted Weibull distribution provide significantly improved fit for the two datasets as compared with other distributions.
- Research Article
20
- 10.1109/tifs.2016.2547865
- Aug 1, 2016
- IEEE Transactions on Information Forensics and Security
It is known that various types of location privacy attacks can be carried out using a personalized transition matrix that is learned for each target user, or a population transition matrix that is common to all target users. However, since many users disclose only a small amount of location information in their daily lives, the training data can be extremely sparse. The aim of this paper is to clarify the risk of location privacy attacks in this realistic situation. To achieve this aim, we propose a learning method that uses tensor factorization (or matrix factorization) to accurately estimate personalized transition matrices (or a population transition matrix) from a small amount of training data. To avoid the difficulty in directly factorizing the personalized transition matrices (or population transition matrix), our learning method first factorizes a transition count tensor (or matrix), whose elements are the number of transition counts that the user has made, and then normalizes counts to probabilities. We focus on a localization attack, which derives an actual location of a user at a given time instant from an obfuscated trace, and compare our learning method with the maximum likelihood (ML) estimation method in both the personalized matrix mode and the population matrix mode. The experimental results using four real data sets show that the ML estimation method performs only as well as a random guess in many cases, while our learning method significantly outperforms the ML estimation method in all of the four data sets.
- Research Article
16
- 10.1088/0031-9155/51/17/002
- Aug 8, 2006
- Physics in Medicine & Biology
In radiation therapy with highly energetic heavy ions, the conformal irradiation of a tumour can be achieved by using their advantageous features such as the good dose localization and the high relative biological effectiveness around their mean range. For effective utilization of such properties, it is necessary to evaluate the range of incident ions and the deposited dose distribution in a patient's body. Several methods have been proposed to derive such physical quantities; one of them uses positron emitters generated through projectile fragmentation reactions of incident ions with target nuclei. We have proposed the application of the maximum likelihood estimation (MLE) method to a detected annihilation gamma-ray distribution for determination of the range of incident ions in a target and we have demonstrated the effectiveness of the method with computer simulations. In this paper, a water, a polyethylene and a polymethyl methacrylate target were each irradiated with stable 12C, 14N, 16O and 20Ne beams. Except for a few combinations of incident beams and targets, the MLE method could determine the range of incident ions R MLE with a difference between R MLE and the experimental range of less than 2.0 mm under the circumstance that the measurement of annihilation gamma rays was started just after the irradiation of 61.4 s and lasted for 500 s. In the process of evaluating the range of incident ions with the MLE method, we must calculate many physical quantities such as the fluence and the energy of both primary ions and fragments as a function of depth in a target. Consequently, by using them we can obtain the dose distribution. Thus, when the mean range of incident ions is determined with the MLE method, the annihilation gamma-ray distribution and the deposited dose distribution can be derived simultaneously. The derived dose distributions in water for the mono-energetic heavy-ion beams of four species were compared with those measured with an ionization chamber. The good agreement between the derived and the measured distributions implies that the deposited dose distribution in a target can be estimated from the detected annihilation gamma-ray distribution with a positron camera.
- Research Article
125
- 10.1080/15459621003609713
- Feb 12, 2010
- Journal of Occupational and Environmental Hygiene
When analyzing censored datasets, where one or more measurements are below the limit of detection (LOD), the maximum likelihood estimation (MLE) method is often considered the gold standard for estimating the GM and GSD of the underlying exposure profile. A new and relatively simple substitution method, called β -substitution, is presented and compared with the MLE method and the common substitution methods (LOD/2 and LOD/√2 substitution) when analyzing a left-censored dataset with either single or multiple censoring points. A computer program was used to generate censored exposure datasets for various combinations of true geometric standard deviation (1.2 to 4), percent censoring (1% to 50%), and sample size (5 to 19 and 20 to 100). Each method was used to estimate four parameters of the lognormal distribution: (1) the geometric mean, GM; (2) geometric standard deviation, GSD; (3) 95th percentile, and (4) Mean for the censored datasets. When estimating the GM and GSD, the bias and root mean square error (rMSE) for the β -substitution method closely matched those for the MLE method, differing by only a small amount, which decreased with increasing sample size. When estimating the Mean and 95th percentile the β -substitution method bias results closely matched or bettered those for the MLE method. In addition, the overall imprecision, as indicated by the rMSE, was similar to that of the MLE method when estimating the GM, GSD, 95th percentile, and Mean. The bias for the common substitution methods was highly variable, depending strongly on the range of GSD values. The β-substitution method produced results comparable to the MLE method and is considerably easier to calculate, making it an attractive alternative. In terms of bias it is clearly superior to the commonly used LOD/2 and LOD/√2 substitution methods. The rMSE results for the two substitution methods were often comparable to rMSE results for the MLE method, but the substitution methods were often considerably biased.
- Conference Article
1
- 10.1109/vnc.2016.7835928
- Dec 1, 2016
In this paper, we focus on the visible light communication (VLC) using an LED (transmitter) and a high-speed image sensor (receiver) for an intelligent transport system (ITS). The receiver of this system suffers from an inability to detect the correct luminance values of each LED since the image captured by the receiver blurs owing to defocusing. To overcome this problem, our previous study proposed a data demodulation method from blurred images by using the maximum likelihood estimation (MLE) method. However, this method requires high complexity as the number of LEDs increases. In this paper, we propose a demodulation method whose performance approaches the MLE method with significantly reduced complexity. The proposed method first apply the MMSE estimation to discern each LED's condition. According to the results of the estimation, we divide symbols from each LED into high reliability symbols and low reliability symbols. If the symbol is categorized as high reliability symbol, we demodulate data based on the result. On the other hand, if the symbol is categorized as low reliability symbol, we apply the MLE method to the LEDs who have low reliability symbols to demodulate data. We conduct computer simulation and compare the performance of the proposed method with the previous (MLE) one.
- Research Article
2
- 10.6339/jds.2014.12(1).1213
- Mar 9, 2021
- Journal of data science
Mixture of Weibull distributions has wide application in modeling of heterogeneous data sets. The parameter estimation is one of the most important problems related to mixture of Weibull distributions. In this paper, we propose a L-moment estimation method for mixture of two Weibull distributions. The proposed method is compared with maximum likelihood estimation (MLE) method according to the bias, the mean absolute error, the mean total error and completion time of the algorithm (time) by simulation study. Also, applications to real data sets are given to show the flexibility and potentiality of the proposed estimation method. The comparison shows that, the proposed method is better than MLE method.
- Research Article
- 10.5897/err2016.2807
- Aug 23, 2016
- Educational Research Review
Variance difference between maximum likelihood estimation method and expected A posteriori estimation method viewed from number of test items
- Conference Article
- 10.1109/iscid.2017.174
- Dec 1, 2017
In this paper, we considered the problem to improve the performance of the UUV when the navigation error exists in the application of target acquisition. The mother ship guides the UUV to acquire the target based on the location information sent back by the UUV. Since the navigation error exists, the discovery probability of the target has been highly decreased. Based on the modeling of the navigation error of the UUV, a least squire (LS) estimation method and a Maximum Likelihood (ML) estimation method are proposed to compensate the navigation error, in which the azimuth of the UUV achieved by the sensors on the mother ship has been used. Simulation results show that the proposed two methods can highly improve the acquisition performance, and the ML estimation method performs even better than the LS estimation method.
- Research Article
24
- 10.1038/s41598-018-20844-w
- Feb 5, 2018
- Scientific Reports
To examine for a causal relationship between vitamin D and glioma risk we performed an analysis of genetic variants associated with serum 25-hydroxyvitamin D (25(OH)D) levels using Mendelian randomisation (MR), an approach unaffected by biases from confounding. Two-sample MR was undertaken using genome-wide association study data. Single nucleotide polymorphisms (SNPs) associated with 25(OH)D levels were used as instrumental variables (IVs). We calculated MR estimates for the odds ratio (OR) for 25(OH)D levels with glioma using SNP-glioma estimates from 12,488 cases and 18,169 controls, using inverse-variance weighted (IVW) and maximum likelihood estimation (MLE) methods. A non-significant association between 25(OH)D levels and glioma risk was shown using both the IVW (OR = 1.21, 95% confidence interval [CI] = 0.90–1.62, P = 0.201) and MLE (OR = 1.20, 95% CI = 0.98–1.48, P = 0.083) methods. In an exploratory analysis of tumour subtype, an inverse relationship between 25(OH)D levels and glioblastoma (GBM) risk was identified using the MLE method (OR = 0.62, 95% CI = 0.43–0.89, P = 0.010), but not the IVW method (OR = 0.62, 95% CI = 0.37–1.04, P = 0.070). No statistically significant association was shown between 25(OH)D levels and non-GBM glioma. Our results do not provide evidence for a causal relationship between 25(OH)D levels and all forms of glioma risk. More evidence is required to explore the relationship between 25(OH)D levels and risk of GBM.
- Research Article
196
- 10.1371/journal.pone.0027731
- Nov 21, 2011
- PLoS ONE
Statistical methods for phylogeny estimation, especially maximum likelihood (ML), offer high accuracy with excellent theoretical properties. However, RAxML, the current leading method for large-scale ML estimation, can require weeks or longer when used on datasets with thousands of molecular sequences. Faster methods for ML estimation, among them FastTree, have also been developed, but their relative performance to RAxML is not yet fully understood. In this study, we explore the performance with respect to ML score, running time, and topological accuracy, of FastTree and RAxML on thousands of alignments (based on both simulated and biological nucleotide datasets) with up to 27,634 sequences. We find that when RAxML and FastTree are constrained to the same running time, FastTree produces topologically much more accurate trees in almost all cases. We also find that when RAxML is allowed to run to completion, it provides an advantage over FastTree in terms of the ML score, but does not produce substantially more accurate tree topologies. Interestingly, the relative accuracy of trees computed using FastTree and RAxML depends in part on the accuracy of the sequence alignment and dataset size, so that FastTree can be more accurate than RAxML on large datasets with relatively inaccurate alignments. Finally, the running times of RAxML and FastTree are dramatically different, so that when run to completion, RAxML can take several orders of magnitude longer than FastTree to complete. Thus, our study shows that very large phylogenies can be estimated very quickly using FastTree, with little (and in some cases no) degradation in tree accuracy, as compared to RAxML.
- Research Article
6
- 10.1177/1470785318796950
- Aug 30, 2018
- International Journal of Market Research
Using the Monte Carlo simulation method, this study analyzes the impacts on fit indices by the degree of nonnormality of variables, the sample size, and the choice of estimation method. To address these issues, we use the causal model of consumer involvement as elaborated by Mittal and Lee. Results of this study show that adjusted goodness of fit index (AGFI) and goodness of fit index (GFI) are subject to variation in sample size, and their use requires a sample size of at least 300 observations to be reliable. Comparative fit index (CFI) and root mean square error of approximation (RSMEA) are more reliable with the generalized least squares (GLS) compared with maximum likelihood estimation (MLE) method under different settings of sample size and degree of nonnormality. Finally, for the standardized root mean square residual (SRMR), it is recommended that it is used with the MLE method. This study provides prescriptions for the choice of fit indices and the requirements of sample size and estimation method to test the causal model of consumer involvement. The method used here can be extended to any model before fitting it to real data. It helps researchers to prevent conflictual results regarding the choice of fit indices.
- Research Article
15
- 10.1088/0031-9155/53/3/002
- Jan 7, 2008
- Physics in Medicine & Biology
In order to effectively utilize the prominent properties of heavy ions in radiotherapy, it is important to evaluate both the position of the field irradiated with incident ions and the absorbed dose distribution in a patient's body. One of the methods for this purpose is the utilization of the positron emitters produced through the projectile fragmentation reactions of stable heavy ions with target nuclei. In heavy-ion therapy, spread-out Bragg peak (SOBP) beams are used to achieve uniform biological dose distributions in the whole tumor volume. Therefore, in this study, we designed SOBP beams of 30 and 50 mm water-equivalent length (mmWEL) in width for 12C and 16O, and carried out irradiation experiments using them. Water, polyethylene and polymethyl methacrylate were selected as targets to simulate a human body. Pairs of annihilation gamma rays were detected by means of a limited-angle positron camera for 500 s, and annihilation gamma-ray distributions were obtained. The maximum likelihood estimation (MLE) method was applied to the detected distributions for evaluating the positions of the distal and proximal edges of the SOBP in a target. The differences between the positions evaluated with the MLE method and those derived from the measured dose distributions were less than 1.7 mm and 2.5 mm for the distal and the proximal edge, respectively, in all irradiation conditions. When the positions of both edges are determined with the MLE method, the most probable shape of the dose distribution in a target can be estimated simultaneously. The close agreement between the estimated and the measured distributions implied that the shape of the dose distribution in an irradiated target could be evaluated from the detected annihilation gamma-ray distribution.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.