Reliability, Resilience, and Alerts: Preferences for Autonomous Vehicles in the United States

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Self-driving vehicle (SDV) safety and reliability are becoming critical design parameters as SDVs increase their market share. This paper examines public preferences for key SDV safety features (system reliability, sensor resilience, failure behavior, and driver alert methods) using a choice-based conjoint survey of 403 U.S. respondents. A novel integration of conjoint analysis with Least Absolute Shrinkage and Selection Operator (LASSO) regression and generalized linear mixed-effects models (GLMMs) was applied to identify the most influential features and their demographic or behavioral predictors. Results show that multimodal driver alerts (i.e., audio + visual) were the most influential factor, accounting for nearly two-thirds of decision weight. System reliability (i.e., low human intervention rates) and sensor resilience (i.e., low tolerance for failures) were secondary, while failure behavior had minimal influence. Subgroup analyses revealed modest variations by willingness to pay for SDVs, income, race/ethnicity, marital status, education, driving frequency, and risk propensity, though the importance of alerts and reliability remained consistent across groups. This combined conjoint-LASSO-GLMM framework enhances the precision of preference estimation and offers actionable guidance for SDV manufacturers seeking to align safety feature design with consumer expectations.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1080/10556788.2013.801970
A gradient method for the monotone fused least absolute shrinkage and selection operator
  • Jun 18, 2013
  • Optimization Methods and Software
  • Yu Xia + 1 more

We propose the monotone fused least absolute shrinkage and selection operator (LASSO) model and develop a continuous algorithm for it. The LASSO model is a special case of the fused LASSO model. The LASSO technique improves prediction accuracy and reduces the number of predictors, while the fused LASSO procedure also encourages flatness of the regression predictors. The monotone fused LASSO model describes regression with monotonic constraints better than the fused LASSO model. We adapt Nesterov's fast gradient methods to the monotone fused LASSO model, we give closed-form solutions for each iteration and prove the boundedness of the optimal solution set, and we provide convergence results. Numerical examples are provided and discussed. Our approach can easily be adapted to related problems, such as the monotone regression and the fused LASSO model.

  • Research Article
  • Cite Count Icon 32
  • 10.1109/access.2019.2909071
Hi-LASSO: High-Dimensional LASSO
  • Jan 1, 2019
  • IEEE Access
  • Youngsoon Kim + 4 more

High-throughput genomic technologies are leading to a paradigm shift in research of computational biology. Computational analysis with high-dimensional data and its interpretation are essential for the understanding of complex biological systems. Most biological data (e.g., gene expression and DNA sequence data) are high-dimensional, but consist of much fewer samples than predictors. Such high-dimension, low sample size (HDLSS) data often cause computational challenges in biological data analysis. A number of least absolute shrinkage and selection operator (LASSO) methods have been widely used for identifying biomarkers or prognostic factors in the field of bioinformatics. The LASSO solution has been improved through the development of the LASSO derivatives, including elastic-net, adaptive LASSO, relaxed LASSO, VISA, random LASSO, and recursive LASSO. However, there are several known limitations of the existing LASSO solutions: multicollinearity (particularly with different signs), subset size limitation, and the lack of the statistical test of significance. We propose a high-dimensional LASSO (Hi-LASSO) that theoretically improves a LASSO model providing better performance of both prediction and feature selection on extremely high-dimensional data. The Hi-LASSO alleviates bias introduced from bootstrapping, refines importance scores, improves the performance taking advantage of global oracle property, provides a statistical strategy to determine the number of bootstrapping, and allows tests of significance for feature selection with appropriate distribution. The performance of Hi-LASSO was assessed by comparing the existing state-of-the-art LASSO methods in extensive simulation experiments with multiple data settings. The Hi-LASSO was also applied for survival analysis with GBM gene expression data.

  • Conference Article
  • 10.23919/ccc50068.2020.9188556
Ultra-short-term Interval Prediction of Wind Farm Cluster Power Based on LASSO
  • Jul 1, 2020
  • Yan Zhou + 4 more

Efficient and accurate power prediction of wind farm cluster is an effective method to improve the safety and reliability of power system for large-scale wind power. In this paper, the probabilistic prediction model of regional wind power is studied. The nonparametric method based on least absolute shrinkage and selection operator (LASSO) is used for the ultra-short-term probabilistic prediction. In this paper, the prediction model of nonlinear quantile regression (NQR) model based on quantile regression (QR) and extreme learning machine (ELM) is studied. Then, LASSO is utilized to shrink the output weights for the sparsity. The penalty of LASSO can prevent the overfitting and improve the performance of prediction intervals (PIs), without the reduction of computational efficiency. With the actual dataset of the wind farms in northeast China, the PIs performance is verified, compared with other well-established benchmarks.

  • Research Article
  • Cite Count Icon 90
  • 10.1111/j.1541-0420.2006.00660.x
Predicting Patient Survival from Microarray Data by Accelerated Failure Time Modeling Using Partial Least Squares and LASSO
  • Mar 1, 2007
  • Biometrics
  • Susmita Datta + 2 more

We consider the problem of predicting survival times of cancer patients from the gene expression profiles of their tumor samples via linear regression modeling of log-transformed failure times. The partial least squares (PLS) and least absolute shrinkage and selection operator (LASSO) methodologies are used for this purpose where we first modify the data to account for censoring. Three approaches of handling right censored data-reweighting, mean imputation, and multiple imputation-are considered. Their performances are examined in a detailed simulation study and compared with that of full data PLS and LASSO had there been no censoring. A major objective of this article is to investigate the performances of PLS and LASSO in the context of microarray data where the number of covariates is very large and there are extremely few samples. We demonstrate that LASSO outperforms PLS in terms of prediction error when the list of covariates includes a moderate to large percentage of useless or noise variables; otherwise, PLS may outperform LASSO. For a moderate sample size (100 with 10,000 covariates), LASSO performed better than a no covariate model (or noise-based prediction). The mean imputation method appears to best track the performance of the full data PLS or LASSO. The mean imputation scheme is used on an existing data set on lung cancer. This reanalysis using the mean imputed PLS and LASSO identifies a number of genes that were known to be related to cancer or tumor activities from previous studies.

  • Research Article
  • 10.7507/1001-5515.201508026
Generalized interaction LASSO based on alternating direction method of multipliers for liver disease classification
  • Jun 1, 2017
  • Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi
  • Jing Li + 2 more

Features and interaction between features of liver disease is of great significance for the classification of liver disease. Based on least absolute shrinkage and selection operator (LASSO) and interaction LASSO, the generalized interaction LASSO model is proposed in this paper for liver disease classification and compared with other methods. Firstly, the generalized interaction logistic classification model was constructed and the LASSO penalty constraints were added to the interactive model parameters. Then the model parameters were solved by an efficient alternating directions method of multipliers (ADMM) algorithm. The solutions of model parameters were sparse. Finally, the test samples were fed to the model and the classification results were obtained by the largest statistical probability. The experimental results of liver disorder dataset and India liver dataset obtained by the proposed methods showed that the coefficients of interaction features of the model were not zero, indicating that interaction features were contributive to classification. The accuracy of the generalized interaction LASSO method is better than that of the interaction LASSO method, and it is also better than that of traditional pattern recognition methods. The generalized interaction LASSO method can also be popularized to other disease classification areas.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.brainresbull.2024.110992
Mental workload evaluation using weighted phase lag index and coherence features extracted from EEG data
  • May 31, 2024
  • Brain Research Bulletin
  • Somayeh B Shafiei + 2 more

Mental workload evaluation using weighted phase lag index and coherence features extracted from EEG data

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.4236/ojs.2020.101009
Variable Selection via Biased Estimators in the Linear Regression Model
  • Jan 1, 2020
  • Open Journal of Statistics
  • Manickavasagar Kayanan + 1 more

Least Absolute Shrinkage and Selection Operator (LASSO) is used for variable selection as well as for handling the multicollinearity problem simultaneously in the linear regression model. LASSO produces estimates having high variance if the number of predictors is higher than the number of observations and if high multicollinearity exists among the predictor variables. To handle this problem, Elastic Net (ENet) estimator was introduced by combining LASSO and Ridge estimator (RE). The solutions of LASSO and ENet have been obtained using Least Angle Regression (LARS) and LARS-EN algorithms, respectively. In this article, we proposed an alternative algorithm to overcome the issues in LASSO that can be combined LASSO with other exiting biased estimators namely Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator. Further, we examine the performance of the proposed algorithm using a Monte-Carlo simulation study and real-world examples. The results showed that the LARS-rk and LARS-rd algorithms, which are combined LASSO with r-k class estimator and r-d class estimator, outperformed other algorithms under the moderated and severe multicollinearity.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.1371/journal.pone.0296625
Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms.
  • Feb 13, 2024
  • PLOS ONE
  • Eric Komla Anku + 1 more

Undernutrition among children under the age of five is a major public health concern, especially in developing countries. This study aimed to use machine learning (ML) algorithms to predict undernutrition and identify its associated factors. Secondary data analysis of the 2017 Multiple Indicator Cluster Survey (MICS) was performed using R and Python. The main outcomes of interest were undernutrition (stunting: height-for-age (HAZ) < -2 SD; wasting: weight-for-height (WHZ) < -2 SD; and underweight: weight-for-age (WAZ) < -2 SD). Seven ML algorithms were trained and tested: linear discriminant analysis (LDA), logistic model, support vector machine (SVM), random forest (RF), least absolute shrinkage and selection operator (LASSO), ridge regression, and extreme gradient boosting (XGBoost). The ML models were evaluated using the accuracy, confusion matrix, and area under the curve (AUC) receiver operating characteristics (ROC). In total, 8564 children were included in the final analysis. The average age of the children was 926 days, and the majority were females. The weighted prevalence rates of stunting, wasting, and underweight were 17%, 7%, and 12%, respectively. The accuracies of all the ML models for wasting were (LDA: 84%; Logistic: 95%; SVM: 92%; RF: 94%; LASSO: 96%; Ridge: 84%, XGBoost: 98%), stunting (LDA: 86%; Logistic: 86%; SVM: 98%; RF: 88%; LASSO: 86%; Ridge: 86%, XGBoost: 98%), and for underweight were (LDA: 90%; Logistic: 92%; SVM: 98%; RF: 89%; LASSO: 92%; Ridge: 88%, XGBoost: 98%). The AUC values of the wasting models were (LDA: 99%; Logistic: 100%; SVM: 72%; RF: 94%; LASSO: 99%; Ridge: 59%, XGBoost: 100%), for stunting were (LDA: 89%; Logistic: 90%; SVM: 100%; RF: 92%; LASSO: 90%; Ridge: 89%, XGBoost: 100%), and for underweight were (LDA: 95%; Logistic: 96%; SVM: 100%; RF: 94%; LASSO: 96%; Ridge: 82%, XGBoost: 82%). Age, weight, length/height, sex, region of residence and ethnicity were important predictors of wasting, stunting and underweight. The XGBoost model was the best model for predicting wasting, stunting, and underweight. The findings showed that different ML algorithms could be useful for predicting undernutrition and identifying important predictors for targeted interventions among children under five years in Ghana.

  • Research Article
  • 10.1002/pds.70165
Use of Machine Learning to Compare Disease Risk Scores and Propensity Scores Across Complex Confounding Scenarios: A Simulation Study
  • Jun 1, 2025
  • Pharmacoepidemiology and Drug Safety
  • Yuchen Guo + 3 more

ABSTRACTPurposeThe surge of treatments for COVID‐19 in the second quarter of 2020 had a low prevalence of treatment and high outcome risk. Motivated by that, we conducted a simulation study comparing disease risk scores (DRS) and propensity scores (PS) using a range of scenarios with different treatment prevalences and outcome risks.MethodFour methods were used to estimate PS and DRS: logistic regression (reference method), least absolute shrinkage and selection operator (LASSO), multilayer perceptron (MLP), and XgBoost. Monte Carlo simulations generated data across 25 scenarios varying in treatment prevalence, outcome risk, data complexity, and sample size. Average treatment effects were calculated after matching. Relative bias and average absolute standardized mean difference (ASMD) were reported.ResultEstimation bias increased as treatment prevalence decreased. DRS showed lower bias than PS when treatment prevalence was below 0.1, especially in nonlinear data. However, DRS did not outperform PS in linear or small sample data. PS had comparable or lower bias than DRS when treatment prevalence was 0.1–0.5. Three machine learning (ML) methods performed similarly, with LASSO and XgBoost outperforming the reference method in some nonlinear scenarios. ASMD results indicated that DRS was less impacted by decreasing treatment prevalence compared to PS.ConclusionUnder nonlinear data, DRS reduced bias compared to PS in scenarios with low treatment prevalence, while PS was preferable for data with treatment prevalence greater than 0.1, regardless of the outcome risk. ML methods can outperform the logistic regression method for PS and DRS estimation. Both decreasing sample size and adding nonlinearity and nonadditivity in data increased bias for all methods tested.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.ijrobp.2021.01.017
Flogging a Dead Salmon? Reduced Dose Posterior to Prostate Correlates With Increased PSA Progression in Voxel-Based Analysis of 3 Randomized Phase 3 Trials
  • Apr 20, 2021
  • International Journal of Radiation Oncology*Biology*Physics
  • Jane Shortall + 8 more

Flogging a Dead Salmon? Reduced Dose Posterior to Prostate Correlates With Increased PSA Progression in Voxel-Based Analysis of 3 Randomized Phase 3 Trials

  • Research Article
  • 10.1017/s0021859624000479
Least absolute shrinkage and selection operator regression used to select important features when predicting wheat yield from various genotype groups
  • Jun 1, 2024
  • The Journal of Agricultural Science
  • Muhuddin Rajin Anwar + 5 more

Bread wheat and durum wheat genotypes were grown in field experiments at two locations in New South Wales, Australia across several years and using two sowing times (‘early’ v. ‘late’). Genotypes were grouped based on genetic similarity. Grain yield, grain size, soil characteristics and daily weather data were collected. The weather data were used to calculate water and heat stress indices for four key growth periods around flowering. Least absolute shrinkage and selection operator (LASSO) was used to predict grain yield and to identify the most influential features (a combination of index and growth period). A novel approach involving the crop water supply–demand ratio effectively summarized water relations during growth. LASSO predicted grain yield quite well (adjusted R2 from 0.57 to 0.98), especially in a set of durum genotypes. However, the addition of other important variables such as lodging score, disease incidence, weed incidence and insect damage could have improved modelling results. Growth period 2 (30 days pre-flowering up to flowering) was the most sensitive for yield loss from heat stress and water stress for most features. Although one group of bread wheat genotypes was more sensitive to water stress (drought) in period 3 (20 days pre-flowering to 10 days post-flowering). Evapotranspiration was a significant positive feature but only in the vegetative phase (pre-flowering, period 1). This study confirms the usefulness of LASSO modelling as a technique to make predictions that could be used to identify genotypes that are suitable candidates for further investigation by breeders for their stress-tolerance ability.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 87
  • 10.1371/journal.pone.0089700
Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer
  • Feb 28, 2014
  • PLoS ONE
  • Tsair-Fwu Lee + 12 more

PurposeThe aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT.Methods and MaterialsQuality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC.ResultsEight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values.ConclusionsMultivariate NTCP models with LASSO can be used to predict patient-rated xerostomia after IMRT.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 182
  • 10.1186/s12874-016-0254-8
Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: simulation and application
  • Nov 14, 2016
  • BMC Medical Research Methodology
  • Monica M Vasquez + 5 more

BackgroundThe study of circulating biomarkers and their association with disease outcomes has become progressively complex due to advances in the measurement of these biomarkers through multiplex technologies. The Least Absolute Shrinkage and Selection Operator (LASSO) is a data analysis method that may be utilized for biomarker selection in these high dimensional data. However, it is unclear which LASSO-type method is preferable when considering data scenarios that may be present in serum biomarker research, such as high correlation between biomarkers, weak associations with the outcome, and sparse number of true signals. The goal of this study was to compare the LASSO to five LASSO-type methods given these scenarios.MethodsA simulation study was performed to compare the LASSO, Adaptive LASSO, Elastic Net, Iterated LASSO, Bootstrap-Enhanced LASSO, and Weighted Fusion for the binary logistic regression model. The simulation study was designed to reflect the data structure of the population-based Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD), specifically the sample size (N = 1000 for total population, 500 for sub-analyses), correlation of biomarkers (0.20, 0.50, 0.80), prevalence of overweight (40%) and obese (12%) outcomes, and the association of outcomes with standardized serum biomarker concentrations (log-odds ratio = 0.05–1.75). Each LASSO-type method was then applied to the TESAOD data of 306 overweight, 66 obese, and 463 normal-weight subjects with a panel of 86 serum biomarkers.ResultsBased on the simulation study, no method had an overall superior performance. The Weighted Fusion correctly identified more true signals, but incorrectly included more noise variables. The LASSO and Elastic Net correctly identified many true signals and excluded more noise variables. In the application study, biomarkers of overweight and obesity selected by all methods were Adiponectin, Apolipoprotein H, Calcitonin, CD14, Complement 3, C-reactive protein, Ferritin, Growth Hormone, Immunoglobulin M, Interleukin-18, Leptin, Monocyte Chemotactic Protein-1, Myoglobin, Sex Hormone Binding Globulin, Surfactant Protein D, and YKL-40.ConclusionsFor the data scenarios examined, choice of optimal LASSO-type method was data structure dependent and should be guided by the research objective. The LASSO-type methods identified biomarkers that have known associations with obesity and obesity related conditions.

  • Research Article
  • 10.47352/jmans.2774-3047.251
Performance of Ridge Regression, Least Absolute Shrinkage and Selection Operator, and Elastic Net in Overcoming Multicollinearity
  • Feb 23, 2025
  • Journal of Multidisciplinary Applied Natural Science
  • Dewi Retno Sari Saputro + 2 more

Multicollinearity is a violation of assumptions in multiple linear regression analysis that can occur if there is a high correlation between the independent variables. Likewise, the variants of multiple linear regression models such as the Geographically Weighted Regression model (GWR). Multicollinearity causes parameter estimation using the Quadratic Method (QM) unstable and produces a large variance. On the other hand, what is expected in the estimation parameters is an estimate with a minimum variance, even though it is biased. Thus, one way to overcome multicollinearity can be to use biased estimators, such as Ridge Regression (RR), Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic Net (EN). In RR, the Least Square Method (LSM) coefficient is reduced to zero but it can’t select the independent variable. However, the parameter model obtained from the Ridge Regression is biased, and the variance of the resulting regression coefficients is relatively tiny. In addition, the RR is increasingly difficult to understand if a huge number of independent variables are used. Meanwhile, LASSO is a computational method that uses quadratic programming and can act out the RR principles and perform variable selection. The LASSO method became known after discovering the Least-Angle Regression (LARS) algorithm. The LASSO method can reduce the LSM coefficient to zero to perform variable selection. LASSO also has a weakness, so EN is used. In this article, the performance of the three methods is compared from the mathematical aspect. The performance of each is written as follows, RR is helpful for clustering effects, where collinear features can be selected together; LASSO is proper for feature selection when the dataset has features with poor predictive power and EN combines LASSO and RR, which has the potential to lead to simple and predictive models.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 14
  • 10.1186/1753-6561-3-s7-s21
Comparison between the stochastic search variable selection and the least absolute shrinkage and selection operator for genome-wide association studies of rheumatoid arthritis
  • Dec 1, 2009
  • BMC Proceedings
  • Sudeep Srivastava + 1 more

Because multiple loci control complex diseases, there is great interest in testing markers simultaneously instead of one by one. In this paper, we applied two model selection algorithms: the stochastic search variable selection (SSVS) and the least absolute shrinkage and selection operator (LASSO) to two quantitative phenotypes related to rheumatoid arthritis (RA). The Genetic Analysis Workshop 16 data includes 2,062 unrelated individuals and 545,080 single-nucleotide polymorphism markers from the Illumina 550 k chip. We performed our analyses on the cases as the quantitative phenotype data was not provided for the controls. The performance of the two algorithms was compared. Using sure independence screening as the prescreening procedure, both SSVS and LASSO give small models. No markers are identified in the human leukocyte antigen region of chromosome 6 that was shown to be associated with RA. SSVS and LASSO identify seven common loci, and some of them are on genes LRRC8D, LRP1B, and COLEC12. These genes have not been reported to be associated with RA. LASSO also identified a common locus on gene KTCD21 for the two phenotypes (marker rs230662 and rs483731, respectively). SSVS outperforms LASSO in simulation studies. Both SSVS and LASSO give small models on the RA data, however this depends on model parameters. We also demonstrate the ability of both LASSO and SSVS to handle more markers than the number of samples.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon