Review of automated parallel test form assembly

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

This paper presents a review of advanced automated parallel test form assembly (ATA) methods for computer-based testing (CBT) using artificial intelligence approaches. Parallel test forms ensure equivalent measurement accuracy of examinees’ scores across different sets of question items and thereby enable assessments to be administered at any time and from any location. First, this paper presents a review of ATA methods, organizing them into four categories: Mixed-Integer Programming (MIP), multi-form MIP, metaheuristics, and form-maximization. Second, this paper describes a comparison of the characteristics of these methods in terms of measurement accuracy, accessibility, computational complexity, and feasible testing frequency. Moreover, based on these characteristics, the review presents an example decision framework for selecting ATA methods according to the requirements of CBT. Finally, numerical experiments were conducted for this study to compare representative ATA methods in terms of their relative strengths and weaknesses.

Similar Papers
  • Research Article
  • Cite Count Icon 36
  • 10.3310/hta14200
Antenatal screening for haemoglobinopathies in primary care: a cohort study and cluster randomised trial to inform a simulation model. The Screening for Haemoglobinopathies in First Trimester (SHIFT) trial
  • Apr 1, 2010
  • Health Technology Assessment
  • E Dormandy + 18 more

To assess the effectiveness, cost-effectiveness, acceptability and feasibility of offering universal antenatal sickle cell and thalassaemia (SCT) screening in primary care when pregnancy is first confirmed and to model the cost-effectiveness of early screening in primary care versus standard care. A population-based cohort study, cluster randomised trial and refinement of a published decision model. Twenty-five general practices from two UK primary care trusts (PCTs) in two inner city boroughs with a high proportion of residents from minority ethnic groups. Practices were considered eligible if they agreed to be randomised and they were able to provide anonymous data on all eligible pregnant women. Participants were at least 18 years old and consented to take part in the evaluation. Practices were allocated to intervention, using minimisation and stratifying for PCT and number of partners at the practice, as follows: screening in primary care with parallel father testing (test offered to mother and father simultaneously; n = 8 clusters, 1010 participants); screening in primary care with sequential father testing (test offered to father only if mother identified as carrier; n = 9 clusters, 792 participants); and screening in secondary care with sequential father testing (standard care; n = 8 clusters, 619 participants). Data on gestational age at pregnancy confirmation and screening date were collected from trial practices for 6 months before randomisation in the cohort phase. The primary outcome measure was timing of SCT screening, measured as the proportion of women screened before 70 days' (10 weeks') gestation. Other outcomes included: offer of screening, rates of informed choice and proportion of women who knew the carrier status of their baby's father by 77 days (11 weeks). For 1441 eligible women in the cohort phase, the median [interquartile range (IQR)] gestational age at pregnancy confirmation was 7.6 weeks (6.0 to 10.7 weeks) and 74% presented in primary care before 10 weeks. The median gestational age at screening was 15.3 weeks (IQR 12.6 to 18.0 weeks). Only 4.4% were screened before 10 weeks. The median delay between pregnancy confirmation and screening was 6.9 weeks (4.7 to 9.3 weeks). In the intervention phase, 1708 pregnancies from 25 practices were assessed for the primary outcome measure. Completed questionnaires were obtained from 464 women who met eligibility criteria for the main analysis. The proportion of women screened by 10 weeks (70 days) was 9/441 (2%) in standard care, compared with 161/677 (24%) in primary care with parallel testing, and 167/590 (28%) in primary care with sequential testing. The proportion of women offered screening by 10 weeks (70 days) was 3/90 (3%) in standard care (note offer of test ascertained for questionnaire respondents only), compared with 321/677 (47%) in primary care with parallel testing, and 281/590 (48%) in primary care with sequential testing. The proportion of women screened by 26 weeks (182 days) was similar across the three groups: 324/441 (73%) in standard care, 571/677 (84%, 0.09) in primary care with parallel testing, and 481/590 (82%, 0.148) in primary care with sequential testing. The screening uptake of fathers was 51/677 (8%) in primary care with parallel testing, and 16/590 (3%) in primary care with sequential testing, and 13/441 (3%) in standard care. The predicted average total cost per pregnancy of offering antenatal SCT screening was estimated to be 13 pounds in standard care, 18.50 pounds in primary care with parallel testing, and 16.40 pounds in primary care with sequential testing. The incremental cost-effectiveness ratio (ICER) was 23 pounds in primary care with parallel testing and 12 pounds in primary care with sequential testing when compared with standard care. Women offered testing in primary care were as likely to make an informed choice as those offered screening by midwives later in pregnancy, but less than one-third of women overall made an informed choice about screening. Offering antenatal SCT screening as part of pregnancy-confirmation consultations significantly increased the proportion of women screened before 10 weeks (70 days), from 2% in standard care to between 16% and 27% in primary care, but additional resources may be required to implement this. There was no evidence to support offering fathers screening at the same time as women. Current Controlled Trials ISRCTN00677850.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/autest.2004.1436842
The TPS development of parallel automatic test systems
  • Sep 20, 2004
  • Xiao-Ping Zhu + 2 more

With current generation and next generation test systems focusing on testing efficiency, it is critical to develop test strategies that maximize testing throughput, make better use of the increasingly expensive instruments used in test station and drive down test costs. Parallel test, which means multiple units-under-test (UUTs) can undergo testing simultaneously, improves testing strategy by enhancing product flow, reducing aggregate test times, and improving instrument usage. The research and development of parallel automatic test systems (PATS) discussed in this paper is on the basis of common automatic test systems (CATS). Under the condition that PATS possesses all testing resources of CATS, how to make better use of instruments to increase throughput and cut down testing costs? The TPS (test program set) development of PATS is regarded as the key in this paper. The main tasks of the TPS development of PATS include the following four themes. The first is how to analyze test requirements. The second is how to design test unit adapter supporting parallel test. The third is to design and develop test program supporting multiple tasks and threads, which is the core of PATS. The last theme is how to manage multiple tests of PATS. We use the COTS software-TestStandTM to manage parallel testing tasks easily. In the paper, the development and management procedure of PATS's TPS is discussed in detail based on the TPS development of missiles PATS.

  • Research Article
  • Cite Count Icon 9
  • 10.1007/bf00144432
Syphilis and blood donors: comparison of two different diagnostic strategies.
  • Feb 1, 1996
  • European Journal of Epidemiology
  • M M D'Errico + 5 more

In this study the validity of the methods provided for by Italian law (VDRL or RPR tests) were compared with the diagnostic strategy suggested by WHO (the use of VDRL and TPHA tests in parallel). Sensitivity, specificity and posterior probability of infection after a positive or a negative result were estimated. The application of two tests in parallel produces a statistically significant increase of sensitivity from 47% to 98% while the increase of proportion of false positives is not significant (from 15% to 16%). Probability of infection when the result is negative to the RPR is 0.07% while a negative result to the RPR and the TPHA tests has a probability to be really infected of 0.003%. The use of the two tests (RPR and TPHA) in parallel is able to give the highest degree of sensitivity, indispensable to select possible blood donors, while maintaining a good degree of specificity. The authors concluded that the use of VDRL alone does not exclude infectivity of a blood sample, and in accordance with WHO and international recommendations, the VDRL or RPR and TPHA tests should be used in parallel for syphilis screening.

  • Research Article
  • 10.1093/ofid/ofae631.965
P-770. Impact of parallel use of low-complexity automated nucleic acid amplification tests and urine lateral flow lipoarabinomannan assays to detect tuberculosis in people living with HIV
  • Jan 29, 2025
  • Open Forum Infectious Diseases
  • Dana Hassneiah + 7 more

Background Tuberculosis (TB) is the leading cause of death in people living with HIV (PWH). Missed and delayed TB diagnoses in PWH lead to worse outcomes. The World Health Organization (WHO) recommends the use of low-complexity automated nucleic acid amplification tests (LC-aNAAT) as the initial diagnostic test for TB and urine lateral flow lipoarabinomannan assays (LF-LAM) to assist with TB diagnosis in PWH. This systematic review assesses the impact of parallel use of respiratory sample LC-aNAAT and urine LF-LAM in PWH on patient-important outcomes.Table 1.Comparison of mortality in randomized trials evaluating interventions that included respiratory LC-aNAATs and urine LF-LAM in adult inpatients with HIV . Methods We searched multiple databases up to 3 November 2023. Eligible studies were randomized trials that allocated PWH to parallel LC-aNAAT and LF-LAM testing or LC-aNAAT alone or LF-LAM alone. We used a standardized form to extract data on mortality, proportion diagnosed with TB, proportion treated with TB and time to diagnosis and treatment. We assessed study quality using Cochrane Risk of Bias tools. Outcome measurement used random-effects meta-analysis to estimate pooled risk ratios with 95% confidence intervals.Figure 1.Impact of interventions that included respiratory LC-aNAATs and urine LF-LAM on mortality in adult inpatients with HIV Results After 457 articles were screened, three randomized trials were included, all of which were conducted in inpatient settings in sub-Saharan Africa, with 4602 adult participants (Table 1). Risk-of-bias was low. The effect of parallel testing with respiratory LC-aNAAT and urine LF-LAM on mortality in PWH at eight weeks compared to testing with LC-aNAAT alone was uncertain (risk ratio 0.93; 95% CI 0.74 to 1.17) (Figure 2). Parallel testing may increase the proportion of patients with a confirmed TB diagnosis (pooled RR: 3.06, 95% CI 1.82 to 5.16) (Figure 3) and the proportion treated for TB (pooled RR: 1.47, 95% CI 1.25 to 1.73) (Figure 4) compared to LC-aNAAT alone. Time-to-treatment was shorter for the parallel testing group compared to the LC-aNAAT group in two trials and similar in the third.Figure 2.Impact of interventions that included respiratory LC-aNAATs and urine LF-LAM on the proportion of adult inpatients with HIV and with a confirmed tuberculosis diagnosis. Conclusion In inpatient settings, the effect of parallel use of LC-aNAAT and LF-LAM on mortality in PWH at eight weeks was uncertain. Parallel testing may increase the proportion of PWH diagnosed with TB and treated for TB. There was no data on children or outpatient settings. Evidence is limited by heterogeneity in implementation approaches in included trials and contextual differences that included the effects of the COVID pandemic on one trial.Figure 3.Impact of respiratory LC-aNAATs and urine LF-LAM on the proportion of adult inpatients with HIV treated for tuberculosis. Disclosures Laura Olbrich, PHD candidate, Cepheid: received Xpert MTB/Rif ultra cartridges for free Maunank Shah, MD, PhD, Scene Health: Royalties for license

  • Research Article
  • 10.1016/j.adaj.2025.12.019
A practitioner's guide to developing critical appraisal skills: How to understand and interpret frequentist and Bayesian approaches applied to serial and parallel diagnostic testing.
  • Feb 1, 2026
  • Journal of the American Dental Association (1939)
  • Michael Glick + 3 more

Serial and parallel diagnostic testing can improve diagnostic accuracy. Yet their application and interpretation remain underused in clinical practice. The authors synthesized statistical methodology literature on serial and parallel diagnostic testing configurations, comparing frequentist and Bayesian analytical frameworks. The authors used caries diagnosis as an example to illustrate the practical application of these approaches in guiding test selection and result interpretation throughout. Serial testing improves specificity and positive predictive value, making it ideal for confirmatory diagnoses, whereas parallel testing improves sensitivity and negative predictive value, making it ideal for initial screening. The frequentist framework provides population-level estimates of test performance, supporting test validation, regulatory decision making, and large-scale screening strategies. In contrast, the Bayesian approach focuses on individualized inference, allowing clinicians to incorporate prior beliefs on the basis of patient-specific information and update disease posterior probabilities as new information becomes available. Moreover, Bayesian methods allow for flexible combination of individual test results rather than treating multiple diagnostic tests as a single composite measure. Clinicians should use serial testing for confirmatory diagnosis and parallel testing for screening. Understanding both frequentist population-level benchmarks and Bayesian patient-specific probability updating enables more informed testing strategies and result interpretation.

  • Research Article
  • Cite Count Icon 79
  • 10.1016/j.jviromet.2013.04.003
Evaluation of current rapid HIV test algorithms in Rakai, Uganda
  • Apr 11, 2013
  • Journal of Virological Methods
  • Ronald M Galiwango + 11 more

Evaluation of current rapid HIV test algorithms in Rakai, Uganda

  • Research Article
  • Cite Count Icon 255
  • 10.1287/mnsc.47.5.663.10480
Parallel and Sequential Testing of Design Alternatives
  • May 1, 2001
  • Management Science
  • Christoph H Loch + 2 more

An important managerial problem in product design in the extent to which testing activities are carried out in parallel or in series. Parallel testing has the advantage of proceeding more rapidly than serial testing but does not take advantage of the potential for learning between tests, thus resulting in a larger number of tests. We model this trade-off in the form of a dynamic program and derive the optimal testing strategy (or mix of parallel and serial testing) that minimizes both the total cost and time of testing. We derive the optimal testing strategy as a function of testing cost, prior knowledge, and testing lead time. Using information theory to measure the test efficiency, we further show that in the case of imperfect testing (due to noise or simulated test conditions), the attractiveness of parallel strategies decreases. Finally, we analyze the relationship between testing strategies and the structure of design hierarchy. We show that a key benefit of modular product architecture lies in the reduction of testing cost.

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s11235-013-9678-1
The study of spacecraft parallel testing
  • May 1, 2013
  • Telecommunication Systems
  • Zhongwen Li + 3 more

For the purpose of avoiding interference between each parallel testing tasks in a spacecraft, this paper analyzes the testing process by dividing it into testing atoms, and makes the parameter set as the basic unit for each testing atom resource allocation so as to avoid interference. By means of modeling the parallel testing and with the object of minimizing the total testing time, it puts forward the parallel spacecraft testing task scheduling algorithm on basis of improved particle swarm optimization. The experimental results verify that this method can be efficiently applied in spacecraft parallel testing optimal scheduling.

  • Research Article
  • Cite Count Icon 2
  • 10.7759/cureus.53645
Personalized Treatment of Recurrent, Metastatic Head and Neck Cancer Guided by Patient-Derived Xenograft Models.
  • Feb 5, 2024
  • Cureus
  • Morgan D Black + 9 more

Recurrent or metastatic head and neck squamous cell carcinoma (RMHNSCC) is associated with a poor prognosis and short survival duration. There is an urgent need to identify personalized predictors of drug response to guide theselection of the most effective therapy for each individual recurrence. We tested the feasibility of patient-derived xenografts (PDX) for guiding their RMHNSCC salvage treatment. Fresh tumor samples from eligible, consented patients were implanted into mice. Established tumors were expanded in mouse PDX cohorts to identify responses to candidate salvage drug treatments in parallel testing. Patients alive and suitable for chemotherapy were treated based on responses determined by PDX testing. Nine patient tumors were successfully engrafted in mice with an average time of 89.2±41.7 days. Four patients' PDX models underwent parallel drug testing. Two patients received PDX-guided therapy. In one of these patients, single agents of cetuximab and paclitaxel demonstrated the best responses in the PDX model, and this patient exhibited sequential partial responses to each drug, including a 17-month clinical response to cetuximab. The main limitation of PDX testing for RMHNSCC was the time delay in obtaining testing results. Despite this, parallel PDX testing may be feasible for a subset of patients and appears to correlate with clinical benefit.

  • Research Article
  • Cite Count Icon 279
  • 10.1037/0021-9010.82.2.300
Reactions to cognitive ability tests: The relationships between race, test performance, face validity perceptions, and test-taking motivation.
  • Apr 1, 1997
  • Journal of Applied Psychology
  • David Chan + 4 more

The relationships among race, face validity perceptions, test-taking motivation, and test performance on a cognitive ability test were examined. Undergraduates completed 2 parallel cognitive ability tests and a test reactions measure. Results showed that test-taking motivation was related positively to subsequent performance on a parallel test even after the effects of race and performance on the first test were controlled. The effect of race on subsequent test performance was found to be mediated partially by motivation that provided evidence that some portion of the Black-White difference in test performance may be explained through differences in test-taking motivation. Results also indicated that Black-White differences in face validity perceptions of the test may be a function of Black-White differences in test performance. Face validity perceptions of the test affected subsequent performance on the parallel test but only indirectly through test-taking motivation.

  • Research Article
  • Cite Count Icon 34
  • 10.1200/po.22.00201
Cost-Effectiveness of Parallel Versus Sequential Testing of Genetic Aberrations for Stage IV Non-Small-Cell Lung Cancer in the Netherlands.
  • Jul 1, 2022
  • JCO Precision Oncology
  • Henri B Wolff + 10 more

PURPOSEA large number of targeted treatment options for stage IV nonsquamous non–small-cell lung cancer with specific genetic aberrations in tumor DNA is available. It is therefore important to optimize diagnostic testing strategies, such that patients receive adequate personalized treatment that improves survival and quality of life. The aim of this study is to assess the efficacy (including diagnostic costs, turnaround time (TAT), unsuccessful tests, percentages of correct findings, therapeutic costs, and therapeutic effectiveness) of parallel next generation sequencing (NGS)–based versus sequential single-gene–based testing strategies routinely used in patients with metastasized non–small-cell lung cancer in the Netherlands.METHODSA diagnostic microsimulation model was developed to simulate 100,000 patients with prevalence of genetic aberrations, extracted from real-world data from the Dutch Pathology Registry. These simulated patients were modeled to undergo different testing strategies composed of multiple tests with different test characteristics including single-gene and panel tests, test accuracy, the probability of an unsuccessful test, and TAT. Diagnostic outcomes were linked to a previously developed treatment model, to predict average long-term survival, quality-adjusted life-years (QALYs), costs, and cost-effectiveness of parallel versus sequential testing.RESULTSNGS-based parallel testing for all actionable genetic aberrations is on average €266 cheaper than single-gene–based sequential testing, and detects additional relevant targetable genetic aberrations in 20.5% of the cases, given a TAT of maximally 2 weeks. Therapeutic costs increased by €8,358, and 0.12 QALYs were gained, leading to an incremental cost-effectiveness ratio of €69,614/QALY for parallel versus sequential testing.CONCLUSIONNGS-based parallel testing is diagnostically superior over single-gene–based sequential testing, as it is cheaper and more effective than sequential testing. Parallel testing remains cost-effective with an incremental cost-effectiveness ratio of 69,614 €/QALY upon inclusion of therapeutic costs and long-term outcomes.

  • Research Article
  • Cite Count Icon 2
  • 10.3390/epidemiologia5030029
Sex Disparity in Stroke Mortality among Adults: A Time Series Analysis in the Greater Vitoria Region, Brazil (2000-2021).
  • Jul 17, 2024
  • Epidemiologia (Basel, Switzerland)
  • Orivaldo Florencio De Souza + 6 more

The disparity between the sexes in stroke mortality has been demonstrated in people from different locations. The objective of this study was to analyze the disparity between sexes in stroke mortality in adults in the metropolitan area of Greater Vitoria between 2000 and 2021. Ecological time series design was conducted with a database of the Brazilian Health System Informatics Department. The annual percentage change and average annual percentage change were calculated through joinpoint regression. Pairwise comparisons using parallelism and coincidence tests were applied to compare temporal trends between men and women. Men had higher mortality rates in most years between 2000 and 2021. In contrast, women had higher proportional mortality values in all years evaluated from 2000 to 2021. The paired comparison revealed a disparity between the sexes in the proportional mortality time series (parallelism test: p = 0.003; coincidence test: p < 0.001). However, the time series of the mortality rates showed no disparity between the sexes (parallelism test: p = 0.114; coincidence test: p = 0.093). From 2000 to 2021, there was a disparity in proportional mortality from stroke between the sexes of the population in the metropolitan area of Greater Vitoria, Brazil. However, the time series of mortality rates between the sexes did not reveal any disparity in the study period.

  • Research Article
  • 10.2196/82669
Combining machine learning models and screening to enhance suicide risk identification for American Indian patients: A Retrospective Cohort Study.
  • Mar 5, 2026
  • Journal of medical Internet research
  • Novelene Goklish + 7 more

American Indian and Alaska Native (AI/AN) communities experience disproportionately high suicide rates. While machine learning (ML) models leveraging electronic health records (EHR) have emerged as promising tools for suicide risk identification, the optimal integration of these models with existing screening practices remains unclear. The objective of this study was to compare parallel and serial testing strategies that combine an ML suicide risk model and the Ask Suicide-Screening Questions (ASQ) against using the ASQ alone. To achieve this, we conducted a retrospective secondary analysis of EHR data. The cohort consisted of adult Emergency Department visits at an Indian Health Service (IHS) facility between October 1, 2019, and October 2, 2021. Sensitivity, specificity, predictive values, and 95% confidence intervals were averaged across 10 cross-validated patient-level folds. The final sample included 7,897 American Indian patients with 26,896 visits, 824 (3.1%) of which had a positive ASQ result, and 102 (0.4%) had the outcome of suicide attempt or death within 90 days of the visit. The logistic regression ML model previously developed using IHS-specific data was operationalized at the 95th and 75th percentiles to evaluate high-risk and medium-risk thresholds, respectively. A sensitivity analysis was performed to evaluate identification approaches across all ED visits during this time period. The ML medium-risk threshold alone identified the most true positives (Sensitivity 0.782, 95% CI 0.648-0.915; Specificity 0.751, 95% CI 0.725-0.777; PPV 0.012, 95% CI 0.009-0.014; NPV 0.999, 95% CI 0.998-0.999) in comparison to the ML high-risk threshold alone (Sensitivity 0.429, 95% CI 0.287-0.572; Specificity 0.955, 95% CI 0.948-0.961; PPV 0.035, 95% CI 0.022-0.048; NPV 0.998, 95% CI 0.997-0.999) or the ASQ alone (Sensitivity 0.178, 95% CI 0.073-0.282; Specificity 0.970, 95% CI 0.968-0.971; PPV 0.022, 95% CI 0.010- 0.034; NPV 0.997, 95% CI 0.996-0.998). Combining the ML high-risk threshold with the ASQ in series yielded the greatest positive predictive ability (PPV 0.050, 95% CI 0.014-0.086) at the cost of reduced sensitivity (0.129, 95% CI 0.036-0.221). Finally, the parallel testing approach using the ML medium-risk threshold yielded the greatest sensitivity (Sensitivity 0.795, 95% CI 0.671-0.920; Specificity 0.742, 95% CI 0.716-0.766; PPV 0.012, 95% CI 0.009-0.014; NPV 0.999, 95% CI 0.989-0.999) without missing any cases identified by screening. Unlike existing studies that evaluate ML and screening tools in isolation, this study innovates by assessing combined parallel and serial testing strategies in a real-world setting. We demonstrate that while serial testing maximizes predictive accuracy, it is often infeasible. Instead, parallel testing brings value as a clinical "safety net" to catch at-risk patients missed by standard practices. Ultimately, integrating ML in suicide prevention requires balancing statistical accuracy with setting-specific, real-world workflows.

  • Research Article
  • Cite Count Icon 13
  • 10.1016/j.canep.2014.11.006
Adjunct screening of cervical or vaginal samples using careHPV testing with Pap and aided visual inspection for detecting high-grade cervical intraepithelial neoplasia
  • Dec 27, 2014
  • Cancer Epidemiology
  • Smita Asthana + 1 more

Adjunct screening of cervical or vaginal samples using careHPV testing with Pap and aided visual inspection for detecting high-grade cervical intraepithelial neoplasia

  • Research Article
  • 10.5555/2841955.2841959
Parallel and Sequential Testing of Design Alternatives
  • May 1, 2001
  • Management Science
  • H Lochchristoph + 2 more

An important managerial problem in product design in the extent to which testing activities are carried out in parallel or in series. Parallel testing has the advantage of proceeding more rapidly t...

Save Icon
Up Arrow
Open/Close