Abstract
To study the risk of spontaneous abortion (SAB) or termination using healthcare utilization databases, algorithms to estimate the gestational age (GA) are needed. Using Medicaid data, we developed a hierarchical algorithm to classify pregnancy outcomes. We identified the subset of potential SAB and termination cases, and abstracted the GA from linked electronic medical records (gold standard). We developed three approaches: (1) assign median GA for SAB and termination cases in the US; (2) draw a random GA from the population distributions; (3) estimate GA based on regression models. Algorithm performance was assessed based on the proportion of pregnancies with estimated GA within 1-4 weeks of the gold standard, the mean squared error (MSE) and the R-squared. Approach 1 and Approach 3 had similar performance, though approach 3 using random forest models with variables selected via the Boruta algorithm had better MSE and R-squared. For SAB, 58.0% of pregnancies were correctly classified within 2 weeks of the gold standard (MSE: 8.7, R-squared: 0.09). For termination, the proportions were 66.3% (MSE: 11.7; R-squared: 0.35). SABs and terminations can be studied in healthcare utilization data with careful implementation of validated algorithms though higher level of GA misclassification is expected compared to live births.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.