In the fast-changing environment of value-based healthcare, providers must prepare and submit diagnosis files for patients to payers with ever-changing requirements from CMS for contracts. Changing specifications from payers repeatedly requires a flexible system to automatically process items with little reliance on human intervention. It provides a scalable data engineering framework that makes use of PySpark and Apache Airflow to automate and streamline the generation of patient diagnosis files. Its design allows it to be dynamically adapted to changes in specifications, which enhances both operational efficiency and compliance with the automation of complex data transformation processes and large-scale data processing. This framework not only optimizes the reimbursement process but also strengthens the foundations of value-based care by supporting improved patient outcomes and ensuring financial alignment with payer requirements through improving accuracy, timeliness, and reducing manual coding efforts. The proposed framework significantly enhances data processing speed, achieving a 50% reduction in processing time compared to the traditional ETL approach. It offers scalability improvements, allowing data handling from 1 TB to 5 TB—an increase of four times the previous limit—while minimizing manual effort through automation with Airflow, resulting in a 70% reduction in manual intervention.
Read full abstract