Since the World Health Organization declared COVID-19 a pandemic in 2020, the global community has faced ongoing challenges in controlling and mitigating the transmission of the SARS-CoV-2 virus, as well as its evolving subvariants and recombinants. A significant challenge during the pandemic has not only been the accurate detection of positive cases but also the efficient prediction of risks associated with complications and patient survival probabilities. These tasks entail considerable clinical resource allocation and attention. In this study, we introduce COVID-Net Biochem, a versatile and explainable framework for constructing machine learning models. We apply this framework to predict COVID-19 patient survival and the likelihood of developing Acute Kidney Injury during hospitalization, utilizing clinical and biochemical data in a transparent, systematic approach. The proposed approach advances machine learning model design by seamlessly integrating domain expertise with explainability tools, enabling model decisions to be based on key biomarkers. This fosters a more transparent and interpretable decision-making process made by machines specifically for medical applications. More specifically, the framework comprises two phases: In the first phase, referred to as the “clinician-guided design” phase, the dataset is preprocessed using explainable AI and domain expert input. To better demonstrate this phase, we prepared a benchmark dataset of carefully curated clinical and biochemical markers based on clinician assessments for survival and kidney injury prediction in COVID-19 patients. This dataset was selected from a patient cohort of 1366 individuals at Stony Brook University. Moreover, we designed and trained a diverse collection of machine learning models, encompassing gradient-based boosting tree architectures and deep transformer architectures, specifically for survival and kidney injury prediction based on the selected markers. In the second phase, called the “explainability-driven design refinement” phase, the proposed framework employs explainability methods to not only gain a deeper understanding of each model’s decision-making process but also to identify the overall impact of individual clinical and biochemical markers for bias identification. In this context, we used the models constructed in the previous phase for the prediction task and analyzed the explainability outcomes alongside a clinician with over 8 years of experience to gain a deeper understanding of the clinical validity of the decisions made. The explainability-driven insights obtained, in conjunction with the associated clinical feedback, are then utilized to guide and refine the training policies and architectural design iteratively. This process aims to enhance not only the prediction performance but also the clinical validity and trustworthiness of the final machine learning models. Employing the proposed explainability-driven framework, we attained 93.55% accuracy in survival prediction and 88.05% accuracy in predicting kidney injury complications. The models have been made available through an open-source platform. Although not a production-ready solution, this study aims to serve as a catalyst for clinical scientists, machine learning researchers, and citizen scientists to develop innovative and trustworthy clinical decision support solutions, ultimately assisting clinicians worldwide in managing pandemic outcomes.