Abstract

Resistance in malaria is a growing concern affecting many areas of Sub-Saharan Africa and Southeast Asia. Since the emergence of artemisinin resistance in the late 2000s in Cambodia, research into the underlying mechanisms has been underway. The 2019 Malaria Challenge posited the task of developing computational models that address important problems in advancing the fight against malaria. The first goal was to accurately predict artemisinin drug resistance levels of Plasmodium falciparum isolates, as quantified by the IC50. The second goal was to predict the parasite clearance rate of malaria parasite isolates based on in vitro transcriptional profiles. In this work, we develop machine learning models using novel methods for transforming isolate data and handling the tens of thousands of variables that result from these data transformation exercises. This is demonstrated by using massively parallel processing of the data vectorization for use in scalable machine learning. In addition, we show the utility of ensemble machine learning modeling for highly effective predictions of both goals of this challenge. This is demonstrated by the use of multiple machine learning algorithms combined with various scaling and normalization preprocessing steps. Then, using a voting ensemble, multiple models are combined to generate a final model prediction.

Highlights

  • Malaria is a serious disease caused by parasites belonging to the genus Plasmodium which are transmitted by Anopheles mosquitoes in the genus

  • The ‘Preprocess Data?’ parameter enables the scaling and imputation of the features in the data. Note that these models were evaluated using random sampling of the input training dataset provided by the DREAM Challenge, though the evaluation within the challenge was performed on an unlabelled testing dataset

  • We efficiently transformed a matrix of over 40,000 genetic attributes for the IC use case and over 4,000 genetic attributes for the resistance rate use case. This was completed with scalable vectorization of the training data, which allowed for many machine learning models to be generated

Read more

Summary

Introduction

Malaria is a serious disease caused by parasites belonging to the genus Plasmodium which are transmitted by Anopheles mosquitoes in the genus. Plasmodium falciparum poses one of greatest health threats in Southeast Asia, being responsible for 62.8% of malaria cases in the region in 20171. Artemisinin-based therapies are among the best treatment options for malaria caused by P. falciparum[2]. The use of artemisinin in combination with other drugs, called artemisinin combination therapies, are the best treatment options today against malaria infections. While there are polymorphisms in the kelch domain–carrying protein K13 in P. falciparum that are known to be associated with artemisinin resistance, the underlying molecular mechanism that confers resistance remains unknown[4]. In early 2020, Birnbaum et al discovered that the highly-conserved gene kelch[13] is associated with a molecular mechanism that allows the parasite to feed on host erythrocytes by endocytosis of hemoglobin[5]. Given that artemisinin is activated by hemoglobin degradation products, these mutations can confer resistance to artemisinin

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call