Abstract

Machine learning and artificial intelligence (AI) have been rapidly progressing in several fields, and one of these is healthcare and biomedical systems where AI and deep learning algorithms have shown massive success for various applications and use cases, such as virtual healthcare assistants, smart medical homes, automated diagnosis, processing of pathology reports, drug discovery, implantable medical devices, and many more. At this stage of progress, AI can reshape the healthcare workforce and bring about a drastic positive change in the understanding and handling of biomedical data in automated systems. However, as these artificially intelligent systems continue to gain more importance and surpass the state-of-the-art systems in terms of performance in biomedical and healthcare problems, the aspect of preserving the privacy and ensuring the security of our data and the users becomes extremely pertinent. Deep learning techniques may be at the risk of learning and memorization of confidential information instead of generalizing well to it, which, in fact, is the main aim (often termed as overfitting). Biomedical data is inherently complex and involves sensitive and private, often confidential, information of patients and users, which is easily vulnerable in the face of attacks by malicious actors. Often, the handling of clinical data requires more context than the standard for usual applications of deep learning 290algorithms, such as patient history, patient preferences, social perspectives, etc. Moreover, the “black box” nature of deep learning algorithms results in a lack of model interpretability and gives rise to confusion regarding exactly how an AI model can achieve the kind of performance it does. It is hence not easy to identify the weaknesses of the model or the reasons for the weakness, or even to extract additional biological explanations from the results. This leads to potential misuse of algorithms by attackers and poses potential threats to user/patient security in biomedical AI systems. Moreover, biomedical systems usually leverage third-party cloud platforms due to scalability, storage and performance benefits, and privacy compromisation is also likely to happen in such situations unless secure sharing schemes and suitable encryption techniques are devised. The problem of security also arises in the case of data integration and adoption that is needed to develop large scale biomedical expert systems. Some ways in which user data privacy can be jeopardized include indirect data leakage, data poisoning (i.e., including fake data samples in training set to drastically change the accuracy), linkage attacks (i.e., recognizing the actual identities of anonymized users), dataset reconstruction from published results, adversarial examples (i.e., adding noise to data to mislead the algorithm), transferability attacks, model theft, etc. These are pertinent questions from the perspective of general machine learning security, and is even more important in the biomedical domain, considering the stake of human life and health. As far as biomedical AI systems are concerned, traditional security and privacy mechanisms are not suitable to be adopted due to the ever-changing nature of research and the complexity of medical data. To counter such security threats, many studies have suggested the adoption of a set of best practices when working with biomedical data and also to ensure the optimal use of predictive models in research, especially to discourage inadequate studies with inaccurate results that may compromise the credibility of important and valid research in the field. There have been recent studies pointing to the practice of keeping training data private while simultaneously building accurate AI models. Two important techniques, in this case, are differential privacy and federated learning, which may serve as potential solutions for the problem. To counter linkage attacks and security threats of the similar sort, which is especially important in the wake of healthcare services in mobile devices that can compromise user identity and location data, recent studies have suggested private record linkage and 291entity resolution techniques, such as deriving unique fingerprints from genomes to preserve patient identity. Finally, it is extremely important to test AI models in real-time clinical situations (which are often complex and noisy) to further understand the fragility of such models and where their vulnerabilities can be exploited, so that better security schemes can be devised to counter the problem. After all, it is always essential to understand the problem thoroughly to come up with actual and effective solutions. In conclusion, addressing the problem of security and privacy in biomedical AI systems is complex, multidisciplinary, and also involves ethical and legal perspectives. As newer and better machine learning and deep learning algorithms are devised to tackle the problems in the healthcare and medical domain, newer security threats will also emerge. We are confident that research in this field will result in quality solutions to achieve the true balance between performance and privacy that is conducive to users and patients in the healthcare and biomedical domains.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call