Abstract

The growth of medical record documents is increasing over time, and the various types of diseases and therapies needed are increasing. However, this has not been followed by an effective and efficient search process. This study aims to deal with search problems that often take a long time with search results that are not necessarily as expected by building a search model for medical record documents using the vector space model (VSM) and TF-IDF methods. The VSM method allows retrieval of results that are not the same as the search queries entered by the user but are expected to provide still results relevant to the user's desired needs. The model development process was taken based on the data in the FS_ANAMNESA and FS_DIAGNOSA columns, followed by preprocessing, which consists of deleting blank lines, lowercase, removing punctuation marks, HTML tags, stop words, excess spaces between words, and normalizing typo words, then forming a TF-IDF matrix based on the frequency of occurrence of each word feature, and followed by the calculation of the similarity value of the search query compared to medical record documents based on the cosine similarity formula. The retrieval results were all columns of each existing medical record document and were sorted based on 10 rows with the highest similarity value. The model evaluation results were based on 1000 medical record documents and tested with 20 search queries in this study, which gave an average precision value of 0.548 and an average recall value of 0.796.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.