Abstract

The latest and precise information regarding the biomedical and healthcare domain is required in the current pandemic situation. The world has turned into a small place where everyone wants quick and relevant medical data to prevent contagious diseases. Doctors, nursing staff, medical practitioners, frontline Covid19 epidemic fighters, and even the common man requires updates and summarized biomedical statistics. A study on Graph-based biomedical text summarization with different similarity measures and ranking of sentence embeddings is presented in this paper. Cosine and Dice similarities and the pre-trained BERT model providing context via sentence embeddings are combined with TextRank and PageRank algorithms resulting in an opulent extractive text summarization of biomedical Cord19 Pubmed articles. Rouge-1 and Rouge-L scores are empirically calculated, providing a comparison between the average F-score, precision, and recall values for various graph-based sentence extraction methods. It has been observed that Cosine similarity and BERT sentence embeddings are equally effective when used with graph-based ranking algorithms. The significant contribution is the proposed TextRank with the BERT embedding model, which is evaluated as the preferred choice for short biomedical document summarization. But for large documents, the BERT model behaves heavy and causes latency in execution whereas, LexRank including Cosine measure still works efficiently for mid-size document summarization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.