Abstract

BackgroundTransfer learning is a common practice in image classification with deep learning where the available data is often limited for training a complex model with millions of parameters. However, transferring language models requires special attention since cross-domain vocabularies (e.g. between two different modalities MR and US) do not always overlap as the pixel intensity range overlaps mostly for images.MethodWe present a concept of similar domain adaptation where we transfer inter-institutional language models (context-dependent and context-independent) between two different modalities (ultrasound and MRI) to capture liver abnormalities.ResultsWe use MR and US screening exam reports for hepatocellular carcinoma as the use-case and apply the transfer language space strategy to automatically label imaging exams with and without structured template with > 0.9 average f1-score.ConclusionWe conclude that transfer learning along with fine-tuning the discriminative model is often more effective for performing shared targeted tasks than the training for a language space from scratch.

Highlights

  • Hepatocellular carcinoma (HCC) is the most common primary liver malignancy, and is the fastest-rising cause of cancer-related deaths in the United States [1]

  • We explore recent language modeling methodologies that overcome these limitations: (1) context-independent word embedding models (Word2vec [11], Glove [12]), where the language model (LM) learns numerical representation of words regardless of where the word occurs in a sentence, and (2) context-dependent word embedding models (BERT [13], ELMo [14]), which capture the context of words – that is, it’s position in a sentence

  • This trend is true for both language models (Word2Vec and better than the context-dependent transformer models (BERT)) as well as both classifiers (Random Forest and 1DCNN)

Read more

Summary

Introduction

Hepatocellular carcinoma (HCC) is the most common primary liver malignancy, and is the fastest-rising cause of cancer-related deaths in the United States [1]. Different versions of LI-RADS are available for each of the HCC screening imaging modalities (including US and MRI). LI-RADS standardized reporting systems, and the use of structured HCC-screening imaging report templates, help facilitate information extraction from imaging reports, enabling the creation of large-scale annotated databases that can be used for clinical outcomes and machine learning research. Despite these efforts, adoption of structured reports remains low [5], and simple tasks like differentiating between benign and malignant exams from a large number of screening cases, requires hundreds of hours of manual review by the radiologist. Transferring language models requires special attention since cross-domain vocabularies (e.g. between two different modalities MR and US) do not always overlap as the pixel intensity range overlaps mostly for images

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.