The increasing capabilities of Large Language Models (LLMs) have opened new opportunities for enhancing Ontology Learning (OL), a process crucial for structuring domain knowledge in a machine-readable format. This paper reports on the participation of the RWTH-DBIS team in the LLMs4OL Challenge at ISWC 2024, addressing two primary tasks: term typing and taxonomy discovery. We used LLaMA-3-8B and GPT-3.5-Turbo models to find the performance gaps between open-source and commercial LLMs. For open-source LLMs, our methods included domain-specific continual training, fine-tuning, and knowledge-enhanced prompt-tuning. These approaches were evaluated on the benchmark datasets from the challenge, i.e., GeoNames, UMLS, Schema.org, and the Gene Ontology (GO), among others. The results indicate that domain-specific continual training followed by task-specific fine-tuning enhances the performance of open-source LLMs in these tasks. However, performance gaps remain when compared to commercial LLMs. Additionally, the developed prompting strategies demonstrate substantial utility. This research highlights the potential of LLMs to automate and improve the OL process, offering insights into effective methodologies for future developments in this field.
Read full abstract