Abstract

Bibliographic references, typically comprising author names, journal titles, paper titles, and publication dates, play a vital role in academic research. Accurately identifying these structured pieces of information from references is a crucial step in developing intelligent bibliographic management systems. However, existing methods often rely on extensive high-quality training data. To mitigate the reliance on extensive training data, we propose a method that integrates prompt learning and contrastive learning for extracting structured information from bibliographic references, named CONT_Prompt_ParseRef. This approach aims to utilize contrastive learning to deepen the understanding of different metadata label types and employ prompt learning to provide specific guidelines for processing and recognition. We constructed a dataset comprising 12,000 samples, available in both Chinese and English versions. The experimental results on this bilingual dataset demonstrate the model's superior performance over existing techniques. Notably, CONT_Prompt_ParseRef shows remarkable robustness in low-resource environments, particularly in scenarios with limited training data, both contrastive and prompt learning play pivotal roles in label extraction from bibliographic references. The ablation study illustrates that omitting either component leads to a decline in performance, with contrastive learning being slightly more influential.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call