Abstract
The number of biomedical literatures is growing rapidly, and biomedical literature mining is becoming essential. A learning classifier based on maximum entropy (ME) for identifying abbreviations is proposed. Two innovative Web-based features for extracting additional semantic information are developed. The study shows the Web as a knowledge source can be incorporated effectively in the machine learning framework and significantly improves its performance. The ME classifier achieves 95% precision and 89% recall on the gold standard corpus “Medstract” and 91% precision and 84% recall on the larger test data that includes 128 full text literatures.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have