280 Birds With One Stone: Inducing Multilingual Taxonomies From Wikipedia Using Character-Level Classification

Amit Gupta,Karl Aberer,Rémi Lebret,Hamza Harkous

doi:10.1609/aaai.v32i1.11921

280 Birds With One Stone: Inducing Multilingual Taxonomies From Wikipedia Using Character-Level Classification

Amit Gupta, Karl Aberer + Show 2 more

Open Access

https://doi.org/10.1609/aaai.v32i1.11921

Copy DOI

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 26, 2018
Citations: 3

Affiliation: École Polytechnique Fédérale de Lausanne

#Target Language #Multilingual Resource + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We propose a novel fully-automated approach towards inducing multilingual taxonomies from Wikipedia. Given an English taxonomy, our approach first leverages the interlanguage links of Wikipedia to automatically construct training datasets for the isa relation in the target language. Character-level classifiers are trained on the constructed datasets, and used in an optimal path discovery framework to induce high-precision, high-coverage taxonomies in other languages. Through experiments, we demonstrate that our approach significantly outperforms the state-of-the-art, heuristics-heavy approaches for six languages. As a consequence of our work, we release presumably the largest and the most accurate multilingual taxonomic resource spanning over 280 languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.