Abstract

This special issue of Language Resources and Evaluation, entitled "New Frontiers in Asian Language Resources", complements the earlier special double issue on Asian Language Processing: State of the Art Resources and Processing (Huang et al. 2006) by presenting eight papers describing specific Asian language resources. As Bird and Simons (2003) explain, research on language resources must deal with how the resources can be acquired and documented as well as how the resources can be accessed and used. Among the eight papers in this issue, the first four papers focus on resources, while the latter four target specific application tasks and describe resource building in the contexts of these applications. In the early days of corpus building, a "large scale" corpus might consist of one million words. Kilgarriff and Grenfenstette's (2003) survey of the historical developments in corpus construction shows that the size of English corpora has increased roughly tenfold every decade since the 1960s, when the one million word Brown Corpus was developed. In the 1980s, the COBUILD project built an eight million word corpus, and the British National Corpus (BNC), completed in 1994, includes 100 million words. This trend continues with LDC's Gigaword Corpus, published in 2003, which contains nearly two billion words. A central question for

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call