Personal tools
You are here: Home Resources Tagged Corpora Walta Information Center - Tagged Amharic News Corpus
Document Actions

Walta Information Center - Tagged Amharic News Corpus

by admin last modified 2008-06-08 04:41

A corpus of 1,065 Amharic news articles (210,000 words) from the Walta Information Center (http://www.waltainfo.com/). The news articles span the period 1998 - 2002 and have been tagged for part of speech and punctuation.


Notes

The corpus contains roughly 210,000 tagged words following the tag convention detailed in the 2006 paper "Manual Annotation of Amharic News Items with Part-of-Speech Tags and its Challenges" (Girma A. Demeke & Mesfin Getachew)


Resource Contact: Lars Asker


License of Use

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.


Downloads


Status & Future Directions

  • Last Edit — 2008/05/30.
  • Generally suitable for research purposes, some clean up remains.



This site conforms to the following standards: