MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition

dc.contributor.authorIfeoluwa Adelani, David
dc.contributor.authorNeubig, Graham
dc.contributor.authorRuder, Sebastian
dc.contributor.authorRijhwani, Shruti
dc.contributor.authorNakatumba-Nabende, Joyce
dc.contributor.authorOgundepo, Odunayo
dc.contributor.authorYousuf, Oreen
dc.contributor.authorMoteu Ngoli, Tatiana
dc.contributor.authorKlakow, Dietrich
dc.date.accessioned2022-12-29T13:29:46Z
dc.date.available2022-12-29T13:29:46Z
dc.date.issued2022
dc.description.abstractAfrican languages are spoken by over a billion people, but are underrepresented in NLP research and development. The challenges impeding progress include the limited availability of annotated datasets, as well as a lack of understanding of the settings where current methods are effective. In this paper, we make progress towards solutions for these challenges, focusing on the task of named entity recognition (NER). We create the largest human-annotated NER dataset for 20 African languages, and we study the behavior of stateof- the-art cross-lingual transfer methods in an Africa-centric setting, demonstrating that the choice of source language significantly affects performance. We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points across 20 languages compared to using English. Our results highlight the need for benchmark datasets and models that cover typologically-diverse African languages.en_US
dc.identifier.citationIfeoluwa Adelani, D., Neubig, G., Ruder, S., Rijhwani, S., Beukman, M., Palen-Michel, C., ... & Klakow, D. (2022). MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition. arXiv e-prints, arXiv-2210.en_US
dc.identifier.urihttps://github.com/masakhane-io/ masakhane-ner/tree/main/MasakhaNER2.0
dc.identifier.urihttps://nru.uncst.go.ug/handle/123456789/6745
dc.language.isoenen_US
dc.publisherarXiv e-printsen_US
dc.titleMasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognitionen_US
dc.typeOtheren_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MasakhaNER 2.0 Africa-centric Transfer Learning for Named Entity.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format
Description:
Conference Paper
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: