publications
publications by categories in reversed chronological order.
An up-to-date list is available on Google Scholar.
2025
- BRIGHTER: Bridging the Gap in Human‑Annotated Textual Emotion Recognition Datasets for 28 LanguagesIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), 2025ACL 2025 Best Resource Paper Award
- SemEval‑2025 Task 11: Bridging the Gap in Text‑Based Emotion DetectionIn Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval‑2025), 2025Best Task Paper Award
- IrokoBench: A New Benchmark for African Languages in the Age of Large Language ModelsIn Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2025, Long Papers), 2025Outstanding Paper Award
- The State of Large Language Models for African Languages: Progress and ChallengesarXiv preprint, 2025Best Paper Award at Deep Learning Indaba 2025
- AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African LanguagesIn Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2025, Long Papers), 2025
- HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language ProcessingarXiv preprint, 2025
- Automatic Speech Recognition for African Low‑Resource Languages: Challenges and Future DirectionsIn Proceedings of the AfricaNLP 2025 Workshop at ACL 2025, 2025
- Who Wrote This? Identifying Machine vs Human‑Generated Text in HausaIn Proceedings of the AfricaNLP 2025 Workshop at ACL 2025, 2025
- INJONGO: A Multicultural Intent Detection and Slot‑filling Dataset for 16 African LanguagesIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, Long Papers), 2025
- POLAR: A Benchmark for Multilingual, Multicultural, and Multi‑Event Online PolarizationarXiv preprint, 2025
- AfroXLMR‑Social: Adapting Pre‑trained Language Models for African Languages Social Media TextIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), 2025
- AfriDoc‑MT: Document‑level MT Corpus for African LanguagesIn Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), 2025
2024
- BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and LanguagesIn Advances in Neural Information Processing Systems (NeurIPS 2024, Datasets and Benchmarks Track), 2024Best Non‑archival Paper Award at C3NLP Workshop
- SemEval Task 1: Semantic Textual Relatedness for African and Asian LanguagesIn Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval‑2024), 2024Honourable Mention, Best Task Paper Award
- SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 LanguagesIn Findings of the Association for Computational Linguistics: ACL 2024, 2024
- Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low‑resource African LanguagesIn arXiv preprint, 2024
- AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under‑resourced African LanguagesIn Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2024, Long Papers), 2024
- Correcting FLORES Evaluation Dataset for Four African LanguagesIn Proceedings of the Ninth Conference on Machine Translation (WMT 2024), 2024
- HausaHate: An Expert Annotated Corpus for Hausa Hate Speech DetectionIn Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), 2024
- Findings of WMT2024 English‑to‑Low Resource Multimodal Translation TaskIn Proceedings of the Ninth Conference on Machine Translation (WMT 2024), 2024
2023
- AfriSenti: A Twitter Sentiment Analysis Benchmark for African LanguagesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023, Main), 2023Best Non‑archival Paper Award at AfricaNLP 2023 Workshop
- HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa LanguageIn Findings of the Association for Computational Linguistics: ACL 2023, 2023
- AfriMTE and AfriCOMET: Empowering COMET to Embrace Under‑resourced African LanguagesIn Proceedings of the 2023 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2023), 2023
- AfriWOZ: Corpus for Exploiting Cross‑Lingual Transfer for Dialogue Generation in Low‑Resource African LanguagesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), 2023
- Combining Symbolic and Deep Learning Approaches for Sentiment AnalysisIn Compendium of Neurosymbolic Artificial Intelligence, 2023
- MasakhaPOS: Part‑of‑Speech Tagging for Typologically Diverse African LanguagesIn Proceedings of the 2023 Annual Meeting of the Association for Computational Linguistics (ACL 2023), 2023
- AfriSenti: A Twitter Sentiment Analysis Benchmark for African LanguagesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), 2023
- HausaNLP at SemEval‑2023 Task 10: Transfer Learning, Synthetic Data and Side‑information for Multi‑level Sexism ClassificationarXiv preprint, 2023
- AfriQA: Cross‑lingual Open‑Retrieval Question Answering for African LanguagesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), 2023
- The African Stopwords Project: Curating Stopwords for African LanguagesarXiv preprint, 2023
2022
- MasakhaNER 2.0: Africa‑centric Transfer Learning for Named Entity RecognitionIn Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), 2022Service: IJCNLP–AACL 2023 Area Chair Award
- NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment AnalysisIn Proceedings of the Thirteenth Language Resources and Evaluation Conference (LREC 2022), 2022
- Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low‑Resourced African LanguagesarXiv preprint, 2022
- MasakhaNER 2.0: Africa‑centric Transfer Learning for Named Entity RecognitionIn Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), 2022
- BibleTTS: A Large, High‑Fidelity, Multilingual, and Uniquely African Speech CorpusIn arXiv preprint, 2022
- Symbolic Versus Deep Learning Techniques for Explainable Sentiment AnalysisIn Portuguese Conference on Artificial Intelligence, 2022
- Hausa Visual Genome: A Dataset for Multi‑Modal English to Hausa Machine TranslationIn Proceedings of the 2022 International Conference on Language Resources and Evaluation (LREC 2022), 2022
- HERDPhobia: A Dataset for Hate Speech Against Fulani in NigeriaarXiv preprint, 2022
2021
- Quality at a Glance: An Audit of Web‑Crawled Multilingual DatasetsTransactions of the Association for Computational Linguistics, 2021
2020
- Participatory Research for Low‑resourced Machine Translation: A Case Study in African LanguagesIn Findings of the Association for Computational Linguistics: EMNLP 2020, 2020Wikimedia Research Award
- Participatory Research for Low‑resourced Machine Translation: A Case Study in African LanguagesarXiv preprint, 2020
- Incremental Approach for Automatic Generation of Domain‑Specific Sentiment LexiconIn Advances in Information Retrieval (ECIR 2020), 2020
- A Survey on Machine Learning Techniques in Movie Revenue PredictionSN Computer Science, 2020
2019
- An Overview of Sentiment Analysis ApproachesarXiv preprint, 2019
2017
- Massive Open Online Courses: Awareness, Adoption, Benefits and Challenges in Sub‑Saharan AfricaIn Proceedings of OcRI, 2017
- A Framework for Implementation of E‑Classroom SystemIn Proceedings of OcRI, 2017
2016
- Massive Open Online Courses: A Success of Cloud Computing in EducationIn Proceedings of OcRI, 2016