World Bank Group: From Chalkboard to Chatbots – Evaluating the Impact of Generative AI on Learning Outcomes in Nigeria
A World Bank working paper finds that using a GPT-4-powered virtual tutor in Nigerian secondary schools significantly boosts English, digital, and AI skills, with stronger gains for higher-performing, female, and higher socioeconomic students. The intervention proved highly cost-effective, equating to 1.5–2 years of traditional schooling and suggesting that scalable AI tutoring can enhance learning in low-resource settings, provided challenges like digital equity are addressed.
This is a Policy Research Working Paper from the World Bank's Education Global Department, published in May 2025. Titled "From Chalkboards to Chatbots: Evaluating the Impact of Generative AI on Learning Outcomes in Nigeria," it details a study on the effectiveness of using large language models, specifically Microsoft Copilot powered by GPT-4, as virtual tutors for secondary school students in Nigeria.
The research, conducted through a randomized controlled trial over six weeks, found that the intervention led to significant improvements in English, digital, and AI skills among participating students, particularly female students and those with higher initial academic performance.
The paper emphasizes the cost-effectiveness and scalability of this AI-powered tutoring approach in low-resource settings, although it also highlights the need to address potential inequities in access and digital literacy for broader implementation.
- Significant Positive Impact on Learning Outcomes: The program utilizing Microsoft Copilot (powered by GPT-4) as a virtual tutor in secondary education in Nigeria resulted in a significant improvement of 0.31 standard deviation on an assessment covering English language, artificial intelligence (AI), and digital skills for first-year senior secondary students over six weeks. The effect on English skills, which was the main outcome of interest, was 0.23 standard deviations. These effect sizes are notably high when compared to other randomized controlled trials (RCTs) in low- and middle-income countries.
- High Cost-Effectiveness: The intervention demonstrated substantial learning gains, estimated to be equivalent to 1.5 to 2 years of 'business-as-usual' schooling. A cost-effectiveness analysis revealed that the program ranks among some of the most cost-effective interventions for improving learning outcomes, achieving 3.2 equivalent years of schooling (EYOS) per $100 invested per participant. When considering long-term wage effects, the benefit-cost ratio was estimated to be very high, ranging from 161 to 260.
- Heterogeneous Effects Identified: While the program yielded positive and statistically significant treatment effects across all levels of baseline performance, the effects were found to be stronger among students with better prior academic performance and those from higher socioeconomic backgrounds. Treatment effects were also stronger among female students, which the authors note appeared to compensate for a deficit in their baseline performance.
- Attendance Linked to Greater Gains: A strong linear association was found between the number of days a student attended the intervention sessions and improved learning outcomes. Based on attendance data, the estimated effect size was approximately 0.031 standard deviation per additional day of attendance. Further analysis predicts substantial gains (1.2 to 2.2 standard deviations) for students participating for a full academic year, depending on attendance rates.
- Key Policy Implications for Low-Resource Settings: The findings suggest that AI-powered tutoring using LLMs has transformative potential in the education sector in low-resource settings. Such programs can complement traditional teaching, enhance teacher productivity, and deliver personalized learning, particularly when designed and used properly with guided prompts, teacher oversight, and curriculum alignment. The use of free tools and local staff contributes to scalability, but policymakers must address potential inequities stemming from disparities in digital literacy and technology access through investments in infrastructure, teacher training, and inclusive digital education.
Related Articles
Students as Agent Builders: How Role-Based Access (RBAC) Makes It Possible
How ibl.ai’s role-based access control (RBAC) enables students to safely design and build real AI agents—mirroring industry-grade systems—while institutions retain full governance, security, and faculty oversight.
AI Equity as Infrastructure: Why Equitable Access to Institutional AI Must Be Treated as a Campus Utility — Not a Privilege
Why AI must be treated as shared campus infrastructure—closing the equity gap between students who can afford premium tools and those who can’t, and showing how ibl.ai enables affordable, governed AI access for all.
Pilot Fatigue and the Cost of Hesitation: Why Campuses Are Stuck in Endless Proof-of-Concept Cycles
Why higher education’s cautious pilot culture has become a roadblock to innovation—and how usage-based, scalable AI frameworks like ibl.ai’s help institutions escape “demo purgatory” and move confidently to production.
AI Literacy as Institutional Resilience: Equipping Faculty, Staff, and Administrators with Practical AI Fluency
How universities can turn AI literacy into institutional resilience—equipping every stakeholder with practical fluency, transparency, and confidence through explainable, campus-owned AI systems.