Predictive Tool for Identifying Out of School Girls
Using machine learning to predict the prevalence of out-of school
girls in rural India
Problem Statement:
There are 3 million out of school girls (OOSGs) in India due to a wide variety of economic and social causes.
Educate Girl's (EG) mission is to get them back into school and make sure they and their peers receive a quality education.
Currently EG’s programs have 3 steps
1) Door to Door (D2D) survey to locate OOSGs
2) Enrollment efforts to get those girls back into school
3) Community education programs to support these girls and their classmates in learning basic literacy and numeracy in schools.
It is currently very expensive and resource intensive to do step 1
EG currently does a D2D census to identify out of school girls. However, 90% of OOSGs live in less than 50% of villages leading to great programmatic inefficiency.
Key Question: How can EG better target its programming to locate and serve more OOSGs with a fixed budget
Solution Summary:
To use machine learning developed in partnership with IDinsight to predict what villages will have large numbers of out of school girls to inform EG’s expansion. The algorithms combine data from the 3,000,000 surveys EG has done to date with nationally available census and education data at the village level. From this more accurate predictions are derived that allow EG to prioritize the villages, blocks, and districts with large numbers of OOSGs
How Solution is Revolutionary:
Increase effectiveness of program allowing EG to reach millions more OOSGs and allow educators to provide academic interventions sooner.
EG can reach 47-57% more OOSGs in Rajasthan and MP at approximately the same cost as current programming
Proof of concept on how social programs can be more efficiently targeted for everything from education to health care to nutrition
- Supportive ecosystems for educators
- Personalized teaching, especially in disadvantaged communities
Educate Girls is using machine learning to help identify out-of-school girls as expediently as possible. Previously, EG conducted resource intensive door-to-door (D2D) canvassing to locate girls that were out of school for its programming. However, the majority of OOSGs are found in a minority of villages: 10% of villages contain half of all OOSGs in the four Rajasthan districts and three Madhya Pradesh (MP) districts where EG has conducted D2D surveying. By using machine learning predictions to better target EG’s programming, EG can reach 47-57% more OOSGs in Rajasthan and MP at approximately the same cost as current programming.
IDinsight has built machine learning algorithms that take EG’s existing D2D data and combines it with government data to predict the number of OOSGs in every village in Rajasthan and Madhya Pradesh(MP).The primary algorithm used is a variant of “Random Forests” built in Python which aggregates the results of hundreds of “decision trees” to make accurate predictions. The predictions consistently distinguish between high-and low-burden villages and high- and low-burden blocks and rank villages and blocks by OOSG prevalence. This ranking has direct programmatic relevance for targeted D2D expansion as EG could reach thousands more OOSGs by targeting high-burden villages.
Over the next year, EG and IDinsight plan to work together to develop an operational strategy to incorporate predictions into EG expansion in Rajasthan and MP, and to continue this collaboration into a third phase focused on Bihar, Uttar Pradesh, and other expansion states. These states have a combined population approximately equal to that of the United States, making the ability to target well imperative. In addition, EG and IDinsight plan to jointly author a short article or blog post summarizing the findings and approach for this novel use of machine learning.
Given the relevance of the predictions for scale-up of government programs, EG and IDinsight will continue to explore collaboration with Indian states to incorporate machine learning and prediction into the expansion of governmental programs focused on OOSG enrollment so that more girls can be identified quickly and be provided appropriate interventions at an earlier stage. Partner with government and partner organizations to take EG’s model to scale in more places.
- Child
- Female
- Rural
- Lower
- Europe and Central Asia
- East and Southeast Asia
The predictive tool is used by Educate Girls field personnel. Areas deemed high burden are targeted first to find the most number of out of school girls (OOSG). For example, if EG were to target the top 50% of villages per district using the tool, EG would reach approximately 73.5% of all OOSGs in a new district in MP and 78.3% of all OOSGs in a new district in Rajasthan. These predictions would enable EG to reach 47-57% more OOSGs, (2,500-3,000girls), in the same number of villages. EG can reach thousands of additional OOSGs by targeting high-burden villages.
Gender based discrimination has resulted in India being home to largest number of illiterate women in the world (over 200 million) with a female literacy rate of 61% and 3 million eligible girls being out-of-school, mostly in rural areas, reasons which include severe social, economic and cultural dimensions of marginalization and a paternalistic society. Exacerbating the issue of access to education, the quality of education provided in rural government primary schools is mostly very weak, resulting in inferior learning outcomes. Targeting OOSG expediently allows earlier academic interventions and shifts the patriarchal narrative to one of more equanimity.
- Other (Please explain below)
- 6
- Less than 1 year
Educate Girls: has existing relationships and large operational presence (1,500+ staff) and a proven model, as measured by rigorous research. EG has a funding plan to see through its operational expansion to serve 20,000,000 children over the next 5 years.
IDinsight: has the analytical skills necessary to build the models and the policy and government experience necessary to tailor solutions to fit the context.
Educate Girls is a nonprofit (US 501C3) and IDInsight is a for profit developmental agency. Educate Girls receives funding through grants and donations. Through this revolutionary tool speaks for itself, especially in a country like India - where human trafficking is high, and health care delivery poses challenges. The hope is that the government will invest in the technology (as well as other organizations working in different sectors) which will allow them to find and reach their targeted population more effectively.
Solve provides a great platform to discuss some of the world's biggest social issues. Our proposed tool itself is not a support for educators and teachers, but does allow for out of school girls to be found more expeditiously, thus, allowing educators and teachers to get interventions to those girls more rapidly so that learning gaps are bridged sooner, and students (girls) can integrate into the mainstream classroom quicker. Solve's approach to solving problems through innovation and technology provides great leverage for showcasing and refining our predictive indicator.
Experimental funding to do analysis work that is targeted at greater good - funding to do preliminary analysis that can be used to make presentations to government and partners. Critical review and support in developing better quantitative methods.
- Peer-to-Peer Networking
- Connections to the MIT campus
- Impact Measurement Validation and Support
- Media Visibility and Exposure
- Grant Funding

Founder/CEO