Transferring "Making Survey Data Useful" to the next generation
- United States
- Not registered as any organization
The problem we aim to solve is the lack of capacity among government and partner institutions based in South Asia and sub-Saharan Africa to identify and reduce survey nonsampling errors. Surveys are a critical tool to help program managers and decision makers understand key issues about the populations and localities they oversee such as existing status and challenges, the most appropriate and acceptable strategies to address these challenges, as well as opportunities for monitoring and evaluation to determine the impacts of implemented strategies. The information provided within high quality, accurate surveys can lead to more evidence-based and effective policies and programs. However, the utility of many surveys is significantly compromised by nonsampling errors. We have seen this in all 60 countries where we have worked.
Survey nonsampling errors result from problems: 1) with the sampling frame (i.e., the list of sample units from which a sample is selected), 2) with data collection, 3) response and non-response errors, 4) mistakes in processing and 5) errors in summary and presentation.
In contrast, sampling errors represent differences between outcomes in survey data when the survey is repeated using identical procedures (e.g., the estimators, such as population totals, means and quartiles, coefficients of variation, variances) and the statistics of the sample. Of the two sources of errors, nonsampling errors are often left out of trainings yet they are also the greater and more difficult to estimate and control within surveys. Controlling nonsampling errors makes survey data accurate, timely, objective (which gives surveys data credibility) and cost-effective. This solve challenge is aimed to improve capacity among governments and their partners to reduce these errors and carry out better surveys. ‘Further, the recently launched Hangzhou declaration “Accelerating progress in the implementation of the Cape Town Global Action Plan for Sustainable Development Data,” called for “an urgent and sustained increase in the level and scale of investments in data and statistics from domestic and international actors, from the public, private and philanthropic sectors, to strengthen statistical capacity in low-income countries and fragile states, close data gaps for vulnerable groups and enhance country resilience in the current context of economic crisis, conflict, climate change and increased food insecurity.”’ This statement explains the specific challenge we wish to help solve. It is a big challenge!
We propose to use the Solve Challenge funds to organize all of our training classes for nonsampling errors and conduct a “Training of Trainers” in South Asia and Sub-Saharan Africa and any other institutions that buy in so that they have the capacity to support governments and partner institutions to design and conduct more accurate, timely, objective, cost-effective and credible surveys with minimal errors. Our team has worked over multiple decades with partners in over 60 countries to design and conduct surveys and we will transfer our skills to survey designers based at training institutions in these regions. The technology is relevant for any setting so can be used to build capacity in additional regions if there are interested partners.
Pakistan and Rwanda are two countries with excellent in-country training institutions that can function as regional Centers of Excellence. In particular, the Space and Upper Atmosphere Research Commission in Pakistan (SUPARCO) and the National Institute of Statistics of Rwanda (NISR) in Africa are renowned institutions that have agreed to participate in this activity should the funding be awarded. Our proposal is to develop comprehensive training courses and to do advanced training of professional staff within these two institutions using advanced technologies that are increasingly becoming available for use across the globe. We will show participants how to optimize the latest resources including satellite imagery, satellite technologies, GIS programs, mathematical statistics and artificial intelligence (AI) to resolve nonsampling errors associated with the 1) sampling frame, 2) data collection, 3) response and non-response errors, 4) mistakes in processing and 5) errors in summary and presentation. By the end of the training, we will ensure that staff within these institutions have the capacity to teach these technical courses to other professionals and institutions across the globe.
Our solution will serve many Directors of institutions and project managers. First, SUPARCO and NISR Professionals in Pakistan and Rwanda will be trained to train others in their institutions to design and conduct more accurate, timely, objective, cost-effective and credible surveys with minimal errors.
Subsequently, additional institutions and project managers in Asia, Africa or any other region trained by these two institutions will benefit from this enhanced capacity. Further, the expertise transferred to these training institutions can ultimately benefit any sector or discipline (e.g., surveys examining poverty, food security, disease, health, population or climate change, etc.) because all sectors and disciplines require the best information to make sound decisions regarding policy and programs.
The team leading this solution have been supporting government and international institutions to carry out better surveys for over 40 years. William Wigton has a Masters Degree in Mathematical Statistics and has worked at the US Department of Agriculture, Research Division of the National Agriculture Statistics Service for 22 years which is the best agricultural statistics service in the world. He then started a company to serve other countries with advanced methods. He worked in over 60 countries. Dr. Alvaro Gonzalez Villalobos has a PhD in Probability Theory and worked as Head Statistician at Food and Agriculture Organization of the United Nations for 12 years. He worked in over 100 countries. Mr. Segir Bouzaffour has a Masters Degree in statistics and has worked in over 30 countries. Mr. Bouzaffour is a genius at data collection in the field. Dr Sohail Malik is an expert at surveys for Poverty Monitoring for poverty reduction. Each of these experts knows about nonsampling errors in surveys and what they do to accuracy.
The two in-country institutions that will participate in this Training of Trainers initiative that have been selected to be trained and to lead subsequent trainings (SUPARCO in Pakistan and NISR in Rwanda) are excellent research institutions. They have both conducted multiple survey-related training courses within their countries and as well as with institutions outside their countries. They are prepared and interested to take on and master this challenge.
NOTE: I will make this section better when I get everyone onboard. I asked George Battese today and I am sure he will be with us.
- Generate new economic opportunities and buffer against economic shocks for workers, including good job creation, workforce development, and inclusive and attainable asset ownership.
- 1. No Poverty
- 2. Zero Hunger
- 3. Good Health and Well-Being
- 6. Clean Water and Sanitation
- 10. Reduced Inequalities
- 12. Responsible Consumption and Production
- 15. Life on Land
- 16. Peace, Justice, and Strong Institutions
- 17. Partnerships for the Goals
- Growth
While we have worked in many countries, and trained in many countries, some of the training programs will need to be adjusted to nonsampling errors specifically and updated to ensure that they are adaptable and relevant for new issues that may be encountered as well as new technologies that are now becoming more widely available (e.g., through AI). So, we will make our next stage more adaptable for the new issues that will be encountered.
The issues are common with all survey including evaluating poverty, food security, disease, health, population or climate change.
Prior to the COVID-19 pandemic, our plan was to conduct a series of training programs to ensure that our technological expertise is transferred to statisticians and program managers working in low- and middle-income countries so they can lead Centers of Excellence based in at least two regions for the foreseeable future. However, this effort was interrupted by the pandemic and has been difficult to restart since then. These funds will enable us to ensure that this training occurs, and that the content is updated to incorporate the most recent technologies and learnings from the field. We will ensure that our training enhances the capacity of participants so that they are able to identify and address nonsampling error challenges even beyond what they may have encountered previously. This will be done to establish their ability to function as Centers of Excellence with their current and future partner institutions.
Moreover, MIT Solve expertise are familiar with many different projects and has places where our expertise would be useful to help these other projects. We are open to that possibility.
- Product / Service Distribution (e.g. delivery, logistics, expanding client base)
Innovation is good but simple and effective is even better. We have helped project managers to be efficient and effective by providing specific targets and better monitoring and evaluation data at the right time and with the right amount of detail.
We teach what is important in setting up a data collection system with the project managers who will need to monitor and make resource decisions. We teach how to complete a project managers data users requirements study. We show how to start projects with clear objectives and then how to build the data collection and data summary system. We stress the important areas to be of concern.
'Moreover, we help the project managers build the data system. Today, project managers are information processing engineers who wish to solve problems in society, so first, they must solve their information processing problems. “To the extent that they are able to master the information problems within their purview, they establish their analytical capacity and their social usefulness. Successful information processing is in turn primarily a problem of the appropriate design of the information systems within which data are collected, analyzed, and acted upon by decision makers.”'(Dr. James Bonnen,1975)
We expect our training to have an impact because we have taught many full scale programs and they have had an impact. If a survey is properly setup, the data collection aspect of the survey is less complicated with smaller samples and easier to collect, analyze, trust and act on.
We usually spend more time and resources setting up and less time collecting data.
Impact Goals for our solution
- Institutions that have been trained.
- Professionals trained in the institution where training took place.
- Number of surveys where nonsampling errors occurred.
- Number of possible nonsampling errors identified with a survey design.
- Actual number of nonsampling errors that actually took place.
- Corrections in survey procedures to counter the nonsampling errors on surveys.
- Data user requirements listed.
Poverty Reduction
“Investing in better data is key to supporting a rescue plan for people and planet”
'The need for data capacity building has never been so urgent, as countries face multiple crises on health, food, energy and climate, and need better data to support policymaking. It is also paramount to ensure effective monitoring and reporting on the progress towards achieving the SDGs.
The problem is that the data requirements are getting more stringent while at the same time their statistical budgets are getting more restrictive.
In response to the funding gap in data, the recently launched Hangzhou declaration “Accelerating progress in the implementation of the Cape Town Global Action Plan for Sustainable Development Data”, called for “an urgent and sustained increase in the level and scale of investments in data and statistics from domestic and international actors, from the public, private and philanthropic sectors, to strengthen statistical capacity in low-income countries and fragile states, close data gaps for vulnerable groups and enhance country resilience in the current context of economic crisis, conflict, climate change and increased food insecurity.”'
Moreover, governments don’t always want these data to be available to the public.
We find that with this goal in particular, measuring change from survey to survey is the most important goal. Data takes on meaning over time. So, a goal is to identify the sources of nonsampling errors and try to design systems that can measure change rather than designing a new system each 5-year survey period. That is, survey design parameters should not change if at all possible, after the first survey. If there is a correlation between survey results, you can reduce the sample size, improve the management and quality of data handling and have data that is more informative that can stay within a budget.
Food Security
"The number of people facing hunger and food insecurity has been rising since 2015, with the pandemic, conflict, climate change and growing inequalities exacerbating the situation. In 2022, about 9.2 per cent of the world population was facing chronic hunger, equivalent to about 735 million people – 122 million more than in 2019. An estimated 29.6 per cent of the global population – 2.4 billion people – were moderately or severely food insecure, meaning they did not have access to adequate food. This figure reflects an alarming 391 million more people than in 2019. • Despite global efforts, in 2022, an estimated 45 million children under the age of 5 suffered from wasting, 148 million had stunted growth and 37 million were overweight. A fundamental shift in trajectory is needed to achieve the 2030 nutrition targets."
Probability sampling and statistical inference. When samples are selected using probably methods from sampling frames that account for all units of a population and each unit of the population has a chance to be selected, and data are collected without error and the samples are expand properly. Samples can be small and yet accurate. This is explained in detail next.
Probability sampling and statistical inference are the core traditional technologies that we use that can be helped by AI. When: 1. Samples are selected using probably procedures, 2. From a sampling frame constructed to account for all units of a population without duplication of any units, 3. and each unit of the population has a chance to be selected, and 4. Data are collected without error and 5. Data are expand properly we have estimates of population parameters that are unbiased estimates. At this time, it is time to think about sampling errors and not before.
To get to this place, special types of artificial intelligence (AI) are used. More specifically, differential calculus, and a special type of simulated annealing is used when enough data about each sampling unit are present. For example, a population of interest is subdivided into 15,000 units. We will subdivide these 15,000 units into 10 subpopulations. So, we take 15,000 taken 10 at a time for the first stratum. The are billions of combinations that are available. So, which of the billions of combinations are most efficient for sampling? A simulated annealing process reduces the billions of combinations into optimal classes. This takes information about each sampling unit, a very fast computer and some smart programming.
But once set up, very efficient sampling can be done that can produce excellent data.
The traditional technology we use is: random sampling, systematic sampling double sampling and multiple frame sampling. On the satellite side, data classification (some type of discriminant analysis assisted by AI) and other digital technologies.
Each of these technologies can be useful in specific instances.
- A new application of an existing technology
- Artificial Intelligence / Machine Learning
- Big Data
- GIS and Geospatial Technology
- Imaging and Sensor Technology
We have the following part-time staff. We are all retired from full time experts in the field to survey technology.
William H Wigton (MS)
Alvaro Gonzalez Villalobos (PhD)
Seghir Bouzaffour (MS)
Everyone of our experts has more than 40 years of experience working on these problems. We have always worked together and separately.
We have all worked together in Rwanda for the National Institute of Statistics of Rwanda (NISR).
William H Wigton is a US citizen.
Seghir Bouzaffour was born in Morocco and is both a Moroccian and Canadian citizen
Alvaro Gonzalez Villalobos is an Argentina and French Citizen
We provide services to international and government institutions who ask for our services. The services are always specific for a problem survey.
We have always had to compete for work with other consulting firms. In Rwanda, there were originally 5 firms that wrote proposals. They eliminated down to 3 firms on the first round then one firm dropped out. That left two firms. In the final round, AAIC was underbid by oner $200,000 but that firm was rejected by the National Institute of Statistics of Rwanda (NISR). We explained that we could also do an inexpensive consultancy but they would pay more and get less in the end. NISR management agreed.
In Pakistan, the Government estimated that they had a surplus of wheat. They sold wheat on the international market. Six months later they found that they were short wheat and had to buy one million tons of wheat and that costs the government three hundred million dollars.
The Space and Upper Atmospheric Research Commission (SUPARCO) went to FAO/UN and asked for help. We were hired and two years later they were estimating hectares and production of crops in Pakistan.
Our business plan is consulting services by days with audited overhead rates on consultancies. Direct expenses such as travel and per diem based on USAID or UN rates and some contingency rate on total.
We do not charge overhead rates on direct expenses.
- Organizations (B2B)
Agricultural Assessments International Corporation (AAIC) was in business from October 1983 to 2020 when COVID started. This company was started by William H Wigton and all of our key members worked for this company as well as for FAU/UN and USAID as consultants.