top of page

Stata, Python
Multiple Regression, Statistical Modeling, Predictive Analysis 

Sample 1: Statistical Modeling/ Multiple Regression

Money buy literacy.jpg

We conducted statistical modeling and multiple regression to assess the association of national education spending and youth literary rates using data sets of 189 World Bank member countries. Top finalists at the Harvard Student Research Symposium.

Our Primary Research Questions

  • What is the relationship between education spending and youth literacy rate controlling for a country’s average years of schooling, primary pupil to teacher ratio, and development status?

  • Does the relationship between education spending and youth literacy rate differ by development status controlling for average years of schooling and primary pupil to teacher ratio?


Our Findings

Our analysis of the 2015 HDR dataset compiled by the UNDP sought to investigate the relationship between education spending and youth literacy rate. We were specifically interested in examining this relationship across development status controlling for average years of schooling and primary pupil to teacher ratio. Our study found a statistically significant association between education spending and youth literacy rate, as well as evidence indicating that this association differs according to a country’s development status. Spending more on education in developing countries indicates a statistically significant improvement of youth literacy rates, while developing countries exhibited a negative association where increased education spending in developed countries posed negative effects on youth literacy rate. As a result, we recommend that developing countries direct their education policy towards increasing education spending while developed countries critically reassess their investment and allocation strategies for education spending.


We first investigated the uncontrolled association between these variables  and found a statistically significant, positive association, where a doubling in education spending predicted a 4.91 percentage point increase in youth literacy rate.

Controlling for average years of schooling and primary pupil to teacher ratio, a doubling of education spending in developing countries predicted a 3.25 percentage point increase in youth literacy rate. This finding suggests that in developing countries, education spending is positively associated with youth literacy rate. But more surprisingly, a doubling of education spending under similar conditions in developed countries predicted a 1.74 percentage point decrease in youth literacy rate. It appears that increased education spending in developed countries is associated with negative outcomes on youth literacy rate.



 While our findings indicate a statistically significant relationship between education spending and youth literacy rate, we cannot conclude a causal relationship between these variables.

Dataset limitations cannot be generalized for all World Bank countries due to the exclusion of 89 countries in response to missing data. Although a sample of 100 countries is substantial, the particularities of the countries included, such as greater prioritization of education, may generate biases in our study.

Available data provided neither a gender inclusive literacy statistic nor a weighting by gender, and we had the option of selecting either male or female literacy rate as the outcome variable. Because women’s educational opportunities are more heavily influenced by geographic, cultural, and religious contexts, male youth literacy rate serves as a more reliable measure for examining the controlled association between education spending and literacy. However, the conclusions drawn from examining the youth literacy of males rather than the aggregate of both males and females limit the scope and validity of our findings. Furthermore, though the outcome we examined was exclusive to male youth, our predictors reflected data on both sexes.











Needs Analysis, Focus Groups, Surveys, Literature Review, Comprehensive Reports, SWOT Analysis

Sample 2:  Needs Analysis / Surveys for a Climate Curriculum


Inspired by the United Nations 2030 Sustainable Development Goals, the ultimate goal of the course is to produce graduates from schools of education worldwide who can serve as leaders in the 21st century global movement on the pressing issue of climate change mitigation, adaptation, impact reduction and early warning. Implementation of the course was documented in Springer Nature (below).

Survey Administration and Interpretation

In order to assess preliminary needs and interest in a climate change curriculum for the 21st century at HGSE, we administered a Knowledge, Attitude, and Practice (KAP) survey to 66 respondents comprised mostly of HGSE students. A randomized sample was collected through an online survey soliciting students within HGSE and the greater Harvard community. Limitations of the survey include probable bias towards climate activism due to its opt-in design, as well as its small sample size, representing less than 10% of the overall HGSE student population. The 66 participants were affiliated with various programs at HGSE and planned to go into different sectors in education, reflecting the diverse student composition in education sectors that our curriculum targets. Each question was designed specifically to corresponded with a Knowledge, Attitude, or Practice (KAP) assessment of student’s understanding and interest of climate change.  


A survey of the participants’ self-assessed knowledge on climate change revealed 66.2% of the respondents reporting they had gathered some information about the subject, and 15.4% claiming to have very limited knowledge on climate change. To test the actual knowledge of climate change, two multiple-choice questions were administered, questioning the causes and effects of climate change. Only 13 respondents, comprising 19.7% of the total sample answered correctly on both questions, demonstrating a significant gap in students’ actual knowledge of climate change.


  90.5% of the respondents were very certain that climate change was happening, showing a strong consensus on the existence of the phenomenon. 95.4% of respondents expressed concerns regarding climate change, with 64.6% of the respondents responding that they were very concerned about climate change. There was a general consensus with regards to whether education had a significant role in mitigating climate change, with 95.4% of respondents selecting 4 or 5 on a scale of 5, with higher numbers indicating greater significance. Although slightly weaker (84.6%), strong agreement was also observed in responses to the question of whether educators were significant in mitigating climate change.


            By analyzing qualitative responses to questions asking one’s efforts taken to mitigate climate change, we could extract five general themes that represented the responses: reducing one’s own carbon footprint, raising awareness, engaging in political action, no effort, and skepticism of individual effort in mitigating climate change. The majority of the respondents (90.4%) have made efforts in mitigating climate change, with the most common responses related to reducing their own carbon footprint (80.7%).

Next, an analysis of one’s attempt to mitigate climate change through education yielded six different themes: informal conversations, teaching students, using social media, organizing events, and advocacy. The most common attempts were engaging in informal conversations (25.5%) and teaching students on the topic of climate change (23.4%).


     34 students, or 52.3% of total respondents, showed an interest in taking a course on climate change and education.


Whereas formative assessments like ours are not meant to be generalizable nor predictive of actual behavior, this figure provides some indication that there is considerable interest in CCE. The authors shared the results of the survey, and the feedback was immediately positive. The urgency of the situation and the demand from the student population clearly demonstrate the need for this class.  

Sample 3:  Needs Analysis - Focus Group for a Media Production

Formative Evaluation

To better understand the needs and interests of our target audience, we reached out to the 4th graders at Garden Pilot Academy. We surveyed questions on topics pertaining to media and obesity to build up our understanding of the knowledge and preferences of 9- and 10-year-olds.

Our formative evaluation consisted of two large segments; a paper survey (Appendix D) in which we assessed students’ baseline knowledge, attitude, and practice with regards to individual health, and a group interview to gather datapoints upon their understanding and appeal of storyline. These valuable insights will enhance the usability of our programming. 

Knowledge. To assess the basic knowledge of 9-year-olds on health and nutrition, we asked students, “What does ‘staying healthy’ mean to you?” 2 out of 10 students provided wrong answers; 80% of the students surveyed were correct. Students seemed to have an adequate baseline understanding of healthy habits.

When asked, “What does it mean if someone is ‘overweight’?” 9 out of the 10 students demonstrated a correct understanding about the concept “overweight”. Only 1 student wrote “I don’t know”.

Attitude. To get a feel for students’ socio-emotional tie into the issue, we asked students to provide responses to: Have you felt different or teased because of the way your body looks? 

7 students answered “No,” 2 students answered “Kind of/Maybe,” and 1 student answered “Yes.”

Behavior. Finally, to understand a students’ typical snacking habits, we asked, What are some of your favorite snacks? 

The overwhelming response was a skew towards junk foods, with 5 students answering, “candy/gummy candy” but also “yogurt”; 3 students claiming “fruits” or “chips” to be their go-to snacks, and other single responses recorded as: Cookies, Soda, Hot Cheetos, Crackers, Goldfish  Peppers, Cucumber, Apple, Oreos, Chocolate.  

Media Habits. Self-reporting show that students spend most time reading books, and that television came 4th in terms of media preference. Despite our findings from the field, we had felt that our research on the needs and preference of local Chinese kids was strong enough to proceed with our intervention campaign.


Media Habits of 4th Graders

 Character Choice. Unable to decide whether to use human or cute, furry animal characters as protagonists of our animated series, we decided to turn to our 9-year-olds for help.

The 4th graders overwhelmingly showed preferential bias towards animated animal characters over human characters. Surprising was the choice of animals they had designated for Charlie, a 12-year-old fun-loving, sporty rabbit, and Lucy, a 9-year-old girl independent tiger. The children drilled us the importance of making Lucy into a strong, fierce, and likable protagonist, reflective of 21st century movements of female empowerment. This refreshing juxtaposition of the characters provide an element of fun, suspense, and binary dynamism to our story.

Plot Appeal. During a short group reading of the plot (a formative storymatic), three members in the group observed the students to tally up the children’s behaviors and comments during each segment of our plot. Our key takeaways include: kids laughing at plots points such as “getting stuck in the toilet” and asking for details about this plot and showing curiosity such as, “How did he get out?” Some attention detraction after “every child in town” but attention coming back as children heard key phrases such as “the kind scientist”, “watch”, “sugar fighter”, or “I will come back.”

Students were curious to learn about the relationships between the two doctors, and we soon realized our original names for the doctors could confound the students. We elaborated on a backstory for the doctors and came up with catchy names based on the feedback.

Students helped brainstorm plot choices such as what the Energizer should do (add more features such as a tracking device for protection), where to set the context (a jungle or forest for the battle to take place), and additional plot choices (a magical fluoride toothpaste or floss) for our protagonists to fight villains (such as “Super Soda,” a very bad king).

This formative storymatic gave us a sound basis to make changes to our final end product.

Dance.  To see how students would respond to a short, animated dance clip, we played two separate clips of animated characters dancing. Students immediately responded by imitating dance gestures or telling us how ‘cool’ the moves were. This feedback informed us to include a catchy ‘dance’ within our television programming so that students could imitate as part of a physical movement campaign.

 Limitations. Despite an overwhelming positive experience at our field, our small sampling size provided several limitations to our research efforts. First, it would be difficult to generalize our findings to the overall population, and certainly not to our target audience in Chinese kids living in Tier 1 cities. Second, surveys were self-reported by 4th grade students, which meant inaccuracy and/or bias should be heavily taken into consideration.

Sample 4:  Market Research and User Persona

bottom of page