MIT Solve

Solution Overview & Team Lead Details

Solution name.

Taal - Know Your Notes

Provide a one-line summary of your solution.

Making music education accessible and equitable. An application to help underprivileged students and teachers self-learn singing and evaluate assignments using note detection and automatic music transcription.

What specific problem are you trying to solve?

My overarching goal is to improve access to high-quality music education using technology. In my community, among those interested in music, there is a significant separation between those who have the resources to pursue it actively and those who do not. More specifically, private vocal lessons are costly.

A survey of 3000 music educators done by the DMR (Dynamic Music Room) revealed that only 19.8% feel moderately well-prepared or very prepared for the job. Most public schools could not continue choir or other music education during remote learning. In our private school, teachers asked each student to submit recordings of their singing so they could listen and provide individual feedback, which took a lot of work to both understand and correct. Based on a 2021 study conducted by NAMM, an average school district spent 1.9% of its total operating expenditures on music programs, and 85.4% of that was spent on music educators' salaries.

These stats and situations imply a need for more technology applications to assist teachers and students in music education.

With the increase in global accessibility to online education, many people use online courses to teach themselves various skills, like programming, math, science, writing, etc. However, only some of these courses are effective in art-related fields, especially music.

The reason is that none of the music education platforms include an immediate feedback system. Students are either expected to judge themselves or wait hours/days for a teacher to evaluate their singing. The critical gap in music education at the elementary level is the need for immediate feedback for students singing and assistance for teachers in assessing students singing. With current tools and applications, teachers are left to manually evaluate each singing piece and give that feedback to students.

Elevator pitch

What is your solution?

My solution is a learning tool for students and teachers that can assist with music education. My prototype has two components: a website and a Chrome extension.

The website allows students to practice independently without a structured class environment. Students can upload audio recordings or record directly on the website and transcribe the recording into a piece of notated music with the click of a button. The automated transcription is incredibly useful to students because they can see how they are singing and compare it to the sheet music they were singing.

The chrome extension uses the same core technology to help teachers with grading assignments on Sight Reading Factory, a paid platform that enables teachers to assign sight reading exercises to students. However, this platform does not have an automatic grading platform, meaning teachers must spend extra time grading. To make this process faster, music teachers can use my chrome extension, which extracts the student recording to be graded and transcribes it into notated music when the teacher is on a grading webpage. The sheet music displayed on the screen below the original notated music the student was trying to sing. With this chrome extension, the grading process reduces to requiring a simple comparison, making the process easy and fast for teachers.

The main code for both components is written in Python where the analyses take place using the SPICE pitch model, followed by rhythm analysis. The extracted notes and rhythm is encoded in a format called music-xml using music21 (open source software), and then OSMD is used to display it to the user.

This is the beginning of a complete music education platform that can help students and teachers accelerate music learning with automatic detection and transcription of vocal music.

Who does your solution serve? In what ways will the solution impact their lives?

There are two equally important target population sets for this project - students who are starting to learn choir singing and music teachers. When addressing the student population, this project helps the students in the early stages of music learning the most, as that is when they need the most help with a guided curriculum and active feedback. As for the teacher target population, the project can help teachers at all levels improve their efficiency in grading the students' singing.

With the current learning tools and processes, amateur music students rely heavily on music teachers to guide them. The music lessons are costly, and students from low-income families cannot afford them. In addition, since all the feedback and evaluation are manual, they get feedback once a week at most, which means they don't improve as fast as they would have if there was an automated feedback mechanism. My solution is to build an automatic music transcription service for singing which can provide feedback within minutes to the singer. With this service, the students can record music with a web browser or a mobile phone and immediately get the transcription of the piece, including notes and rhythm. They can now compare this to the sheet music they were singing and understand where they went wrong. With this feedback loop, they can practice independently and go to the teacher for expert feedback. This can completely transform the learning experience for underprivileged students and help them reach their full musical capabilities.

The project also helps teachers with the same automatic transcription to grade the music assignments much faster. Currently, teachers are evaluating each assignment, and they need to listen to the entire submitted singing sample, highlight the errors manually for the student, and they give that feedback. In most cases, teachers communicate the errors on paper with time and notes where the mistake happened. Imagine this situation to be completely different with automated transcription and teachers are shown the most probable error places with a marker in time. With this tool, they will also be able to go to that location, play that part of the music, write detailed feedback in addition to error notes that product points, and then send them to students from within the system. I plan to extend the project further to allow teachers to record the correct singing and deliver that to students. This makes the entire grading experience digital and extremely efficient.

In addition to helping students and teachers, once fully developed, this project will allow financially constrained public schools and nonprofits to scale their music education cost-effectively.

How are you and your team well-positioned to deliver this solution?

I am currently working alone, without a team. This problem falls at the intersection of math, machine learning, and music. Over the last four years, I have trained myself on several different fronts, making me the best candidate to solve this challenge. I have been singing in the school choirs, learning music theory, and extensively programming. I have also taught myself machine learning using online course material and taken AP computer science courses at my school.

I also understand the community I am targeting because I struggled with music education early in my singing journey. I have sung in choirs since the 4th grade. My parents are first-generation immigrants, and no one in my family is a musician. They lacked the knowledge to recognize my talent and passion for music or help find good coaches to train me. Because of my struggles with access to music education in my early years, I have failed to qualify for technical auditions despite having a good voice many times. This has been my source of motivation and passion for solving this problem.

Based on my knowledge, I have built a very credible prototype and have been testing it with my peer students and teachers at my school. I am aware that I lack deep knowledge of machine learning which is needed to build a complete solution. Still, my passion, music knowledge, and current computer science education level can take me far enough to get traction across local public schools. I then plan to hire/partner with other engineers to extend the solution.

What steps have you taken to understand the needs of the population you want to serve?

I started with understanding the problem by interviewing my teachers and my peers who are learning mus. I wrote down the requirements and suggestions proposed in each interview and collected and filtered the initial base requirements. I also analyzed different solutions already available in the market. Based on this research, I concluded that these solutions don't address the problem that I am trying to solve. All other solutions are either addressing professional music singers or casual hobbyists. Music students and teachers are a completely underserved segment.

After building an initial prototype, I tested it and collected user feedback. I communicated with music teachers at my school, including the one who originally raised her concerns about the problem, about my solution. I have communicated with them, and they continue to provide me with feedback that I am using to improve my solution. I have also shared my solution with other students interested in music from my school. They have tested my website and also continue to provide me feedback to help improve my pitch detection and display algorithm.

In addition to the user testing done with students and teachers at my school, I am now talking with some industry experts to collect new datasets for better training of the transcription model. These are Ph.D. students and professors working in the MIR field (Music Information Retrieval - https://en.wikipedia.org/wiki/...).

Which aspects of the Challenge does your solution most closely address?

Improving learning opportunities and outcomes for learners across their lifetimes, from early childhood on (Learning)

What is your solution’s stage of development?

Prototype: A venture or organization building and testing its product, service, or business model

In what city, town, or region is your solution team located?

Los Altos, CA, USA

Who is the Team Lead for your solution?

Saanvi Bhargava

More About Your Solution

If your solution has a website or an app, provide the link here:

https://www.knowyournotes.app/

What makes your solution innovative?

My solution is a new approach to enable music education for the masses. Current music education software is designed only to give assignments to the students and rarely provides feedback, while my solution takes an entirely different approach. My software provides an immediate feedback system, which actively helps students improve. It allows students to grade themselves and enhances the velocity at which teachers can grade student assignments, thus democratizing music education in our underfunded arts programs in the public school system. I expect my product to enable broader positive impacts in this space. My solution can be expanded to become a fully interactive online music learning platform, which would change the market entirely.

What are your impact goals for the next year, and how will you achieve them?

I have two impact goals for my project for the next year:

Increase the accuracy of detection and improve a few aspects of the product. Based on my early testing with teachers and students, the detection algorithm does not work well when the note sung is short. Based on my research, there are a few ways to solve this. One of my goals is to improve this pitch detection algorithm.
Drive adoption: My biggest and most important goal for the next year is to expand my chrome extension and website usage. I have tested the product with a few students and teachers at my high school. Based on early testing, they see the value. I want to expand the user base to other schools in my area. I plan to expand into different public schools with low music funding where I can make more of an impact, possibly gaining up to 30 teacher users. For schools, I will focus mostly on teachers to drive the adoption of my chrome extension which helps teachers grade assignments. In addition, I would like to drive adoption with middle school students who are just starting with choir programs. Considering my current progress, 120 consistent students per month is an attainable goal in a year.

Describe the core technology that powers your solution.

The complete solution includes a web application and a chrome extension. The web application and chrome extension use HTML, Javascript, and CSS for the front end. The web application and chrome extension use numerous chrome APIs to record music and inject scripts of training websites to capture audio files. In addition, I use the OSMD library for displaying music notes.

The core technology that powers the solution is the backend server that runs on Google cloud infrastructure and is written in Python. The entire detection of notes and transcription is a multi-stage process:

Stage 1: Music conversion - The first stage in the algorithm is to convert the audio file into wav format and then extract samples from it.

Stage 2: Frequency detection - The detection algorithm uses artificial intelligence to detect the frequency of a music sample using the SPICE pitch detection model.

Stage 3: Rhythm detection - After pitch extraction, each pitch is converted to a note based on a frequency lookup for each note. The note conversion is based on the fact that notes follow an exponential scale, and the frequency of notes doubles for each octet change. In addition, to this number of heuristics are applied to detect and minimize errors in pitch detection.

Stage 4: Music XML and display - The last stage converts the notes into music XML format using the music21 library, (insert link here) a library developed by the MIT Music department. The front end uses OSMD for display.

Stages 2 and 3 are the core technology pieces of the solution and can be improved with large data set for AI and also further research.

Please select the technologies currently used in your solution:

Artificial Intelligence / Machine Learning
Audiovisual Media
Software and Mobile Applications

In which countries do you currently operate?

United States

How many people does your solution currently serve, and how many do you plan to serve in the next year? If you haven’t yet launched your solution, tell us how many people you plan to serve in the next year.

My solution currently serves around 20 people. Based on my projections, I expect to serve up to 150 people in the next year: 30 teachers and 120 students. My goal is to focus on getting active feedback from these early users to improve the algorithm and hence I would like to limit the user base to the above numbers in this early stage.

What barriers currently exist for you to accomplish your goals in the next year?

The biggest challenge I face is that I need more time or need to partner/hire another engineer to continue working on the solution. In addition, I feel that a larger dataset of singing samples, deeper understanding of machine learning and more test users can help improve the solution significantly.

In addition, reaching out to public school teachers and convincing them to adopt a technical product will be hard. Most music teachers are not comfortable with technology, so I focused on keeping the interface very simple, so that adoption is faster. My other option is to reach out to a local non-profit supporting music education to break the adoption barrier.

Your Team

Business Model