Key learnings from building a “voice search” for hindi film songs

A couple of posts ago, I talked about the idea that “audio search” makes so much sense for a music app. We have been working behind the scenes looking at voice to speech technologies and evaluating them with a view to offer voice search in our app “Filmi Filmy”.

We are happy to report that we were completely wrong when we first thought of this – Since all of the song titles are entered in English but represent Hindi words phonetically eg: “O mere dil ke chain”, “Gata rahe mera dil” we think that we can use a voice to speech engine to take user inputs, turn them into phonetic English and use the English text as the search keys.

It turns out that is it much more elegant and natural to take the voice input “O mere dil ke chain”, render it as the hindi string “ओ मेरे दिल के चैन” and search for the hindi string in the database. One significant advantage to this is that it reduces the complexity of the phonetics completely. It does not matter if the “ke” is spelled as “key” anymore as in Hindi it will always be spelled as “के”.

The challenge of course is getting a database of film song titles entered in Hindi. Nearly all song databases have English transliterated titles – and may we add- not two of them spell the same song the same way. A healthy inheritance from English led and US led software is that from YouTube to the home grown Gaana nearly all the songs are in English.

We are happy to report that fortunately a bit of innovation and tons of persistence can solve this problem (we may not have a huge cash chest at Pariksha but we are certainly not short on tech coolness). One of our engineers figured out a way to use existing open-source tools to build hindi equivalents of the titles.

The results are spectacular, to say the least. Consider for example this song search using voice search with hindi titles v/s text search with English phrases below:

Text Search With English Phrases Voice Search with Hindi Titles
blog-article song


We need to do a bit more work on the hindi song titles and improve the error handling on the search and this should be ready for public use. Now consider the scenario we had described earlier – Imagine slumping in a car after a long day and with no energy to type to search, all you have to do is say the song and voila the app will play it on your phone, ear-phone or connected blue-tooth speaker. Dare we say, it is not long before this will be a reality!

Building A Micro Payment Based Mobile Marketplace For Digital Content
What’s the BIG News Now?

4 thoughts on “Key learnings from building a “voice search” for hindi film songs”

    1. The author name was not being displayed. It has been fixed now. Thanks for pointing out. Glad you liked the piece.

  1. This is cool. Voice to text will work so well in Hindi and other Indian languages as they are phonetically aligned whereas English is not.

    I was just wondering is there a market for a very difficult problem of melody extraction. This problem is complex because human auditory system is a bit complex 🙂 One can not simply work on fundamental frequencies and pitch to get the melody portion out.

    Not sure a melody based classification/categorization of music will have market? If yes, one can try out for a starter.

  2. My experience and learning is that all tech that is useful to a large audience can find a market. It takes the right time, maturity of use-case and end-user population to scale.

    Voice tech has been around for a decade plus but only with smartphones it is becoming economically viable.

    Mood classification is the next frontier for music apps. Someone will find a way to make it big 🙂

Leave a Reply to Bhraman Cancel reply

Your email address will not be published. Required fields are marked *