Talk to Me: A Survey of Voice-Based Mobile Tech

Posted by PrabhasPokharel on Aug 24, 2009

The pre-cursors to mobile phones were walkie-talkies, and the first generation of mobile phone networks only supported voice communications. With second generation networks and a happy accident came SMS, and only with the third generation networks came mobile data services in the form of GPRS.

Most applications using mobile phones these days tend to use these newer channels of communication—SMS and data. But even though we sometimes forget, voice is still a part of mobile phone communications.  This article profiles interesting ways in which voice technology is being used for social work all around the world.

Voice transmission has a singular advantage over SMS and data transmissions—it channels human, spoken  language directly. Users of many literacy levels can use voice technology with keypad and voice navigation, and applications can be run in local languages. Users can issue commands and requests in their natural language, and thus communicate more accurately. The problem, unfortunately, lies on the receiving end. Voice data is much harder to process automatically than text or other data. It requires considerable technical effort (or a lot of person-power) to parse and separate voice data (and even then accuracy isn’t perfect), and searching voice data still remains a nigh-impossible feat. And secondly, airtime costs tend to run higher than text message costs.

Yet, there are a few projects in existence that leverage the talking capabilities of mobile phone for interesting ways to deliver information.

Question-Answering Services

Two of the more projects deal with providing a very simple service—answers to people's questions.

Question Box provides a question-answering service in India and Uganda. In India, boxes are installed in slums and villages that connect users to operators that will answer questions. In Uganda, users can call in with any mobile phone to have operators answer questions. The operators have access to a repository of previously asked questions (and answers), the Internet if available, and in Uganda, a custom-built offline search engine and database built specifically for the project.

Avaaj Otalo provides an audio community forum for farmers in rural Gujarat. Working with an organization that had a previously popular radio program, Otalo provided a call-in number where farmers can ask questions and answers some too! Navigating a VoiceXML-programmed menu with the keypad or one-word commands, users are also able to listen to archives of the radio program.

Question Box avoids having to process users' questions by adding a human listener in the loop, while Avaaj Otalo avoids processing by organizing their collection of audio prompts with an externally organized menu. The programs have yet to deal with the problem of cost, however, and subsidize their information querying to users. Otalo operates with a toll-free number. Question Box provides the phones to call from in India, and in Uganda, Grameen Community Knowledge Workers provide the mobile phones that calls are made from.

Wikipedia and News on the phone

MobilED, operating in South African schools developed a program that delivered Wikipedia over mobile phones. Users texted in a query, and they would be called back, with a speech synthesizer reading them the text of the Wikipedia entry requested. Users could also upload voice-based edits to articles, or create audio entries if nothing existed on a topic. Again, queries were text-based and thus easy to parse, and only the information delivery was based on voice. The cost issue was more severe, however, and MobilED eventually abandoned the project in favor of data based and cheaper cell phone technologies.

Freedom Fone, a Knight-funded project based in Zimbabwe, is working on providing news using an audio channel. In an environment where the press is highly repressed, and access to news is scant, Freedom Fone plans to implement a solution where users can either call in or text in, and be called back with the latest news information. The cost structure has yet to be determined.

Recreating the Web, or Wikis over audio

Perhaps the most ambitious project that uses the voice platform is IBM’s Spoken Web (also known as the World Wide Telecom Web) Project. The idea here is to create a parallel of the World Wide Web, but all on the audio platform. The project has a concept of voice-sites that are linked to specific phone numbers, and has already built a system called VoiceGen to create VoiceXML content using audio input. The group had an initial deployment of a subset of this technology, in the form of VoiKiosks that allowed users to listen to information from different NGOs and upload professional advertisements that saw high rates of usage.

MIT CSAIL is also developing audio wikis, or “local repositories of audio information,” that could be edited and created using audio, essentially recreating what wikis are on the Internet on a mobile-accessible, audio platform. The system is not fully developed yet, but deployment is planned for India, in conjunction with Microsoft Research India

What are the barriers to voice services?

Voice has some benefits that you just can’t run away from, especially when it comes to low literacy consumers. The question then is why voice technologies are not more of a player in the world today. Automatization difficulty seems serious, but as Question Box, Avaaj Otalo, and IBM’s projects have shown, not an obstacle that can’t be overcome.

Cost seems to be more serious of an issue—the real bottleneck for sustainable voice-tech to be deployed. MobilED moved on from the voice-based program to others because of high cost. South Africa voice costs are indeed high; the International Telecommunication Union reported that 2008 costs for one minute of on-network airtime during peak hours was US $0.59 by Purchasing Power Parity. On the other hand, costs in India are cheap; the same minute costs $0.07 in India (Data from MobileActive’s MobileData page). We hope that the number of voice applications coming out of India is an indication of these cheap costs, and some of these projects soon cross the cost-barrier.

Photo courtesy of gopal1035 on Flickr, CC BY 2.0.

Prabhas Pokharel is Project Lead for the Mobile Media Toolkit at He tweets at @prabhasp.

Sound Barriers - using VUI

Very interesting article to relate to research I am currently doing into Voice User Interfaces.  My research is trying to look more generally at mass market adoption, but importance in emerging markets such as those you point out here are not unexplored. 

This research will be available free to MobileActive members when we finally publish in October 09.  Currently doing a consumer survey, please take part.  It will help us understand what mobile designers, developers and marketeers need to do to make Voice in the user interface work for customers.


Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><p><br> <b><i><blockquote>
  • Lines and paragraphs break automatically.

More information about formatting options