Wednesday, May 14, 2014

Speech Recognition - Other challenges

As we've seen on our last post, one of the major challenges of Speech Recognition is Speaker Variability.
But there are others...

And the fact that there are still so many big challenges in this theme, is what drives our interest for it. Speech Recognition is, on the one hand, a technology with huge potential for growth due to its wide possibility of use and, on the other hand, something about which that is a lot to learn and investigate.

So, what are these other main challenges?

> Quality of inputs devices (typically microphones): the microphone's too low or too high sensitivity may origin problems to the software deciphering the message. A microphone too sensitive, for example, may capture unintended sounds (other people speaking in the same room, for example) and thus prejudice the whole Speech Recognition process. Related to this, today makes sense to all of us the possibility of controlling our TV by voice instructions. "incresase volume", "next channel", "shut down TV" are basic instructions that we can imagine ourselves giving to the TV in our living room. But let's think: will the TV respond only to pre-programed "instructors"? Will it then recognize our voice over some other noise existing in the same division? Will everyone be able to give instructions? To make it work, will the ASR device have to be in the control, rather than on the TV set? These are all questions whose answers are still under a lot of research.

> Meaning and context of what is being said: although this topic has to do with the sound itself, is not so much related to the variability of the voice, but more related to the intended meaning of whoever is speaking. Words like "there" and "their”, “whole" and "hole" and "leave" and "live" are pronounced the same way but have totally different meanings. No Speech Recognition technology is yet able to take this step to identify one's meaning, and our research tells us that this is a big step that won’t probably be taken soon.


To emphasize the idea again, some of the people involved in studies in this field question if - given that there are so many problems associated to it, all the research should continue. The majority though see it like we do... the potential of Speech Recognition Technology and the early stage where we still are, makes us believe that this will be the solution to many problems and, therefore, research should and will continue.

No comments:

Post a Comment