« The Twelve Days of Unified Communications – The Eighth Day – 12 CFOs Bilking | Main | The Twelve Days of Unified Communications – Day Ten – No More Pagers Beeping »

The Twelve Days of Unified Communications – The Ninth Day – Interface Enhancing

On the ninth day of UC the industry gave to me interface enhancing,
eight CFOs bilking,
overuse of power dimming,
applications plug ‘n playing,
five phone rings,
the voicemail market girds,
an AT lens,
what SMB loves,
And a clear definition of UC.

It’s all about the user experience, and what is closer to the user than the application or device user interface. In UC one of the sexier technologies used in user interface design is speech recognition. As one of my primary research focal points I’m a big fan. In fact, I finally caved and bought a Blackberry Pearl this year just for voice-activated dialing (VAD) (I know. I’m a little slow on these things sometimes. It’s like the shoemaker not having any shoes). So, when one of the vendors that I talked to about unified communications wishes, wished for better speech recognition as an interface in mobile devices I jumped on it. Therefore, wish number nine is that ASR and UC vendors continue to overcome reliability issues for ASR used in unified communications applications, make them even simpler, and find even more useful ways to incorporate both ASR and TTS into UC application design.

It’s not that speech recognition is bad, as in one of my recent blogs,"Speech Technology in Top 10 Technology Flops? I Think Not", I stated that I believe the technology has come light years from where it was even five or ten years ago. However, of all the technologies used within unified communications, it has one of the toughest rows to hoe, as it is confronted with a huge amount of obstacles to overcome in order to perform flawlessly every time. And perform flawlessly every time it does not yet do. Take, for example, the simple act of using speech recognition as a navigational aid in a mobile phone. It seems pretty simple. However, take that interface into a moving car with the windows down, the radio on, kids screaming in the back and road noise, and it can be spotty at best and useless at its worst. When writing this I was reminded that one of the things that often comes out of this mom’s mouth is “Be quiet, I’m trying to use voice dialing”.

Slightly off the topic of UC, but in line with improving the user interface, I read a recent press release from Sensory, Inc., that gave me one of those “why hasn’t someone done this before” moments, related to voice user interfaces (VUIs). Sensory’s press release on their new TimeSet function, was as follows: “With the advent of TimeSet from Sensory, consumer devices can now be programmed using natural speech input. TimeSet is unique in that it doesn't require any user or device training, or an unusual sequence of spoken commands or discrete input of digits. A person can now simply say "twelve-thirty-seven PM", or "six AM." Any device, such as an alarm clock, microwave oven, thermostat, coffee maker, or DVR outfitted with an RSC-4x chip can be programmed quickly and accurately without the hassle of holding down hidden or multiple buttons.”

That is what I’m talking about. Completely brain dead, single-digit IQ user interfaces using speech. Of course the seven year olds that we goad into setting our DVR clocks will have nothing to do now. Within unified communications we could make some immediate improvements to VUIs. For example, while I’m using voice dialing with my Blackberry, when I hit the VAD button (I have Cingular service, by the way), I hear “Say a command”, so I say, “call so and so”, and I hear back, “which number”. This is the aggravating part. I sometimes say cell and sometimes say mobile, and it only recognizes one. It’s the same thing with ‘work’ vs. ‘business’. How difficult is it to have both variations in the grammar, rather than play an error prompt? Similarly, I’m responsible for remembering how I put a contact’s name into Outlook to begin with. If for example, you have hundreds or thousands of names in contacts like I do, it’s hard to remember if you put your uncle’s name in as John and Marie Smith, or John Smith, or Mr. John Smith. But if you say it the wrong way, it’s not recognized. It seems like a sensible thing to fix.

I know there are interesting things emerging right now with speech technologies within UC. For example, right on the heels of announcing TimeSet, Sensory also announced a partnership with Foxlink Group to create Bluetooth products using speech as the interface. The whole area of voice search and navigation shows tremendous promise as do the use of speech-to-text and text-to-speech within IM, SMS and voice messaging. I’m hoping 2008 brings a lot more advancements and enhancements. Perhaps we will incorporate speech into conferencing and use speech analytics in meeting note taking, or use speech recognition to keep track of the speakers in a room, or while in a car. There is so much room for creativity. 2008 could be a fun year for speech technologies.

About

This page contains a single entry from the blog posted on December 22, 2007 11:44 PM.

The previous post in this blog was The Twelve Days of Unified Communications – The Eighth Day – 12 CFOs Bilking.

The next post in this blog is The Twelve Days of Unified Communications – Day Ten – No More Pagers Beeping.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.35