Here is an update on a startup company in the UK that I’ve been following for a year now, and a few of you may know about. Well worth checking into, VoxIQ provides an enabling technology to the contact center market that combines speech technologies and knowledge databases to assist live agents in providing the best customer service and satisfaction possible - and they do it really fast. The key to VoxIQ’s uniqueness is in the combination of speech recognition and the use of knowledge databases. Many companies use speech recognition or speech analytics to improve agent performance, or knowledge-based systems (KBS) to assist agents, but VoxIQ uses them in tandem, to provide dynamic database information to the agent in real-time as the conversation progresses with the caller.
The VoxIQ software does this by listening to the conversation between the agent and caller, detecting key words, and establishing and maintaining knowledge of the context of the call as it moves along. Essentially, it models each stage of the application as a small vocabulary context based on the current state of the application. Based on the knowledge it then gleans from the conversation, the system accesses the appropriate databases, and then displays relevant data for the agent, by either populating fields in the customer record, or pulling up and displaying context-sensitive information for discussion with the caller. This enables agents to either talk in greater depth to the caller or provide the exact information that the caller needs. The VoxIQ software doesn’t just do it once though, it does it continuously, searching on different context-driven keywords as the conversation continues.
The other unique part is the way that ASR is used. Because the speech recognizer is only focused on a limited number of key words that are pertinent to the context of a given part of the call, the application can search faster than we can speak, which allows it to then search for topics related to the context fast enough to populate the screen of the agent as they speak. Once information is mined related to one context, VoxIQ recognizes that information for that context has been fulfilled, and it is rewritten by the next context, eliminating all the words of the previous context. Next, VoxIQ searches a different set of keywords related to the next part of the conversation, and so on. This method eliminates the limitations that other systems pose by requiring a large vocabulary of words in order to support a continuous speech application. In contrast, VoxIQ only requires a small number of words per context segment. This improves both speed and accuracy in that the ASR listens for a few words, to determine the proper context, and then listens for different words as the context changes.
A further refinement to this, that VoxIQ has made, and that they claim is not found anywhere else, is real-time re-recognition of continuous speech using two ASR engines. This allows the system to find keywords uttered that are relevant to contexts other than the one that is currently in focus. The system can access two sets of information at one time so that an agent won’t have to ask again for information already (perhaps inadvertently) provided by the caller. For example, ASR1 is used for ‘current in focus context’ while some speech is recorded, and then passed to ASR2 for re-recognition against allowable contexts in the process. The stuff that is recognized out of ‘current in focus context’ is retained so that the agent doesn’t ask questions on information already provided. This all works well because it is processed faster than we can speak, allowing VoxIQ to conduct re-recognition in a number of contexts as well as the current context in focus.
For example, a computer company might use VoxIQ in product support to hone in on the equipment type, problem and fix for the customer. So if a customer calls in and says, in response to the agent asking what the problem is, “I have a printer that seems to be jammed, even though I thought I put the cartridge in properly, so I called because I think its still under warranty.” ASR1 listens and picks out “printer”, “cartridge” and “jammed”, which are in context for defining the customer’s problem, and then pulls up the information for printers and troubleshooting, and presents it to the agent. ASR2 re-recognizes the caller input, picks up “warranty, which is out of context with the initial question by the agent, but may be needed later, notes it as being part of a separate context, and stores it for later use. When the call progresses to the point that warranty is being discussed, the agent already has the information on their screen, and doesn’t have to ask the caller for it.
I thought this was a significantly interesting twist on the use of speech recognition and KBS for helping agents do a better job. The agent doesn’t have to mess around so much searching for information, and the screen is populated for them, which saves time and frustration and makes the agent look good! No more jotting notes down during the call with the hopes that you input them all later, or remember all the details. Moreover, it does it in a tidy fashion, several steps at a time, eliminating the need for a huge vocabulary.
VoxIQ intends to extend these capabilities to markets other than the contact center, such as entertainment. For example, VoxIQ can listen to sports broadcasts and provide sportscasters with the relevant information about players and statistics that they need to have at their fingertips during broadcasts. This is particularly useful when one broadcaster mentions someone from another team, or from the past teams, that are not part of the set statistics provided for the game that they are watching.
Interactive gaming is another target area for VoxIQ. Imagine being able to talk to your game piece and have critical information revealed to you fast enough to allow you to make split second decisions as to your next move. Though I’m not an online gamer, I can see how this would be quite addictive.
In any case, this technology shows a lot of promise. If you haven’t checked these guys out you should. Since I couldn’t turn this blog into a white paper on how it works, if you are interested they have a white paper you can request on their web site that details all the good stuff.
