vs. Natural Language
Many speech interfaces, including the native NaturallySpeaking
commands, are aimed at allowing you to speak to the computer the same way
you speak to another person. True natural language understanding is a major
area of computer science research, but it's a difficult problem and natural
language understanding is many years from practical use. Many of today's
speech interfaces use a pseudo-natural language approach that, instead of
true understanding, provide several ways to say a given command.
There are three serious drawbacks to the pseudo-natural language
1. The programs don't cover all the ways to say a given command.
When people are left to figure out command wording for themselves, they
often use wording that's not included.
When the computer doesn't respond to a command, there are several
possibilities for what went wrong — the computer may not have interpreted
your words correctly, or those words may not be correct wording for that
particular command. Having several possibilities for what went wrong makes
it difficult to know what to do next. If the computer didn't interpret your
words correctly, you should repeat the command. If the words aren't correct
for that particular command, you should try another wording.
Having multiple wording possibilities for commands also makes it
difficult to provide full, usable documentation; users are advised to guess
rather than look up commands because the on-line facility to look up a command
from the full command list is slow and awkward.
This makes speech recognition software frustrating to use.
In contrast, the structured grammar approach used by UC provides
rules and words that make it easy to learn commands.
2. Having many ways to word commands means the computer must listen
for many different possibilities, which slows the computer's response time.
Synonymous ways to word commands also means you must choose one way, which
slows your response time.
This makes using a computer slower and more difficult than it needs
3. Synonymous commands make it impossible to combine several computer
steps into one command. To carry out a task on a typical computer using
the keyboard and mouse, you often must carry out many steps to accomplish
a single task such as finding a particular file. This is because the keyboard
and mouse have real estate limitations — a finite number of keys on the
keyboard, and a finite amount of space on the screen used for mouse choices.
In theory, speech doesn't have a real estate problem — there are many words
and word combinations available. The pseudo-natural language approach, however,
squanders this potential.
If you have an average of 5 ways to say each of only 20 commands
and you'd like to be able to combine any 2 of these commands, the computer
must listen for 100 x 95, or 9,500 possible combinations. Allowing for three-command
combinations of the same 20 commands (100 x 95 x 90) adds 855,000 combinations.
Four-command combinations (100 x 95 x 90 x 85) add 72 million more commands.
This generally limits pseudo natural language systems to commands
that mimic individual keyboard and mouse steps. In contrast, the structured
grammar approach used by UC makes it possible to combine commands, which
greatly speeds computing.
An independent study (www.cs.cmu.edu/~usi/papers/HLT04.pdf)
by researchers at Carnegie Mellon University found that 74% of users prefer
a structured rather than natural language approach to speech recognition.