Category Archives: Speech Recognition

Guest Blog: Remembering Touch Commands

So you’ve mastered Touch commands, in fact you’ve mastered them so well that you’ve created dozens and now you can’t remember them all. To access them, you can say “UC List Touch”, but that draws your focus away from the screen that you are working on.

If you incorporate your Touch commands into your Custom Files, you will always have them at your fingertips — or tongue tips, as it were.

Some of you are already using Custom Files (e.g.,“Custom 1 Guide”, “Custom 2 Guide”). For those of you who aren’t, these are handy guides that can come up on the right side of your screen, providing a “cheat sheet” while you continue to work. They take up some space on your screen, but they are frequently worth it. Especially if — like me — you have a hard time remembering all those great Touch commands!

Right now I am using Custom Files for the two programs I use most frequently, Outlook and Firefox. In each of these programs, I have created a number of Touch commands to move the mouse and click at various spots on the screen. I decided that I wanted to have easy access to my touch commands so I could maximize my use of this powerful tool. I also wanted to reorganize my touch commands by program (when you add a new Touch command, it is added chronologically to the bottom of your list ).

Here’s how to do it:

  1. Say “UC List Touch” to call up your list of touch commands.
  2. Say “All Copy to Excel”, which will copy all of your touch commands into an Excel spreadsheet.  It will also copy all of the number coordinates that go with each touch command you’ve created, but don’t worry about that, the next step will take care of that.
  3. Select your touch commands by saying “40 Downs” or some number that will capture them (if you have a lot of touch commands, you might need to say “100 Downs”, for example.)
  4. Say “All Copy to Notepad”. This will copy your touch commands into Notepad, which will remove all of the formatting from Excel (you don’t want to have the lines that separate each Excel.
  5. You now have a clean list of all of your touch commands that you can organize. Rearrange them by program, or any way that you would like. Once you have them in an order that makes sense to you, you can easily copy and paste them into your custom files.

Now your touch commands are part of your cheat sheet. So the next time you are working in a program and you want to remember all of those handy dandy touch commands you created for it, just call up your custom file and there they are.

ONE BIG CATCH: Since the Custom File does take up a portion of your screen on the right side, some of your touch commands may be “off” since part of your screen is now filled by the Custom File itself. So you can look at the file to remind yourself of your touch commands, then close the file and continue. It’s still easier than scrolling through the list you see when you call up UC List Touch.

– Big Talker

Tip: Finding a command

Here’s a very quick tip.

If you know the name of a command, or even part of it, and want to look it up in the Utter Command documentation, say

   – “UC Index” to bring up the Utter Command Index
   “Find Open” to put the cursor in the find dialog box
   – type a keyword you want to look for, for instance “Wait”, “Drag” or “Before”
   “Enter” to find the first instance
   – if necessary, “Enter” again to find subsequent instances

Once you find what you’re looking for, use the reference number to call up the full lesson on the command, e.g. “UC Lesson 4 Point 5”. This is also a good way to see the consistent patterns in the Utter Command speech command set.

Tell me what you think – reply here or let me know at info@ this website address.

Happy new year!

Trying out Dragon Search for the iPhone

Dragon Search is a nice app. Here’s how it works: open the app, hit one button, speak the phrase you want to search for. By default the app stops listening and starts the search when you pause so you don’t have to hit another button when you’re done.

The app comes up quickly, which from a practical standpoint is extremely important. And in my experience so far the search has been fast. There’s also a button you can push to cancel out of the search. The big plus of this application is the different search channels: Google, iTunes, Twitter, Wikipedia, and YouTube. You can search for something, like green apples, and the results will come up in the channel you used last. Once you’ve done a search you can switch channels easily to see results across channels.

I have a couple of practical suggestions.

1. The history list is just three items long — I’d like a much longer scrolling history list. Google Voice Search has a long scrolling list that includes dates. I would’ve liked to have seen Nuance improve on that.

2. I’d also like to be able to add my own channel.

I’ll also take the opportunity to repeat what I said a couple of days ago. I appreciate the progress on speech apps — don’t get me wrong. But speech on the iPhone is still not what I really want, which is system-level speech control of a mobile device that would give me the option to use speech for anything. These new apps are steps in the right direction — making the iPhone more hands-free. But there’s still a long way to go.

Trying out Dragon Dictation for the iPhone

I’ve been trying out the Dragon Dictation iPhone app. It’s still not what I really want, which is system-level speech control of a mobile device that would give me the option to use speech for anything. But it’s a step in the right direction of making the iPhone more hands-free.

Here’s how Dragon Dictation for the iPhone works: open the app, hit one button, speak up to 30 seconds of dictation, then hit another button to say you’re done. Your dictation shows up on the screen a few seconds later. Behind the scenes the audio file you’ve dictated is sent to a server, put through a speech-recognition engine, and the results sent back to your screen. Now you can add to your text by dictating again, or hit an actions button that gives you three choices: send what you’ve written to your e-mail app, send it to your text app, or copy it to the clipboard so you can paste it someplace else.

The recognition is usually fairly accurate in quiet environments. Not surprisingly, you get a lot of errors in noisy environments. To its credit, on a mobile device the built-in microphone is not optimal for speech-recognition. It does pretty well given these constraints.

Here’s a practical suggestion that should be easy to implement: Add a decibel meter so people can see exactly how much background noise there it is at any given time. This would make people more aware of background noise so they could set their expectations accordingly.

The interface for correcting errors is reasonable. Tap on a word and there are sometimes alternates available or you can delete it. Tap the keyboard button and you can use the regular system keyboard to clean things up.

I have two interface suggestions:

1. You can’t use the regular system copy and paste without going into the keyboard mode. You should be able to. I suspect this is fairly easy to fix.

2. There is no speech facility for correcting errors. I think there’s a practical fix here as well.

First, some background. Full dictation on a mobile device is tricky. Full dictation speech engines take a lot of horsepower. Dragon Dictation sidesteps the problem by sending the dictation over the network to a server running a speech engine. The trade-off is it’s difficult to give the user close control of the text — you must dictate in batches and wait briefly to see the results. This makes it more difficult to offer ways to correct using speech. But I think there is a good solution already in use on another platform.

Although it’s difficult to implement most speech commands given the server setup, the “Resume With” command that’s part of the Dragon NaturallySpeaking desktop speech application is a different animal. This command lets you start over at any point in the phrase you last dictated by picking up the last couple of words that will remain the same and dictating the rest over again.

This would make Dragon Dictation much more useful for people who are trying to be as hands-free as possible. It would also lower the frustration of misrecognitions and subtly teach people to dictate better.

It’s nice to see progress on mobile speech. I’m looking forward to more.

Speeding Web navigation: single-step deep menu access

Utter Command speech-enables the Firefox Mouseless Browsing extension, which puts a unique number on every clickable item on a Web page. UC lets you click every item on a page, including links, by saying the number plus the word “Go”, for instance “7 Go”.

This works pretty well, but it gets even better when you discover that an item doesn’t have to be visible for you to click it.

This lets you click items that are off-screen. Better yet, it lets you click items on drop-down menus without having to first drop-down the menu. This lets you use a single step to get to any menu item in a Web application once you know the number.

For instance, to insert a horizontal line in a Google Document you can click the “Insert” menu, then click the menu item “Horizontal Line”. There’s no direct keyboard shortcut for horizontal line, so it’s usually a two-step task.

Using numbers you can say “7 Go” to drop-down the Insert menu, then “84 Go” to click  Horizontal Line. But if, like me, you add horizontal lines often enough to remember the number, you can cut straight to the chase and say “84 Go” anytime you want a horizontal line.

Tip: What to do when dictation isn’t recognized as text

Occasionally the Dragon NaturallySpeaking speech engine will get mixed up about whether or not the program or field in focus is something you should be able to type text into. When this happens you’ll see lots of question marks in the recognition box.

The problem is usually easy to fix — move the focus out of whatever program this is happening in, then back in. Here’s a quick way to do that — the UC command “Notepad Open · Notepad Close”.

Tip: What to do when dictation isn't recognized as text

Occasionally the Dragon NaturallySpeaking speech engine will get mixed up about whether or not the program or field in focus is something you should be able to type text into. When this happens you’ll see lots of question marks in the recognition box.

The problem is usually easy to fix — move the focus out of whatever program this is happening in, then back in. Here’s a quick way to do that — the UC command “Notepad Open · Notepad Close”.

Tip: Faster correction

Here’s a very quick tip that’ll speed the correction process.

If you start spelling in the Dragon NaturallySpeaking Spell Correction dialog box, but realize too late that you spoke too soon because the correct answer was one of the choices after all, all is not lost. Say the UC command “Line Delete” to get the original choices back.

Keep it simple: Keys don’t need pseudonyms

I have a pet peeve in the Keep it Simple, Stupid department.

Keyboard shortcut labels containing Enter should say “Enter”, which is what’s written on the key, rather than the antiquated “Return”. Having the label match the key is better both for people who pay keys and people who speak keys.

Thunderbird/File has an example of what not to do — two labels containing “Return”. It’s a small thing, but using any amount of brainpower for this type of translation is unnecessary.

What are your speech pet peeves? Tell me about them – reply here or let me know at info@ this website address.

Keep it simple: Keys don't need pseudonyms

I have a pet peeve in the Keep it Simple, Stupid department.

Keyboard shortcut labels containing Enter should say “Enter”, which is what’s written on the key, rather than the antiquated “Return”. Having the label match the key is better both for people who pay keys and people who speak keys.

Thunderbird/File has an example of what not to do — two labels containing “Return”. It’s a small thing, but using any amount of brainpower for this type of translation is unnecessary.

What are your speech pet peeves? Tell me about them – reply here or let me know at info@ this website address.