Sunday, June 15, 2014

Experimenting with Dragon NaturallySpeaking

This blog post is my attempt to experiment with the newest version of Dragon NaturallySpeaking. I previously fiddled around a bit with it back in college. However, this time to give it a real shot. There are three fundamental motivations for this experiment. The first reason, is that I am simply curious of playing with new and interesting technology. And the ability to simply speak to your computer and have it respond in a useful way is interesting. The second reason is to improve my speaking skills. I think that dictation will help me control my speech. The third reason is simply to increase my productivity. Having to work on a computer all day has taken a lot of time and energy away from anything as simple as maintaining my house. And while I'm folding clothes or washing dishes I feel like I could be able to use my voice to do work.

Installation and setup

The initial installation of Dragon NaturallySpeaking was complicated by some confusing profile options. Initially set up the device so that it would take remote recording devices. I did this so that I could use my cell phone's import audio files that can be translated later. However this option by default seem to disable the microphone, and make impossible to be enabled. However quick search on the forum of the problem. I was able to fix this  by setting up a new profile.

However my troubles didn't end there. The headset I bought comes with the option to use a infrared or Bluetooth connection. I want to use the Bluetooth connection because it does require line of sight and has a longer range. This helped in another room and hour when I just don't have access to the computer visually. I'm not sure what the problem is. The headset didn't come with an installed desk. And just by plugging it in all the drivers are installed correctly, or so it appeared. However, now the headset does connection machine but when I catch up using a simple program sound recorder, no audio is recorded.

Initial Impressions

My negative impressions have to do mostly with the foreignness of the interface, and the speed of some of the commands. The foreignness  has to do with how ingrained typing is into my work process. Something as simple as having to form an entire thoughts before starting to express it, is something that I'm having to get used to. Having a suspension for excellent concrete lead to say. And the number of keystrokes like a sort of pace car to go. And not being a public's a practice public speaker and not use to having to form ideas when speaking meetings, in day-to-day conversation, the thoughts are concise and limited in scope. And because the software uses the context intends to guess that what words meant to say when there's some kind of error rest the entire sentence at the same time and not speak haltingly, slurring your words can affect the results negatively. Also, the commands are little strange. However I don't really have anything to compare this to, because this the first time you ever tried to use speech recognition software in a concerted way.

The speed of the software can also be frustrating at times. When executing a command, or trying to perform some keyboard input, the software can take a little bit of time to process that. I'm also a little worried about when getting to larger documents are running out of memory and really crashing I've heard it from other users of the software that this can happen and they care around it  by having two computers working at the same time when looking at working on large documents. I would really prefer not be the case.

However, the positive aspects of this new input method are already apparent. Having to think about the entire bottle beginning to express it has made me control my speech. I can already see it as sort of exercise, accessing parts of my brain that I normally do not. It's a rather interesting way to work with your computer. It's hard to describe.

The performance is much better than I expected. It seems that the longer the sentence I try to say, the better the algorithm performs. And it's been relatively few dictation errors. And I attribute these mostly to my own experience with the practice of dictation, rather than any inherent flaw with the software. I can only imagine that as I use the software more, the error rate will go down and the speed at which I compose logo up.

Future usage and Outlook

In the future I'm hoping that I'll be able to use the Bluetooth headset, and become proficient enough with the software they don't have to be looking at the screen the entire time. I would prefer to be able to walk around, or do the chores as I talked about beforehand, and then only check on the dictation process every once in a while to make sure that everything is going okay. I will also be trying to install the software use it on Linux  there are couple of projects out there that are designed specifically to allow you to run the same command and software under wine. I'll probably post a video with how that works out.

I think that it's obvious that I'll be able to at least generate the first version of a document much quicker. The speed at which I'm able to speak and have the software interpret correctly is extremely high. I haven't actually tested but I've heard other people say that they get something like 230 or 250 words a minute using this dictation software. And I haven't tested while, I think that it's somewhere around 80 or 90 words a minute. And also it feels more natural justice say the thought, rather than having to pound out each individual character of the keyboard.

I have high hopes.

