Technology

Summary. This page offers an overview of voice to text or voice speech recognition. A few of the common programs are:

History. Voice speech recognition is a technology that has been developed over many decades from the 1970s to the year 2000.  Initially, when speech recognition became available, there were many limitations to the software:

  • It was necessary to spend considerable time training the computer to recognize a specific person’s voice.
  • It was necessary to train the computer to become familiar with the vocabulary commonly used by the person speaking.
  • Expensive microphones were necessary to achieve the quality necessary for the computer to recognize speech.
  • It was necessary to speak close to the microphone but not too close, so microphone adjustment was very important.
  • Speech recognition software has been very expensive, costing as much as $200.00.
  • Even after training in computer and spending considerable money on software and a specialized microphone, the accuracy of speech recognition programs for many years was very poor.

Over the years, a few companies rose to the top and became recognized as leaders and speech recognition. Products such as Dragon Naturally Speaking by Nuance and ViaVoice by IBM became the two prominent programs in use. Over time, Dragon Naturally Speaking became considered one of the best programs available. However the cost was about $100 to $200 depending on the version purchased.

Shortly after the year 2000, Microsoft began including speech recognition capabilities in their Microsoft Office product line. The speech recognition software would only be installed if during the installation that person installing the software manually selected to install all features. The included software from Microsoft was very powerful, free, didn’t require special microphones, required very little training, and produced a high degree of accuracy. For this reason, products like Dragon Naturally Speaking were only practical for specialized use such as doctors’ offices where medical terminology might be used. Since most people could get by just fine with the free software from Microsoft, other products became less commonly used.

In January, 2008, MacSpeech Unveiled their Dictate product which replaced the iListen software product that was considered by some people to be very expensive, not very accurate, and it required extensive training. The MacSpeech Dictate product is an effort to address problems with the iListen product they had previously been selling. The product comes with a new user interface and is a big leap forward when compared to iListen. Unlike Microsoft’s speech recognition, the software does not allow casual editing with the mouse or keyboard. According to the documentation, all movement through your document and editing should be done with voice commands. This can become rather tedious. Uncommon words must be spelled out letter for letter using the spelling mode. A relatively short training time and relatively good accuracy are some of the good qualities of the product. Unfortunately, the product sells for $200 (or about $160 at Amazon). This is a high price when compared with other products that allow combined voice and keyboard typing editing. Another glitch with the program, is that the included notepad program will not allow keyboard entry at all. When you type using the notepad included with Dictate, the text you type goes from right to left. The phone support technicians said this will not be fixed in the future.

Perhaps the most exciting development in speech recognition became available in November, 2008, when Google began offering an application for the iPhone that interprets the speech of anyone talking, even from across the room, converts it into text, and then searches the Internet for the word or phrase spoken.  What’s amazing about the Google product, in addition to the fact that it’s absolutely free, is that it’s very accurate and requires no special microphone or training.  Aside from a few alterations, this web page was entirely created using the Microsoft speech to text technology. 

As new products and software become available, they will be presented and reviewed here.

%d bloggers like this: