How to Convert Audio Files Into Text

By Noel Lawrence

Whether transcribing a legal deposition or conducting an interview for a newspaper, finding the best way to convert spoken-word audio into a text document is an age-old problem. Transcribing services are expensive and not always prompt. Audio-to-text conversion software often produces such inaccurate documents that it takes longer to correct them than to do a manual transcription. Though there is no magic bullet solution, a little technical savvy can streamline if not altogether eliminate the labor intensive aspects of transcription.

Things You'll Need

  • Computer
  • Audio-editing software
  • Voice-recognition software
  • Headset
  • Audio-conversion software (optional)

Step 1

Acquire audio editing software. Pro Tools and GoldWave are two popular applications but a plethora of shareware and freeware applications will do the trick.

Step 2

Open your audio file in your chosen editing software. Your application will display the audio file as a visual waveform that allows you to “scrub” back and forth across the recording and avoid losing your place. If you cannot open the file, you may need to acquire audio conversion software to convert it into a file format the application can recognize. Consult the link in the Resources section if you need assistance with this issue.

Step 3

Acquire a computer headset and plug it into the appropriate audio jack of your computer.

Step 4

Open the audio file in your editing software, play a few seconds of your audio recording and memorize the words.

Step 5

Repeat the words into your voice-recognition application. At first, your software may not transcribe your speech with complete accuracy. Don’t panic. Voice-recognition applications are able to “learn” your speech habits over time until they work about 99 percent of the time. Consult the instructions for your chosen application on how to help it recognize what you are saying.

Step 6

Repeat steps 4 and 5 until you finish your transcription. Try to gauge how many seconds of spoken-word audio you can memorize for transcription. The more speech you can memorize, the faster you can transcribe. Don’t get impatient. Unless you have a photographic memory, doing more than 30 seconds at a time may result in mistakes or omissions.

Tips & Warnings

  • Mac users will need to purchase a USB headset for dictation. Cheaper mini-jack headsets work on PC’s only.