About Transcriber Project
Despite significant progress in speech recognition algorithms reached in last decades, the main part of real speech recognition job is done by people. The basic instrument for this job is transcriber.
Typists and journalists use transcribers for speech phonograms to text documents conversion. The speech phonograms above could be records of miscellaneous meetings, interviews, lectures, court sittings, conferences and so on. Foreign language teachers as well as their students use transcribers for auditing. "Philips" produces special hardware transcribers for many years. Hardware transcribers use real recorders as on the following picture.
Software transcribers use multimedia capabilities of personal computers. The most part of the software transcribers currently on the market usually use simple text editors such as WordPad. Microsoft Word is used only in outstanding programs of such kind. Actually the simple transcriber is a text editor supplemented with sound player capabilities. The most outstanding transcribers use sound labels (special type of hyperlinks) in the text documents for starting playback from the arbitrary time mark. Usually "arbitrary time mark" is a fixed event such as "meeting start", "appearance of the meeting participant" and so on.
The professional typists type 180 and more symbols per minute and as a rule don't use a mouse. Instead they control transcribers using foot control as on the following picture.
The typical categories of transcriber users are
Atypical transcriber users, for example, could store play lists directly in the Microsoft Word documents.
This project integrates two great technologies - Microsoft Word and Windows Media Player (just a "player" below). Template "AhWMPlayer2.dot" ("Program") is intended to transform Microsoft Word to the fully functional digital transcriber - audiotext editor for professional typists with simultaneously listening phonograms and controlling playback. Audiolabels give direct access to any part of any phonogram.
Program has been tested with
The package contains the following files:
The following file(s) could be optionally included to the demo version.
Program demo version is freeware. Electronic dongle is not required. Main demo version limitations is the lack of some important features of professional transcribers: foot control and special tempocorrection DSP plugin are not supported.
To setup the Program perform the following actions
1. unzip archive
2. copy files "AhWMPlayer2.dot", "AhWMPlayer2.ini" and "AhPlayer2Eng.chm" to the Microsoft Office Startup folder (usually "C:\Program Files\Microsoft Office\OFFICE11\Startup")
3. if you need foot control support copy file "AhPlayer_FC.dll" to the system folder (usually "c:\windows\system32")
4. launch or relaunch Microsoft Word.
The following message appears while launching Microsoft Word due to used ActiveX "Windows Media Player".
Just press the "OK" button to continue.
The Program is supplied "as is", no technical support is assumed. The author will be glad to receive your feedback on E-mail.
Tempocorrection algorithms are used in the professional software transcribers during about last twenty years. Lately even musicians have been using them - see http://www.ronimusic.com/.
Windows Media Player supports tempocorrection directly using menu item "View\Enhancements\Play Speed Settings...".
Moreover Windows Media Player capabilities could be extending with special DSP plugins. The current version of the Program uses embedded tempocorrection support.
Windows Media Player v. 9 supports tempocorrection for the .wma, .wmv, .wm, .MP3, and .asf media file formats. In addition, tempocorrection may not be available when playing streaming or progressively downloaded media.
Unfortunately tempocorrection for the WAV format is not supported.
Program has not been tested with streaming sound, including net radio stations.
The basic audiolabel information items are listed in the following table.
As audiolabels should differ from the ordinary document text and allow change, usage and deletion from the document, the best choice is to use special Microsoft Word styles for managing audiolabels.
As it was stated above, special Microsoft Word styles are the best choice for managing audiolabels. Letís introduce the style conventions, which will be used for managing audiolabels.
If style AhSoundLink or AhSoundText is absent at the moment of insertion (for example, style has been deleted by the user), then Program automatically creates the default style definitions, which are described in the following pictures.
The following lines illustrate the audiolabel structure format
There are some examples of audiotext (or sound) documents below.
Sound document "Detochkin.doc"
Sound document "Dialog131.doc"