Prosodylab-Aligner

The Prosodylab-Aligner is a set of Python and shell scripts for performing automated alignment of text to audio of speech using Hidden Markov Models developed in our lab by Kyle Gorman. It is designed to be easy to use as possible, and especially for use with data elicited in a laboratory setting. While it ships with pre-trained North American English monophone models based on data collected in our lab, it also supports training on arbitrary data.

NB: It is important to note that training a new acoustic model does not require any time-aligned training data. To train a new acoustic model, you need at least one hour of audio data, with accompanying word-level transcriptions, and a pronunciation dictionary. Optimal time alignments are learned during model training. Ideally, the audio and transcripts correspond to an utterance/sentence or breath group, but in practice shorter or longer files can also be used for training.

Where to get it

The aligner is available from GitHub (Click on the “Download” button to get the aligner or clone/fork the repository).

System Requirements & Installation

Prosodylab-Aligner has been tested on Mac OS X and should work on Linux. The installation instructions are in the process of being updated, for the time being, you’ll find instructions for installation on Mac here:

Installation on Macs

If you’ve used the aligner to train models for a new language, you can share the dictionary and your model at this git repository: prosodylab-alignermodels. See the README.md for more information. It will help others to align more data from that language.

Please keep tuned for updates in the coming week, since we’re still finalizing the documentation of the new features and changes (as of January 6 2014). In the meantime, if you’d like to use the old aligner (compatible with the video tutorials below), then you can still install that here:

Prosodylab-Aligner, Previous Version

Video Tutorial (based on old aligner)

Erin Olson (McGill’11, now doing her Ph.D. in Linguistics at MIT), one of the developers of the aligner, has posted a Video tutorial here:

Bug Report

Please submit bugs to Kyle Gorman at OHSU.

How to Cite

If you use this tool, we would appreciate it if you cite the following paper:

Gorman, Kyle, Jonathan Howell and Michael Wagner. 2011. Prosodylab-Aligner: A Tool for Forced Alignment of Laboratory Speech. Canadian Acoustics. 39.3. 192–193. [preprint] [paper]

Alternatives

You may also find the Penn Phonetics Lab forced aligner useful if you are studying American English. Unlike Prosodylab-Aligner, it does not support training. The newest sources are on SourceForge. Forced-Alignment capability is also built into recent version of Praat.

Acknowledgements

Work on the aligner was funded by the following grants to Michael Wagner:

FQRSC Nouvelle Chercheur NP-132516
SSHRC Digging into Data Challenge Grant 869-2009-0004
SSHRC Canada Research Chair