Our tools are highly accurate, give reliable results that match or outperform the current state-of-the-art systems, and are quantitatively similar to results from human annotators. They developed over more than five years of collaboration between linguistic rand computer science researchers.
All tools are free to use, released under the MIT license, and available at our lab's repository. You can use them in your own research, or can change and develop them.
Researchers can use these tools in two different ways. Each tool is integrated with Praat as a plugin, allowing straighforward single-file processing in a graphical interface. In addition, all tools have an easy command line interface for batch processing large numbers of files.
Automation of phonetic measurement is an integral part of working with large datasets in Laboratory Phonology. A challenge for such systems is to match the reliability of human annotators. We introduce four systems for reliable, automatic measurement of phonetic features that are often of interest to speech production researchers: word duration, vowel duration, voice onset time, and formant estimation and tracking
The systems are designed to work with variable length inputs and do not need to be synchronized with the onset or offset of the phonetic interval of interest. The signal processing and the acoustic signal representation were tailored for each task individually, allowing the systems to exhibit the degree of precision and reliability needed to support research in Laboratory Phonology. To create systems that would produce reliable output, we used state-of- the-art deep neural networks (DNN) models and the trained them on corpora that were manually annotated at Northwestern University and Ohio State University. All tools are currently available for Mac or Linux, and soon will be available for Windows.
Our GitHub repository gives you access to our tools and information on setup and use.