Project workflow

The different steps in the phonetic mining and archiving of our audio data.

Lars Hinrichs https://larshinrichs.site (The University of Texas at Austin)https://utexas.edu
10-15-2020

Data processing steps

The whole project produces 5 assets for each recorded sound file (including the audio itself) which need to be stored and staged somewhere.

Folders for the five asset types in our working directory on box.com.

Figure 1: Folders for the five asset types in our working directory on box.com.

Get audio and metadata

The RA downloads existing interview files and the available metadata about the speaker.

For these older speech files (the gramophone recordings, or the “Willard Collection”), we use the metadata that Elizabeth Strong collected for us and placed on the https://archive.texasenglish.org page for us. For other recordings, we have a spreadsheet in our vaults such as this one, which the RA can look up.

Naming conventions

RAs must assign a unique speaker ID according to our own conventions. See the 4th slide below.

RA renames the audio with the unique ID and uploads it to a folder labeled “no. 1” in our shared project folder on Box.

Transcription

The RA then produces an orthographic transcript. The transcription guidelines we follow, which are probably not relevant to the data management issues we’re discussing, are available here. We have used different methods of producing transcripts, but the dominant method is transcription in Praat. See also www.praat.org.

Forced alignment and vowel analysis

The audio file and orthographic transcript are uploaded for semi-automatic processing in DARLA. DARLA processes server-side and e-mails five different files to the student. Of these, we retain the force-aligned TextGrid (contains tier with phonetic transcription, timestamped at milisecond level for each speech sound) and the CSV file with harmonic formant measurements2 at 5 points across the duration of each vowel sound.

The RA uploads these two files into dedicated folders, labeled “3” and “4”, in the shared Box drive.

Post-processing in Praat

Finally, an RA (usually a different one) loads the audio in Praat one more time together with the force-aligned TextGrid from DARLA and, using a custom Praat script, produces another dataset with 21 instead of 5 harmonic formant measurements for each vowel. This file is then again uploaded in a dedicated folder labeled “5”. This file looks the same as the 5-point measurements CSV file, it only has more columns.

Analysis

This is the work that the other PIs and I on this grant do. I am mentioning this here because analysis usually happens in a Github repo (an RStudio project that is version controled with Github). Early-stage outputs of an exploratory type can be gleaned on this blog.


  1. Please let me know your Google ID so I can add you to the site viewers.↩︎

  2. Strictly speaking, these should be called “estimates,” but most linguists say measurements.↩︎

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Hinrichs (2020, Oct. 15). Originalitätsverdacht | Bohmann, Bohmann & Hinrichs: Project workflow. Retrieved from https://actuation.netlify.app/posts/workflow/

BibTeX citation

@misc{hinrichs2020project,
  author = {Hinrichs, Lars},
  title = {Originalitätsverdacht | Bohmann, Bohmann & Hinrichs: Project workflow},
  url = {https://actuation.netlify.app/posts/workflow/},
  year = {2020}
}