Friday, September 26, 12-2 p.m.
Dynamic Units in Speech Production

Observation of the physical characteristics of speech indicate that the human movements used to produce speech are temporally patterned and that neighboring movements largely overlap one another in time. This description stands in sharp contrast to a traditional phonological description of spoken language as composed of a small number of discrete and concatenated atemporal units. These apparently contrasting views can be reconciled in large measure by adopting an approach to phonological representation in which the atomic building blocks of words are dynamical. Over the past decade, Browman and Goldstein have pursued such an approach by investigating the hypothesis that articulatory gestures are fundamental units in speech production. This colloquium will present work done with Louis Goldstein, Elliot Saltzman, and others that has generated evidence for dynamical units in speech production and will discuss the ramifications of this approach for understanding how surface variability in performance can arise given underlyingly invariant or stable phonological units.

In recent experimental work (Goldstein, Pouplier, Chen, Saltzman & Byrd, in prep.), evidence for gestural action units has been obtained through an investigation of the kinematics of speech errors produced when a simple phrase is repeated. Because of the intrusive and/or gradient character of many gestural errors, we argue that an examination of articulatory data is crucial to describing and interpreting speech errors with respect to phonological representation.

Next, assuming gestures as dynamical units in speech production, we must consider how invariant representations can be maintained in the face of a lack of invariance in the public manifestation of these representations. Evidence has demonstrated that surface variability in the execution of articulatory gestures arises from many sources—including competition among gestures (e.g., the influence of adjacent context) and prosodic structure. Recent work of others and our own (e.g., Byrd & Saltzman, 1998, 2003) has suggested that gestural attractor dynamics are not fixed but rather vary as a function of a gesture’s position in the word and phrase. We consider the question of how phrasal structure gives rise to the observed patterning of low-level, inherently temporal, action units. Our approach provides a theoretical reconciliation of what in the past has been an inconsistency in the manner in which prosodic structure and segmental structure have been accommodated in the dynamical approach. Namely, we suggest that both “segmental” and “suprasegmental” structure have inherently temporal properties that are crucially coordinated in speech planning.

[Work supported by NIH.]