|
|
||||||||
Institute for Sensory Research, Department of Bioengineering and Neuroscience, Syracuse University, Syracuse, NY 15244-5290, USA;
Department of Physical Medicine and Rehabilitation, Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins University and Good Samaritan Hospital, Baltimore, MD 21239, USA
* corresponding author, karen_hiiemae{at}isr.syr.edu
Abstract (I) Introduction (II) The Hyolingual Complex (1) THE FUNCTIONAL ANATOMY OF THE HYOLINGUAL COMPLEX (2) THE KINETIC CHAIN (III) Measuring Tongue Movements (IV) The Moving Tongue (1) METHODS OF DATA ACQUISITION (2) ELECTROPALATOGRAPHY (EPG) (3) ELECTROMAGNETIC ARTICULOMETER (EMA) (4) APPLIED DIAGNOSTIC CINERADIOGRAPHY (CFG) AND VIDEOFLUOROGRAPHY (VFG) Data reduction Lateral projection motion recording (5) X-RAY MICROBEAM (XRMB) (6) ULTRASONOGRAPHY (US) (7) MAGNETIC RESONANCE IMAGING (MRI) (8) SUMMARY (V) Tongue Movements in Feeding and Speech (1) STAGE I TRANSPORT (2) PROCESSING (3) STAGE II TRANSPORT (4) BOLUS FORMATION AND DEGLUTITION (5) TONGUE SHAPES IN FEEDING (VI) Tongue Movements in Speaking (VII) Tongue Shapes in Speaking/Vocalization (VIII) The Hyolingual Musculature (IX) Modeling the Tongue (X) Directions for Future Research Acknowledgments References
| Abstract |
|---|
|
|
|---|
Key words. Tongue, regional anatomy, eating, speech, deglutition
| (I) Introduction |
|---|
|
|
|---|
The mammalian tongue has vital functions in feeding: It plays a major role in ingestion, as in licking, lapping, and browsing; and it moves food distally through the oral cavity from the incisors to the post-canines for chewing, and then to the pharynx for bolus formation and swallowing. In dogs, the tongue has a thermoregulatory function in panting. Tongue position relative to the posterior pharyngeal wall is important in respiration. Chemo-receptors and mechanoreceptors in the tongue surface sense the nature and mechanical properties of ingested food, and prevent the digestion of noxious substances. In addition, tongue shape and position in the oral cavity influence the shape and dimensions of the airway between the palate and the tongue surface in mammals with Type I tongues (Doran and Baggett, 1971). Given the dominant role of speech in human interactions, an overwhelming proportion of the research on tongue movement focuses on its role in speech, specifically vowel and consonant production. [Nor is it surprising that speech research uses an internationally recognized alphabet of notations and abbreviations that are difficult for the oral biologist to follow. Where possible, those specialized usages have been avoided here.] Until recently, studies of the patterns of tongue movement in feeding were focused on non-human mammals (see Hiiemae and Crompton, 1985; Hiiemae, 2000). Since 1992 (Palmer et al.), there have been four reports on tongue, hyoid, and jaw movements in human subjects (complete sequences from initial ingestion to terminal swallow) when consuming foods of different initial consistencies (Palmer et al., 1997; Palmer, 1998; Hiiemae and Palmer, 1999; Hiiemae et al., 2002). In the same period, there have been significant developments in the approaches to tongue behavior in speech.
Doran (1975) and Doran and Baggett (1971) identified two types of mammalian tongues: Type I is the spatulate fleshy tongue found in almost all mammals. It can protrude up to 50% over its resting length and is capable of fairly complex movements. Type II tongues are the highly flexible whip-like organs found in anteaters and other myrmecophagous mammals. [For reviews of tetrapod feeding mechanisms, including a chapter on mammals, see Schwenk, 2000.] Although differing in their details, all Type I tongues share common characteristics and general architectures. Non-human mammalse.g., opossums, rats, mice, rabbits, cats, tenrec, and a range of primates, but particularly the macaquehave been used in studies of feeding. Cats, rabbits, guinea pigs, and rats have also been used in studies investigating the neural control of that rhythmic activity. Clearly, tongue movements in speech can be studied only in humans.
There is almost no common ground between the feeding/physiology literature and that focusing on speech, but a bridge may be appearing. MacNeilage (1998) hypothesizes that the movements of the tongue and jaw in speech (which he terms cyclicities) evolved from their movements in infantile babbling. This idea has its supporters and detractors but is superficially very appealing. No one has yet attempted, as far as we know, to test it experimentally. This idea is particularly relevant if one is interested in the evolution of speech, since, as should become clear within the body of this review, human tongue behavior in feeding builds on the patterns of movement in the hyolingual complex observed in other mammals. It is, therefore, reasonable to hypothesize that the matrix of tongue movements during human speech was derived from the wide variety of tongue movements found in suckling and feeding, although this view is controversial.
| (II) The Hyolingual Complex |
|---|
|
|
|---|
|
|
Tongue behavior cannot be divorced from hyoid movement, which is directly linked to motion of the mandible. The length and angulation of the floor of the mouth on which the tongue body rides are dictated by that linkage. Movements of the tongue surface can occur independently of hyomandibular movement within a limited range of jaw motion. [This relationship can be described as one in which a tapered sausage is attached to a mobile surface (the oral floor formed by the hyomandibular muscles) so that the sausage can change its shape as the floor moves.]
In a trenchant review of the muscles of the mandible, Last (1954) lays out the principles of these functional relationships. Last does not call these linkages a kinetic chain, but the thrust is just that. This concept can be represented as a series of linked muscle groups acting on two mobile skeletal elements, the hyoid and mandible (Fig. 1
). Their relative positions determine the length and orientation of the floor of the mouth, and so the gross vertical and antero-posterior position of the tongue body relative to the hard palate. This concept is important, because many studies on the jaw musculature in the dental and related literature focus only on the adductors (i.e., temporalis, masseter, medial pterygoid). Some refer to the digastric as the primary abductor of the mandible (see Miller, 1991). By and large, the hyoid complex is ignored. The speech literature is quite different, in that the focus is on the relative position and movement of the articulators, emphasizing the tongue surface, palate, and lips (Fig. 2
; also see Folkins and Kuehn, 1982). Folkins and Kuehn advance the concept of bidirectionality, in which they recognize that movement in one part of the system affects all the others. The literature on the anatomy of the tongue (e.g. Lowe, 1981) dismisses the hyoid muscles as belonging to the floor of the mouth. It is essential to emphasize that global tongue position and so its movements in feeding and speech are directly correlated with the length and orientation (position) of the floor of the mouth (the base of the tongue body), i.e., hyoid position.
(2) THE KINETIC CHAIN
The muscular linkages among the mandible/lower jaw, the hyoid, the cranial base (see Fig. 1
), and sternum have the following properties:
First, during feeding, the hyoid is in continuous motion, so that the relationship between it and the lower jaw changes constantly. Hyoid movement is linked to that of the opening and closing of the jaws (the masticatory/chewing cycle) and therefore to activity in the mandibular adductors (Figs. 1
, 2
, 3
, 4
). Hyoid motion results from change in the relative positions, and distance between, the hyoid and the mandibular symphysis, which, in turn, depends on mandibular position relative to the cranium. Analysis of the experimental data shows that the hyoid can travel upward and forward toward a slowly opening jaw, and that it can be pulled sharply backward from a jaw held in a wide gape (Hiiemae et al., 2002; also see Carlsöö, 1956; Pancherz et al., 1986). It is also clear that the geometry of the relationship between the hyoid and the mandible in man is such that their relative positions are affected by the direction and amplitude of jaw movement (Folkins and Kuehn, 1982). It has been argued (Thexton and McGarrick, 1988, 1989) that if cinefluorographic (CFG) or videofluorographic (VFG) data are examined with the lower (mandibular) occlusal plane as the reference plane, then hyoid and tongue movements confined within the tongue body can be analyzed without the distorting effect of jaw movement. This is simply not the case for man, where the jaw and tongue are relatively much shorter than in other mammalse.g., opossum, cat, and macaque (see Hiiemae and Crompton, 1985; Hiiemae, 2000)and the hyoid with the larynx lies below the posterior tongue rather than behind it.
|
|
Third, the complex biomechanics of the hyolingual apparatus have not been thoroughly studied. An issue here is hyoid movement and the correlated position of the oropharyngeal surface of the tongue. It cannot be assumed that the two exactly parallel each other. The tongue can bunch, heap, and twist, increasing its vertical dimension while shortening its postero-anterior axis.
| (III) Measuring Tongue Movements |
|---|
|
|
|---|
|
There are two continuing problems with investigations of tongue motion: first, the 2D representation of 3D events when standard imaging techniques are used (see Stone, 1990); and second, the speed with which these events can be recorded relative to their actual time course. These issues are discussed below in the context of each of the major data acquisition methods currently in use.
| (IV) The Moving Tongue |
|---|
|
|
|---|
(2) ELECTROPALATOGRAPHY (EPG)
Electropalatography (EPG) uses intra-oral sensors. Subjects wear an individualized thin (5 mm thick) plastic base-plate over the hard palate anchored to the maxillary teeth. Variable numbers of sensors which respond to tongue contact are embedded in the device (typically 32, 64, or 128 [Folkins and Kuehn, 1982]) or in the EPG3 device, which had 62 sensors (Hardcastle et al., 1991). [The commercially available EPG instrument (Kay Elemetrics Palatometer, 6300 Lincoln Park, NJ, USA) has 96 sensors.] The device has limitations, since the actual movement of the tongue is not measured, only the points of contact between its surface and the hard palate.
Although attempts to use EPG to measure tongue-palate contacts in feeding on solid foods failed (Heath, personal communication; Heath et al., 1980), Jack and Gibbon (1995) successfully measured tongue-palate contacts during the consumption of milk (liquid), yogurt (thick and creamy, but semi-liquid), and jelly. Chi-Fishman and Stone (1996) argue that EPG can be successfully used to study swallowing. However, the greater value of this method in speech research, when used in conjunction with other methods, is clear from Stone and Lundberg (1996), who compared the data obtained with the results of ultrasound in a study of tongue shape relative to palate. Similarly, the glossometer used by Flege (1988) had intra-oral sensors (2 x 3 x 6 mm) embedded in a thin (3 mm) plastic pseudopalate (comparable with the EPG device). Each sensor assembly has an LED and paired phototransistor. During data acquisition, the LEDs are pulsed in rapid succession, sending a beam of infrared light downward in a plane perpendicular to the occlusal plane of the teeth, so that the light is reflected from the tongue surface. The method cannot be used if anything is between the tongue and the palate, thus making it unacceptable for studies of feeding.
(3) ELECTROMAGNETIC ARTICULOMETER (EMA)
Another intra-oral technique, the electromagnetic articulometer (EMA), designed for use in the transduction of articulatory movements during speech production, relies on the attachment of tiny transmitter coils (4 x 4 mm base with a thickness of 2.5 mm) to the tongue surface, lips, and velum (see Fig. 1
in Perkell et al., 1992). Coils of that size would rapidly detach if used during feeding on foods other than liquids, since the tongue surface twists toward the post-canine teeth in every chewing cycle (see Fig. 5
). This device requires scrupulous calibration and much manipulation of the data obtained. Recently, Kaburagi and Honda (2001) have used an EMA system to obtain articulatory data to test their dynamic model of the tongue. An equivalent electromagnetic system (the Sirognathograph; see Hiiemae et al., 1996; Kazazoglu et al., 1994) accurately records jaw movement in 3D but cannot be used for the tongue, given the problem of intra-oral transducers when feeding. Other devices, such as strain gauges (Muller and Abbs, 1979), used to measure force or displacement of the lips and mandible, are viable tools for some speech research but, again, are unsuitable for feeding studies, because such methods cannot be applied to the tongue (Folkins and Kuehn, 1982).
(4) APPLIED DIAGNOSTIC CINERADIOGRAPHY (CFG) AND VIDEOFLUOROGRAPHY (VFG)
This technique (CFG) became available in the 1950s and was used for the earliest studies of human swallowing (e.g., Ardran and Kemp, 1955). Perkell (1969) performed an exhaustive analysis of tongue movements in a single male subject while recording 13 nonsense utterances (each with an unstressed followed by a stressed syllable in combinations of 7 vowels and 6 consonants as well as a single short sentence). The first clinical cameras were slow (2530 frames per sec), and they also used 35-mm film, which had to be laboriously analyzed with special equipment. Radiation exposure for human subjects soon became a concern. After an initial flurry of activity, human studies (CFG) effectively ceased, only to resume in the late 1980s, when videofluorography (VFG), which requires much lower radiation levels, became a standard radiological diagnostic tool. Hiiemae (1967, 1968; Hiiemae and Ardran, 1968) used a 35-mm CFG diagnostic machine to analyze patterns of mandibular motion in rats. That pioneering study was followed by a series with opossums and then other non-human mammals (see Hiiemae and Crompton, 1985; Hiiemae, 2000). The duration of single masticatory cycles in humans ranges from about 450 to 1000 msec, with swallowing cycles the longest. A cycle 600 msec long recorded at 30 fps would include about 18 frames of film, or 36 interlaced videofields. Chewing cycles in small mammals are much faster, i.e., on the order of 250350 ms; 9 frames or 18 videofields are inadequate for the study of such movements. Many of the early records (see Hiiemae and Palmer, 2001) were jerky and difficult to interpret. If such rapid motion was to be investigated, recording speed had to increase. Cinefluorographic facilities for animal studies were installed, first at the Yale Peabody Museum and then at the Museum of Comparative Zoology at Harvard. Those dedicated systems, filming at 100 fps, provided the basis for a series of studies in which the complete feeding process in a wide variety of mammals, including the role of the tongue in food transport, was described (Hiiemae et al., 1978; Hiiemae and Crompton, 1985, et seq.). Those mammalian studies formed the basis for the Process Model of Feeding in humans [discussed below (Hiiemae and Palmer, 1999)].
Those early efforts highlighted the need for reproducible standardized reference points within a complex system which has all its parts in motion. The first markers for the measurement of jaw movement were simple amalgam fillings on the buccal surface of canines or molars which appeared as black dots in the films. To examine tongue surface motion in animals (opossums, cats, hyraces, and macaques), investigators used a hypodermic needle to insert small metal pellets just under the gustatory mucosa in anesthetized animals (see Hiiemae and Crompton, 1985, for specific references). Our recent human studies (e.g., Palmer et al., 1997) have used small lead discs (4 x 0.4 mm) cemented to upper and lower teeth, and to the tongue. Similar markers have also been used by Kuehn (1976), Tomura et al.(1981), and Stone and Lele (1992); also see Gay et al.(1994). Gold pellet tongue markers were used at the Microbeam Facility at the University of Wisconsin (Hamlet, 1989; Westbury et al., 2000; Tasko et al., 2002). However, with that technique, the only images were of actual marker positions (see below).
Data reduction
To plot movements of markers over time in lateral projection radiographs (Figs. 3A
, 3B
), one must establish the Cartesian coordinates for each marker and then manipulate them to give its position relative to a reference plane within the orofacial complex (Fig. 2B
). We have traditionally used a palatal reference with the X axis defined as the line between upper canine and molar markers (representative of the occlusal plane of both the upper post-canines and of the hard palate). This choice was dictated by the functional relationship between the tongue surface and hard palate in feeding. It works equally well for speaking (Fig. 3
), since that also depends on the changing relationship between the tongue and hard-palate articulators. The mandibular plane is defined by the line between lower canine and lower molar markers, and is perpendicular to the sagittal plane. This reference plane was used in some of the animal studies as a means of examining tongue movement in isolation (see Thexton and McGarrick, 1998 see Thexton and McGarrick, 1999). [Details of the data reduction methods used in VFG studies on non-human mammals can be found in the references in Hiiemae and Crompton (1985) and Hiiemae (2000).]
Lateral projection motion recording
Lateral projection motion recording of the orofacial complex provides a 2D image of 3D events (Hiiemae et al., 2002). This issue has been discussed by Stone (1990). However, many of the animal studies used a conventional 16-mm cinecamera, synchronized to the fluoroscopic camera, to record the animals in frontal view to provide a measure of medio-lateral jaw motion and to identify active and balancing sides in chewing.
In practice, research with human subjects can use one of two VFG projections: The lateral projection allows movements in the vertical and horizontal planes to be measured; the postero-anterior (P-A) projection, medio-lateral and vertical movements. (It should be noted that our human subjects research review boards approved protocols [Institutional Review Board, IRB] allowing us a lifetime total of 5 min of VFG recording per normal subject.) However, the rate of data acquisition is still 30 fps. This creates a problem when VFG is being recorded with other signals, such as EMG. Each videoframe is acquired over the entire 33.33-ms period as the videocamera tracks across and down the screen. Digital data, usually acquired at minimally 500 Hz, must be manipulated to reconcile with the VFG frame period. This means that an EMG event can be identified with a specific frame but not precisely where it occurs within the frame (see Palmer et al., 1992). High-speed digital cameras are now available but are not yet used for routine diagnostic VFG testing and so are not available for experimental purposes.
(5) X-RAY MICROBEAM (XRMB)
An important data resource for tongue movement studies was created by the development of the x-ray microbeam. Invented in the early 1970s (Fujimura et al., 1973; Kiritani et al., 1977), this technology uses much lower levels of radiation than VFG. The limitation is that it images only the position of the gold pellets glued to the tongue and teeth. Additional instrumentation is needed to capture tongue surface informationfor example, a sagittal ultrasound recording of the same utterance was recorded for each subject immediately after the microbeam record was obtained and matched to the pellet positions (see Stone, 1991). A large database (58 subjects) recorded by means of this instrument is now publicly available (Westbury, 1994). It has been used by Westbury et al.(2000) and Tasko et al.(2002) to examine tongue kinematics during speech and swallowing, respectively. Tasko et al. found so much variability in pellet trajectories among 12 subjects that it was remarkably difficult to develop a generalized description of tongue kinematics in liquid swallows.
(6) ULTRASONOGRAPHY (US)
Ultrasound (US) images soft tissue in real time (Sonies et al., 1981; Keller and Ostry, 1983; Stone et al., 1983; Stone and Shawker, 1986; Stone and Lundberg, 1996). It has several advantages over VFG: (a) There is no ionizing radiation, and (b) midline submental transducer placement minimizes masking of the tongue by the hard tissues (mandible and teeth). Recordings can be made at a 30-fps frame rate (30 Hz). Submental recordings show the changing shape of the tongue surface, although the presence of air under the anterior tongue and its lateral margins can prevent their imaging. To quantify tongue surface movements, investigators have used a marker pellet technique (Shawker et al., 1983, 1985). A major disadvantage of US is the absence of spatial information on the relationship between the visualized tongue surface and the rest of the vocal tract. Moreover, during the rapid pharyngeal portion of the swallow, posterior tongue motion is faster than the available frame rate. No one appears to have used US for complete masticatory sequences.
Ultrasound is widely used in speech studies. It was used to fill in the tongue profiles in the microbeam data (Stone, 1991). Combined with electro-palatography (EPG) and jaw motion recording, the interactions of the tongue, palate, and mandible have been explored in speech production (Stone and Vatikiotis-Bateson, 1995). The use of US in studies of feeding has been largely confined to the analysis of tongue movement in the liquid swallow (Shawker et al., 1983; Stone and Shawker, 1986; Chi-Fishman and Stone, 1996). Imai et al.(1995) imaged the tongue in real time in normal subjects who ate six foods of very different consistencies and were able to report on the tongues role in turning the food (toward the occlusal surface of the teeth on the active side; see Fig. 5
), mixing it with saliva, sorting unsuitable particles (presumably too big to be swallowed), and contributing to bolus formation. They report that vertical motion of the tongue had two phases: sorting and bolus formation.
Stone and Lundberg (1996) generated elegant 3D models of tongue surface configuration for a substantial range of vowels and consonants (Fig. 6
). They found that four classes of tongue shape were sufficient to account for and categorize all the sounds they imaged. The single female subject was asked to produce vowel and consonant sounds and sustain them for 15 sec to encompass the 10-second recording time needed. Although not a normal behavioral pattern, it was necessary if good experimental records were to be obtained. To develop the 3D images/models, the investigators reconstructed the data using essential parameters from the recording system and sophisticated software.
|
(7) MAGNETIC RESONANCE IMAGING (MRI)
MRI is a newer method for examining soft tissues for diagnostic and research purposes (see Lufkin et al., 1986). Readers are referred to the papers cited below for the details of the methods (signal generation, signal acquisition, and data reduction) used in each specific study. MRI has serious limitations as a research tool for studies of speech (phoneme production) or deglutition. First, the subject is supine, a particular problem for studies of feeding. Second, MRI data acquisition is slow when compared with the duration of normal feeding and speaking events, and especially with the pharyngeal transit time for a liquid bolus. The rate of data acquisition problem can be ameliorated for short speech productions: The subject is asked to repeat the utterance several times, and images are obtained with the use of a timed trigger at various stages of the utterance (gated data acquisition). The data are then pooled to reconstruct the tongue shape for that utterance. The best example of this is reported by Stone et al.(2001a), whose single subject was asked to repeat each of 6 consonant-vowel (C-V) combination syllables 96 times in succession to allow for 32 repetitions for each of three MRI slices. This subjects heroic effort did provide the basis for an evaluation of the method for the delineation of tongue surface shapes. However, the authors report that the study clearly demonstrated the potential problems with this method in any clinical context. The number of repetitions needed per slice continues to decrease with improved MRI methods. Stone et al.(2001b) report data using 13 and 4 repetitions for each slice. However, the biggest single problem remains the mandatory supine position.
Gilbert et al.(1998) used echoplanar MRI to examine tongue behavior (lingual tissue deformation) in swallowing by supine subjects who took 5 mL of water into their mouths through a plastic tube, swallowing the whole volume on command. Their results confirmed what is known from previous VFG and US studies (which had subjects seated upright). The MRI study, however, did produce time-varying geometric maps of the subsurface lingual tissue. Dry (saliva) swallows were examined by Napadow et al.(1999) to obtain data on the intra-lingual deformation of the tongue using eight normal human subjects. They developed a model for intra-lingual strain during these swallows. However, the somewhat global areas for strain in their figures appear to have little correlation with the known anatomy and intrinsic structure of the human tongue. Rather, they provide the basis for a novel approach to testing the muscular hydrostat model (Kier and Smith, 1985).
(8) SUMMARY
VFG remains the gold standard for the study of orofacial and pharyngeal behaviors in feeding. The authors are the only investigators to have used it for an extended speech passage (Hiiemae et al., 2002). That initial exploratory experiment could usefully be repeated with a design to allow for the dissection of jaw movement in the context of the phonemes produced. For speech [and feeding] studies, VFG has the disadvantage that the 2D image collapses valuable 3D data. It is clear that the other methodologies (US and MRI) can offer both the speech language community and the oral biology community methods by which the former, in particular, may be able to investigate appropriate and narrowly defined questions. For feeding studies, the current MRI data provide a demonstration of possibilities rather than any novel insights.
| (V) Tongue Movements in Feeding and Speech |
|---|
|
|
|---|
When the spatial domains used by the jaw, hyoid, and tongue markers in feeding and speech are compared (Fig. 4
), there are clear differences. All markers show larger ranges of motion in feeding than in speech, at least in the sagittal plane. Tongue-palate contact is also less in speech. Our lateral projection images may mask movements of the lateral margins of the tongue, because of tooth radiopacity. These lateral tongue-palate contacts are important for certain phonemes (Stone and Lundberg, 1996). Gibbs and Messerman (1972) assert that the amplitude of jaw movement in speech is much smaller than in feeding, and this is confirmed by our data. When the centroid positions of the jaw in feeding and speech were compared, the difference was quite small (average, 1.2 mm). The centroid positions of anterior and posterior tongue markers also differed by only 1.1 and 0.8 mm, respectively. For the hyoid bone, however, the centroid position for speech was 10.2 mm antero-inferior to that for feeding (see Fig. 4
). Analysis of the data presented in Hiiemae et al.(2002) shows that jaw and tongue marker movements in speech occur within the sagittal domains used for feeding, but that hyoid domains are significantly different. The data shown in Fig. 4
collapse temporal data from long sequences (more than 30 sec) to give the spatial domains (centroids). Centroid analysis is limited in that it omits consideration of the time domain. Future studies (planned and under way) will address the dynamics of the system.
The Process Model of Feeding (Palmer et al., 1997; Hiiemae and Palmer, 1999) describes four main sequential stages: Stage I transport, in which ingested food is moved from the incisal area to the post-canine teeth (premolars, molars) for processing; Processing, in which the food is reduced; Stage II transport, in which triturated food is moved through the fauces for bolus formation; and, last, bolus formation and deglutition. Specific jaw and tongue movements are associated with each stage.
(1) STAGE I TRANSPORT
Our experiments use pre-cut, standard weights/volumes of the test foods. Subjects deposit the food onto their tongues by using their fingers or by pulling the food off a cocktail stick with their anterior teeth. At the time of ingestion, the jaws are maximally open and the lips apart. As soon as the food is deposited, the bite is cradled on the anterior-middle tongue surface, and the posterior oral tongue is heaped. The tongue surface is rapidly depressed to the level of the mandibular occlusal plane as the hyoid and tongue body are pulled sharply backward and somewhat downward. This hyoid movement has two results: First, the oropharynx is almost closed (at least in lateral projection); and second, the bite is carried bodily backward on the retracting tongue. As the jaws start closing, the tongue starts to rise. The bite, pulled back to the level of the last molars, is carried forward and upward toward the first upper molars as the jaws approach minimum gape. There is usually a further lower-amplitude open-close movement before the bite is finally positioned on the mandibular occlusal plane of the presumptive active side by a twisting tongue movement (Fig. 5
). We are describing the retraction of the tongue-hyoid-jaw complex in this behavior as pull-back.
(2) PROCESSING
Tongue movements occur in both the sagittal and coronal planes. In the sagittal plane, the hyoidand with it, the tongue surfacecycles (Palmer et al., 1997). As shown in Fig. 7
, the anterior tongue marker orbits so that it moves from a downward position at maximum gape, upward and backward as the mandible moves up in the closing stroke. The tongue marker reaches its most backward position during closing, continuing to rise to reach its most palatal position just after the teeth reach occlusion. In the macaque, this upward movement was suspended for a few moments as the teeth reached occlusion at the end of the power stroke (Hiiemae et al., 1995). [Informally, we hypothesized that this pause explained the rarity of tongue biting during feeding!] During the intercuspal phase and as the jaws start to open, the tongue continues to cycle forward and then downward. As shown in Fig. 7
, the orbit of the tongue surface cycle rises as processing proceeds, bringing the tongue surface progressively closer to the palate, culminating in palatal contact in the swallow. This cycling has the effect of moving chewed food progressively anteriorly. Intermittently, the tongue tip is elevated and used to collect this food from the anterior surface of the hard palate; as the jaws begin to separate, that bolus is then returned to the molar region, often by the pull back mechanism, as the jaw reaches the following maximum gape. Sagittal tongue cycling is found in all mammals studied with VFG. The amplitude of the vertical component is greatest in man, but the pattern is common across all mammalian groups studied (Hiiemae and Palmer, 2001).
|
The relatively tight linkage between jaw-hyoid and tongue movements seen in processing often loosens after the first swallow in the sequence. The amplitude of jaw movement decreases and becomes irregular. At the same time, the tongue twists and turns. This period of clearance is used for the tongue to clear fragments of food from the vestibules of the cheeks and the floor of the mouth. Often one or more boli are formed during clearance, or a second processing sequence, and are then swallowed. Multiple swallows are normal in feeding sequences, particularly with harder or fibrous foods. Each sequence ends with a terminal swallow.
(3) STAGE II TRANSPORT
Stage II transport is defined as the movement of material through the pillars of the fauces or the posterior oral seal (Dua et al., 1997). This movement marks the start of the liquid swallow and the beginning of bolus formation in the oropharynx (Hiiemae and Palmer, 1999). The mechanism is simple: The tongue rises with the tip and anterior surface, coming into contact with the anterior hard palate. This contact then spreads posteriorly, squeezing the food distally behind the contact (as in finger compression on a toothpaste tube). Note that the tongue itself does not move backward; rather, points on the tongue sequentially move upward to come into contact with the palate. This mechanism is called squeeze back and was first described in the opossum as squeeze-wedge (see Hiiemae and Crompton, 1985). There is one very important difference between Stage II in non-human mammals and in man. In the former, the incipient bolus passes through the fauces during the late (fast) opening and early (fast) closing phases of the jaw movement cycle, whereas in man it occurs in early opening. This subtle difference may affect interpretations of neurophysiological data on swallowing control mechanisms.
(4) BOLUS FORMATION AND DEGLUTITION
The liquid swallow is the most intensively studied feeding behavior. The typical paradigm calls for a subject to hold a bolus of liquid in the mouth and swallow on command (Dodds et al., 1990). In this context, most subjects will form a bolus between the surface of the tongue and the palate, but some will hold the liquid at the floor of the mouth (respectively, the so-called tipper- and dipper-type swallows). In the swallow-ready position, the tongue perimeter forms a seal around the bolus anteriorly and laterally on each side. A posterior seal formed between the surface of the tongue and the palate at the junction of the hard and soft palates prevents premature passage of liquid into the pharynx. The tongue accommodates larger boluses by forming a deeper cavity (Kahrilas et al., 1993). When the command to swallow is given, the anterior area of tongue-palate contact expands posteriorly, squeezing the bolus toward the pharynx, and the back of the tongue drops, eliminating the posterior oral seal. These motions comprise the oral stage of swallowing. Note that the tongue motion is nearly identical to the squeeze-back mechanism of Stage II transport (Palmer et al., 1992).
Once the bolus passes into the pharynx, the pharyngeal stage of swallowing is immediately initiated (Dodds et al., 1990). The larynx folds shut, and the velopharyngeal isthmus closes. The pharyngeal surface of the tongue pushes posteriorly (so-called tongue base retraction), making contact with the contracting pharyngeal walls. This action pushes the bolus through the pharynx and the upper esophageal sphincter, which opens actively at the onset of the pharyngeal stage. Bolus propulsion is assisted by elevation of the pharynx and larynx as well as by sequential (cephalo-caudal) contraction of the pharyngeal constrictor muscles. Bolus propulsion by the tongue is most effective with large bolus volumes, but the pharyngeal constrictors have a larger role for small volumes (Kahrilas et al., 1993).
The swallowing of semi-solid and chewed solid foods is quite different (Palmer et al., 1992; Palmer, 1998; Hiiemae and Palmer, 1999). As discussed above, triturated food is pushed/propelled into the pharynx by the tongue during Stage II transport cycles. A bolus accumulates in the oropharynx during multiple transport cycles (oropharyngeal aggregation time, which may last up to about 10 or 12 sec in healthy individuals). When the swallow is finally triggered, the pattern is very much like that described for liquids: The tongue surface sweeps remaining food from the oral cavity into the pharynx (squeeze-back), and the pharyngeal surface of the tongue pushes backward to propel food through the pharynx (tongue base retraction).
Chi-Fishman and Sonies (2000) studied rapid sequential swallowing of liquid. They report drink and swallow cycles with repeated sequences of tongue propulsion. Some of their subjects merged two successive boluses in the hypopharynx before the onset of a pharyngeal response, while holding the larynx closed continuously to prevent aspiration. These sequential swallows of liquid resembled swallows of triturated solid food, in that the bolus was formed in the pharynx before swallow onset.
(5) TONGUE SHAPES IN FEEDING
The drawings included in Fig. 5
illustrate the appearance of the tongue at various stages in feeding (Abd-el-Malek, 1955). The depression of the anterior surface and the heaped posterior surface of the tongue we have recorded in Stage I transport are shown. Similarly, the lower pair of drawings shows the twisting movement used to place and then maintain food on the occlusal plane. These shapes represent the changes in gross tongue-surface morphology seen in the lateral and antero-posterior VFG recordings. They also show two important features of tongue movement in feeding: First, the lateral margins of the tongue can move independently of the mid-body; second, the anterior and middle segments can move independently to produce anterior hollowing synchronously with posterior heaping. What Abd-el-Malek was unable to do was measure dimensional changes within the tongue. Expansion and contraction of the tongue surface, measured by changes in the relative positions of tongue markers in protrusion and retraction, have been reported in the macaque (Hiiemae et al.,1995). There is agreement in the human literature (see below) that such differential segmental behavior occurs in man (Stone, personal communication).
| (VI) Tongue Movements in Speaking |
|---|
|
|
|---|
Hiiemae et al.(2002) did not specifically address the movements of the tongue surface in speaking, but the relatively compact spatial domains in Fig. 4B
show movement within a more restricted space than for feeding (Fig. 4A
). Since we can find no evidence of significant medio-lateral hyoid movement in man or, for that matter, in other mammals (Anapol, 1988), it is probably reasonable to assume that the speaking domains in Fig. 4B
represent the actual sagittal range of tongue and hyoid markers, and of jaw movement. It must be noted that: (a) the teeth do not come into full occlusion during the reading of the Grandfather Passage, as evidenced by the spatial domains for the jaw marker; and (b) the anterior tongue marker makes almost no palatal contact except at the anterior margin of its range of movement.
The absence of published descriptions of global tongue movement in speaking is explained when the wide anatomical range of positions of the oral articulators in speech is considered (see Fig. 5
-1 in Daniloff, 1973). Point-tracking techniques (EMA, XRMB) provide data on movements of the jaws, lips, and tongue. We are looking at a functional complex where the events of interest are both transitory and localized within the larger oral cavity and the oropharynx. It is therefore not at all surprising that the focus has been on tongue shape in phoneme production rather than on synchronous tongue, hyoid, and jaw movement patterns.
However, while not focused on the tongue movements in speaking which made the articulator interaction he was analyzing possible, some of Perkells (1969) figures (especially his 3.2- 3.4 and 3.15) imply a movement trajectory. His Fig. 3
.15 is particularly interesting, since it shows an orbital movement of a posterior tongue marker when the subject uttered the /hák¤/ sounds. A following paper (Perkell et al., 1992) examined the velocity and acceleration of the lips in persons uttering a range of consonants and vowels in combinations.
| (VII) Tongue Shapes in Speaking/Vocalization |
|---|
|
|
|---|
Rather, we are focusing on those shapes given the hypothesis in the Introduction which suggests that tongue shapes in speech are consistent with, if not derived from, those seen in feeding. Using a novel 3-D ultrasound machine coupled with EPG, Stone and Lundberg (1996) were able to reconstruct the tongue surface in three dimensions when their subject was sounding 12 vowels and 6 consonants in American English. The EPG data complemented the ultrasound images by recording the extent and placement of tongue-palate contacts. This was important because analysis of the data showed the lateral margin contacts between the tongue and the palatal gingiva of the maxillary dental arcade. As Stone reported (1990), some tongue positions are stabilized by palatal contacts. After reconstructing the tongue-surface shapes for all 18 sounds, Stone and Lundberg concluded that the shapes could be grouped into four categories, one of which (two-point displacement) was seen only with the consonant ell; the other three were seen with both vowels and consonants. In their first category, front-raising, the anterior and middle tongue are raised, with the formation of a midline groove posteriorly extending into the oropharyngeal surface (Fig. 6
). A complete midline groove with elevated lateral tongue margins characterizes their second category. This shape was associated with low vowels. Shape three is described as back-raising and is essentially the reciprocal of the first: The posterior and middle tongue are elevated, often with the appearance of an anterior groove or dimple. In two-point displacement, the tongue has an elevated anterior and posterior segment with a small central groove in the middle segment. Stone and Lundberg (1996) make a convincing case for the muscular hydrostat approach to tongue shape, pointing out that the local expansions and contractions reflect the redistribution of tongue substance to form the shapes they identify.
Are there similarities between tongue shapes in feeding and those in sound production? Clearly, there are (compare Figs. 5
and 6
). Combinations of heaping and hollowing occur in both behaviors. Interestingly, Stone and Lundbergs front-raising shape is highly reminiscent of the tongue shape in squeeze-back, where the front of the tongue is raised and the posterior and oropharyngeal surface is grooved. The only tongue shape/movement not seen in speech is the twisting movement used to control food position and to retrieve food fragments from the vestibules and floor of the mouth in clearance (Fig. 5
). However, the deformations of the tongue surface used in speech are more complicated than those in swallowing; it is the movements in food processing (including clearance) that show the full range of possible tongue shapes.
| (VIII) The Hyolingual Musculature |
|---|
|
|
|---|