Age-related changes in European Portuguese vowel acoustics

This study addresses effects of age and gender on acoustics of European Portuguese oral vowels, given to the fact of conflicting findings reported in prior research. Fundamental frequency (F0), formant frequencies (F1 and F2) and duration of vowels produced by a group of 113 adults, aged between 35 and 97 years old, were measured. Vowel space area (VSA) according to gender and age was also analysed. The results revealed that the most consistent age-related effect was an increase in vowel duration in both genders. F0 decreases above [50-64] for female and for male data suggests a slight drop over the age range [3564] and then an increase in an older age. That is, F0 tends to be closer between genders as age increases. In general, there is no evidence that F1 and F2 frequencies were lowering as age increased. Furthermore, there were no changes to VSA with ageing. These results provide a base of information to establish vowel acoustics normal patterns of ageing among Portuguese adults.


Introduction
From young adulthood to old age, the speech production mechanism undergoes several anatomical and physiological changes. Moreover there are substantial gender differences in the extent and timing of the ageing process [1,2,3]. Numerous studies have evaluated the effects of ageing on the acoustic properties of speech (e.g. [1,3,4]). Most of them have focused on fundamental frequency (F0) and have shown a decrease in F0 with ageing in women [5,6,7,3,8,9]; for men there is less agreement across researches, with some studies indicating that F0 significantly decreases above 60 [4,10], and others suggesting an F0 drop in men over the age range 30-50 and then an increase in F0 in older age [1,7,2,3,11,8,12,5].
Other studies have reported on age-related changes to formants (mostly F1 and F2), particularly in the production of vowels. The results are quite variable and the most consistent effect is an age-dependent formant frequency lowering [1,13,14]. Another finding is of a greater centralization of the vowel space in older speakers (which should result in movement to the centroid of formant space) [15,16,3,8]. In some cases the changes have been identified as occurring only on particular vowels and some studies have showed a gender-vowel interaction in formant frequencies [3,5,17].
It has often been noted that older adults use slower speaking rates [1,3,18]. That is, vocal ageing implies a decrease in the number of syllables and phonemes per second, which leads to the increase of segment duration [19,1,3,17].
Unlike other languages, only a few studies have been concerned with effects of ageing on F0 [20,21,22], duration and formant frequencies [21,22] of European Portuguese (EP) vowels. Given that those previous researches used different methods and analysis procedures, it is hard at this time to draw solid conclusions on the effects of age and gender on EP vowel acoustics. Our previous study [22], with speakers between 60 and 90, indicated a slight age-related F0 decrease in women and a trend of increase in men. Vowel duration has shown to significantly increase with ageing. Age differences in vowel formant frequency were also observed, mainly in women. The comparison between our data and the results of previous studies on acoustic correlates of EP vowels from young adults [23] suggested a trend towards the centralization of the vowel space.
The purpose of this article is to analyse the effects of age and gender on duration, F0 and formant frequencies (F1, F2) for EP oral vowels.This study extends previous research by reporting data from four adult age groups, covering the age range of 35 to 97, which is essential to provide a more complete view of agerelated changes in EP vowel acoustics. The speech stimuli were carefully chosen to allow easy and accurate formant measure, and the greater constancy of the speech stimuli across speakers throughout the life span also facilitates comparisons and reduces variability. Since there is a paucity of literature on EP vowel acoustics [23,24,22,25], this study also provides valuable insights to an accurate description of these sounds.
This study also examines the relationship between age, gender and Vowel Space Area (VSA). VSA is used to model possible reduction in the articulatory capability of speakers. Such reduction is observed as a compression of the area of the vocal space. The main hypothesis is that young speakers have a better articulation capability than older speakers [26,27] [u] in stressed position and the vowels [1] and [5] in unstressed position. Each vowel was produced in a disyllabic sequence, mostly CV.CV (C-consonant, V-vowel) (e.g. "pato", "duck"), where C was a voiced/ voiceless stop consonant (

Recording Protocol
Recordings took place in quiet rooms, using an AKG C535 EB cardioid condenser microphone connected to an external 16-bit sound system (PreSonus Audio-BoxTM USB), at a sampling rate of 44100 Hz. The sentences were randomized and presented individually on the computer screen with software system SpeechRecorder [28] using pictures together with the orthographic word. Participants read the sentences at comfortable pitch and loudness level, after familiarizing themselves with the sentences. Each carrier sentence was repeated 3 times. Thus, each participant produced 12 repetitions of each vowel, in a total of 108 productions by speaker (113 participants x 36 words x 3 repetitions = 12204 recordings).

Segmentation of the Data Set
The recorded data was first automatically segmented at phoneme level using WebMAUS [29] and then imported into Praat [30], so that 3 trained analyzers could manually check the accuracy of all phoneme boundaries, by finding the first and last positive zero crossings of the quasi-periodic waveform associated with the vowel. Recordings presenting clipping or other recording artifacts (e.g. noise, cough) or in which the speaker produced unusual hoarseness or vocal fry were excluded. In case the participants misread a word, it was not analysed. Furthermore, due to vowel reduction affecting unstressed vowels in EP, the vowel [1] was often elided [31]. A total of 758 recordings were not analysed (approximately 6.2% of trials).

Acoustic measurements
Acoustic parameters of the vowels were automatically extracted from the data set using Praat scripts. The F0 of the vowels was estimated with the cross-correlation algorithm, which is especially suitable for measuring short vowels [23]. Median F0 value was taken from the central 40% of each target vowel, which minimizes the impact of flanking consonants on the F0; in addition, taking the median F0 values rather than the mean, reduces the effect of F0 measurement errors [23]. The pitch range for the analysis was set to 60 -400 Hz for men and 120 -400 Hz for women. If the analysis failed on any of the speaker's vowel tokens, which only occurred for women, a new analysis was automatically performed using a pitch floor of 75 Hz (which occurred in only 28 of the 11446 vowel tokens, almost all from an 80 year old woman). Burg-LPC algorithm was used to compile values for F1 and F2, at the central 40% of the vowel. A procedure (adapted from [23] and previously used in [24,22]), was applied to optimize the formant ceiling for a certain vowel of a certain speaker. The first two formants were determined 201 times for each vowel, for all ceilings between 4500 and 6500 Hz in steps of 10 Hz (for female), and for all ceilings between 4000 and 6000 Hz in steps of 10 Hz (for male). The chosen ceiling was the one that yielded the lowest variation (for more details see [23]). Thus, for each vowel produced by each speaker there is only one optimal ceiling. The duration measurements were computed from the label files with reference to the beginning and the ending points of each vowel. Vowels with duration values shorter than 20 ms were excluded (8 vowels). An important goal for this database is to provide normative data for adult age speakers, therefore, outliers that exceeded 2.5 standard deviations from the mean for particular speaker by F0 and from their gender x vowel mean by F1 and F2 were excluded from this analysis [5,32]. This procedure yielded in 532 outliers excluded from the study. The VSA is defined by the polygon area based on the mean value for each oral vowel, adapted from [26,33,34].

Statistical Analysis
The statistical analysis was conducted with the SPSS software package (SPSS 25.0 -SPSS Inc., Chicago, IL, USA). The values of F0, F1, F2 and duration were calculated for all productions, and subsequently, the median of repetitions was performed for each vowel and speaker. The VSA was calculated in Hz 2 for each speaker. For each dependent variable (F0, F1, F2, and duration), a three-way mixed analysis of variance (ANOVA) was applied, with vowel as a within-subject factor and with gender and age as between-subject factors. For VSA, a two-way ANOVA was applied, with gender and age as between-subject factors. The ANOVA assumptions of residual normality and homogeneity of variance were validated (except homogeneity of variance for VSA). On what concerns the sphericity assumption, the Epsilon Huynh-Feldt correction was used. In all statistical analysis, the level of significance was p<0.05.

Results
This section presents the detailed results of the acoustic measurements and statistical analysis aimed at investigating the effects of age, gender and vowel on duration, F0, formants, and VSA. Table 1 summarizes the average values for all these parameters; each number is an average of the 9 oral vowels under analysis by gender and age group.

Vowel Duration
The results of EP vowel duration measurements by age and vowel are displayed in Figure 1. As can be seen in Figure 1, unlike the stressed vowels, the unstressed vowels ([1] and [5] ) did not show relevant changes of duration with age.  Findings also showed significant age by gender interaction for F0 (F(3;103)=3.2; p=0.028), indicating that the differences among age groups varied by gender. In male, F0 decreased until the age group [50-64] and started to increase after that age, with a more pronounced increase in the group ≥ 80, that presented the highest mean value of F0. The opposite tendency was observed for female speakers, where F0 increased until the age group [50-64] and started to decrease sharply after this age. The age group ≥ 80 presented the highest mean value of F0. As illustrated in As seen in Figure 2, for unstressed vowels F0 decreases very markedly with age. Figures 3 b) c) show the mean F1 and F2 values of the EP vowels for the four age groups, by gender. As expected, women presented higher F1 and F2 frequencies than men (cf.  Although F2 values were slightly lower for the vowels from the ≥ 80 speakers (cf. Apparently, the size of the F2 space was larger for women than for men: the F2s of /u/ were similar for both genders, whereas /i/ values were very different.

Discussion
The current study examined similarities and differences in EP vowel acoustics that male and female speakers presented in response to ageing. The findings suggest that some characteristics seem to change with age, mainly the duration and F0, and all acoustic parameters have shown to be gender dependent (except duration). Vowels duration significantly increased in both genders with ageing, which is in agreement with the literature [19,35,3,18]. This may be related to the slowing of the nerve       conduction velocity and with the changes verified in the respiratory and central nervous systems [1].
F0 tends to approach between genders as age increases. The decreases of F0 in women is consistent with the data available in other languages [5,6,7,3,8,9] and in EP [20,21], and has been attributed to the endocrinological changes that occur after menopause [7,36,7,1,37]. For men there is less agreement across studies, and our results tend to confirm the trend that F0 decreases until middle age and increases again at an advanced age [1,7,2,3,11,8]. For EP, the few available data didn't show significant changes with age [21]. The increase of F0 in males may be associated with the muscle atrophy, or with an increase in stiffness of vocal folds tissue with ageing [36,37,1].
It is clear from our results that vowel formants do not systematically decrease with age, in contrast with some previous reports [1,14,13]. There are vowels that presented a different pattern of formant frequencies variation with age and gender. Age related changes in F1 and F2 might be related to specific articulatory adjustments of the older speakers during speech, rather than generalized processes such as lengthening of the vocal tract [13,5]. Age-related changes in VSA, although not significant, show a slightly decrease, mainly for males, which supports a trend towards the centralization of vowels' space with ageing. That is vowel articulation becomes more centralized in older speakers [15,16,26]. And still, the smaller polygon of the older males tends to indicate that they have a worse articulation capability [26,27].
Finally, the present study finds several general proprieties of Portuguese vowels [23], which they have in common with other languages: significant differences by gender for all vowels in F0 and formants; the high vowels have a higher F0 than low vowels, i.e. they exhibit intrinsic F0; back vowels present higher F1 than their front counterparts. The differences of F1 between front and back vowels decrease with ageing (see Figure 3 a)). So, the age group ≥ 80 present a lower front-back distinction compared to all the other age groups.

Conclusions
This study adds to the growing body of data on the effects of age on the acoustic properties of speech, providing information on vowel acoustics from adults who speak a language different from English. In that sense, it might help to better understand crosslinguistic similarities and language-particular features of vowel ageing. Moreover, this normative data for EP vowel acoustics are important as reference for clinical assessment and treatment of different speech disorders, that are often age-related, and to provide information for speech technologies.
Several features of this research are notable. A new database was devised, containing all EP oral vowels in similar word contexts; vowels were produced by a large sample of healthy adults in four age groups; they were collected using standardized recording procedures; data segmentation was manually checked by experts; and analyses were conducted for several acoustic parameters (duration, F0, F1, F2 and VSA).
This work is the starting point for a broader life span study, involving a large number of EP speakers, from infancy to old age. The relation between the vowel acoustic and the articulatory changes with ageing should be addressed using advanced instrumental techniques, such as ultrasonography and magnetic ressonance imaging.