Effects of Age on Speech-in-Noise Identification: Subjective Ratings of Hearing Difficulties and Encoding of Fundamental Frequency in Older Adults

Atta Heidari; Abdollah Moossavi; Fariba Yadegari; Enayatollah Bakhshi; Mohsen Ahadi

doi:10.7874/jao.2017.00304

J Audiol Otol > Volume 22(3); 2018 > Article

Heidari, Moossavi, Yadegari, Bakhshi, and Ahadi: Effects of Age on Speech-in-Noise Identification: Subjective Ratings of Hearing Difficulties and Encoding of Fundamental Frequency in Older Adults

Original Article

Journal of Audiology and Otology 2018;22(3):134-139.

Published online: May 4, 2018

DOI: https://doi.org/10.7874/jao.2017.00304

Effects of Age on Speech-in-Noise Identification: Subjective Ratings of Hearing Difficulties and Encoding of Fundamental Frequency in Older Adults

Atta Heidari¹, Abdollah Moossavi², Fariba Yadegari³, Enayatollah Bakhshi⁴, Mohsen Ahadi⁵

¹Department of Audiology, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

²Department of Otolaryngology and Head and Neck Surgery, School of Medicine, Iran University of Medical Sciences, Tehran, Iran

³Department of Speech Therapy, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

⁴Department of Biostatistics, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran

⁵Department of Audiology, Rehabilitation Research Center, School of Rehabilitation Sciences, Iran University of Medical Sciences, Tehran, Iran

Address for correspondence Abdollah Moossavi, MD Department of Otolaryngology and Head and Neck Surgery, Iran University of Medical Sciences, Tehran 1985713834, Iran Tel +98 21 22180066 Fax +98 21 22180109 E-mail moossavi.a@iums.ac.ir

Received November 18, 2017 Revised January 24, 2018 Accepted March 16, 2018

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background and Objectives

Numerous studies have indicated deterioration of speech perception in noisy conditions among the elderly even those with normal hearing capabilities. The aim of this study was to investigate the effects of age on the speech-in-noise identification by speech-in-noise (SIN) test, subjective ratings of hearing difficulties by speech, spatial, and qualities of hearing scale (SSQ) questionnaire and encoding of fundamental frequency (F₀) by Speech auditory brainstem response (ABR) in the elderly and comparing the results with young people.

Subjects and Methods

The present study was conducted on 32 elderly people aged over 60 years old (17 male and 15 female) with the mean age of 68.9 (standard deviation=6.33) possessing normal peripheral hearing and 32 young subjects (16 male and 16 female) aged 18-25 years old.

Results

Findings showed that the score of SIN test is lower among the elderly people as compared with young people in signal-to-noise ratios of 0 and -10 based on Iranian version of SSQ questionnaire (p<0.001). The range of F₀ amplitude in the elderly people is also lower than young people (p<0.001) in Speech ABR.

Conclusions

It seems that speech processing in older people is deteriorated comparing with young people regardless of their normal peripheral auditory thresholds. This decrease will result in weaker perception and improper segregation of speech from other competing sources.

Keywords: Aging · Fundamental frequency (F₀) · Speech-in-noise perception.

Introduction

Numerous studies have shown decline in speech perception at the presence of competitive noise among elderlies even those having normal hearing ability. Difficulty in speech perception of old people can be due to various reasons some of which have not been completely understood [1]. However, they can be attributed to deficits in the peripheral auditory system, central auditory processing and cognitive problems [2]. Audiometric thresholds cannot properly show speech perception performance of the elderly people due to the nature of the test.

To identify a specific stream of sound in a mixture of other sounds, the auditory information entered to the auditory system has to be segregated and grouped based on its characteristics [3]. The process of segregation and grouping of different sound sources was first described by Bregman [4] in 1990 as a phenomenon called auditory scene analysis. Segregation processes are automatic or primitive and operate before attention commencement (top-down control). Numerous evidences have demonstrated that bottom-up sensory processes conduct sound segregation for sound sources differentiation in pre-attentive stages (i.e. before the effect of top-down processes). This indicates that auditory stream segregation is performed before stimuli selection [5-7]. Segregation of a combination of sounds from different sources entering the auditory system occurs automatically in lower levels of the auditory system facilitating the selection of external sources for better speech processing and understanding [5].

Auditory system segregates different sounds including speech at least by two basic elements (i.e. temporal and frequency characteristics). Segregation of different simultaneously presented sound streams will be done based on their frequency contents and harmonic relations which lead to separation of each sound stream and source. Finally, their independent perception representations will be shaped in the central auditory system. Information obtained from young people with normal auditory ability showed that fundamental frequency (F₀) and low-frequency harmonics in combination with auditory stimuli such as speech have crucial importance in understanding sounds pitch and perceptual segregation and therefore, facilitation of speech perception in a noisy environment [4,8]. A possible deficit in frequency representation is supported by reduced frequency following responses to tone bursts, as well as increased frequency discrimination difference limens in older adults compared to young people [9].

Based on the mentioned points, it can be expected that correct and automatic extraction of fundamental frequencies and their consequences (discovery of pitch, formant properties, vowels, and their harmonic relationships) can be considered as the first step in speech perception especially in noisy and crowded environments [1,8,10]. Reduction of pitch perception due to the decrease in nervous processing ability in the brainstem and subcortical regions is one of the problems of the elderly people which disables them in speech perception in noisy environments [1,11].

To investigate speech processing, various tools and tests can be used. In this study, two speech identification behavioral tests were employed: speech-in-noise (SIN) test and Iranian version of the speech, spatial, and qualities of hearing scale (SSQ) questionnaire. Speech auditory brainstem response (ABR) test was also employed to investigate the performance of brainstem regarding F₀.

For behavioral investigation of speech identification at the presence of noise, the Iranian version of adult temporal acuity test was employed; speech stimuli comprised four lists of fifty words-monosyllabic lists, which were presented to the right ears with continuous noise on signal-to-noise ratios (SNRs) of 0 and -10 dB [12,13].

SSQ is a questionnaire developed to measure a listener’s self-reported ability to hear in a variety of everyday situations for some purposes. In particular, SSQ is a promising tool for assessing the difficulties that listeners may have in understanding auditory signals (speech and non-speech) in challenging and realistic conditions that involve issues such as reverberation, the spatial positions of sounds, and different types of masking [14]. The Iranian version of this questionnaire which contains 47 statements is designed in three sections of speech perception, spatial hearing, and some qualities of hearing which was translated from the original version and its reliability and validity was confirmed [15].

Speech ABR is a proper test for evaluation of the subcortical electrophysiological auditory processing of speech [16]. It provides a clear relation between the stimuli and nervous system responses and evaluates the quality of fundamental frequency reception; and some degree of its quantization is also possible at brainstem level [17]. Previous studies have indicated the relationship between SIN processing spectral and temporal constituents of Speech ABR in children [17], adults, and the elderly people [1]. Variations in this test can be regarded as an objective sign of changes in nervous system function.

The results of behavioral tests of speech in noise, self-assessment of speech processing by Iranian version of SSQ questionnaire and variations of fundamental frequency amplitude of Speech ABR in the elderly people with normal hearing ability (complaining of speech perception difficulties) and their correlation were surveyed by this study and compared with those of young people with normal hearing ability who had no problem regarding speech perception.

Subjects and Methods

Subjects

The present descriptive-analytical study was conducted from January to March 2017 on 32 elderly people over 60 years old (17 male and 15 female) with mean age of 68.9 [standard deviation (SD)=6.33] and 32 young adults aged 18-25 years old (16 male and 16 female) with mean age of 21.43 (SD=1.74) having normal hearing thresholds in Rofaideh Hospital, Tehran, Iran. This study was approved by the Ethics Committee of University of Social Welfare and Rehabilitation Sciences. Written consents of subjects were obtained after complete explanation of the method and ensuring them that they will be informed of the results.

The elderly people were recruited from Yas senior nursing home and health centers of Tehran Municipality. All samples were randomly selected from those having inclusion criteria: right-handedness (using Persian version of Edinburgh Handedness Inventory questionnaire), being monolingual (master in Persian language as mother language), having normal external auditory canals with intact tympanic membranes, and no history of ear diseases, epilepsy, head trauma or accident, brain surgeries, and nervous system medications. The pure tone average of all subjects in the range of 500-4,000 Hz was better or equal to 25 dB HL in both ears; thresholds of each of four frequencies were better or equal to 40 dB HL with a maximum mean difference of the threshold for each similar frequency in both ears as 5 dB HL. Tympanometry and acoustic reflex were conducted to make sure on the normal performance of the middle ear. The Mini-Mental State Examination was used to screen the normal cognitive function of the elderly. The same criteria were also considered for selection of young volunteer participants from students of University of Social Welfare and Rehabilitation Sciences, Tehran. The pure tone average for young participants was better than 25 dB HL in the frequency range of 500-4,000 Hz in both ears.

Stimuli

All participants underwent SIN test and filled the SSQ questionnaire. For investigating the SIN perception, four 50-word standard lists along with continuous noise at two SNRs of 0 and -10 were applied. Then, the Iranian version of SSQ questionnaire was completed by participants. This questionnaire includes 47 statements regarding 3 aspects of speech perception, spatial hearing, and hearing quality. Participants’ abilities were assessed at each statement and determined through a 10-degree horizontal scale in which zero and 10 represented minimum and maximum ability, respectively.

Procedures

Speech ABR test was performed on each old and young participant by Bio-Logic Navigator Pro System, in which the noninverting electrode, inverting electrode, and earth electrode were placed in Cz, on the right earlobe, and on the forehead, respectively. During the recording phase, impedance was kept below 5 kΩ and inter-electrode impedance was maintained below 1.5 kΩ. Stimuli were presented by insert earphone (Etymotic Research, Elk Grove Village, IL, USA), ER-3A. The stimuli consisted of a 40 ms synthesized stop consonant /da/ provided with the BioMARK module. This syllable contained an initial noise burst, a formant transition between the consonant and a steady-state vowel with a fundamental frequency (F₀) which linearly rose from 103 Hz to 125 Hz; the voicing begins at 5 ms with an onset release burst during the first 10 ms. The first formant frequency (F₁) linearly increased from 220 Hz to 720 Hz, while the second formant (F₂) decreased from 1,700 Hz to 1,240 Hz over the duration of the stimulus. The third formant (F₃) fell slightly from 2,580 Hz to 2,500 Hz, while the fourth (F₄) and fifth (F₅) formants remained constant at 3,600 Hz and 4,500 Hz, respectively. The stimulus was presented by stimulus costume option in Biologic AEP software (version 7.0) with alternating polarity and presentation rate of 10.9 per second. Stimulus intensity was 80 dB SPL calibrated by 2-cm³ DB-0138 coupler audiometer Bruel & Kjaer Type 2203 and a microphone with a 1-inch diameter. Online filter setting of 100-2,000 Hz, sampling rate of 1,024 and time window of 85.33 ms (including a 15 ms pre-stimulus time) were also employed. All the stimuli were applied on the right ear according to the current standards and individual traces exceeding ±23.8 mV were eliminated from the average. Total 4,000 (two sub averages of 2,000 sweeps) artifact free responses were obtained. The test was carried out in calm condition with closed eyes in reclining position on a comfortable chair in a soundproof room with low light and low electrical and environmental noise. The spectral survey of the obtained responses was done by Mat Lab software, version R2013a (The Math Works, Inc., Natick, MA, USA).

Statistical analysis

Statistical analysis of this research was done by SPSS (version 16, SPSS Inc., Chicago, IL, USA). Normality of the data was examined by Kolmogorov-Smirnov test. The multi-variable analysis was used for comparing the results of SIN test, SSQ questionnaires and variation of fundamental frequency F₀ range. Correlation of the behavioral tests results with F₀ range variations was calculated through Pearson correlation coefficient.

Results

On account of the normal distribution of the data, the multivariable analysis was applied for comparing the results of younger and older people.

Speech-in-noise test

Table 1 shows the results of SIN test in two SNRs (0 and -10) for both groups. In both cases, the score gained by elderly people was lower than the younger subjects and this difference was statistically significant (p<0.001). For the younger adults, the mean 0 and -10 SNR scores were 68.18% (SD=7.54%) and 39.06% (SD=7.03%), respectively. For the older adults, the mean scores for 0 and -10 SNR were 51.56% (SD=6.84%) and 23.93% (SD=5.82%), respectively.

The Iranian version of SSQ

As shown in Table 2, in comparison with older adults, younger subjects gained higher scores on the overall score, speech, spatial and qualities subscales. For the younger adults, the mean overall SSQ, speech, spatial, and qualities subscale scores were 8.78 (SD=0.65), 8.82 (SD=0.62), 8.77 (SD=0.49), and 9.04 (SD=0.47), respectively. However, the older adults obtained 7.07 (SD=0.35), 7.11 (SD=0.32), 7.04 (SD=0.41), and 7.1 (SD=0.48) for the same subscales. These differences between the two groups were statistically significant (p<0.001).

Spectral analysis of speech ABR

Spectral analysis was applied to measure the precision and magnitude of neural phase locking at fundamental frequency F₀, first formant frequency F₁, and higher frequencies of the first formant HF (Table 3). Analyses of the spectral domain of responses indicated that fundamental frequency (F₀) encoding amplitude was lower for the elderly people [6.98 (SD=2.8)] as compared with young people [10.38 (SD=3.12)]. This difference was statistically significant (p<0.001).

Moreover, results of SIN behavioral test in two SNRs of 0 (r=0.366, p<0.003) and -10 (r=0.299, p< 0.016) and the total score of SSQ questionnaire (r=0.342, p< 0.006) showed high correlation with F₀ range.

Discussion and conclusion

In SIN test and for both SNRs, the elderlies’ mean score was lower than the young subjects. Mean score of young people was 68.18% and 39.06% in SNRs of 0 and -10, respectively. These results are in agreement with the results of Omidvar, et al. [13] (66.2% and 33% in the same order). Also, the scores of the elderly people were 51.56% and 23.93% for the SNRs of 0 and -10, respectively, which were in accordance with the studies of Jafari, et al. [18] and Stuart and Phillips [19].

Investigation of the groups in terms of their responses to SSQ questionnaire showed that the score of the elderly people (7.08) was less than the young subjects (8.78). This trend was also observed in the mean score of the three items of the questionnaire: speech perception, spatial hearing, and hearing quality. This decrease can indicate the impact of age on communicative abilities. The scores of this study are in agreement with the results of Singh and Pichora-Fuller [20] in which older adults and young people gained the scores of 7.7 and 8.8, respectively.

Through Speech ABR test, fundamental frequency F₀ and its spectral analysis were conducted in microvolts. The results of the elderly people were less than young subjects. This trend was in agreement with the results of Anderson, et al. [1] and Vongpaisal and Pichora-Fuller [21].

Anderson, et al. [1] showed that the older adults need improved coding in fundamental frequency at sub-cortex level for better understanding of speech. In more summarized form, the better the fundamental frequency receiving ability in these people, the less problems they will face in speech perception at the presence of noise.

In this study, only the effect of aging was addressed as the elderly participants had normal hearing threshold and no cognitive problems. Therefore, hearing loss and cognitive problems had the least possible impact on the study result. However, as the elderly who entered the study suffered from poor speech perception in noisy conditions (based on their own statement), the obtained results indicated the effect of aging on effective auditory processing in speech perception in noise.

Older adults are unable to benefit from voicing cues as effectively as younger adults in an informational masking task [22,23] and this affects their ability to process pitch cues. This deficit may interfere with following a single stream among the competing voices.

The decline in the ability to use pitch cues may arise from age-related decreases in γ-Aminobutyric acid (GABA) inhibition. Reduction in GABA has been found in the inferior colliculus and dorsal cochlear nucleus of rats [24,25]. Downregulation of inhibitory function may also lead to degradation of subcortical temporal resolution [26] by decreasing the selectivity of pertinent acoustic features in the stimulus [27-29]. It is possible that deficit in GABA inhibition may partly be responsible for weaker F₀ encoding and less stable/precise neural timing in the older adult group. While decrease in GABAergic inhibition may contribute to age-related deficits in subcortical encoding of pitch and timing. The primary purpose of this study was to examine the aspects of important subcortical processing for SIN perception in older adults rather than to assess the effects of aging on subcortical responses. Previous findings have demonstrated smaller representation of F₀ in children and young adults with poor SIN perception [1,17,30], so our finding regarding these effects in older population could be indicative of a fundamental and age-independent mechanism of auditory processing.

Results of SIN test and SSQ questionnaire showed that having a normal hearing threshold in the elderly people is not associated with normal speech perception in noisy environments as good as normal young people. Moreover, the high correlation of behavioral and self-assessment test results as well as the results of receiving a range of F₀ in this study indicates the impact of F₀ receiving range reduction on the speech processing. Fundamental frequency has a key role in segregation of simultaneous speech sounds and identification of the speaker. Higher ability in receiving F₀ enables the person to segregate simultaneous sounds easier; which is of crucial importance in speech perception in noisy environments. The decrease in the range of F₀ can lead to the weak performance of elderly people in brainstem nervous processing. Such reduction may cause the weaker perception of the target speech and its improper segregation from the background noise and therefore, disability of the older adults to follow the speech of a person. Therefore, the elderly people would fail to follow a discussion.

Acknowledgments

This study was part of a Ph.D. thesis coded as IR.USWR.REC.1396. 302; University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.

Notes

Conflicts of interest: The authors have no financial conflicts of interest.

Table 1.

Distribution of speech in noise test per group

SNR (dB)	Group	No	Mean (%)	SD	p-value
0	Young	32	68.18	7.54	<0.001
	Elderly	32	51.56	6.84
-10	Young	32	39.06	7.03	<0.001
	Elderly	32	23.93	5.82

SNR: signal-to-noise ratio, SD: standard deviation

Table 2.

Distribution of Iranian version of SSQ per group

Item	Group	No	Mean (scores)	SD	p-value
Speech perception	Young	32	8.82	0.62	<0.001
	Elderly	32	7.11	0.32
Spatial hearing	Young	32	8.77	0.49	<0.001
	Elderly	32	7.04	0.41
Hearing quality	Young	32	9.04	0.47	<0.001
	Elderly	32	7.10	0.48
Total score	Young	32	8.78	0.65	<0.001
	Elderly	32	7.07	0.35

SSQ: speech, spatial, and qualities of hearing scale, SD: standard deviation

Table 3.

Distribution of the spectral magnitudes per group

Spectral magnitudes	Group	No	Mean (mV)	SD	p-value
F₀	Young	32	10.38	3.12	<0.001
	Elderly	32	6.98	2.80
F₁	Young	32	8.77	0.49	<0.001
	Elderly	32	7.04	0.41
HF	Young	32	9.04	0.47	<0.001
	Elderly	32	7.10	0.48

SD: standard deviation, F0: fundamental frequency, F1: first formant frequency, HF: higher frequencies of the first formant

REFERENCES

1. Anderson S, Parbery-Clark A, Yi HG, Kraus N. A neural basis of speech-in-noise perception in older adults. Ear Hear 2011;32:750–7.

2. Getzmann S, Wascher E, Falkenstein M. What does successful speech-in-noise perception in aging depend on? Electrophysiological correlates of high and low performance in older adults. Neuropsychologia 2015;70:43–57.

3. Divenyi P. Speech separation by humans and machines. New York, NY: Springer Science & Business Media;2004.

4. Bregman AS. Auditory scene analysis: the perceptual organization of sound. Cambridge, MA: MIT Press;1994.

5. Talebi H, Moossavi A, Lotfi Y, Faghihzadeh S. Effects of vowel auditory training on concurrent speech segregation in hearing impaired children. Ann Otol Rhinol Laryngol 2015;124:13–20.

6. Winkler I, Kushnerenko E, Horváth J, Ceponiene R, Fellman V, Huotilainen M, et al. Newborn infants can organize the auditory world. Proc Natl Acad Sci U S A 2003;100:11812–5.

7. Hulse SH, MacDougall-Shackleton SA, Wisniewski AB. Auditory scene analysis by songbirds: stream segregation of birdsong by European starlings (Sturnus vulgaris). J Comp Psychol 1997;111:3–13.

8. Pichora-Fuller K, MacDonald E. Auditory temporal processing deficits in older listeners: from a review to a future view of Presbycusis. In: Dau T, Buchholz JM, Harte JM, Christiansen TU. editors. Auditory Signal Processing in Hearing Impaired Listeners. Proceeding of the 1st International Symposium on Auditory and Audiological Research (ISAAR) 2007 Aug 29-31; Helsingør, Denmark. Denmark: Centertryk A/S;2008 291–300.

9. Clinard CG, Tremblay KL, Krishnan AR. Aging alters the perception and physiological representation of frequency: evidence from human frequency-following response recordings. Hear Res 2010;264:48–55.

10. Vander Werff KR, Burns KS. Brain stem responses to speech in younger and older adults. Ear Hear 2011;32:168–80.

11. Bidelman GM, Villafuerte JW, Moreno S, Alain C. Age-related changes in the subcortical-cortical encoding and categorical perception of speech. Neurobiol Aging 2014;35:2526–40.

12. Omidvar S, Jafari Z, Tahaei AA. Evaluating the results of Persian version of the temporal resolution test in adults. Audiol 2012;21:38–45.

13. Omidvar S, Jafari Z, Tahaei AA, Salehi M. Comparison of auditory temporal resolution between monolingual Persian and bilingual Turkish-Persian individuals. Int J Audiol 2013;52:236–41.

14. Gatehouse S, Noble W. The speech, spatial and qualities of hearing scale (SSQ). Int J Audiol 2004;43:85–99.

15. Lotfi Y, Nazeri AR, Asgari A, Moosavi A, Bakhshi E. Iranian version of Speech, Spatial, and Qualities of Hearing Scale: a psychometric study. Acta Med Iran 2016;54:756–64.

16. Anderson S, Kraus N. Objective neural indices of speech-in-noise perception. Trends Amplif 2010;14:73–83.

17. Anderson S, Skoe E, Chandrasekaran B, Kraus N. Neural timing is linked to speech perception in noise. J Neurosci 2010;30:4922–6.

18. Jafari Z, Omidvar S, Jafarlou F, Kamali M. The effect of age on speech temporal resolution among elderly people. Adv Cogn Sci 2011;13:55–64.

19. Stuart A, Phillips DP. Word recognition in continuous and interrupted broadband noise by young normal-hearing, older normal-hearing, and presbyacusic listeners. Ear Hear 1996;17:478–89.

20. Singh G, Pichora-Fuller MK. Older adults’ performance on the speech, spatial, and qualities of hearing scale (SSQ): test-retest reliability and a comparison of interview and self-administration methods. Int J Audiol 2010;49:733–40.

21. Vongpaisal T, Pichora-Fuller MK. Effect of age on F₀ difference limen and concurrent vowel identification. J Speech Lang Hear Res 2007;50:1139–56.

22. Helfer KS, Freyman RL. Aging and speech-on-speech masking. Ear Hear 2008;29:87–98.

23. Huang Y, Xu L, Wu X, Li L. The effect of voice cuing on releasing speech from informational masking disappears in older adults. Ear Hear 2010;31:579–83.

24. Caspary DM, Milbrandt JC, Helfert RH. Central auditory aging: GABA changes in the inferior colliculus. Exp Gerontol 1995;30:349–60.

25. Caspary DM, Schatteman TA, Hughes LF. Age-related changes in the inhibitory response properties of dorsal cochlear nucleus output neurons: role of inhibitory inputs. J Neurosci 2005;25:10952–9.

26. Caspary DM, Ling L, Turner JG, Hughes LF. Inhibitory neurotransmission, plasticity and aging in the mammalian central auditory system. J Exp Biol 2008;211(Pt 11):1781–91.

27. Burger RM, Pollak GD. Analysis of the role of inhibition in shaping responses to sinusoidally amplitude-modulated signals in the inferior colliculus. J Neurophysiol 1998;80:1686–701.

28. Edwards CJ, Leary CJ, Rose GJ. Mechanisms of long-interval selectivity in midbrain auditory neurons: roles of excitation, inhibition, and plasticity. J Neurophysiol 2008;100:3407–16.

29. Hall JC. GABAergic inhibition shapes frequency tuning and modifies response properties in the auditory midbrain of the leopard frog. J Comp Physiol A 1999;185:479–91.

30. Song JH, Skoe E, Banai K, Kraus N. Perception of speech in noise: neural correlates. J Cogn Neurosci 2011;23:9:2268-79.