Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Disorders of the Nervous System

Deciphering Compromised Speech-in-Noise Intelligibility in Older Listeners: The Role of Cochlear Synaptopathy

Markus Garrett, Viacheslav Vasilkov, Manfred Mauermann, Pauline Devolder, John L. Wilson, Leslie Gonzales, Kenneth S. Henry and Sarah Verhulst
eNeuro 9 January 2025, 12 (2) ENEURO.0182-24.2024; https://doi.org/10.1523/ENEURO.0182-24.2024
Markus Garrett
1Medizinische Physik and Cluster of Excellence “Hearing4all”, Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg 26129, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Viacheslav Vasilkov
2Hearing Technology @ WAVES, Department of Information Technology, Ghent University, Zwijnaarde 9052, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Manfred Mauermann
1Medizinische Physik and Cluster of Excellence “Hearing4all”, Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Oldenburg 26129, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pauline Devolder
2Hearing Technology @ WAVES, Department of Information Technology, Ghent University, Zwijnaarde 9052, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John L. Wilson
3Department of Otolaryngology, University of Rochester, Rochester, New York 14642
4Department of Neuroscience, University of Rochester, Rochester, New York 14642
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Leslie Gonzales
4Department of Neuroscience, University of Rochester, Rochester, New York 14642
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kenneth S. Henry
3Department of Otolaryngology, University of Rochester, Rochester, New York 14642
4Department of Neuroscience, University of Rochester, Rochester, New York 14642
5Department of Biomedical Engineering, University of Rochester, Rochester, New York 14627
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah Verhulst
2Hearing Technology @ WAVES, Department of Information Technology, Ghent University, Zwijnaarde 9052, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sarah Verhulst
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

Speech intelligibility declines with age and sensorineural hearing damage (SNHL). However, it remains unclear whether cochlear synaptopathy (CS), a recently discovered form of SNHL, significantly contributes to this issue. CS refers to damaged auditory-nerve synapses that innervate the inner hair cells and there is currently no go-to diagnostic test available. Furthermore, age-related hearing damage can comprise various aspects (e.g., hair cell damage, CS) that each can play a role in impaired sound perception. To explore the link between cochlear damage and speech intelligibility deficits, this study examines the role of CS for word recognition among older listeners. We first validated an envelope-following response (EFR) marker for CS using a Budgerigar model. We then applied this marker in human experiments, while restricting the speech material’s frequency content to ensure that both the EFR and the behavioral tasks engaged similar cochlear frequency regions. Following this approach, we identified the relative contribution of hearing sensitivity and CS to speech intelligibility in two age-matched (65-year-old) groups with clinically normal (n = 15, 8 females) or impaired audiograms (n = 13, 8 females). Compared to a young normal-hearing control group (n = 13, 7 females), the older groups demonstrated lower EFR responses and impaired speech reception thresholds. We conclude that age-related CS reduces supra-threshold temporal envelope coding with subsequent speech coding deficits in noise that cannot be explained based on hearing sensitivity alone.

  • cochlear synaptopathy
  • envelope-following response
  • outer hair cell damage
  • reception threshold
  • sensorineural hearing loss
  • speech-in-noise
  • speech
  • speech intelligibility

Significance Statement

Temporal bone histology reveals that cochlear synaptopathy (CS), characterized by damage to inner hair cell auditory nerve fiber synapses, precedes sensory cell damage and hearing sensitivity decline. Despite this, clinical practice primarily evaluates hearing status based on audiometric thresholds, potentially overlooking a prevalent aspect of sensorineural hearing damage due to aging, noise exposure, or ototoxic drugs–all of which can lead to CS. To address this gap, we employ a novel and sensitive EEG-based marker of CS to investigate its relationship with speech intelligibility. This study addresses a crucial unresolved issue in hearing science: whether CS significantly contributes to degraded speech intelligibility as individuals age. Our study-outcomes are pivotal for identifying the appropriate target for treatments aimed at improving impaired speech perception.

Introduction

The cochlea, deeply embedded within the temporal bone, poses challenges for direct histological assessments of sensorineural hearing loss (SNHL) in living humans. While outer hair cell (OHC) deficits are typically diagnosed through indirect methods like pure-tone audiograms or distortion-product otoacoustic emissions (DPOAEs), damage to auditory nerve fiber (ANF) synapses between inner hair cells (IHCs) and spiral ganglion cells–known as cochlear synaptopathy (CS, Kujawa and Liberman, 2009)–can only be quantified via post-mortem temporal bone histology (Makary et al., 2011; Viana et al., 2015; Wu et al., 2018). Understanding how various aspects of SNHL contribute to speech recognition deficits remains a major challenge in hearing science, particularly due to the difficulties in diagnosing synaptopathy non-invasively in humans (Plack et al., 2016; Hickox et al., 2017; Kobel et al., 2017; Bramhall et al., 2019; DiNino et al., 2022). Nonetheless, studies in both animals and humans indicate that CS is a critical component of SNHL, as it often progresses with age (Sergeyenko et al., 2013; Parthasarathy and Kujawa, 2018; Wu et al., 2018) and occurs prior to OHC damage following noise exposure (Kujawa and Liberman, 2009; Fernandez et al., 2015). This suggests that the prevalence of synaptopathy may significantly exceed the World Health Organization’s estimate of 5.3% of the global population suffering from disabling hearing loss as diagnosed using the audiogram (Stevens et al., 2013; WHO, 2019).

CS is thought to contribute to reduced speech intelligibility in individuals with otherwise normal audiograms (Mepani et al., 2021) due to its association with compromised supra-threshold temporal envelope (TENV) coding (Bharadwaj et al., 2014). This connection may help explain why hearing sensitivity alone is not a reliable predictor of individual speech-intelligibility scores (Festen and Plomp, 1983; Papakonstantinou et al., 2011). In animals with histologically verified CS, compromised TENV coding can be measured using envelope-following responses (EFRs, Parthasarathy et al., 2014; Shaheen et al., 2015; Parthasarathy and Kujawa, 2018). The EFR is a non-invasive auditory-evoked potential (AEP) that reflects the phase-locked neural response of a population of peripheral and brainstem neurons to a stimulus envelope (Kraus et al., 2017). Its frequency-domain response displays peaks that correspond to both the stimulus modulation frequency and its harmonics. Because EFRs can be reliably recorded in both animals and humans, their spectral magnitude is proposed as a non-invasive marker of synaptopathy that can be used across species (e.g., Shaheen et al., 2015).

To develop effective treatments for SNHL, it is essential to determine whether CS affects speech intelligibility. This goal has led to numerous studies exploring this potential relationship in humans, as summarized by DiNino et al. (2022). However, the findings have been mixed and often inconclusive. For example, while the amplitude of wave I in the auditory brainstem response (ABR), a marker of CS in animal studies (Kujawa and Liberman, 2009; Möhrle et al., 2016), has been shown to predict speech intelligibility in some research (Bramhall et al., 2015; Liberman et al., 2016), it has not produced consistent results across all studies (Prendergast et al., 2017; Johannesen et al., 2019). Similarly, studies based on EFR markers of CS also show variability; some indicate that certain EFR markers can predict speech intelligibility in individuals with normal hearing (Mepani et al., 2021), while others do not support this finding (Grose et al., 2017; Guest et al., 2018).

Several factors may contribute to the inconsistent outcomes across studies:

  • Individual differences: Variations in the absolute strength of the ABR and EFR can reflect non-hearing-related factors in humans, such as head size (Trune et al., 1988; Mitchell et al., 1989; Plack et al., 2016). This suggests the need for a relative metric design to improve their sensitivity to SNHL aspects (Bharadwaj et al., 2015; Mehraei et al., 2016; Hickox et al., 2017; Le Prell, 2019).

  • Mixed signals: Both ABR and EFR can indicate a combination of CS and OHC deficits (Verhulst et al., 2016b; Garrett and Verhulst, 2019; Van Der Biest et al., 2023). Without histopathological data, it is challenging to interpret these responses specifically in terms of synaptopathy.

  • Sensitivity of recording paradigms: The recording methods developed in animal studies may not be sufficiently sensitive for use in humans, potentially overlooking cases of CS (Hickox et al., 2017; Bramhall et al., 2019).

  • Broadband nature of speech: While speech is a broadband signal, the EFR primarily reflects auditory TENV coding mechanisms associated with ANF activity at higher cochlear frequencies, above the phase-locking limit (Joris and Yin, 1992; Verschooten et al., 2015; Henry et al., 2016). Speech recognition also relies on frequency content below this limit, which is encoded as temporal fine structure (TFS) information by the ANFs. This TFS information serves as a critical perceptual cue (e.g., Lorenzi et al., 2006; Hopkins et al., 2008; Hopkins and Moore, 2010; Henry et al., 2016; Borjigin and Bharadwaj, 2023; Mai and Howell, 2023) and is likely not reflected well in the EFR. Thus, given that conventional EFR markers mainly assess TENV coding and its impairments, the lack of correlation between EFR (driven by TENV) and speech recognition (which involves both TFS and TENV coding) does not necessarily mean that CS does not affect speech intelligibility. It may also indicate that distinct mechanisms govern each metric independently.

This study aims to clarify factors that may have complicated the interpretation of the relationship between speech perception and AEP markers of CS. First, we focus on EFRs rather than ABRs. This choice is based on previous model simulations and data, which suggest that OHC deficits have a more pronounced effect on the ABR amplitude compared to the EFR amplitude (Verhulst et al., 2016a; Garrett and Verhulst, 2019; Vasilkov et al., 2021). Second, auditory model simulations have shown that optimizing the EFR stimulus envelope to be rectangular, rather than sinusoidal, can improve the EFR marker’s sensitivity to CS (Vasilkov et al., 2021).

Building on these findings, our study first assesses the sensitivity of an optimized EFR marker for CS in a Budgerigar model with kainic acid-induced ANF damage. We then validate its effectiveness in detecting individual differences in CS compared to conventional EFR markers. We proceed to examine the correlation between this EFR marker and speech intelligibility using both low-pass (LP)- and high-pass (HP)-filtered speech materials in human subjects. The idea behind using filtered speech stimuli is that the listener would rely on specific auditory processing cues associated with low or high-frequency hearing. In this context, the perception of the LP condition is predominantly based on available temporal fine structure (TFS) cues (Lorenzi et al., 2006), whereas the HP condition mostly relies on temporal envelope (TENV) cues. It is well known that auditory-nerve phase-locking to TFS declines with increasing frequency (i.e., 1.4 kHz in humans based on interaural-time-difference sensitivity; Joris and Verschooten, 2013), and that for frequencies beyond this limit, the auditory system can only rely on TENV processing. Our hypothesis is that the EFR marker offers more accurate predictions of speech recognition thresholds when both measures depend on high-frequency TENV mechanisms and their associated impairments. To delineate the individual contributions of CS and hearing sensitivity to speech intelligibility, we investigate the connections between objective markers of hearing and speech intelligibility in three groups: young individuals with normal hearing, and two age-matched groups of older participants: one with normal audiograms and the other with impaired audiograms. We presume that the older participants may experience age-related CS (Wu et al., 2018), potentially compounded by additional pathologies affecting OHCs.

Materials and Methods

Study Participants

Three participant groups were recruited using age and audiometric pure-tone thresholds as the selection criteria: yNH, oNH and oHI. The younger (y) subjects were between 20 and 30 years old, and older (o, OLD) subjects had ages between 60 and 70. Normal-hearing (NH) subjects had audiometric thresholds less than 20 dB HL for frequencies up to 4 kHz. Hearing-impaired (HI) subjects had sloping audiograms and thresholds that surpassed 20 dB HL at least once below 4 kHz. The grouping criterion did not account for potential individual variations in the degree of synaptopathy; this was an unknown variable at the start of the study. Using these criteria, 15 young normal-hearing (yNH: 24.5 ± 2.2 y/o, 8 females), 15 older normal-hearing (oNH: 64.2 ± 1.9 y/o, 7 females) and 14 older hearing-impaired (oHI: 65.2 ± 1.7 y/o, 7 females) subjects participated. There were no significant age differences between the oNH and oHI participant groups (p > .05). Otoscopy was performed prior to data collection to ensure that participants had no obstructions or other visible outer or middle ear pathologies. Aside from the audiogram and otoscopy, all tests were performed monaurally on the ear with the best audiometric thresholds. The experiments were approved by the ethics committee of the University of Oldenburg. Participants gave written informed consent and were paid for their participation.

Behavioral and Physiological Markers of Hearing Sensitivity

We adopted two measures to quantify hearing sensitivity: a standard clinical pure-tone audiogram to assess the behavioral hearing sensitivity, and DPOAE thresholds (THDP) to assess the OHC integrity more directly through changes in OHC-driven ear canal pressure. We collected DPOAEs at 4 kHz to quantify OHC-damage in the same frequency region as targeted by the EFR marker of CS.

DPOAE stimuli were presented over ER-2 speakers (Etymotic Research) using foam ear tips and DPOAEs were recorded using the ER10B+ OAE microphone system (Etymotic Research) and custom-made MATLAB scripts (Mauermann, 2013). Two pure tones (f1, f2) were simultaneously presented at a fixed f2/f1 ratio of 1.2 using a primary frequency-sweep method (Long et al., 2008). Frequencies were exponentially swept up (2 s/octave) over a 1/3rd octave range around the geometric mean of 4 kHz. Primary level L1 followed the Scissors paradigm (L1 = 0.4 L2 + 39; Kummer et al., 1998) given a primary L2 of 30–60 dB SPL in steps of 6 dB (in oHI listeners, L2 of 66 and 72 dB SPL were additionally collected). The distortion component (LDC) was extracted using a sharp 2-Hz-wide least-squares-fit filter and the center frequency of the measured frequency range was used to construct LDC growth functions. Individual LDC data points and their standard deviations were used in a bootstrapping procedure to fit an adapted cubic function through the LDC datapoints as described in Verhulst et al. (2016a). LDC growth functions typically increase monotonically or saturate with increasing L2 (Mauermann and Kollmeier, 2004; Abdala et al., 2021). We thus constrained our bootstrapping procedure to only include random LDC draws (i.e., from within the confidence interval of each mean LDC measurement point) to impose monotonous growth in each LDC (L2, b) bootstrap run. We used an automated algorithm which eliminated adjacent data points at either end of the growth function (never intermediate points) that compromised monotonicity. THDP was determined in each bootstrap run as the L2 at which the extrapolated fitting curve reached a level of −25 dB SPL (Neely et al., 2009). This bootstrapping procedure yielded the median threshold (THDP) and its standard deviation. THDP of two yNH participants with reasonable growth functions but very low thresholds (−30.9, −5.4 dB SPL) were set to 1.5-times the interquartile range of the yNH THDP threshold distribution (−4.3 dB) to avoid high-leverage data-points. THA and THDP correlated strongly at 4 kHz (ρ(44) = 0.84; p < 0.05), corroborating earlier observations (Boege and Janssen, 2002) and confirming that both metrics assess hearing sensitivity.

Audiometric thresholds (THAs) were collected for frequencies between 0.125 and 8 kHz using Sennheiser HDA200 headphones and a clinical audiometer AT900 (Auritec). Figure 1 shows audiograms of the tested ears, which were chosen based on the better of left and right audiograms. At 4 kHz, yNHcontrol participants had a mean (M ±SD) threshold of 3.3 ± 3.5 dB HL. The older groups had thresholds of 11.3 ± 3.5 dB HL (oNH) and 36 ± 8.4 dB HL (oHI), respectively.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Pure-tone hearing thresholds (in dB HL) at frequencies between 0.125–8 kHz. Groups were based on audiogram and DPOAE thresholds and age. yNHcontrol: young normal hearing control group (blue), oNH: old normal hearing (orange), oHI: old hearing impaired (red). Thick traces represent the group mean and thin traces represent individual audiogram profiles. Note that two subjects from the recruited NH group did not meet the THDP threshold criterion for normal hearing to be included in the control group. The audiograms of these NH subjects were indicated with thin black traces. The 20 dB HL hearing threshold (THA), which was used to separate listeners into oNH or oHI subgroups, was indicated by a gray dashed curve.

Human EFR recordings

EFRs were recorded in humans to study their relationship to speech intelligibility and in Budgerigars to assess their sensitivity to kainic-acid induced CS. Human EFRs were recorded to two amplitude-modulated pure-tone stimuli with the same carrier frequency (f = 4 kHz) and modulation frequency (fm = 120 Hz, starting phase (ϕ) of 3π2 ). The only difference between the stimuli was their envelope shape: the SAM stimulus had a sinusoidal modulator (Eq. 1), as commonly used in animal studies of CS (e.g., Shaheen et al., 2015; Parthasarathy and Kujawa, 2018), while the modulator of the second stimulus was a rectangular-wave with a duty-cycle (τ) of 25% (RAM; Vasilkov et al., 2021). A modulation depth (md) of 0.95 (95%, −0.45 dB re. 100%) was applied. MATLAB’s “square” function was adopted to create the RAM stimulus, and both stimuli are formulated as follows:SAM:x(t)=[1+md⋅sin(2πfmt+φ)]sin(2πft) RAM:y(t)=[2+2md⋅m(t)]sin(2πft),withm(t)=2∑n=0d⋅fm−1[u(t⋅fm−n+φ2π)−u(t⋅fm−n+φ2π−τ)]−1,andu[p]={0,p<0,1,p≥0,( unit step function) Stimuli were windowed using a 2.5% tapered-cosine window, had a duration (d) of 0.4 s and were repeated 1,000 times each (500 per polarity). The inter-stimulus interval consisted of a uniformly distributed random silence jitter (100 ms ± 10 ms). The SAM tone was presented at 70 dB SPL and the RAM stimulus at the same peak-to-peak amplitude, which corresponded to 68 dB SPL. A previous study compared a broad range of possible ABR and EFR markers of CS (Vasilkov et al., 2021), and we selected the most promising RAM stimulus for use in the present study. Stimuli were generated in MATLAB (R2015b) at a sampling rate of 48 kHz and calibrated using a Brüel & Kjær ear simulator type 4157 for insert earphones. A Fireface UCX sound card (RME) and TDT-HB7 headphone driver (Tucker-Davis) were used to drive the ER-2 insert earphones (Etymotic Research) using the open-source portaudio playrec ASIO codec (Humphrey, 2008). Stimuli were presented monaurally to the test ear.

Recordings took place in a double-walled electrically shielded measurement booth (IAC acoustics) and participants sat in a reclining chair while watching a silent movie. EEG signals were recorded using a 64-channel cap with equidistant electrode spacing (Easycap) and active Biosemi Ag/AgCl electrodes were connected to a Biosemi amplifier. A sampling rate of 16,384 Hz and 24-bit analog-to-digital conversion were used to store the raw data traces. A common-mode-sense (CMS) active electrode was placed on the fronto-central midline and a driven-right-leg (DRL) passive electrode was placed on the tip of the nose. Reference electrodes were placed on each earlobe. Electrode offsets (DC values of the CMS signal) were kept below 25 mV.

Raw EEG recordings were extracted in Python (version 2.7.10 | Anaconda 2.3.0 (64-bit), www.python.org) and MNE-Python (version 0.9.0; Gramfort et al., 2013, 2014) and all EEG recording channels were re-referenced to the offline-averaged earlobe electrodes. Data were epoched in 400 ms windows starting from the stimulus onset and baseline corrected by the average amplitude per epoch. We only present results from the vertex channel (Cz) in this study which is known to yield good signal strength for subcortical auditory sources (Picton, 2010). Signal processing was performed in MATLAB (R2014b). The EFR estimates for each stimulus condition and participant were computed based on the energy at the modulation frequency and its first four harmonics (h0−h4=k×fm,k=[1..5] ) to account for all envelope-related energy in the EEG signal (Vasilkov et al., 2021). Equation 3 was used in a bootstrap routine to obtain the mean EFR amplitude and corresponding standard deviation (Zhu et al., 2013):EFR=max(Wb¯)−min(Wb¯)2,whereWb¯=1B∑b=1BWb,withWb=1N∑n=0N−1(Fn,b−NFn,b)eiθn,b;ifn≠kfmfsN,thenFn,b,NFn,b=0,\,fork=[1..5], where N corresponds to the number of frequency bins in the magnitude spectrum. First, a mean spectral estimate of the EEG recording for each frequency component (n) was computed by averaging the complex discrete Fourier transform values of 1,000 randomly drawn epochs using 500 epochs of each polarity (with replacement) in each bootstrap run (b; Fn,b). Epochs were windowed using a 2% tapered-cosine window before the frequency-domain transformation. The electrophysiological noise floor (NFn,b) at frequencies h0 − h4 was computed as the average magnitude of the ten frequency-bins surrounding the respective frequency (five bins each side). The noise-floor estimates were then subtracted from the signal components h0 − h4, to yield peak-to-noise floor (PtNn,b) magnitude estimates. Figure 2A (left panels) illustrate the mean magnitude spectra and noise floor estimates for an example human recording. All frequency components apart from the harmonic frequencies (h0 − h4) were removed from the noise-floor-corrected spectrum before it was transformed back to the time domain using the inverse discrete Fourier transform and the original phase information (θn,b) of the harmonic frequencies. This procedure was repeated b = 200 times to yield 200 reconstructed time-domain estimates of the EFR waveform (Wb). The Wb waveforms were then averaged and the EFR was defined as half the peak-to-peak amplitude of the averaged reconstructed time-domain waveform. Figure 2A (right panels) shows the reconstructed EFR time-domain waveform for an example human recording and compares it against the original, filtered EEG signal on which the reconstruction procedure was applied. The metric defined in Equation 3 corresponds to the EFR peak-to-noise-floor amplitude and is further referred to as the EFR amplitude (in μV), or the EFR marker.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Comparison between human (panel A) and Budgerigar (panel B) EFR recordings and analysis procedures for single subjects. For each species, the top row corresponds to the SAM stimulus and the bottom row to the RAM stimulus. The modulation frequencies and depths were 120 Hz (95% md) and 100 Hz (100% md) for humans and budgerigars, respectively. Carrier frequencies were 4 kHz (68 dB SPL RAM, 70 dB SPL SAM) and 2.83 kHz (75 dB SPL), respectively. Time-domain responses (right panels) show the filtered EEG recording along with the reconstructed time-domain waveform that was based on five frequency components (h0 − h4) and their respective phases. The EFR amplitude (or, EFR marker) was extracted from the reconstructed time-domain EFR following Equation 3, which was based on an iFFT of the noise-floor-corrected mean EFR magnitude (left panels). The budgerigar recordings (panel B) are shown for the same animal before or after kainic-acid (KA) administration. Post-KA spectral peaks and reconstructed EFRs were smaller than pre-KA EFRs.

Budgerigar EFR recordings and kainic-acid induced synaptopathy

EFRs were also recorded in eighteen young adult Budgerigars (Melopsittacus undulatus), approximately two years of age, before and/or after induction of CS using kainic acid (KA). The Budgerigar is a small parrot species with human-like behavioral sensitivity to many simple and complex sounds in operant conditioning studies (Dent et al., 2000; Dooling et al., 2000), and was selected based on its use in ongoing behavioral studies of CS (Wong et al., 2019). All procedures were performed at the University of Rochester and approved by the University Committee on Animal Resources. Three animals were tested before and after KA administration and monitored over time, whereas the others in the cohort either belonged to a control-only (n = 11) or KA-only group (n = 4). KA is a glutamate analog that damages cochlear afferent synapses between hair cells and auditory-nerve fibers through excitotoxicity in mammals and birds (Bledsoe et al., 1981; Pujol et al., 1985; Zheng et al., 1997; Sun et al., 2001). In Budgerigars, bilateral cochlear infusion of KA has been shown to permanently reduce ABR wave-I amplitude by up to 70% without impacting behavioral audiometric thresholds or DPOAEs generated by sensory hair cells (Henry and Abrams, 2018; Wong et al., 2019). We used the methods described in Henry and Abrams (2018) and Wong et al. (2019) to induce synaptopathy in Budgerigars. Briefly, animals were anesthetized with ketamine (5–6 mg/kg) and dexmedetomidine (0.1 mg/kg; subcutaneous bolus injection) and placed in a custom stereotaxic device. Ketamine/dexmedetomidine were administered subcutaneously for anesthetic induction, but anesthesia was maintained throughout the surgery (approximately 1–2 h) by a continuous anesthetic infusion pump (Razel Scientific; Fairfax, VT, USA). The middle-ear space was accessed surgically using a posterior approach to expose the basal prominence of the cochlear duct, where a 0.15-mm diameter cochleostomy was made using gentle rotating pressure on a small manual drill. Thereafter, 2.5 μL of 2-mM KA (Abcam ab 14,4490; Cambridge, UK) in hanks balanced salt solution (Sigma-Aldrich H8264; St. Louis, MO, USA) was infused into cochleostomy over 90 s using a microinjection needle. Compound action potentials (CAPs) of the auditory nerve were recorded before and after infusion in response to clicks. Excitotoxic synaptic injury was confirmed by observing >90% CAP reduction within 10–20 min following KA exposure. The left and right ears were treated with KA during different surgical procedures four weeks apart to minimize operating time and to avoid excessive anesthetic exposure. DPOAEs were recorded using a swept-tone paradigm (see Wong et al., 2019) before and after surgeries to confirm no adverse impact of the procedures on sensory hair cells. Prior to KA exposure, wave-I amplitude of the ABR in response to 90-dB peSPL clicks was 24.56 ± 2.31 μV in animal K20 and 23.69 ± 2.60 μV in animal K25 (M ± SD). Reduction of wave I, based on ABRs recorded four or more weeks post KA (during the steady-state period), was 68.5% in animal K20 and 64.9% in animal K25. EFRs were measured at multiple points before KA exposure and at four time points after the second infusion. Repeated measurements assessed within-subject variability of responses, since synaptic injury remains relatively stable after one month following KA exposure (Sun et al., 2000; Henry and Abrams, 2018; Wong et al., 2019). Anesthesia was performed as described above using ketamine and dexmedetomidine (for recording sessions, anesthesia was only administered subcutaneously), and body temperature was maintained in the normal range for this species of 39—41°C.

Stimuli were generated in MATLAB (The MathWorks, Natick, MA, USA) at a sampling frequency of 50 kHz. Stimuli were SAM and RAM tones presented at 75 dB SPL with 10-ms cosine-squared onset and offset ramps, 300-ms duration, and a 130-ms silent interval between successive stimuli. Carrier frequency and modulation frequency were 2,830 and 100 Hz, respectively, and the polarity of the carrier signal was alternated between stimulus repetitions. Depth of modulation was 100%, and the duty-cycle for RAM modulation was fixed at 25%. Stimuli were converted to analog using a data acquisition card (PCIe-6251; National Instruments, Austin, TX, USA), which also digitized response activity using the same internal clock. Stimuli were presented free-field through a loudspeaker (MC60; Polk Audio, Baltimore, MD, USA) positioned 20 cm from the animals head in the dorsal direction (the rostral surface of the head faced downward in the stereotaxic apparatus; thus the loudspeaker and animal were located in the same horizontal plane). Level was controlled digitally (by scaling stimulus waveforms in MATLAB) and by up to 60 dB of analog attenuation applied by a programmable attenuator (PA5; Tucker Davis Technologies, Alachua, FL, USA). Calibration was based on the output of a 14 ” precision microphone (model 4,938; Brüel and Kjær, Marlborough, MA USA) in response to pure tones.

Electrophysiological activity was recorded differentially between a stainless steel electrode implanted at the vertex (M0.6 threaded machine screw; advanced through the skull to the level of the dura) and a platinum needle electrode (model F-E2; Natus Manufacturing, Gort, Co. Galway, Ireland) inserted at the base of the skull near the nape of the neck. A second needle electrode in the animal’s back served as ground. Activity was amplified by a factor of 50,000 and filtered from 30 to 10,000 Hz (P511; Grass Instruments, West Warwick, RI USA) prior to sampling (50 kHz) by the data acquisition card. Responses to 300 repetitions of the same stimulus (including both polarities) were averaged to produce each EFR waveform and amplitude, following the procedure described for human EFR recordings in Equation 3. The Budgerigar panels in Figure 2B (left panels) depict the mean EFR spectral magnitudes and noise floor estimates for the SAM and RAM stimulus for a Budgerigar that was monitored longitudinally. Spectra and reconstructed EFR waveforms are shown both before and after KA administration and demonstrate smaller EFR waveforms after ototoxic-induced CS in this species (Fig. 2B, right panels).

Human Speech Reception Thresholds (SRTs)

Speech intelligibility was assessed by applying a standard German Matrix sentence test which determines the speech reception threshold (SRT) in quiet or in a fixed-level noise background (Matrix-test or OLSA; Wagener et al., 1999; Brand and Kollmeier, 2002). The OLSA test consists of 5-word sentences (male speaker: name–verb–numeral–adjective–object) using a corpus of 50 possible words. The speech-shaped background noise was generated from the sentences and matched the long-term spectrum of the speech material. The SRT for 50% correctly identified words (i.e., SRT50) was determined using a 1-up/1-down adaptive procedure with varying step size based on word scoring (20 sentences per condition).

The SRT50 was determined in quiet (SiQ) and noise (SiN) for unfiltered/broadband (BB), LP-filtered and HP-filtered audio. The speech and noise signals in the LP and HP conditions were generated by applying a 1,024th-order FIR filter with respective cut-off frequencies of 1.5 and 1.65 kHz to the OLSA test material (i.e., the BB condition). Since the adopted EFR marker predominantly captures high-frequency TENV processing, it is logical to incorporate a speech condition that similarly depends on cochlear TENV processing, such as the SRTHP.

Both SiQ and SiN conditions were included in our study because we anticipate that speech processing in the presence of a fixed-level background noise may be more adversely affected by CS. Although both hearing sensitivity (Plomp, 1986) and CS-compromised TENV processing (Shaheen et al., 2015; Parthasarathy and Kujawa, 2018) can impact SiQ processing, we believe that the SiN condition may be further compromised by reduced coding redundancy due to AN deafferentation (Lopez-Poveda and Barrios, 2013). In the SiQ conditions, the speech level was varied and the dB SPL at which the 50% correct threshold was reached, was reported. The initial speech level in the SiN test was 70 dB SPL and the noise level was kept fixed at 70 dB SPL while the speech level varied adaptively to yield the SRT. The six conditions (3 SiQ, 3 SiN) were presented in pseudo-random order. Participants completed three training runs in which a SiNBB with a fixed SNR of 5 dB was followed by a regular SiNBB condition with SRT tracking for training purposes. All possible words were displayed on the screen during those runs to familiarize participants with the stimulus material. The third training run was a SiNHP condition with SRT tracking but without visual aid. During the experiment, answers were logged by the experimenter and no visual feedback was provided. Measurements were conducted in a double-walled sound-insulated booth using Sennheiser HDA200 headphones in combination with the Earbox Highpower ear 3.0 sound card (Auritec) and the Oldenburg Measurement Platform (HörTech gGmbH). The setup was calibrated using an artificial ear type 4153, microphone type 4134, preamplifier type 2669 and sound level meter type 2610 (Brüel & Kjær).

Statistical and post-hoc analysis

To disentangle the effects of CS and hearing sensitivity on the EFR markers and SRTs, we performed group statistics, as well as a multiple regression analysis using the pooled data. THAs and THDPs were used as criteria to assign listeners to specific groups. Aside from their normal audiometric thresholds, all subjects in the yNHcontrol group were ensured to have a THDP,4 kHz ≤ 25 dB SPL, to minimize the risk of OHC damage in this group. As a result, two participants from the original yNH group were excluded from the yNHcontrol group based on their THDP,4 kHz. Their datapoints were, however, included in the multiple regression analysis across all individuals, and were identified with distinct markers in the corresponding regression figures. We performed a group analysis using the “aov_group” function from the R programming environment (R Core Team, 2019), and investigated main effects of age and hearing sensitivity using unpaired Students’ t-tests between the yNHcontrol (n = 13) and oNH (n = 15) groups, and between the oNH and oHI (n = 14) groups, respectively. The “SciPy” python package for scientific computing (Oliphant, 2007; Millman and Aivazis, 2011) and “stats.ttest_ind” function was used for this purpose. p-values for multiple comparisons were Bonferroni adjusted to control for the family-wise error rate. The applied correction factors are given in the Results section.

We examined linear correlations between the EFR and SRT on the pooled data (n = 44) using correlation coefficients calculated using the “SciPy” python package and “stats” package. All correlations refer to the Pearson correlation coefficient (r) if both variables were normally distributed (Shapiro–Wilk test), otherwise the Spearman’s rank correlation coefficient (ρ) was reported. These correlations were further investigated using using the “sklearn” linear regression functions in Python to analyse the residuals, and using multiple regression models to verify the contribution of age and hearing sensitivity (“lsmeans” package in R Lenth, 2016). Additionally, we performed commonality analysis using the “yhat” package (Nimon et al., 2008). Commonality analysis combines linear regressions on the dependent variable and allows for the decomposition of the explained variance (R2 ) of the linear predictors into subcomponents explained by the unique and the common/shared variance of predictors and all their possible combinations (Newton and Spurrell, 1967). This technique also works in the presence of multicollinearity (Ray-Mukherjee et al., 2014).

Because hearing sensitivity, as assessed behaviorally using pure-tone thresholds is similar, but not necessarily identical, to OHC sensitivity assessed through DPOAE thresholds (Boege and Janssen, 2002), we performed additional post-hoc statistics using the THDP,4 kHz as the measure of hearing sensitivity. For the group-based analyses, we divided the cohort into groups with either normal or impaired DPOAE thresholds, reflecting THDP,4 kHz ≤ or >25 dB SPL, respectively. We also considered the pooled older group (OLD, n = 29) when investigating main effects of age-related CS in comparison to the yNHcontrol group, and the pooled normal-hearing group (NH, n = 28) when investigating main effects of hearing sensitivity in comparison to the oHI group.

Results

EFR sensitivity to CS

Panel B in Figure 2 shows the effect of KA on Budgerigar SAM and RAM EFR magnitude spectra (left) and reconstructed waveforms (right). Energy at the modulation frequency of the stimulus and its harmonics was reduced after KA administration, leading to an overall reduction in the reconstructed EFR waveform and its corresponding amplitude. EFR amplitude reductions occurred consistently across the three longitudinally monitored Budgerigars (Fig. 3A), and was attributed to a histology-verified reduction in auditory-nerve peripheral axons and cell bodies (Fig. 3B). The histology confirms that KA introduces CS in Budgerigars, and the additionally performed DPOAE analysis in Wang et al. (2023) and Wilson et al. (2021) furthermore demonstrates the selectivity of KA to AN synapses and cells without damaging the OHCs. As Figure 3A depicts, EFR amplitude reductions occur instantly after KA administration, and recover slightly to an overall reduced amplitude over the following weeks.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

A, EFR amplitudes from three Budgerigars (B132, B120, B125) before or several weeks after administration of kainic acid (KA). B, Representative cross sections of the Budgerigar cochlea from a control ear (left) and from an ear exposed to 1-mM KA solution (12 weeks post exposure; right). Sections are stained for DAPI and Myosin 7A, and are from the location 50–60% of the distance from the apex to the base (≈2-kHz cochlear frequency; see Wang et al., 2023). KA exposure causes marked reduction of auditory-nerve (AN) peripheral axons and cell bodies in the AN ganglion, without impacting the hair-cell epithelium. C, Boxplots and individual data points of the Budgerigar SAM and RAM EFR amplitudes before or after KA-administration. Connected lines correspond to data from the same animal. To account for both paired and independent data in the Budgerigar sample, we performed a partially overlapping samples t-test and reported significance as: *p < .05, **p < .01 and ***p < .001. D, Boxplots and individual data points of human SAM and RAM EFR amplitudes. Data are show for the different test groups based on age and THAs: yNHcontrol, oNH and oHI, as well as for the pooled OLD group (oNH+oHI). Independent samples t-tests were performed between all conditions for the SAM and RAM conditions separately, and significance was reported as: *p < .05, **p < .01 and ***p < .001 after applying a Bonferroni correction of 6.

The post-KA EFR amplitudes shown in Figure 3C correspond to average EFR amplitudes (μV) over the different post-KA measurement time points for each animal and are compared to control (pre-KA or non-KA) EFR amplitudes. Connected lines refer to data points stemming from the same Budgerigar, and SAM and RAM EFRs were recorded during the same session. Other points show data from control animals (n = 11) or post-KA exposure animals (n = 4; i.e., animals for which pre-exposure EFRs were not recorded). To account for the mix of paired and independent observations in the sample, we conducted a partially overlapping samples t-test (Derrick, 2017), which revealed a significant decrease in EFR amplitude from the pre-KA condition (M = 0.78, SD = 0.245) to the post-KA condition (M = 0.49, SD = 0.127) for the SAM stimulus (t(19.89) = 2.75, p = .012). The RAM condition showed an even larger significant reduction from pre-KA (M = 3.86, SD = 0.689) to post-KA (M = 1.73, SD = 0.565, t(19.89) = 7.611, p < .001). The latter observation relates to the almost five-times larger pre-KA RAM amplitudes. These recordings confirm the positive effect the stimulus envelope has on the EFR signal-to-noise ratio. Single-unit AN recordings show more synchronized AN responses to faster-rising stimulus envelopes (Dreyer and Delgutte, 2006), and the AN and EFR model simulations performed in Vasilkov et al. (2021) show that this effect also impacts the neural generators of the EFRs. We conclude that the RAM EFR is a selective non-invasive marker of CS in Budgerigar and that the RAM EFR has a better sensitivity over the SAM EFR in identifying individual differences in CS. This renders the RAM EFR a suitable candidate for use in human studies for whom the intrinsic EEG signal-to-noise ratio is inherently smaller than in research animals.

Human EFR recordings: age-related deficits

Figure 3D depicts human SAM and RAM EFRs for yNHcontrol and older subjects with or without impaired audiograms. The human EFR amplitudes are in agreement with both model predictions (Vasilkov et al., 2021) and Budgerigar findings (panel C) in showing overall 3.7 times larger RAM (M = 0.239, SD = 0.076) than SAM (M = 0.065, SD = 0.022) amplitudes (μV) in the yNHcontrol group. Compared to the yNHcontrol group, older subjects showed reductions in the amplitude of the SAM and RAM EFRs by 7% and 47%, respectively. The mean amplitudes for the older group were M = 0.061 (SD = 0.031) for SAM and M = 0.126 (SD = 0.057) for RAM. There were no significant differences between the yNHcontrol group and older listeners for the SAM EFR (p > .05). However, RAM EFR amplitudes were significantly reduced in the older group compared to the yNH control group (t(40) = 5.17, p < .001). Figure 3D visualises this trend, and furthermore shows that the RAM EFRs of the oNH and oHI subgroups are significantly smaller than those of the yNHcontrol group. Together, this supports the view that the RAM EFR is more sensitive to detecting age-related changes than the SAM EFR in humans.

Humans differ from the KA Budgerigar model of CS in that the reductions in human EFRs may be influenced by factors beyond age-related CS. For example, our human cohort may have had other forms of SNHL (e.g., OHC damage) that could also have affected the RAM EFR marker. To investigate the potential influence of hearing sensitivity on the observed EFR reductions, the main effect of age can be considered against the main effect of hearing sensitivity. The yNHcontrol group had significantly larger RAM EFR amplitudes (M = 0.239, SD = 0.076) than the oNH group (M = 0.155, SD = 0.062, t(26) = 3.09, p = .005), and the oNH group had better EFRs than the oHI group (M = 0.095, SD = 0.028, t(27) = 3.19, p = .004). However, the main effect of age (t(40) = 3.92, p < .001) was greater than that of THA differences in hearing sensitivity (t(27) = 3.19, p = .004). We performed an additional post-hoc regrouping using the THDP criterion of 25 dB SPL to separate cohort into those with normal or impaired OHC integrity at 4 kHz. Within the group of THDPs <25 dB SPL, younger subjects had significantly larger EFRs than the older listeners (t(17) = 3.9, p < .001), and among the older listeners, there were no significant EFR differences between those with normal or impaired THDPs (t(27) = 1.1, p > .05). The mean THDP difference of 20.9 dB between the older subjects with normal or impaired hearing sensitivity was thus not reflected in their EFR amplitudes. Taken together, our group analyses support a predominant age-related CS interpretation of the RAM-EFR. Going further, we investigate the degree to which the RAM-EFR marker can predict speech intelligibility declines in older listeners.

Speech reception thresholds

Individual and group speech reception thresholds (SRTs) are depicted in Figure 4 for quiet (SiQ; panel A) and stationary noise backgrounds (SiN, panel B). For each filtered condition, SRTs of the yNHcontrol group are compared to the pooled older group, as well as oNH and oHI subgroups.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Speech reception thresholds (SRT) for the OLSA matrix sentence test presented in quiet A, and in speech-shaped noise B, for three conditions: original (BB), low-pass-filtered speech and noise material (fc = 1.5 kHz; LP), high-pass filtered speech and noise material (fc = 1.65 kHz; HP). SRTs are grouped by the selection groups (yNHcontrol, oNH, oHI), as well as pooled across oNH and oHI subjects into an older group (OLD). Independent t-tests were computed between the groups in each condition for the quiet and noise conditions separately, and significant differences were indicated on the figure. The p-values were Bonferroni corrected before their significance was reported as (*) p < .05, (**) p < .01 and (***) p < .001. A Bonferroni correction of 18 was applied for the yNHcontrol, oNH and oHI group comparisons, and of 6 for the yNHcontrol and OLD comparisons.

A two-way (3 × 3) mixed-design ANOVA analysis investigated the role of participant group (yNHcontrol, oNH, oHI) and filter-condition (LP, HP, BB) on the SRT. Apart from significant main effects of group (SiQ: F(2, 39) = 37.2, p < .001; SiN: F(2, 39) = 34.8, p < .001) and filter-condition (SiQ: F(2, 78) = 420.3, p < .001; SiN: F(2, 78) = 707, p < .001), the interaction terms were also significant (SiQ: F(4, 82) = 24.6, p < .001; SiN: F(4, 82) = 17.9, p < .001), indicating that group SRTs were differently affected by the filtering.

Adding background noise affected the groups in the HP condition differently across the SiQ and SiN condition, while the trends observed in the BB and LP conditions remained consistent. The addition of background noise particularly impaired the oNH group in the HP condition. While the OLD subgroups performed worse than the yNHcontrol group in both SiQ and SiN conditions (t(40) = −6.8 (SiQ) and −7.3 (SiN), p < .001), the oNH group performed equally poorly as the oHI group in the SiN condition (p > .05), but not in the SiQ condition (t(27) = −5.08, p < .001). This supports the view that processing TENV information in a stationary noise background is more affected by the ageing process than by hearing sensitivity impairments. Further examination of Figure 4 shows that the SRTHP was generally worse than the SRTLP. This effect was not influenced by the presence of background noise, suggesting that German speech intelligibility relies more on speech frequency information below 1.5 kHz.

In the LP condition, where hearing sensitivity was comparable and within the normal range for both the oNH and oHI groups, there were no significant differences in SRTLP. This supports the view that age-related differences were not the driving factor in explaining these group differences. In the HP condition, both the oNH and oHI groups had poorer SRTHP scores than the yNHcontrol group, indicating a predominant age effect. Whereas the SiN condition was similarly reduced in oNH and oHI listeners, the significantly worse performance of the oHI group compared to the oNH group in the SiQ condition underscores the additional impact of reduced hearing sensitivity at frequencies above 1.5 kHz for SiQ, but not SiN, processing.

Speech reception thresholds: individual differences

Figures 5 and 6 depict the relationship between RAM (top) or SAM (bottom) EFR amplitudes and SRTSiQ or SRTSiN, respectively. Overall, the SRT related most strongly to the RAM, not SAM, EFR amplitude. After correcting for multiple comparisons (n = 12, p = .0042), all SRTSiQ conditions correlated significantly to the RAM EFR, while for the SRTSiN conditions, only the BB and HP condition remained significant. This suggests that the RAM-EFR marker, which was more sensitive to detect individual CS differences than the SAM-EFR, was also more effective at predicting individual differences in speech recognition.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Regression plots between SRTSiQ and the EFR amplitudes for RAM (top) and SAM stimuli (bottom). Analyses are performed for the BB (A,D), LP (B,E) and HP filtered conditions (C,F). Subjects belonging to the yNHcontrol, oNH, oHI groups are color-coded and the two yNH subjects who did not meet the THDP criterion to be included in the yNHcontrol group are marked with crosses. Correlation statistics (ρ or r) are indicated on the each panel and are performed across the entire cohort (ALL), or subgroups of OLD (oNH+oHI) or NH (yNH+oNH) subjects.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Regression plots between SRTSiN and the EFR amplitudes for RAM (top) and SAM stimuli (bottom). Analyses are performed for the BB (A,D), LP (B,E) and HP filtered conditions (C,F). Subjects belonging to the yNHcontrol, oNH, oHI groups are color-coded and the two yNH subjects who did not meet the THDP criterion to be included in the yNHcontrol group are marked with crosses. Correlation statistics (ρ or r) are indicated on each panel and are performed across the entire cohort (ALL), or subgroups of OLD (oNH+oHI) or NH (yNH+oNH) subjects.

Secondly, the EFR markers specifically targeted TENV coding mechanisms, given their 4-kHz carrier frequency, which led to a stronger prediction of the SRTHP than the SRTLP.

Table 1 and the figure legends summarize the correlation statistics between SRT conditions and the RAM EFR for the entire cohort (ALL) as well as for the NH or OLD subgroups. When comparing correlations across subgroups of NH and OLD participants, it is evident that the RAM EFR consistently serves as a stronger predictor of SRT in the NH cohort compared to the OLD cohort. This finding suggests that age-related effects, which are the dominant factor in the NH group, exert a greater influence on SRT predictions than THA differences do in the OLD cohort. Notably, participants in the OLD group may already exhibit some degree of age-related CS, which could mask the relationship between RAM EFR and SRT.

View this table:
  • View inline
  • View popup
Table 1.

Relationship between the EFR marker SRT

Of particular interest is the NH group, who are not currently classified as clinically hearing-impaired based on their normal audiograms, yet can demonstrate both reduced RAM-EFR amplitudes and impaired SRTs. These findings highlight the importance of conducting more detailed assessments of these individuals beyond standard audiograms, especially when they report difficulties with hearing.

However, to support a hypothesis in which RAM EFR and SRTs are simultaneously influenced by an underlying CS cause, it is essential to rule out other factors that could have affected this relationship. Potential confounding factors include aspects that impact speech recognition (such as cognition) but do not affect the RAM EFR, or a predominance of other interrelated SNHL pathologies (such as OHC damage) that could drive both measures in this relationship.

The roles of hearing sensitivity and age in predicting the SRT

To factor out the mediating effect of hearing sensitivity on the observed relationships between the RAM EFR and SRT, we corrected for THA by considering the residuals of a linear regression model between THA and SRT. Figure 7 shows the relation between the SRT residuals and the RAM EFR after correcting for the mean THA across the frequencies in the 0.125–8 kHz, 0.125–1.5 kHz, and 1.5–8 kHz intervals, for the BB, LP, HP conditions, respectively. After correcting for hearing sensitivity, the SRTSiN-HP residuals remained significantly correlated for the NH subgroup (r(29) = −0.6, p < .001), and approached significance for the cohort (ρ(44) = −0.28, p = .07). We repeated this analysis using THDP,4 kHz as the correction factor, and found that the significant relationship between RAM EFR and SRTSiN-HP decreased from ρ(44) = −0.73 (p < .001) to ρ(44) = −0.34, but remained significant (p = .02). Taken together, these analyses show that the individual SRTSiN-HP differences cannot solely be explained by hearing sensitivity differences. None of the SRTBB or SRTLP residuals maintained a relationship to the RAM EFR, after correcting for hearing sensitivity. This is not surprising as the RAM EFR marker of CS reflects supra-threshold TENV processing above the phase-locking limit, and the SRTBB values were very similar to the SRTLP results which rely on low-frequency hearing mechanisms.

Figure 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 7.

Residuals of a linear regression model between the SRT and THA for the different SRT conditions: BB, LP, HP. The applied THA for each of the conditions was adjusted to the frequency range of the speech material. The THA correction applied refers to the mean THA across the 0.125–8 kHz (BB), 0.125–1.5 kHz (LP), and 1.5–8 kHz (HP) intervals, respectively. Indicated correlation statistics refer to the ρ or r calculated between the THA-corrected SRTs and the RAM EFR, and were either calculated across the entire cohort (ALL) or subgroups of NH (yNH+oNH) or OLD (oNH + oHI) subjects.

Another confounding factor, namely age, could have affected the SRT and RAM EFR differently, hence we first considered its independent contribution to the RAM EFR. While a significant age effect was evident in the RAM EFR for the pooled data (ρ(44) = −0.66, p < .001). This effect was absent for the subgroup of older participants, who were age-matched within an 61–68 age bracket. Simultaneously, an age effect was also observed for the yNHcontrol subgroup (ρ(13) = −0.57, p = .041). These findings suggest that while a general age-related trend for CS exists, individuals within the same age decade can still exhibit varying degrees of CS.

Secondly, we investigated the independent contributions of age and TH to the SRTHP as part of a linear regression analysis performed in Table 2. Age was a strong predictor of the SRTHP in quiet and noise conditions (Adjusted R2 of 0.58 and 0.54, resp.), but is not an independent predictor as such. A colinearity analysis with age and other predictor values for the SRT (i.e., RAM EFR, THA, THDP) showed variance inflation factors above 1.5, indicating mild colinearity between the predictor variables. A Durbin-Watson test further revealed that linear regression models with age and THDP, or age and RAM EFR had significantly correlated residuals (p < .05). Since age was neither independent of the RAM-EFR nor TH variables, including it in the multiple regression models would consistently dominate the results. Moreover, because age cannot be considered a marker of peripheral hearing damage, it was excluded from further analysis.

View this table:
  • View inline
  • View popup
Table 2.

Linear regression analysis between SRT and variables RAM EFR, age and threshold

A role for the RAM-EFR marker of CS in predicting the SRTHP

Tables 2 and 3 report a multiple regression and commonality analyses which considered the entire cohort (n = 44) and variables: RAM EFR, THA, THDP, or age using the following equation for the dependent variable (SRTHP):SRTHP=β1⋅X1+β2⋅X2+ϵ. Several linear regression models of SRTSiN,HP and SRTSiQ,HP were compared: single regression models which only considered the X1 factor, and multiple regression models with and without interaction terms. Best-fitting single regression models were those with age or THA as the predictor variable. Regression models which included THA or THDP and the RAM EFR improved the model fit for the SRTSiN,HP, but not for SRTSiQ,HP condition, and adding an interaction term did not further improve the models.

View this table:
  • View inline
  • View popup
Table 3.

Linear regresssion and commonality analysis for SRTHP

In general, hearing sensitivity parameters were stronger predictors in the SRTSiQ,HP than SRTSiN,HP models. This underscores a strong association between SRTSiQ and hearing sensitivity (Festen and Plomp, 1983; Papakonstantinou et al., 2011). In contrast, it is well known that the SRTSiN is not well predicted by hearing sensitivity alone (Plomp, 1986; Fitzgerald et al., 2024). The SRTSiN-HP performance may thus be more influenced by supra-threshold TENV deficits linked to temporal coding impairments caused by CS (Lopez-Poveda and Barrios, 2013) or the so-called supra-threshold distortion factors (Plomp, 1986).

To further examine the independent role of EFR RAM for speech recognition, we performed linear regression models and a commonality analysis for the dependent variables SRTSiN,HP and SRTSiN,HP in Table 3. All models had a normal distribution of residuals, and a colinearity (vif) factor below 1.61 (i.e., <5, thus independent variables). Homoscredasticity was not met for the SRTSiN,HP models, and the Durbing-Watson test showed dependence in the SRTSiN,HP and SRTSiN,NH conditions. Linear models with RAM EFR (X1) or THA,4 kHz (X2) were significant for all conditions, but explained less variance for the SRTSiN-HP,OLD condition where only minor age-related CS differences were expected due the age-matching of subjects in the cohort. Even though there was a significant unique contribution of THA,4 kHz in all models, there was also a unique and significant contribution of 14.1% and 34.7% for the RAM EFR in the SRTSiN-HP and SRTSiN-HP,NH models, respectively. Repeating this analysis, but with the THDP,4 kHz variable of hearing sensitivity, yielded unique RAM EFR contributions of 30.37% and 82.59 % for the SRTSiN-HP and SRTSiN-HP,NH conditions, respectively. Especially given that the unique contribution of RAM EFR was drastically reduced in the SRTSiQ,HP models, this further supports the conclusion that our proposed RAM EFR marker of CS is more effective at predicting the individual signal-to-noise ratio required for speech recognition in a fixed-level, stationary noise background.

Discussion

We report strong predictive power of the RAM-EFR amplitude for SRTs, with better performance for SRTHP compared to SRTBB and SRTLP conditions (Table 1, Figs. 5 and 6). Additionally, the unique contribution of the RAM EFR marker of CS to the SRTHP models, which included both hearing sensitivity and the RAM-EFR marker, was greater for the SiN condition than for the SiQ condition (Table 3). This finding aligns with the prevailing hypothesis that CS has a more pronounced effect on supra-threshold SiN processing than on SiQ processing. Overall, our results suggest that the ANF population, as assessed using a fixed-level 70-dB-SPL, 4-kHz RAM stimulus, is a strong predictor of speech recognition at the similar sound levels and within comparable cochlear frequency ranges.

The quality of the adopted RAM-EFR marker of CS

To interpret the RAM-EFR as a pure marker of CS, the EFR marker would need to be fully independent of hearing sensitivity or OHC integrity. Our recordings from Budgerigars support the sensitivity of the EFR marker to CS, though they do not completely exclude the possibility that significant OHC damage may also influence its amplitude. Model simulations presented by Vasilkov et al. (2021) provide further insight, indicating that on-CF ANF responses to the RAM stimulus remain unaffected by OHC damage (see their Fig. 1). Additional simulations of human EFR generators by Vasilkov et al. (2021) and Van Der Biest et al. (2023) show that simulated OHC damage has only a minimal effect (5–10%) on the 4-kHz RAM EFR amplitude, while CS significantly impacts the response, reducing it by up to 81%. Thus, the RAM stimulus was designed to be minimally influenced by coexisting OHC damage, a conclusion supported by our human EFR recordings, which show a greater EFR amplitude difference between the yNHcontrol and the oNH group than between the oNH and oHI groups (see Fig. 3D).

We argue that when the stimulus is presented at sufficiently high levels to drive ANFs into saturation, the impact of the effective stimulus level, or supra-threshold audibility, on the EFR response is minimal. Although this effect was not systematically examined in our study, we reference the findings of Encina-Llamas et al. (2021). In their study, EFRs were recorded in response to 98-Hz-modulated SAM tones at various pure-tone frequencies and levels in young normal-hearing listeners (mean age: 24 ± 3.2 years) and older hearing-impaired listeners (mean age: 56.2 ± 12.7 years). The hearing-impaired participants had hearing thresholds of <=20 dB HL below 4 kHz and between 20 and 45 dB HL for frequencies up to 8 kHz, which aligns closely with the audiometric profiles considered in our study.

By fitting two piecewise linear curves to the EFR magnitude growth curves between 20 and 80 dB SPL (expressed in dB per dB), Encina-Llamas et al. (2021) demonstrated that both NH and HI growth curves exhibited a compression breakpoint around 60 dB SPL (see Fig. 2E in their study). To examine this further, we replotted their original EFR data in μV to calculate growth slopes for stimulus levels above 60 dB SPL, as shown in Figure 8. Our analysis revealed an EFR growth slope of 0.002 μV per dB, which was similar for both NH and HI listeners. This indicates that EFR amplitudes increase by approximately 0.04 μV between stimulus levels of 60 and 80 dB SPL, and that differences in hearing sensitivity did not affect this process.

Figure 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 8.

Reanalysis of the Encina-Llamas et al. (2017) dataset reporting EFR magnitudes to a 98-Hz SAM tone of 4011 Hz. dB values were transformed into μV using a 10**(dB/20) transformation with 1 μV as the reference. Data from 13 NH (24 ± 3.2 y/o) and 7 HI (56.2 ± 12.7 y/o) listeners are shown. Only EFR amplitudes that were significantly above the noise floor are shown, and a linear fit was made across data-points above the compression knee-point of 60 dB SPL. The NH cohort had audiogram thresholds below 15 dB HL for frequencies below 8 kHz and the HI cohort had dB HL thresholds <=20 for frequencies below 4 kHz and between 20 and 45 dB HL for frequencies up to 8 kHz.

More importantly for our study, the dataset from Encina-Llamas et al. (2017) did not show significant group differences in EFR amplitude between NH and HI listeners at stimulation levels of 60 dB SPL or higher (see Table 4). Hearing sensitivity differences of up to 30 dB between NH and HI subjects for frequencies above 4 kHz did thus not significantly impact the supra-threshold EFR amplitude. This suggests that variations in audibility are unlikely to influence EFR amplitude in terms of supra-threshold TENV coding or CS when presented at a fixed, supra-threshold level above the EFR growth curve’s knee-point.

View this table:
  • View inline
  • View popup
Table 4.

Statistics test examining whether SAM EFR amplitudes are different across subjects with or without impaired audiograms for stimulation at different presentation levels [dB SPL]

Taken together, the model simulations (Vasilkov et al., 2021; Van Der Biest et al., 2023) and experimental findings from both Budgerigar and human studies support the conclusion that our EFR marker is sensitive to CS and is largely independent of hearing sensitivity differences when assessed at 70 dB SPL. Our human results demonstrated a clear age-related decline in EFR amplitudes, even in the absence of OHC damage. These findings align with animal research linking deficits in temporal coding at the earliest neural stages of the auditory pathway to progressive or noise-induced CS (Parthasarathy et al., 2014; Fernandez et al., 2015; Shaheen et al., 2015; Parthasarathy and Kujawa, 2018).

Context with prior studies

This study addressed and minimized several challenges that have complicated previous human research into the causal relationship between speech-in-noise intelligibility deficits and EFR markers of synaptopathy. One key issue in earlier studies has been the low sensitivity of the traditionally used SAM-based EFR metrics (Grose et al., 2017; Guest et al., 2018). In contrast, we demonstrate that the RAM-EFR is consistently larger in the same individuals and is more strongly affected by KA-induced CS than the conventional SAM-EFR (Fig. 3). This increased sensitivity makes the RAM-EFR more effective than the SAM-EFR for detecting individual differences in both CS and SRTs (Figs. 5 and 6).

Additionally, prior model simulations have shown that the 4-kHz SAM EFR is more influenced by OHC damage compared to the RAM EFR, highlighting the superior specificity of the RAM EFR for detecting CS (Vasilkov et al., 2021; Van Der Biest et al., 2023). By integrating these simulations with Budgerigar and human EFR recordings, we provided evidence that the two variables considered in the regression models—hearing sensitivity and the RAM-EFR marker of CS–can be treated as independent factors of SNHL.

Furthermore, by better aligning the frequency content of the EFR and speech stimuli to target auditory TENV mechanisms and their deficits, the models we presented are better suited than previous studies to isolate the independent contribution of CS to speech intelligibility in noise.

Confounding factors and study limitations

Since synaptopathy often precedes permanent OHC damage during the aging process (Sergeyenko et al., 2013; Fernandez et al., 2015; Parthasarathy and Kujawa, 2018) or following noise exposure (Kujawa and Liberman, 2009; Furman et al., 2013), markers of OHC damage may inadvertently serve as predictors of CS. Although these two pathologies are distinct in their origins and have different effects on auditory processing, they share common risk factors, such as age and noise exposure, and both are associated with reduced EFR amplitudes (Dimitrijevic et al., 2016; Parthasarathy and Kujawa, 2018; Garrett and Verhulst, 2019). This complexity underscores the importance of ensuring that a marker for CS is independent of OHC damage markers (e.g., THDP or THA) when assessing the specific role of CS in sound perception. While we provide strong evidence that the RAM-EFR marker primarily captures CS-related information, its effectiveness may still correlate with auditory threshold measures if individuals with reduced hearing sensitivity also experience CS as an early sign of SNHL.

Our multiple regression analysis revealed that SRTs are influenced by both CS and behavioral and DPOAE markers of hearing sensitivity (Table 3). In the oHI subgroup, the lack of a clear relationship between SRTSiN and the RAM EFR may be explained by the presence of OHC damage (Fig. 6C). While CS contributed to SRT deficits in all listeners within this subgroup, OHC damage likely exacerbated the age-related decline in SRT without being reflected in the EFR marker. Additionally, the group-level analysis (Fig. 4B) underscores the compounded impact of OHC damage on speech intelligibility in noise, as evidenced by significantly worse SRTSiN-HP performance in the oHI group compared to the yNHcontrol group. Lastly, our results show that the oNH group can perform as poorly on SiN processing as the oHI listeners, and this has important consequences for clinical practice. While the oNH group is currently not considered to have a hearing problem based on their audiogram, they do experience speech processing deficits that should be quantified in clinical practice using either a SiN test, or an EFR marker of CS which predicts their SRT scores.

Dissociating general age effects from age-related CS remains challenging in human studies. Previous research involving older participants or those with significant noise-exposure histories has often reported electrophysiological evidence of CS or reduced temporal coding fidelity in the studied populations (e.g., Anderson et al., 2011, 2012; Konrad-Martin et al., 2012; Clinard and Tremblay, 2013; Schoof and Rosen, 2016; Bramhall et al., 2017; Valderrama et al., 2018; Bramhall et al., 2021). Within this context, our study confirms that age is inherently linked to the development of SNHL. However, in our age-restricted samples (OLD or yNHcontrol subgroups), age proved to be a poor predictor of the degree of CS. More broadly, age is associated with both OHC damage (Lin et al., 2011; ISO, 2017) and CS (Schmiedt et al., 1996; Makary et al., 2011; Konrad-Martin et al., 2012; Möhrle et al., 2016; Parthasarathy and Kujawa, 2018). This dual association likely contributes to the observed strong relationship between age and RAM-EFR when pooling data across groups (ρ(44) = −0.66, p < .001). Because age correlates with all facets of SNHL and its biomarkers, incorporating it into statistical models complicates the interpretation of the unique contributions of each pathology.

Next, we address several aspects related to our study design. Current limitations in human experimentation prevent us from directly establishing a causal relationship between CS and speech intelligibility in live humans, necessitating reliance on indirect evidence to support our conclusions. To this end, we incorporated model simulations that examined the impact of various SNHL pathologies on the cochlear and neuronal generators of the EFR (Vasilkov et al., 2021; Van Der Biest et al., 2023). Additionally, we employed an animal model to demonstrate the sensitivity of our EFR markers to CS. The convergence of findings from model predictions and experimental results in several studies (e.g., Verhulst et al., 2018; Encina-Llamas et al., 2019; Keshishzadeh et al., 2020; Vasilkov et al., 2021; Buran et al., 2022) supports the validity of these approaches for the purposes of our study. However, it is important to acknowledge that current state-of-the-art models, while advanced, may not fully capture all cochlear and neural mechanisms underlying EFR generation. Moreover, species differences could confound direct comparisons between Budgerigar and human EFRs. Nevertheless, neural recordings from the Budgerigar inferior colliculus indicate that their auditory processing strategies are comparable to those of the mammalian midbrain (Henry et al., 2017), supporting the relevance of these findings to human research.

In our approach to isolate the contribution of CS to speech intelligibility deficits within the broader context of SNHL, we conducted a group analysis. This approach aimed to amplify the effects between the yNHcontrol group and older groups expected to exhibit age-related CS. However, pooling data from groups that differ in multiple factors can introduce additional between-group explanatory variables that were not explicitly controlled. For instance, cognitive factors such as memory and attention, which are known to decline with age and are associated with speech-in-noise comprehension (Humes et al., 2010; Humes, 2013; Yeend et al., 2017), likely account for some of the unexplained variance in the multiple regression models (Table 3). This is particularly relevant as the analyzed groups had substantial age differences. However, the matrix test was designed to mitigate memory effects by randomly generating word sequences and minimizing cognitive load through the immediate recall of only five words at a time. Additionally, EFR recordings in response to high-modulation-frequency stimuli are considered largely free from top-down attention (e.g., Varghese et al., 2015) or memory influences. The robust correlations observed between RAM-EFR amplitudes and SRTSiN-HP scores suggest that cognitive factors are not the primary driving force behind the findings, but instead interact with peripheral auditory encoding deficits (Johannesen et al., 2016). We contend that, for the purposes of this study, pooling data from carefully defined homogeneous groups is both valid and necessary to disentangle the relative contributions of OHC damage and CS to speech intelligibility deficits. Future studies which consider an age-gradient across a larger cohort of study participants may further shed light on the dynamics between age, CS and speech intelligibility.

We propose the following considerations for future study designs. Recent research on ANF coding (Henry et al., 2016; Encina-Llamas et al., 2019; Vasilkov and Verhulst, 2019) has demonstrated that for supra-threshold stimulation, TENV information can be encoded via the tails of high-spontaneous-rate ANF tuning curves in basal cochlear regions. This suggests that the neural generators of the 4-kHz EFR may encompass a broader frequency range than expected from the stimulus’ narrow-frequency profile and its basilar membrane excitation. Such a phenomenon could impact the intended alignment between the cochlear frequency regions activated by the EFR marker and those involved in processing speech stimuli, particularly under the assumption that both are governed by similar TENV mechanisms. While our study design improved this alignment by transitioning from the broadband (BB) condition to the high-pass (HP) condition, additional refinements could further enhance the interpretation of how reduced TENV coding contributes to speech intelligibility deficits. For instance, introducing high-frequency masking noise to both the speech and EFR stimuli could better isolate the frequency regions of interest and improve alignment.

Conclusion

We conclude that age-related synaptopathy is a significant hearing health concern, as both experimental and theoretical evidence demonstrate its substantial impact on auditory TENV coding and speech intelligibility, especially in noise. Sensitive diagnostic tools are essential for understanding the role of synaptopathy in impaired sound perception. The RAM-EFR marker, which proved to be robust, widely applicable, and selective for CS, was instrumental in reaching these conclusions. Given the independent contribution of synaptopathy to age-related speech perception deficits, particularly in noisy environments, future therapeutic interventions can be developed to address synaptopathy and mitigate its functional consequences.

Footnotes

  • Ghent University owns a patent (US Patent App. 17/791,985) related to the RAM-EFR methods adopted in this paper. Sarah Verhulst and Viacheslav Vasilkov are inventors.

  • This work was supported by the DFG Cluster of Excellence EXC 1077/1 “Hearing4all” (MG, MM, SV), the European Research Council (ERC) under the Horizon 2020 Research and Innovation Programme (grant agreement No 678120 RobSpear; VV, SV) and European Innovation Council (EIC-Transition EarDiTech 101058278; SV), National Institutes of Health grant R01 DC017519 (KH) and a National Institutes of Health Predoctoral National Research Service Award Fellowship (TL1 TR002000) administered by the University of Rochester Clinical and Translational Science Institute (JW).

  • The authors would like to thank the study participants as well as the Hörzentrum Oldenburg for helping with participant recruitment. Lastly, we thank Sarineh Keshishzadeh for help with the analysis scripts and data storage and labelling throughout the project and Attila Fráter for help with the reanalysis of the Encina-llamas data.

  • ↵*These authors contributed equally to the work.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.

References

  1. ↵
    1. Abdala C,
    2. Ortmann AJ,
    3. Guardia YC
    (2021) Weakened cochlear nonlinearity during human aging and perceptual correlates. Ear Hear 42:832–845. https://doi.org/10.1097/AUD.0000000000001014 pmid:33886169
    OpenUrlCrossRefPubMed
  2. ↵
    1. Anderson S,
    2. Parbery-Clark A,
    3. White-Schwoch T,
    4. Kraus N
    (2012) Aging affects neural precision of speech encoding. J Neurosci Nurs 32:14156–14164. https://doi.org/10.1523/JNEUROSCI.2176-12.2012 pmid:23055485
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Anderson S,
    2. Parbery-Clark A,
    3. Yi H-G,
    4. Kraus N
    (2011) A neural basis of speech-in-noise perception in older adults. Ear Hear 32:750–757. https://doi.org/10.1097/AUD.0b013e31822229d3 pmid:21730859
    OpenUrlCrossRefPubMed
  4. ↵
    1. Bharadwaj HM,
    2. Masud S,
    3. Mehraei G,
    4. Verhulst S,
    5. Shinn-Cunningham BG
    (2015) Individual differences reveal correlates of hidden hearing deficits. J Neurosci Nurs 35:2161–2172. https://doi.org/10.1523/JNEUROSCI.3915-14.2015 pmid:25653371
    OpenUrlPubMed
  5. ↵
    1. Bharadwaj HM,
    2. Verhulst S,
    3. Shaheen L,
    4. Liberman MC,
    5. Shinn-Cunningham BG
    (2014) Cochlear neuropathy and the coding of supra-threshold sound. Front Syst Neurosci 8:26. https://doi.org/10.3389/fnsys.2014.00026 pmid:24600357
    OpenUrlCrossRefPubMed
  6. ↵
    1. Bledsoe SC,
    2. Bobbin RP,
    3. Chihal DM
    (1981) Kainic acid: an evaluation of its action on cochlear potentials. Hear Res 4:109–120. https://doi.org/10.1016/0378-5955(81)90040-X
    OpenUrlCrossRefPubMed
  7. ↵
    1. Boege P,
    2. Janssen T
    (2002) Pure-tone threshold estimation from extrapolated distortion product otoacoustic emission I/O-functions in normal and cochlear hearing loss ears. J Acoust Soc Am 111:1810–1818. https://doi.org/10.1121/1.1460923
    OpenUrlCrossRefPubMed
  8. ↵
    1. Borjigin A,
    2. Bharadwaj HM
    (2023) Individual differences reveal the utility of temporal fine-structure processing for speech perception in noise. bioRxiv.
  9. ↵
    1. Bramhall N,
    2. Beach EF,
    3. Epp B,
    4. Le Prell CG,
    5. Lopez-Poveda EA,
    6. Plack CJ,
    7. Schaette R,
    8. Verhulst S,
    9. Canlon B
    (2019) The search for noise-induced cochlear synaptopathy in humans: mission impossible? Hear Res 377:88–103. https://doi.org/10.1016/j.heares.2019.02.016
    OpenUrlCrossRefPubMed
  10. ↵
    1. Bramhall NF,
    2. Konrad-Martin D,
    3. McMillan GP,
    4. Griest SE
    (2017) Auditory brainstem response altered in humans with noise exposure despite normal outer hair cell function. Ear Hear 38:e1–e12. https://doi.org/10.1097/AUD.0000000000000370 pmid:27992391
    OpenUrlCrossRefPubMed
  11. ↵
    1. Bramhall NF,
    2. McMillan GP,
    3. Kampel SD
    (2021) Envelope following response measurements in young veterans are consistent with noise-induced cochlear synaptopathy. Hear Res 408:108310. https://doi.org/10.1016/j.heares.2021.108310 pmid:34293505
    OpenUrlCrossRefPubMed
  12. ↵
    1. Bramhall N,
    2. Ong B,
    3. Ko J,
    4. Parker M
    (2015) Speech perception ability in noise is correlated with auditory brainstem response wave i amplitude. J Am Acad Audiol 26:509–517. https://doi.org/10.3766/jaaa.14100
    OpenUrlCrossRefPubMed
  13. ↵
    1. Brand T,
    2. Kollmeier B
    (2002) Efficient adaptive procedures for threshold and concurrent slope estimates for psychophysics and speech intelligibility tests. J Acoust Soc Am 111:2801–2810. https://doi.org/10.1121/1.1479152
    OpenUrlCrossRefPubMed
  14. ↵
    1. Buran BN,
    2. McMillan GP,
    3. Keshishzadeh S,
    4. Verhulst S,
    5. Bramhall NF
    (2022) Predicting synapse counts in living humans by combining computational models with auditory physiology. J Acoust Soc Am 151:561–576. https://doi.org/10.1121/10.0009238 pmid:35105019
    OpenUrlCrossRefPubMed
  15. ↵
    1. Clinard CG,
    2. Tremblay KL
    (2013) Aging degrades the neural encoding of simple and complex sounds in the human brainstem. J Am Acad Audiol 24:590–599. https://doi.org/10.3766/jaaa.24.7.7
    OpenUrlCrossRefPubMed
  16. ↵
    1. Dent ML,
    2. Dooling RJ,
    3. Pierce AS
    (2000) Frequency discrimination in budgerigars (Melopsittacus undulatus): effects of tone duration and tonal context. J Acoust Soc Am 107:2657–2664. https://doi.org/10.1121/1.428651
    OpenUrlCrossRefPubMed
  17. ↵
    1. Derrick B
    (2017) How to compare the means of two samples that include paired observations and independent observations: a companion to derrick, Russ, Toher and White (2017). Quant Methods Psychol 13:120–126. https://doi.org/10.20982/tqmp.13.2.p120
    OpenUrlCrossRef
  18. ↵
    1. Dimitrijevic A,
    2. Alsamri J,
    3. John MS,
    4. Purcell D,
    5. George S,
    6. Zeng F-G
    (2016) Human envelope following responses to amplitude modulation: effects of aging and modulation depth. Ear Hear 37:e322–335. https://doi.org/10.1097/AUD.0000000000000324 pmid:27556365
    OpenUrlCrossRefPubMed
  19. ↵
    1. DiNino M,
    2. Holt LL,
    3. Shinn-Cunningham BG
    (2022) Cutting through the noise: noise-induced cochlear synaptopathy and individual differences in speech understanding among listeners with normal audiograms. Ear Hear 43:9–22. https://doi.org/10.1097/AUD.0000000000001147 pmid:34751676
    OpenUrlCrossRefPubMed
  20. ↵
    1. Dooling RJ,
    2. Lohr B,
    3. Dent ML
    (2000) Hearing in birds and reptiles. In: Comparative hearing: birds and reptiles, Springer handbook of auditory research (Dooling RJ, Fay RR, Popper AN, eds), pp 308–359. New York, NY: Springer.
  21. ↵
    1. Dreyer A,
    2. Delgutte B
    (2006) Phase locking of auditory-nerve fibers to the envelopes of high-frequency sounds: implications for sound localization. J Neurophysiol 96:2327–2341. https://doi.org/10.1152/jn.00326.2006 pmid:16807349
    OpenUrlCrossRefPubMed
  22. ↵
    1. Encina-Llamas G,
    2. Dau T,
    3. Epp B
    (2021) On the use of envelope following responses to estimate peripheral level compression in the auditory system. Sci Rep 11:6962. https://doi.org/10.1038/s41598-021-85850-x pmid:33772043
    OpenUrlCrossRefPubMed
  23. ↵
    1. Encina-Llamas G,
    2. Harte JM,
    3. Dau T,
    4. Shinn-Cunningham B,
    5. Epp B
    (2019) Investigating the effect of cochlear synaptopathy on envelope following responses using a model of the auditory nerve. J Assoc Res Otolaryngol JARO 20:363–382. https://doi.org/10.1007/s10162-019-00721-7 pmid:31102010
    OpenUrlCrossRefPubMed
  24. ↵
    1. Encina-Llamas G,
    2. Parthasarathy A,
    3. Harte JM,
    4. Dau T,
    5. Kujawa SG,
    6. Shinn-Cunningham B,
    7. Epp B
    (2017) Hidden hearing loss with envelope following responses (EFRs): the off-frequency problem. Baltimore, United States.
  25. ↵
    1. Fernandez KA,
    2. Jeffers PWC,
    3. Lall K,
    4. Liberman MC,
    5. Kujawa SG
    (2015) Aging after noise exposure: acceleration of cochlear synaptopathy in recovered ears. J Neurosci Nurs 35:7509–7520. https://doi.org/10.1523/JNEUROSCI.5138-14.2015 pmid:25972177
    OpenUrlPubMed
  26. ↵
    1. Festen JM,
    2. Plomp R
    (1983) Relations between auditory functions in impaired hearing. J Acoust Soc Am 73:652–662. https://doi.org/10.1121/1.388957
    OpenUrlCrossRefPubMed
  27. ↵
    1. Fitzgerald MB,
    2. Ward KM,
    3. Gianakas SP,
    4. Smith ML,
    5. Blevins NH,
    6. Swanson AP
    (2024) Speech-in-noise assessment in the routine audiologic test battery: relationship to perceived auditory disability. Ear Hear 45:816–826. https://doi.org/10.1097/AUD.0000000000001472 pmid:38414136
    OpenUrlCrossRefPubMed
  28. ↵
    1. Furman AC,
    2. Kujawa SG,
    3. Liberman MC
    (2013) Noise-induced cochlear neuropathy is selective for fibers with low spontaneous rates. J Neurophysiol 110:577–586. https://doi.org/10.1152/jn.00164.2013 pmid:23596328
    OpenUrlCrossRefPubMed
  29. ↵
    1. Garrett M,
    2. Verhulst S
    (2019) Applicability of subcortical EEG metrics of synaptopathy to older listeners with impaired audiograms. Hear Res 380:150–165. https://doi.org/10.1016/j.heares.2019.07.001
    OpenUrlCrossRefPubMed
  30. ↵
    1. Gramfort A
    , et al. (2013) MEG and EEG data analysis with MNE-python. Front Neurosci 7:267. https://doi.org/10.3389/fnins.2013.00267 pmid:24431986
    OpenUrlCrossRefPubMed
  31. ↵
    1. Gramfort A,
    2. Luessi M,
    3. Larson E,
    4. Engemann DA,
    5. Strohmeier D,
    6. Brodbeck C,
    7. Parkkonen L,
    8. Hämäläinen MS
    (2014) MNE software for processing MEG and EEG data. NeuroImage 86:446–460. https://doi.org/10.1016/j.neuroimage.2013.10.027 pmid:24161808
    OpenUrlCrossRefPubMed
  32. ↵
    1. Grose JH,
    2. Buss E,
    3. Hall JW
    (2017) Loud music exposure and cochlear synaptopathy in young adults: isolated auditory brainstem response effects but no perceptual consequences. Trends Hear 21:2331216517737417. https://doi.org/10.1177/2331216517737417 pmid:29105620
    OpenUrlCrossRefPubMed
  33. ↵
    1. Guest H,
    2. Munro KJ,
    3. Prendergast G,
    4. Millman RE,
    5. Plack CJ
    (2018) Impaired speech perception in noise with a normal audiogram: no evidence for cochlear synaptopathy and no relation to lifetime noise exposure. Hear Res 364:142–151. https://doi.org/10.1016/j.heares.2018.03.008 pmid:29680183
    OpenUrlCrossRefPubMed
  34. ↵
    1. Henry KS,
    2. Abrams KS
    (2018) Persistent auditory nerve damage following kainic acid excitotoxicity in the budgerigar (Melopsittacus undulatus). J Assoc Res Otolaryngol JARO 19:435–449. https://doi.org/10.1007/s10162-018-0671-y pmid:29744730
    OpenUrlCrossRefPubMed
  35. ↵
    1. Henry KS,
    2. Abrams KS,
    3. Forst J,
    4. Mender MJ,
    5. Neilans EG,
    6. Idrobo F,
    7. Carney LH
    (2017) Midbrain synchrony to envelope structure supports behavioral sensitivity to single-formant vowel-like sounds in noise. J Assoc Res Otolaryngol JARO 18:165–181. https://doi.org/10.1007/s10162-016-0594-4 pmid:27766433
    OpenUrlCrossRefPubMed
  36. ↵
    1. Henry KS,
    2. Kale S,
    3. Heinz MG
    (2016) Distorted tonotopic coding of temporal envelope and fine structure with noise-induced hearing loss. J Neurosci Nurs 36:2227–2237. https://doi.org/10.1523/JNEUROSCI.3944-15.2016 pmid:26888932
    OpenUrlCrossRefPubMed
  37. ↵
    1. Hickox AE,
    2. Larsen E,
    3. Heinz MG,
    4. Shinobu L,
    5. Whitton JP
    (2017) Translational issues in cochlear synaptopathy. Hear Res 349:164–171. https://doi.org/10.1016/j.heares.2016.12.010 pmid:28069376
    OpenUrlCrossRefPubMed
  38. ↵
    1. Hopkins K,
    2. Moore BCJ
    (2010) The importance of temporal fine structure information in speech at different spectral regions for normal-hearing and hearing-impaired subjects. J Acoust Soc Am 127:1595–1608. https://doi.org/10.1121/1.3293003
    OpenUrlCrossRefPubMed
  39. ↵
    1. Hopkins K,
    2. Moore B,
    3. Stone M
    (2008) Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech. J Acoust Soc Am 123:1140–53. https://doi.org/10.1121/1.2824018 pmid:18247914
    OpenUrlCrossRefPubMed
  40. ↵
    1. Humes LE
    (2013) Understanding the speech-understanding problems of older adults. Am J Audiol 22:303–305. https://doi.org/10.1044/1059-0889(2013/12-0066)
    OpenUrl
  41. ↵
    1. Humes LE,
    2. Kewley-Port D,
    3. Fogerty D,
    4. Kinney D
    (2010) Measures of hearing threshold and temporal processing across the adult lifespan. Hear Res 264:30–40. https://doi.org/10.1016/j.heares.2009.09.010 pmid:19786083
    OpenUrlCrossRefPubMed
  42. ↵
    1. Humphrey R
    (2008) Playrec, Multi-channel Matlab Audio.
  43. ↵
    ISO (2017) Acoustics - statistical distribution of hearing thresholds related to age and gender (7029). Switzerland: International Organization of Standardization, ISO.
  44. ↵
    1. Johannesen PT,
    2. Buzo BC,
    3. Lopez-Poveda EA
    (2019) Evidence for age-related cochlear synaptopathy in humans unconnected to speech-in-noise intelligibility deficits. Hear Res 374:35–48. https://doi.org/10.1016/j.heares.2019.01.017
    OpenUrlCrossRefPubMed
  45. ↵
    1. Johannesen PT,
    2. Pérez-González P,
    3. Kalluri S,
    4. Blanco JL,
    5. Lopez-Poveda EA
    (2016) The influence of cochlear mechanical dysfunction, temporal processing deficits, and age on the intelligibility of audible speech in noise for hearing-impaired listeners. Trends Hear 20:2331216516641055. https://doi.org/10.1177/2331216516641055 pmid:27604779
    OpenUrlCrossRefPubMed
  46. ↵
    1. Joris PX,
    2. Verschooten E
    (2013) On the limit of neural phase locking to fine structure in humans. Adv Exp Med Biol 787:101–108. https://doi.org/10.1007/978-1-4614-1590-9_12
    OpenUrlCrossRefPubMed
  47. ↵
    1. Joris PX,
    2. Yin TC
    (1992) Responses to amplitude-modulated tones in the auditory nerve of the cat. J Acoust Soc Am 91:215–232. https://doi.org/10.1121/1.402757
    OpenUrlCrossRefPubMed
  48. ↵
    1. Keshishzadeh S,
    2. Garrett M,
    3. Vasilkov V,
    4. Verhulst S
    (2020) The derived-band envelope following response and its sensitivity to sensorineural hearing deficits. Hear Res 392:107979. https://doi.org/10.1016/j.heares.2020.107979
    OpenUrlCrossRefPubMed
  49. ↵
    1. Kobel M,
    2. Le Prell CG,
    3. Liu J,
    4. Hawks JW,
    5. Bao J
    (2017) Noise-induced cochlear synaptopathy: past findings and future studies. Hear Res 349:148–154. https://doi.org/10.1016/j.heares.2016.12.008
    OpenUrlCrossRefPubMed
  50. ↵
    1. Konrad-Martin D,
    2. Dille MF,
    3. McMillan G,
    4. Griest S,
    5. McDermott D,
    6. Fausti SA,
    7. Austin DF
    (2012) Age-related changes in the auditory brainstem response. J Am Acad Audiol 23:18–75. https://doi.org/10.3766/jaaa.23.1.3 pmid:22284838
    OpenUrlCrossRefPubMed
  51. ↵
    1. Kraus N,
    2. Anderson S,
    3. White-Schwoch T
    (2017) The frequency-following response: a window into human communication. In: The frequency-following response. Springer handbook of auditory research (Kraus N, Anderson S, White-Schwoch T, Fay R, Popper A, eds), vol 61. Cham: Springer. https://doi.org/10.1007/978-3-319-47944-6_1
  52. ↵
    1. Kujawa SG,
    2. Liberman MC
    (2009) Adding insult to injury: cochlear nerve degeneration after temporary noise-induced hearing loss. J Neurosci Nurs 29:14077–14085. https://doi.org/10.1523/JNEUROSCI.2845-09.2009 pmid:19906956
    OpenUrlPubMed
  53. ↵
    1. Kummer P,
    2. Janssen T,
    3. Arnold W
    (1998) The level and growth behavior of the 2 f1-f2 distortion product otoacoustic emission and its relationship to auditory sensitivity in normal hearing and cochlear hearing loss. J Acoust Soc Am 103:3431–3444. https://doi.org/10.1121/1.423054
    OpenUrlCrossRefPubMed
  54. ↵
    1. Lenth RV
    (2016) Least-squares means: the R package lsmeans. J Stat Softw 69:1–33. https://doi.org/10.18637/jss.v069.i01
    OpenUrlCrossRefPubMed
  55. ↵
    1. Le Prell CG
    (2019) Effects of noise exposure on auditory brainstem response and speech-in-noise tasks: a review of the literature. Int J Audiol 58:S3–S32. https://doi.org/10.1080/14992027.2018.1534010
    OpenUrlCrossRefPubMed
  56. ↵
    1. Liberman MC,
    2. Epstein MJ,
    3. Cleveland SS,
    4. Wang H,
    5. Maison SF
    (2016) Toward a differential diagnosis of hidden hearing loss in humans. PLoS One 11:e0162726. https://doi.org/10.1371/journal.pone.0162726 pmid:27618300
    OpenUrlCrossRefPubMed
  57. ↵
    1. Lin FR,
    2. Thorpe R,
    3. Gordon-Salant S,
    4. Ferrucci L
    (2011) Hearing loss prevalence and risk factors among older adults in the United States. J Gerontol Ser A Biol Sci Med Sci 66:582–590. https://doi.org/10.1093/gerona/glr002 pmid:21357188
    OpenUrlCrossRefPubMed
  58. ↵
    1. Long GR,
    2. Talmadge CL,
    3. Lee J
    (2008) Measuring distortion product otoacoustic emissions using continuously sweeping primaries. J Acoust Soc Am 124:1613–1626. https://doi.org/10.1121/1.2949505
    OpenUrlCrossRefPubMed
  59. ↵
    1. Lopez-Poveda EA,
    2. Barrios P
    (2013) Perception of stochastically undersampled sound waveforms: a model of auditory deafferentation. Front Neurosci 7:124. https://doi.org/10.3389/fnins.2013.00124 pmid:23882176
    OpenUrlCrossRefPubMed
  60. ↵
    1. Lorenzi C,
    2. Gilbert G,
    3. Carn H,
    4. Garnier S,
    5. Moore BCJ
    (2006) Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proc Natl Acad Sci U S A 103:18866–18869. https://doi.org/10.1073/pnas.0607364103 pmid:17116863
    OpenUrlAbstract/FREE Full Text
  61. ↵
    1. Mai G,
    2. Howell P
    (2023) The possible role of early-stage phase-locked neural activities in speech-in-noise perception in human adults across age and hearing loss. Hear Res 427:108647. https://doi.org/10.1016/j.heares.2022.108647
    OpenUrlCrossRefPubMed
  62. ↵
    1. Makary CA,
    2. Shin J,
    3. Kujawa SG,
    4. Liberman MC,
    5. Merchant SN
    (2011) Age-related primary cochlear neuronal degeneration in human temporal bones. J Assoc Res Otolaryngol JARO 12:711–717. https://doi.org/10.1007/s10162-011-0283-2 pmid:21748533
    OpenUrlPubMed
  63. ↵
    1. Mauermann M
    (2013) Improving the usability of the distortion product otoacoustic emisssions (DPOAE)-sweep method: an alternative artifact rejection and noise-floor estimation. J Acoust Soc Am 133:3376. https://doi.org/10.1121/1.4805803
    OpenUrlCrossRef
  64. ↵
    1. Mauermann M,
    2. Kollmeier B
    (2004) Distortion product otoacoustic emission (DPOAE) input/output functions and the influence of the second DPOAE source. J Acoust Soc Am 116:2199–2212. https://doi.org/10.1121/1.1791719
    OpenUrlCrossRefPubMed
  65. ↵
    1. Mehraei G,
    2. Hickox AE,
    3. Bharadwaj HM,
    4. Goldberg H,
    5. Verhulst S,
    6. Liberman MC,
    7. Shinn-Cunningham BG
    (2016) Auditory brainstem response latency in noise as a marker of cochlear synaptopathy. J Neurosci Nurs 36:3755–3764. https://doi.org/10.1523/JNEUROSCI.4460-15.2016 pmid:27030760
    OpenUrlAbstract/FREE Full Text
  66. ↵
    1. Mepani AM,
    2. Verhulst S,
    3. Hancock KE,
    4. Garrett M,
    5. Vasilkov V,
    6. Bennett K,
    7. de Gruttola V,
    8. Liberman MC,
    9. Maison SF
    (2021) Envelope following responses predict speech-in-noise performance in normal-hearing listeners. J Neurophysiol 125:1213–1222. https://doi.org/10.1152/jn.00620.2020 pmid:33656936
    OpenUrlCrossRefPubMed
  67. ↵
    1. Millman KJ,
    2. Aivazis M
    (2011) Python for scientists and engineers. Comput Sci Eng 13:9–12. https://doi.org/10.1109/MCSE.2011.36
    OpenUrlCrossRef
  68. ↵
    1. Mitchell C,
    2. Phillips DS,
    3. Trune DR
    (1989) Variables affecting the auditory brainstem response: audiogram, age, gender and head size. Hear Res 40:75–85. https://doi.org/10.1016/0378-5955(89)90101-9
    OpenUrlCrossRefPubMed
  69. ↵
    1. Möhrle D,
    2. Ni K,
    3. Varakina K,
    4. Bing D,
    5. Lee SC,
    6. Zimmermann U,
    7. Knipper M,
    8. Rüttiger L
    (2016) Loss of auditory sensitivity from inner hair cell synaptopathy can be centrally compensated in the young but not old brain. Neurobiol Aging 44:173–184. https://doi.org/10.1016/j.neurobiolaging.2016.05.001
    OpenUrlCrossRefPubMed
  70. ↵
    1. Neely ST,
    2. Johnson TA,
    3. Kopun J,
    4. Dierking DM,
    5. Gorga MP
    (2009) Distortion-product otoacoustic emission input/output characteristics in normal-hearing and hearing-impaired human ears. J Acoust Soc Am 126:728–738. https://doi.org/10.1121/1.3158859 pmid:19640039
    OpenUrlCrossRefPubMed
  71. ↵
    1. Newton RG,
    2. Spurrell DJ
    (1967) A development of multiple regression for the analysis of routine data. J R Stat Soc Ser C Appl Stat 16:51–64. https://doi.org/10.2307/2985237
    OpenUrl
  72. ↵
    1. Nimon K,
    2. Lewis M,
    3. Kane R,
    4. Haynes RM
    (2008) An R package to compute commonality coefficients in the multiple regression case: an introduction to the package and a practical example. Behav Res Methods 40:457–466. https://doi.org/10.3758/BRM.40.2.457
    OpenUrlCrossRefPubMed
  73. ↵
    1. Oliphant TE
    (2007) Python for scientific computing. Comput Sci Eng 9:10–20. https://doi.org/10.1109/MCSE.2007.58
    OpenUrlCrossRef
  74. ↵
    1. Papakonstantinou A,
    2. Strelcyk O,
    3. Dau T
    (2011) Relations between perceptual measures of temporal processing, auditory-evoked brainstem responses and speech intelligibility in noise. Hear Res 280:30–37. https://doi.org/10.1016/j.heares.2011.02.005
    OpenUrlCrossRefPubMed
  75. ↵
    1. Parthasarathy A,
    2. Datta J,
    3. Torres JAL,
    4. Hopkins C,
    5. Bartlett EL
    (2014) Age-related changes in the relationship between auditory brainstem responses and EFRs. J Assoc Res Otolaryngol JARO 15:649–661. https://doi.org/10.1007/s10162-014-0460-1 pmid:24845405
    OpenUrlCrossRefPubMed
  76. ↵
    1. Parthasarathy A,
    2. Kujawa SG
    (2018) Synaptopathy in the aging cochlea: characterizing early-neural deficits in auditory temporal envelope processing. J Neurosci Nurs 38:7108–7119. https://doi.org/10.1523/JNEUROSCI.3240-17.2018 pmid:29976623
    OpenUrlAbstract/FREE Full Text
  77. ↵
    1. Picton TW
    (2010) Human auditory evoked potentials. San Diego, CA: Plural Publishing, Inc.
  78. ↵
    1. Plack CJ,
    2. Léger A,
    3. Prendergast G,
    4. Kluk K,
    5. Guest H,
    6. Munro KJ
    (2016) Toward a diagnostic test for hidden hearing loss. Trends Hear 20:2331216516657466. https://doi.org/10.1177/2331216516657466 pmid:27604783
    OpenUrlCrossRefPubMed
  79. ↵
    1. Plomp R
    (1986) A signal-to-noise ratio model for the speech-reception threshold of the hearing impaired. J Speech Lang Hear Res 29:146–154. https://doi.org/10.1044/jshr.2902.146
    OpenUrlCrossRefPubMed
  80. ↵
    1. Prendergast G,
    2. Millman RE,
    3. Guest H,
    4. Munro KJ,
    5. Kluk K,
    6. Dewey RS,
    7. Hall DA,
    8. Heinz MG,
    9. Plack CJ
    (2017) Effects of noise exposure on young adults with normal audiograms II: behavioral measures. Hear Res 356:74–86. https://doi.org/10.1016/j.heares.2017.10.007 pmid:29126651
    OpenUrlCrossRefPubMed
  81. ↵
    1. Pujol R,
    2. Lenoir M,
    3. Robertson D,
    4. Eybalin M,
    5. Johnstone BM
    (1985) Kainic acid selectively alters auditory dendrites connected with cochlear inner hair cells. Hear Res 18:145–151. https://doi.org/10.1016/0378-5955(85)90006-1
    OpenUrlCrossRefPubMed
  82. ↵
    R Core Team (2019) R: A language and environment for statistical computing.
  83. ↵
    1. Ray-Mukherjee J,
    2. Nimon K,
    3. Mukherjee S,
    4. Morris DW,
    5. Slotow R,
    6. Hamer M
    (2014) Using commonality analysis in multiple regressions: a tool to decompose regression effects in the face of multicollinearity. Methods Ecol Evol 5:320–328. https://doi.org/10.1111/2041-210X.12166
    OpenUrlCrossRef
  84. ↵
    1. Schmiedt RA,
    2. Mills JH,
    3. Boettcher FA
    (1996) Age-related loss of activity of auditory-nerve fibers. J Neurophysiol 76:2799–2803. https://doi.org/10.1152/jn.1996.76.4.2799
    OpenUrlCrossRefPubMed
  85. ↵
    1. Schoof T,
    2. Rosen S
    (2016) The role of age-related declines in subcortical auditory processing in speech perception in noise. J Assoc Res Otolaryngol JARO 17:441–460. https://doi.org/10.1007/s10162-016-0564-x pmid:27216166
    OpenUrlCrossRefPubMed
  86. ↵
    1. Sergeyenko Y,
    2. Lall K,
    3. Liberman MC,
    4. Kujawa SG
    (2013) Age-related cochlear synaptopathy: an early-onset contributor to auditory functional decline. J Neurosci Nurs 33:13686–13694. https://doi.org/10.1523/JNEUROSCI.1783-13.2013 pmid:23966690
    OpenUrlAbstract/FREE Full Text
  87. ↵
    1. Shaheen LA,
    2. Valero MD,
    3. Liberman MC
    (2015) Towards a diagnosis of cochlear neuropathy with envelope following responses. J Assoc Res Otolaryngol JARO 16:727–745. https://doi.org/10.1007/s10162-015-0539-3 pmid:26323349
    OpenUrlCrossRefPubMed
  88. ↵
    1. Stevens G,
    2. Flaxman S,
    3. Brunskill E,
    4. Mascarenhas M,
    5. Mathers CD,
    6. Finucane M
    , Global Burden of Disease Hearing Loss Expert Group (2013) Global and regional hearing impairment prevalence: an analysis of 42 studies in 29 countries. Eur J Public Health 23:146–152. https://doi.org/10.1093/eurpub/ckr176
    OpenUrlCrossRefPubMed
  89. ↵
    1. Sun H,
    2. Hashino E,
    3. Ding DL,
    4. Salvi RJ
    (2001) Reversible and irreversible damage to cochlear afferent neurons by kainic acid excitotoxicity. J Comp Neurol 430:172–181. https://doi.org/10.1002/1096-9861(20010205)430:2<172::AID-CNE1023>3.0.CO;2-W
    OpenUrlCrossRefPubMed
  90. ↵
    1. Sun H,
    2. Salvi RJ,
    3. Ding DL,
    4. Hashino DE,
    5. Shero M,
    6. Zheng XY
    (2000) Excitotoxic effect of kainic acid on chicken otoacoustic emissions and cochlear potentials. J Acoust Soc Am 107:2136–2142. https://doi.org/10.1121/1.428495
    OpenUrlCrossRefPubMed
  91. ↵
    1. Trune DR,
    2. Mitchell C,
    3. Phillips DS
    (1988) The relative importance of head size, gender and age on the auditory brainstem response. Hear Res 32:165–174. https://doi.org/10.1016/0378-5955(88)90088-3
    OpenUrlCrossRefPubMed
  92. ↵
    1. Valderrama JT,
    2. Beach EF,
    3. Yeend I,
    4. Sharma M,
    5. Van Dun B,
    6. Dillon H
    (2018) Effects of lifetime noise exposure on the middle-age human auditory brainstem response, tinnitus and speech-in-noise intelligibility. Hear Res 365:36–48. https://doi.org/10.1016/j.heares.2018.06.003
    OpenUrlCrossRefPubMed
  93. ↵
    1. Van Der Biest H,
    2. Keshishzadeh S,
    3. Keppler H,
    4. Dhooge I,
    5. Verhulst S
    (2023) Envelope following responses for hearing diagnosis: robustness and methodological considerations. J Acoust Soc Am 153:191–208. https://doi.org/10.1121/10.0016807
    OpenUrlCrossRefPubMed
  94. ↵
    1. Varghese L,
    2. Bharadwaj HM,
    3. Shinn-Cunningham BG
    (2015) Evidence against attentional state modulating scalp-recorded auditory brainstem steady-state responses. Brain Res 1626:146–164. https://doi.org/10.1016/j.brainres.2015.06.038 pmid:26187756
    OpenUrlCrossRefPubMed
  95. ↵
    1. Vasilkov V,
    2. Garrett M,
    3. Mauermann M,
    4. Verhulst S
    (2021) Enhancing the sensitivity of the envelope-following response for cochlear synaptopathy screening in humans: the role of stimulus envelope. Hear Res 400:108132. https://doi.org/10.1016/j.heares.2020.108132
    OpenUrlCrossRefPubMed
  96. ↵
    1. Vasilkov V,
    2. Verhulst S
    (2019) Towards a differential diagnosis of cochlear synaptopathy and outer-hair-cell deficits in mixed sensorineural hearing loss pathologies. medRxiv.
  97. ↵
    1. Verhulst S,
    2. Ernst F,
    3. Garrett M,
    4. Vasilkov V
    (2018) Supra-threshold psychoacoustics and envelope-following response relations: normal-hearing, synaptopathy and cochlear gain loss. Acta Acustica United Acustica 104:800–803. https://doi.org/10.3813/AAA.919227
    OpenUrl
  98. ↵
    1. Verhulst S,
    2. Jagadeesh A,
    3. Mauermann M,
    4. Ernst F
    (2016a) Individual differences in auditory brainstem response wave characteristics: relations to different aspects of peripheral hearing loss. Trends Hear 20:1–20. https://doi.org/10.1177/2331216516672186 pmid:27837052
    OpenUrlCrossRefPubMed
  99. ↵
    1. Verhulst S,
    2. Piktel P,
    3. Jagadeesh A,
    4. Mauermann M
    (2016b) On the interplay between cochlear gain loss and temporal envelope coding deficits. Adv Exp Med Biol 894:467–475.
    OpenUrlCrossRefPubMed
  100. ↵
    1. Verschooten E,
    2. Robles L,
    3. Joris PX
    (2015) Assessment of the limits of neural phase-locking using mass potentials. J Neurosci Official J Soc Neurosci 35:2255–2268. https://doi.org/10.1523/JNEUROSCI.2979-14.2015 pmid:25653380
    OpenUrlAbstract/FREE Full Text
  101. ↵
    1. Viana LM,
    2. O’Malley JT,
    3. Burgess BJ,
    4. Jones DD,
    5. Oliveira CACP,
    6. Santos F,
    7. Merchant SN,
    8. Liberman LD,
    9. Liberman MC
    (2015) Cochlear neuropathy in human presbycusis: confocal analysis of hidden hearing loss in post-mortem tissue. Hear Res 327:78–88. https://doi.org/10.1016/j.heares.2015.04.014 pmid:26002688
    OpenUrlCrossRefPubMed
  102. ↵
    1. Wagener KC,
    2. Kühnel V,
    3. Kollmeier B
    (1999) Entwicklung und evaluation eines satztests für die deutsche sprache III: evaluation des oldenburger satztests. Zeitschrift für Audiol / Audiol Acoustics 38:4–15.
    OpenUrl
  103. ↵
    1. Wang Y,
    2. Abrams KS,
    3. Youngman M,
    4. Henry KS
    (2023) Histological correlates of auditory nerve injury from kainic acid in the budgerigar (Melopsittacus undulatus). J Assoc Res Otolaryngol 24:473–485. https://doi.org/10.1007/s10162-023-00910-5 pmid:37798548
    OpenUrlCrossRefPubMed
  104. ↵
    1. Wilson JL,
    2. Abrams KS,
    3. Henry KS
    (2021) Effects of kainic acid-induced auditory nerve damage on envelope-following responses in the budgerigar (Melopsittacus undulatus). J Assoc Res Otolaryngol 22:33–49. https://doi.org/10.1007/s10162-020-00776-x pmid:33078291
    OpenUrlCrossRefPubMed
  105. ↵
    WHO (2019) Deafness and hearing loss.
  106. ↵
    1. Wong SJ,
    2. Abrams KS,
    3. Amburgey KN,
    4. Wang Y,
    5. Henry KS
    (2019) Effects of selective auditory-nerve damage on the behavioral audiogram and temporal integration in the budgerigar. Hear Res 374:24–34. https://doi.org/10.1016/j.heares.2019.01.019 pmid:30703625
    OpenUrlCrossRefPubMed
  107. ↵
    1. Wu PZ,
    2. Liberman LD,
    3. Bennett K,
    4. de Gruttola V,
    5. O’Malley JT,
    6. Liberman MC
    (2018) Primary neural degeneration in the human cochlea: evidence for hidden hearing loss in the aging ear. Neuroscience 407:8–20. https://doi.org/10.1016/j.neuroscience.2018.07.053 pmid:30099118
    OpenUrlCrossRefPubMed
  108. ↵
    1. Yeend I,
    2. Beach EF,
    3. Sharma M,
    4. Dillon H
    (2017) The effects of noise exposure and musical training on suprathreshold auditory processing and speech perception in noise. Hear Res 353:224–236. https://doi.org/10.1016/j.heares.2017.07.006
    OpenUrlCrossRefPubMed
  109. ↵
    1. Zheng XY,
    2. Henderson D,
    3. Hu BH,
    4. McFadden SL
    (1997) Recovery of structure and function of inner ear afferent synapses following kainic acid excitotoxicity. Hear Res 105:65–76. https://doi.org/10.1016/S0378-5955(96)00188-8
    OpenUrlCrossRefPubMed
  110. ↵
    1. Zhu L,
    2. Bharadwaj H,
    3. Xia J,
    4. Shinn-Cunningham B
    (2013) A comparison of spectral magnitude and phase-locking value analyses of the frequency-following response to complex tones. J Acoust Soc Am 134:384–395. https://doi.org/10.1121/1.4807498 pmid:23862815
    OpenUrlCrossRefPubMed

Abbreviations

ABR, auditory brainstem response; AEP, auditory-evoked potential; AN(F), auditory nerve (fiber); BB, broadband; CAP, compound action potential; CF, characteristic frequency; DPOAE, distortion product otoacoustic emission; EEG, electroencephalography; EFR, envelope- following response; HI, hearing impaired; HP, high-pass filtered; IHC, inner hair cell; KA, Kainic Acid; LP, low-pass filtered; NH, normal hearing; OHC, outer hair cell; peSPL, peak-equivalent sound pressure level; PT, pure-tone; SAM, sinusoidally amplitude-modulated; SiN, speech in noise; SiQ, speech in quiet; SNHL, sensorineural hearing loss; SNR, signal-to-noise ratio; SRT, speech reception threshold; SPL, sound pressure level; RAM, rectangularly amplitude-modulated; TENV, temporal envelope; TFS, temporal fine structure; THA, audiometric hearing threshold; THDP, distortion-product otoacoustic emission threshold.

Synthesis

Reviewing Editor: Christine Portfors, Washington State University

Decisions are customarily a result of the Reviewing Editor and the peer reviewers coming together and discussing their recommendations until a consensus is reached. When revisions are invited, a fact-based synthesis statement explaining their decision and outlining what is needed to prepare a revision will be listed below. The following reviewer(s) agreed to reveal their identity: Zsuzsanna Kocsis.

Thank you for addressing the previous reviewer comments.

Author Response

Dear Dr. Portfors and Reviewers, We sincerely thank you for your thorough evaluation of our manuscript. As reflected in both the revised manuscript and this rebuttal letter, we have carefully addressed the reviewers' comments, resulting in significant improvements to the scientific quality, logic, and readability of our work. Implementing the necessary changes, including an updated statistical analysis, redrafting all figures, conducting an additional analysis using a public dataset, and revising the manuscript text, took slightly longer than anticipated.

We hope that, after reviewing our responses (in blue) alongside your original comments (in black) and the revised manuscript, you will agree that these changes have strengthened our study. We kindly ask you to consider this manuscript for publication in eNeuro.

With best regards, The authors REVIEWS and REBUTTAL:

------------------------------------------------------------------------ Synthesis of Reviews:

Computational Neuroscience Model Code Accessibility Comments for Author (Required):

N/A Synthesis Statement for Author (Required):

The major concern with the manuscript is that audibility was not controlled during the experiment or considered in any of the analyses. It is noted that the authors did attempt to control for hearing sensitivity by including DPOAEs in the analyses, but DPOAE amplitudes do not account for the vast differences in audibility across participants. The following concerns regarding audibility should be addressed:

While we focused on the DPOAE analysis as the most direct marker of OHC damage in the originally-presented statistical analyses, we agree that audibility (as a perceptual correlate of hearing sensitivity) needs to be controlled for. We had left audiogram-based measures of hearing sensitivity out of the manuscript because they gave similar results than the DPOAE analyses. But we now changed the narrative to focus more on audibility and added the hearing threshold into the statistical analyses where relevant. We introduce the concept of "hearing sensitivity" more strongly upfront and explain how both audiogram thresholds and DPOAE thresholds are behavioral and physiological correlates of this.

1. Hearing thresholds at 4 kHz (i.e., the EFR carrier frequency) spanned a 50-dB range across participants. However, EFR stimulus levels were not adjusted to account for audibility.

Thus, it is not surprising that older listeners had smaller EFR amplitudes, as the sensation level would have been significantly lower for ONH and OHI compared to YNH. This would also explain the fact that the YNH participants with poor DPOAEs had higher EFR amplitudes than the older participants with poor DPOAEs. It is important to evaluate whether 4 kHz thresholds influenced EFR amplitude independent of age or DPOAE amplitude.

You make a good point here, we now included an additional statistical analysis in the results section that evaluates the effect of 4-kHz hearing sensitivity on the EFR, as well as present a re-analysis of an existing published EFR-vs-level dataset to show that the overall compressive growth of the EFR amplitudes with level, minimized the effect of supra-threshold audibility differences due to audiogram differences in the NH and HI groups. This analysis and discussion was presented as a new paragraph in the discussion "The quality of the adopted RAM-EFR marker of CS".

We made a study-design choice to present the EFR stimuli at the same overall level as the speech used in the SiN experiments to study the relationship between CS and speech intelligibility. The underlying thought was: the health of your ANF population at speech-level may predict how well you can code speech, and how strong you are at tolerating background noise at that level. If we were to have corrected the EFR for audibility, then we would have had to correct the speech material as well. The aim of this study related more to diagnose "how good your speech recognition in noise is at a fixed speech level in comparison to others", rather than to estimate "how well can you code speech in noise when compensated for by hearing sensitivity".

As we show in our correlation plots (Fig.6), the total available ANF population captured by the 70-dB-SPL RAM-EFR stimulation was able to predict SPIN well. We thus see a correlation between an EFR marker of CS and SiN in all conditions. Our additional regression analysis included both hearing sensitivity estimates (THA and THDP), and still showed a unique and significant contribution of the RAM EFR to explain SRTSiN-HP in the entire cohort and subgroup of NH participants (Table 3 and Fig.7).

Analysis added in the results section: "To investigate the potential influence of hearing sensitivity OHC-damage on an age-related CS interpretation of the observed EFR reductions, the main effect of age can be considered against the main effect of hearing sensitivity. The yNHcontrol group had significantly larger RAM EFR amplitudes (M = 0.239; SD = 0.076) than the oNH group (M = 0.155; SD = 0.062; t(26) = 3.09; p = .005), and the oNH group had better EFRs than the oHI group (M = 0.095;

SD= 0.028; t(27) = 3.19; p = .004). However, the main effect of age (t(40)=3.92; p=.0004) was greater than that of THA (t(27) =3.19; p = .004). We additionally performed a group analysis using the THDP criterion of 25 dB SPL to separate the cohort into those with normal or impaired OHC integrity at 4 kHz. Within the group of THDPs < 25 dB SPL, younger subjects had significantly larger EFRs than the older listeners (t(17) = 3.9; p<.001), and among the older listeners, there were no significant EFR differences between those with normal or impaired THDPs (t(27) =1.1 ; p>.05)." In the discussion section, we added the following paragraph:

The quality of the adopted RAM-EFR marker of CS To interpret the RAM-EFR as a pure marker of CS, the EFR marker would need to be fully independent of hearing sensitivity or OHC integrity. Our recordings from Budgerigars support the sensitivity of the EFR marker to CS, though they do not completely exclude the possibility that significant OHC damage may also influence its amplitude. Model simulations presented by Vasilkov et al. (2021) provide further insight, indicating that on-CF ANF responses to the RAM stimulus remain unaffected by OHC damage (see their Fig.1).

Additional simulations of human EFR generators by Vasilkov et al.(2021); Van Der Biest et al. (2023) show that simulated OHC damage has only a minimal effect (5-10%) on the 4- kHz RAM EFR amplitude, while CS significantly impacts the response, reducing it by up to 81%. Thus, the RAM stimulus was designed to be minimally influenced by coexisting OHC damage, a conclusion supported by our human EFR recordings, which show a greater EFR amplitude difference between the yNHcontrol and the oNH group than between the oNH and oHI groups (see Fig.3D).

We argue that when the stimulus is presented at sufficiently high levels to drive ANFs into saturation, the impact of the effective stimulus level, or supra-threshold audibility, on the EFR response is minimal. Although this effect was not systematically examined in our study, we reference the findings of Encina-Llamas et al. (2021). In their study, EFRs were recorded in response to 98-Hz-modulated SAM tones at various pure-tone frequencies and levels in young normal-hearing listeners (mean age: 24 {plus minus} 3.2 years) and older hearing-impaired listeners (mean age: 56.2 {plus minus} 12.7 years). The hearing-impaired participants had hearing thresholds of <=20 dB HL below 4 kHz and between 20 and 45 dB HL for frequencies up to 8 kHz, which aligns closely with the audiometric profiles considered in our study.

By fitting two piecewise linear curves to the EFR magnitude growth curves between 20 and 80 dB SPL (expressed in dB per dB), Encina-Llamas et al. (2021) demonstrated that both NH and HI growth curves exhibited a compression breakpoint around 60 dB SPL (see Fig. 2E in their study). To examine this further, we replotted their original EFR data in µV to calculate growth slopes for stimulus levels above 60 dB SPL, as shown in Fig.8. Our analysis revealed an EFR growth slope of 0.002 µV per dB, which was similar for both NH and HI listeners. This indicates that EFR amplitudes increase by approximately 0.04 µV between stimulus levels of 60 and 80 dB SPL, and that differences in hearing sensitivity did not affect this process.

More importantly for our study, the dataset from Encina-Llamas et al. (2017) did not show significant group differences in EFR amplitude between NH and HI listeners at stimulation levels of 60 dB SPL or higher (see Table 4). Hearing sensitivity differences of up to 30 dB between NH and HI subjects for frequencies above 4 kHz did thus not significantly impact the supra-threshold EFR amplitude. This suggests that variations in audibility are unlikely to influence EFR amplitude in terms of supra-threshold TENV coding or CS when presented at a fixed, supra-threshold level above the EFR growth curve's knee-point.

Taken together, the model simulations (Vasilkov et al., 2021; Van Der Biest et al., 2023) and experimental findings from both Budgerigar and human studies support the conclusion that our EFR marker is sensitive to CS and is largely independent of hearing sensitivity differences when assessed at 70 dB SPL. Our human results demonstrated a clear age-related decline in EFR amplitudes, even in the absence of OHC damage. These findings align with animal research linking deficits in temporal coding at the earliest neural stages of the auditory pathway to progressive or noise-induced CS (Parthasarathy et al., 2014; Fernandez et al., 2015; Shaheen et al., 2015; Parthasarathy &Kujawa, 2018).

Figure 9: Reanalysis of the Encina-Llamas et al. (2017) dataset reporting EFR magnitudes to a 98-Hz SAM tone of 4011 Hz. dB values were transformed into \microV using a 10**(dB/20) transformation with 1 \microV as the reference. Data from 13 NH (24 \pm 3.2 y/o) and 7 HI (56.2 \pm 12.7 y/o) listeners are shown. Only EFR amplitudes that were significantly above the noise floor are shown, and a linear fit was made for data-points above the compression knee-point of 60 dB SPL. The NH cohort had audiogram thresholds below 15 dB HL for frequencies below 8 kHz and the HI cohort had dB HL thresholds $<=$ 20 for frequencies below 4kHz and between 20 and 45 dB HL for frequencies up to 8 kHz.

------ This is the Fig.2E from Encina-Llamas et al. 2021 for your convenience, 2. Age-related differences in hearing thresholds became more pronounced with increasing frequency, as expected. The HP speech stimuli were restricted to frequencies above 1.65 kHz, the region where hearing loss was the greatest and most disparate between groups. Thus, the most pronounced age effect would be expected for HP SRTs based on hearing thresholds alone. Indeed, this is exactly what the data show.

We see your point, and agree that for the HP condition, we expect both (i) a better relation between the RAM EFR and SRTHP based on that both measures rely on TENV mechanisms, and (ii) a potential greater influence of hearing sensitivity/audibility on the SRTHP results.

We take both these factors into account in the updated multiple regression analysis presented in Table 4. In models that consider both the EFR RAM and THA,4kHz or THDP,4kHz, there remains a significant and unique contribution of the RAM EFR to the SRTHP when considering the entire cohort and subgroup of NH listeners. So even though hearing sensitivity can explain aspects of the SRT, the models improve when adding the RAM EFR, and this was especially true for the SiN condition, where synaptopathy is also expected to be most detrimental. Also in the subgroup of NH listeners (yNHcontrol + oNH), who only show very little variation in hearing sensitivity, the RAM EFR is a strong predictor of the SRTHP. Age is a variable that affects all the RAM EFR, SRT and hearing sensitivity measures, and can thus not be seen as an independent variable of SNHL in this analysis.

As we later discuss in discussion section "Confounding factors and Study limitations", people with hearing sensitivity loss are also expected to suffer from CS, as CS precedes OHC damage in the age- or noise-induced model of SNHL (Sergeyenko et al., 2013; Fernandez et al., 2015; Parthasarathy et al. 2018). It is the quality of the CS marker (and its independence from OHC damage) that will ultimately determine its unique role in speech intelligibility.

3. Given the above observations, it is likely that hearing thresholds / audibility would explain a significant portion of the variation in the relationship between speech perception and EFR amplitude (e.g., Fig 6-7). To rule this out, it is important to include hearing thresholds instead of DPOAE amplitude in the regression models. Each model should include thresholds within the region of the speech stimuli. One suggestion as follows: a. Pure-tone average (PTA) could be included as a covariate in the models evaluating BB SRT b. Low-frequency PTA could be included as a covariate in the models evaluating LP SRT c. High-frequency PTA could be included as a covariate in the models evaluating HP SRT To address this point, we both updated the multiple regression analysis in Table 4 (see explanation above) and added a new figure 8 and analysis in the results section that regresses out the mean audiometric threshold from the SRTBB, SRTLP and SRTHP conditions.

The respective adopted mean audiogram thresholds were averaged between 0.125-8, 0.125-1.5, and 1.5-8 kHz to give the best correspondence to the frequency content used in the speech material.

The figure shows the remaining SRT residuals against the RAM EFR amplitude, and the text was updated as follows: "To factor out the mediating effect of OHC damage hearing sensitivity on the strongest observed relationships between the RAM EFR and SRT, we corrected for THA by considering the residuals of a linear regression model between THA and SRT. Figure 7 shows the relation between the SRT residuals and the RAM EFR after correcting for the mean THA over the frequencies in the 0.125-8, 0.125-1.5, and 1.5-8 kHz intervals, for the BB, LP, HP conditions, respectively. After correcting for hearing sensitivity, the SRTSiN-HP residuals remained significantly correlated for the NH subgroup (r(29)=-0.6, p<.001), and approached significance for the cohort (ρ(44)=-0.28, p=.06). We repeated this analysis using THDP,4kHz as the correction factor, and found that the significant relationship between RAM EFR and SRTSiN decreased from ρ(44)=-0.73 (p<0.001) to ρ(44) -0.34, but remained significant (p = .02). After correcting for hearing sensitivity, none of the SRTBB or SRTLP residuals maintained a relationship to the RAM EFR. This is not surprising as the RAM EFR marker of CS reflects supra-threshold TENV processing above the phase-locking limit, and the SRTBB values were very similar to the SRTLP results which rely on low-frequency hearing mechanisms.

Especially for listeners with normal hearing sensitivity, the RAM EFR was able to explain speech recognition scores when speech was confined to target TENV processing mechanisms. The relationship remained strong, even after correcting for hearing sensitivity, and this confirms earlier findings in a cohort with normal audiograms (Mepani et al., 2021)." In addition, both reviewers identified problems with the statistical analyses that should be addressed. Hearing thresholds should be included in the analyses as covariates to account for substantial differences in audibility across participants.

As we explained above, we now changed the narrative of the study to focus on hearing sensitivity measures that can either be based on behavioral correlates (audiometric thresholds, THA) or physiological correlates of OHC damage (DPOAE thresholds, THDP).

Both measures are now consistently considered in all statistical analyses.

Additionally, it appears that a t-test performed on the budgerigar data included both paired and independent observations. This violates the assumption of independence and should be revised.

We looked into this issue and addressed it by adopting the method of a partially overlapping samples t-test as described in Derrick (2017): How to compare the means of two samples that include paired observations and independent observations: A companion to Derrick, Russ, Toher and White (2017). The Quantitative Methods for Psychology, 13(2), 120-126.

The text was updated as follows: "To account for the mix of paired and independent observations in the sample, we conducted a partially overlapping samples t-test (Derrick, 2017), which revealed a significant decrease in EFR amplitude from the pre-KA condition (M = 0.78, SD = 0.245) to the post-KA condition (M = 0.49, SD = 0.127) for the SAM stimulus (t(19.89) = 2.75, p = .012). The RAM condition showed an even larger significant reduction from pre-KA (M = 3.86, SD = 0.689) to post-KA (M = 1.73, SD = 0.565) (t(19.89) = 7.611, p < .001)." Also, the way statistical results are reported in many places in the text is not sufficient. It should be revised and reported according to the APA style, and discussed somewhat better. Detailed comments are below. Also, it is essential to report effect sizes along with the statistical tests.

You are right, we did not always report according to APA style, and therefore reported incomplete results at times. We rechecked all the statistics and ensured all results are now complete and according to APA. Effect sizes are not added as they are not usually additionally reported in the statistical tests we applied (and they can be calculated from the reported mean, t, df values).

Major issues:

1. There are a lot of paragraphs that lack clarity in phrasing.

Unfortunately the main authors are non-native English speakers and we therefore miss language fluency. Also the rewriting of the manuscript text over several iterations and with several authors reduced the overall readability. We improved the general flow of the manuscript text and checked with text-editor services to improve the grammar and clarity of writing.

2. Statistical test results are not reported in APA style consistently, which makes them hard to interpret. Also, it is essential to report effect sizes along with the statistical tests.

We went through all the applied statistical tests again, and double-checked which values need to be reported to be complete and in APA style. This step will greatly improve the credibility of our findings and make the manuscript easier to read.

3. Figures are very confusing and not very easy to relate to the text most of the time.

We updated the color scheme, and simplified the SRT figures to ensure they convey the main message of the paper more strongly. This implied the removal of the post-hoc DP-based subgrouping. This is still included in the statistical analysis but no longer in the main figures, so that these become less busy. We also embedded the figures better in the manuscript text. Especially Fig.2 is much better integrated to illustrate how we go from an EEG recording to the reconstructed EFR waveform and the EFR marker.

4. Consistency is lacking in every aspect, such as group names across the text and figures, the statistical tests are reported in different ways.

We agree and apologize for these inconsistencies throughout the manuscript. We now report all test results in APA style and we checked for consistent labeling in the figures and text. We hope this improves the readability of the paper.

Here I will list the specific issues in detail in the order found in the paper, with comments. • Line 204: "The test ear was chosen based on the better of the left and right audiograms." Does this mean that all of the following measures (DPOAE, SRT, and the EFR) were only recorded by stimulating that one ear? Please clarify.

Yes. We now specifically mention this aspect at the start of the methods section: "Aside from the audiogram and otoscopy, all tests were performed monaurally on the ear with the best audiometric thresholds." • Line 484: "Older subject showed SAM and RAM EFR amplitude reductions in the order of 6 and 47%, respectively. Only the RAM EFR reduction was significant (d.f. = 40, t=3.92, p=0.0004) in older listeners, and can be interpreted as caused by age-related cochlear synaptopathy" The results are a bit confusing, especially since they are not related to the figures. Reduction as compared to what is found here? The young control group? Please clarify.

You are right, we should have been more precise in our writing. We have updated as follows: "Compared to the yNHcontrol group, older subjects showed reductions in the amplitude of the SAM and RAM EFRs by 7% and 47%, respectively. The mean amplitudes for the older group were M = 0.061 (SD = 0.031) for SAM and M = 0.126 (SD = 0.057) for RAM. There were no significant differences between the yNHcontrol group and older listeners for the SAM EFR (p>.05). However, RAM EFR amplitudes were significantly reduced in the older group compared to the yNH control group (t(40)=5.17, p<.001)." • Lines 497-513: This analysis is rather confusing. This section refers to Figure 4D, however, no part of this is really shown if Figure 4D. I can see that the younger and older adults are marked with different colors, but the test itself is not clearly implied in this figure. Also, the authors describe a significant difference in both the DP>25 and DP<25 groups with respect to age, and they also mention the lack of difference between the older adults' RAM EFRs between the two groups, but there is no mention of the difference between the young adults in the two groups? Even with these, I failed to draw the same conclusion as the authors. Something is missing, or it should be depicted with a very different figure. If the distinction is based on the DP>25 vs DP<25, why is that difference not tested? If the difference is nuanced with the age of the subjects, then that should also be accounted for.

To address this comment, we have updated the figure to show both the yNHcontrol, oNH and oHI group data as well as the pooled OLD group (oNH+oHI) data. The statistics are now indicated on the figure and there is a better correspondence between what is shown in the figure and mentioned in the manuscript text. • Line 529: "...indicating that group SRTs were differently affected by the filtering." In what way? Were post hoc tests conducted? What was the difference? The results of the ANOVA test should be interpreted in conjunction with the group SRT data presented in Fig. 4. Post-hoc t-tests were performed, and p-values were Bonferroni-corrected for 18 multiple comparisons before assessing statistical significance. As shown in Fig. 4 and supported by the statistical analysis, the LP and HP conditions exhibit distinct trends across the groups, which aligns with the ANOVA results. • Line 530: "The analysis was repeated for a different grouping using the yNHcontrol group and THDP = 25 dB criterion to divide the older listeners into a normal or hearing-impaired subgroups, but the outcomes remained the same." Did the THDP group also include young listeners in this case? Based on Figure 5 it did, but that makes the statistical test comparing the control group to the experimental group problematic. Please clarify.

You are correct. When we use the THDP = 25 dB criterion, there can be both younger and older listeners in the group, so it does not make sense to compare these normal and impaired THDPs to the yNH control group. We left this analysis out of the manuscript, as the results were not different from the yNHcontrol, oNH and oHI group comparison. We did adopt the THDP = 25 dB criterion as an additional test in conjunction to the RAM EFR analysis in Fig.4, but here we control for the age factor and there is no problem with subjects being present in multiple groups: "We performed an additional post-hoc regrouping using the THDP criterion of 25 dB SPL to separate cohort into those with normal or impaired OHC integrity at 4 kHz. Within the group of THDPs < 25 dB SPL, younger subjects had significantly larger EFRs than the older listeners (t(17) = 3.9, p<.001), and among the older listeners, there were no significant EFR differences between those with normal or impaired THDPs (t(27) = 1.1, p>.05)." • The first entire paragraph of the discussion would be way better suited as a part of the introduction in my opinion.

Good point, we have integrated it with the revised introduction section where it better belongs.

Figures: • Figure 1: What is the line at 20 dB supposed to represent? The color scheme is not the best. The lines for the group averages could be a lot more obvious, for example, the young normal hearing group's average line is barely visible, since it looks the same as the individual's lines.

We improved the color scheme across the figures such that the individual lines (and mean across individual groups) in the audiogram figure 1 are more clear. The 20dB-line indicates the classification criterion to classify older listeners into the oNH or oHI group, and is described in the figure caption as well as in the figure label. • Figure 4: The letters for the panels should be uniform, one is to the right, some are to the left, within or out of the box. The font sizes should be uniform, across the panels.

We now placed the labels at the same position in the sub panels and did what we could to make the figure text sizes conform. Minor details in consistency of font sizes across panels will be checked with journal proof-reading editors. • Fig. 4 Panel A: X axis should read "weeks in relation/with respect to KA administration".

The title for the dashed line for KA administration would work better on top, as it's very busy currently.

You are correct, we changed the label into "Weeks re. KA administration" • Fig. 4 Panel C: If the connected data points are from the same animal, what is the rest? It should be stated in the figure text and explained better. The text says control animals, but it would be nice to understand what the basis for this analysis was. The SAM/RAM would be better suited in the X axis title. It is also hard to tease out what we see in the SAM case, as all the amplitudes are pretty small. I understand wanting to show things on the same scale, but this way, it's illegible.

We have updated the manuscript text as follows:

Methods: "Three animals were tested before and after KA administration and monitored over time, whereas the others in the cohort either belonged to a control-only (n=11)280 or KA-only group (n=4)." Results: "The post-KA EFR amplitudes shown in Fig.3C correspond to average EFR amplitudes over the different post-KA measurement time points for each animal and are compared to control (pre-KA or non-KA) EFR amplitudes. Connected lines refer to data points stemming from the same Budgerigar, and SAM and RAM EFRs were recorded during the same session. Other points show data from control animals (n=11) or post-KA exposure animals (n=4; i.e., animals for which pre-exposure EFRs were not recorded)." • Fig. 4 Panel D: The mean age is redundant here, I would not add it to the figure if it is stated in the text. Why is there no statistical significance indicated here? The text says there is, but it's not shown on the figure. Also, the way it's currently depicted, it suggests that there was a comparison made between all of those groups shown on the figure, which doesn't make sense.

We now indicate the statistical significance of the post-hoc t-tests in the figure, and motivate the multiple-comparisons correction we performed in the figure caption (six in this case). • Figure 5: The titles for the panels should be in the same place, I had to hunt for HP on the rightmost panel. It should be obvious.

We corrected this. The labels for the different conditions are now all at the same level. • I'd suggest removing all lines that do not show a significant result, as this makes the figure harder to read.

We implemented this. • Figure text " or (HIDP, HIDP)." I believe that should read "NHdp, HIdp" instead.

You are correct, but this is no longer applicable because we are focussing on the oNH and oHI groups now in all figures. • Thinking further about it, since the middle and the left panel don't show much, I'd consider only showing the right panel for the main text figure. It takes away from the main message.

We have greatly simplified the figure, so this should no longer be an issue. • Since the NH DP and HI DP groups were created as a post-hoc analysis, I'd somehow signal that these are in fact the same groups, maybe with just a vertical line, but it would at least create a sort of divide between the multitude of comparisons. I'd also probably use a different background color for the right panel, to show that it is different from the left one.

We have omitted those groups from the figure, and only refer to these additional post-hoc statistical analyses in the main text. The post-hoc DP grouping shows almost the same results as the regular group-based (oNH and oHI) statistics, and hence it is only confusing to the reader to duplicate this information in the figure. • Also, is there a purpose of showing the "older adult" group at all? There only seem to be tests reported for the oNH and oHI groups, so it is redundant.

We have now also reported the statistical analysis for the OLD group results, and we refer to these results several times in the main manuscript text, so we are keeping them. • Figure 4 and 5 at one point specifies DP >25 and DP<25 groups vs NHdp and HIdp groups. Are these the same (based on the text I am assuming yes)? If yes, why not keep the names consistent; if not, what was the reason for choosing a different grouping for the two different analyses? No longer applicable, the labeling was indeed not consistent. • Figure 6: Again, panel names (A-F) are in a different location, and BB-HP are on the bottom of the plot. It should be made easy for the reader to know where to look for things. What does ALL signify in the plots? It's also nice to keep the significance codes uniform and makes the figure easier to read if I see them in the figure (if there's no code, I can erroneously assume it's not significant).

We have put the figure labels in the top right corner everywhere, and indicated the statistical analysis for the entire cohort (ALL) and in subgroups (OLD, NH). The figure caption explains this grouping better now. The conditions (BB, LP, HP) are now consistently labeled at the bottom of the plot together with the stimulus type, so the figures should be easier to navigate now. • "...group are marked with blue dots." Marked with crosses? I do not see any blue dots. • Same goes for Figure 7.

Indeed, marked with crosses, we have corrected this mistake in the caption text. • Results ~ figures: why not show the budgerigar results first in Figure 3 if that is mentioned first in the paper? The human results are shown first as they are discussed first in the manuscript. We now integrate references to the figures better with the main text, so the figure order should be logical now.

Minor issues:

1. Abstract: "To address this disconnect between cochlear damage and speech intelligibility deficits, this study investigates to which degree CS contributes to impaired, low-cognitive-effort, speech intelligibility in older listeners." What role does the low cognitive effort play in this sentence? (What degree is correct, not which.) We removed the low-cognitive-effort aspect here, and now refer to word recognition: "To explore the link between cochlear damage and speech intelligibility deficits, this study examines the role of CS for word recognition among older listeners" 2. In the abbreviations, the following items don't occur in the text, remove, or add to the text: • H/M/LSR fibers - high/medium/low spontaneous rate • IC - inferior colliculus • peSPL - peak-equivalent sound pressure level • Add: AEP, ABR maybe? Thank you for pointing this out. We forgot to update the list before submitting. We went through it once more and now updated it with the most important/commonly used abbreviations in the text.

3. Line 66: temporal TFS - temporal is redundant.

Correct, we removed it.

4. Line 77: period after EFRs is redundant.

Yes, we removed it.

5. Line 432: Pearson and Spearman correlation is supposed to be capitalized.

We corrected this.

6. Line 472: "We conclude that the RAM EFR is a sensitive and selective non-invasive marker of cochlear synaptopathy in budgerigar and that the RAM EFR has an improved sensitivity over the SAM EFR in identifying individual differences in CS." Improved means 'it got better than before' in my vocabulary. Maybe better, or more sensitive is a better phrase for this sentence? We changed to "better sensitivity" 7. Line 568: "This illustrates that as the EFR marker became more sensitive to detect individual CS differences (RAM vs. SAM)..." Became is really not a good word for this, it indicates a change, whereas I suppose here it means that the better the EFR marker is, the lower the speech threshold is. If I misunderstand, please clarify.

We agree this was confusing, and we reformulated as follows: "This suggests that the RAM-EFR marker, which was more sensitive to detect individual CS differences, was also more effective at predicting individual differences in speech recognition." 8. Line 591: "The correlation disappeared completely when considering only older listeners (n=29)." Correlations don't disappear. No significant correlation is found when comparing this and that.

We removed the sentence.

9. Line 603: "These results indicate there is a general age-related trend for CS exists,..." The sentence has two verbs, one is sufficient.

Ok, thank you for noticing, we modified the sentence.

10. Line 625: (upto 81%) -> up to Corrected 11. Figure 2: Is this figure really necessary to be a standalone main text figure? Since it's described in the text, I'd either remove this figure or move it to the supplement.

Indeed, this figure is not so important, we have now added the stimulus waveform shape icons to the methods and analysis figures, which is sufficient for the reader to get an idea of what the stimulus looks like, aside from the general stimulus description given in the main manuscript text.

12. Line 698: "The role of OHC damage in speech intelligibility declines" This title is not very meaningful. It declines with relation to what? It would have been better without the word "declines", then I would know to expect to read about the role of OHC damage.

Ok, we removed this.

13. Line 852: "With the established finding that synaptopathy is involved in speech perception declines in aging,..." the sentence doesn't make sense as it stands, the sentence has two verbs.

Indeed, this was strangely written, so we removed the sentence.

14. Line 804: "This implies that the generators of the 4-kHz EFR may also have spanned broader generator region than expected Encina-Llamas et al. (see also 2019)." "expected based on" would be the correct way to put it and the reference is incomplete the way it's written.

We modified the sentence, and moved the reference elsewhere in the text

Back to top

In this issue

eneuro: 12 (2)
eNeuro
Vol. 12, Issue 2
February 2025
  • Table of Contents
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Deciphering Compromised Speech-in-Noise Intelligibility in Older Listeners: The Role of Cochlear Synaptopathy
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Deciphering Compromised Speech-in-Noise Intelligibility in Older Listeners: The Role of Cochlear Synaptopathy
Markus Garrett, Viacheslav Vasilkov, Manfred Mauermann, Pauline Devolder, John L. Wilson, Leslie Gonzales, Kenneth S. Henry, Sarah Verhulst
eNeuro 9 January 2025, 12 (2) ENEURO.0182-24.2024; DOI: 10.1523/ENEURO.0182-24.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
Deciphering Compromised Speech-in-Noise Intelligibility in Older Listeners: The Role of Cochlear Synaptopathy
Markus Garrett, Viacheslav Vasilkov, Manfred Mauermann, Pauline Devolder, John L. Wilson, Leslie Gonzales, Kenneth S. Henry, Sarah Verhulst
eNeuro 9 January 2025, 12 (2) ENEURO.0182-24.2024; DOI: 10.1523/ENEURO.0182-24.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
    • Abbreviations
    • Synthesis
    • Author Response
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • cochlear synaptopathy
  • envelope-following response
  • outer hair cell damage
  • reception threshold
  • sensorineural hearing loss
  • speech-in-noise
  • speech
  • speech intelligibility

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • Early Development of Hypothalamic Neurons Expressing Proopiomelanocortin Peptides, Neuropeptide Y and Kisspeptin in Fetal Rhesus Macaques
  • Experience-dependent neuroplasticity in the hippocampus of bilingual young adults
  • Characterisation of transgenic lines labelling reticulospinal neurons in larval zebrafish
Show more Research Article: New Research

Disorders of the Nervous System

  • Expression of HDAC3-Y298H Point Mutant in Medial Habenula Cholinergic Neurons Has No Effect on Cocaine-Induced Behaviors
  • Parallel gene expression changes in ventral midbrain dopamine and GABA neurons during normal aging
  • Demyelination produces a shift in the population of cortical neurons that synapse with callosal oligodendrocyte progenitor cells
Show more Disorders of the Nervous System

Subjects

  • Disorders of the Nervous System
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.