Introduction

The present study tested for reduced auditory nerve and elevated brainstem activity in people with tinnitus using the auditory brainstem response (ABR), a series of voltage fluctuations or “waves” recordable from the scalp and occurring within ~10 ms of sound stimulation (Jewett 1970). There were two motivations for this study. One follows from previous suggestions that tinnitus may be related to degeneration of a subset of auditory nerve fibers (ANFs) (Bauer et al. 2007) and from recent animal data indicating that high-threshold ANFs may be particularly susceptible to degeneration following high-level acoustic exposures (Kujawa and Liberman 2009; Lin et al. 2011). Here, the amplitude of the first ABR wave, which reflects auditory nerve activity (Buchwald and Huang 1975; Møller and Jannetta 1981), was tested for reductions in tinnitus subjects compared with closely matched non-tinnitus controls. Such a reduction in the face of close threshold matching would be consistent with a selective loss of high-threshold ANFs in tinnitus subjects. The second motivation comes from functional magnetic resonance imaging (fMRI) data demonstrating elevated responses to sound in the inferior colliculi of human subjects with tinnitus (Lanting et al. 2008; Melcher et al. 2009), specifically those with an intolerance of moderate- to high-level sounds (i.e., hyperacusis; Gu et al. 2010). In contrast to fMRI, the ABR reflects activity in only a subset of brainstem neuronal populations and only those having constituent neurons that are highly synchronized to one another in their responses to sound (Melcher and Kiang 1996). Thus, measurements of ABR amplitudes after wave I were used to test whether activity is elevated in these synchronized neuronal populations, in order to complement the less specific information provided by fMRI.

Notably, previous studies that examined ABR wave amplitudes (as opposed to only latencies) and also controlled for hearing sensitivity already provide indications that wave I may be reduced and subsequent waves elevated in human subjects with tinnitus (Attias et al. 1993; Attias et al. 1996; Kehrle et al. 2008; Schaette and McAlpine 2011). Attias et al. (1996) reported greater wave III amplitude in tinnitus subjects with high-frequency hearing loss compared with threshold-, sex-, and age-matched control subjects without tinnitus. Although an earlier study by Attias et al. (1993), also on subjects with high-frequency hearing loss, stated that there were no differences in ABR between tinnitus subjects and non-tinnitus controls, the reported mean ABR amplitudes showed a clear trend toward reduced wave I. Kehrle et al. (2008), comparing tinnitus and non-tinnitus subjects with clinically normal thresholds, reported an enhanced V/I amplitude ratio in tinnitus subjects. While they did not report whether the enhanced amplitude ratio arose from a reduction in wave I, an elevation in wave V or, a combination of both, the earlier data of Attias et al. (1993) and recent report of Schaette and McAlpine (2011) indicate that reduced wave I was a factor. Although the study of Barnea et al. (1990) reported normal ABR amplitudes in tinnitus subjects with clinically normal audiograms, it compared the tinnitus data with clinical norms, not with a closely matched control group, an approach suitable for detecting gross ABR abnormalities but not subtler ones. The present study confirms and extends the published data indicating differences in ABR amplitude between tinnitus subjects and matched non-tinnitus controls. Taking into consideration the neuronal generators of the ABR, the results suggest that a neuronal pathway arising within the ventral cochlear nucleus (VCN) plays a role in tinnitus.

Methods

Study design and subjects

ABRs were compared between subjects: (1) with tinnitus (15 men, age 42 ± 6 years (mean ± standard deviation)), (2) without tinnitus and with age and hearing sensitivity similar to the tinnitus subjects (21 men, age 43 ± 7 years), and (3) without tinnitus, but younger (11 men, age 23 ± 2 years). The comparison of primary interest was between subjects in groups (1) and (2), but comparisons were also made to the younger, audiometrically healthier subjects of group (3). To avoid within-group variability in ABR amplitude related to sex (Jerger and Hall 1980; Michalewski et al. 1980) as well as systematic, inter-group differences in ABR related to sex, all subjects were of one sex (male). Finally, subjects were characterized with respect to a variety of factors, including measures of sound-level tolerance (SLT), depression, and anxiety, to allow post hoc tests of whether such variables were a factor in any inter-group differences in ABR.

Each subject underwent a behavioral testing session and one to five ABR recording sessions. Individual subject characteristics and tinnitus characteristics are provided in Tables 1 and 2.

TABLE 1 Subject characteristics
TABLE 2 Tinnitus characteristics

All but one of the tinnitus subjects (No. 213) were recruited through the Tinnitus Clinic at the Massachusetts Eye and Ear Infirmary (MEEI). The remaining tinnitus subject and all non-tinnitus subjects were recruited from advertisements in local newspapers and personal contacts. This study was approved by the institutional committees on the participation of human subjects at the Massachusetts Institute of Technology and MEEI. All subjects gave their written informed consent.

Behavioral testing

During the behavioral testing session, thresholds to tones with frequencies from 125 Hz through 16 kHz were measured. Loudness discomfort levels were assessed as described in Gu et al. (2010). In tinnitus subjects, tinnitus pitch, loudness, minimum masking level (MML), and presence of residual inhibition were assessed as described in Gu et al. (2010). All subjects completed questionnaires assessing handedness (Oldfield 1971), depression (Beck et al. 1961), anxiety (Beck et al. 1988), SLT (Tyler et al. 2003), and medication intake. Subjects with tinnitus completed two additional questionnaires: one assessing effects of tinnitus on quality of life (Tinnitus Reaction Questionnaire (TRQ)) (Wilson et al. 1991) and the other assessing tinnitus characteristics (e.g., quality of percept and location).

Electrode placement

Chlorided silver electrodes (Grass Technologies) were applied to the scalp with conducting cream (EC2, Grass Technologies), after first abrading the skin with conducting gel (Nuprep, Weaver and Co.). The electrode sites included three locations in the standard 10–20 system: vertex, F3 (left frontal), and F4 (right frontal). An electrode was also clipped to each earlobe. The electrode on the earlobe of the stimulated ear served as the reference electrode. The ground electrode was placed on the forehead or neck with no systematic difference in placement between tinnitus and control subjects. Electrode impedances were measured before, during breaks if any, and after ABR recording, and were maintained at ≤7 kΩ throughout each session.

Stimuli

Stimuli for ABR recording were digitized at a rate of 20 kHz, generated using a DAQPad (National Instruments), and presented over headphones (Sennheiser, HDA-200). Stimuli were 100 μs clicks (condensation) presented monaurally at 30, 50, 70, and 80 decibel above normal adult hearing level (dB nHL). The click spectrum, measured at the output of the headphones on an artificial ear (Larson Davis, AEC101), is shown in Figure 1. For click levels of 50, 70, and 80 dB nHL, broadband noise at 10, 30, and 40 dB nHL, respectively, was presented to the opposite ear to mask any stimulation via acoustic cross talk (Levine 1981). 0-dB nHL was estimated by averaging the click threshold of four subjects age 23–27 who had pure-tone thresholds of ≤20 dB HL at standard audiometric frequencies; 0 dB nHL corresponded to 40 dB peak SPL (measured as in Burkard 2006). Six tinnitus, two non-tinnitus of similar age, and two young non-tinnitus subjects did not tolerate the 80 dB stimulus.

FIG. 1
figure 1

Spectrum of click stimulus (thick line) and noise floor (thin line). Magnitude is in decibel with respect to maximum. Spectrum was measured at the output of the headphones.

Clicks were presented at a rate of 11/s in 4-min runs. The interval between click presentations was jittered by 10 % (9 ms). Six runs were collected at 30 dB and three runs were collected at each of the higher stimulus levels, yielding 15,840 and 7,920 total click presentations per stimulus level, respectively. Stimulus level was constant throughout a run and varied pseudo-randomly across runs of a given session. For sessions in which both ears were tested, measurements for one ear were completed before taking measurements for the other. For all except three subjects (No. 129, 148, and 168), both ears were tested.

Data acquisition

The head stage of the recording system (Medusa, Tucker-Davis Technologies) amplified (20× gain) and digitized (25 kHz sampling rate) the signals from the vertex, F3, and F4 electrodes—each referenced to the earlobe of the stimulated ear. The digitized signals were relayed to the base station of the recording system, which band-pass filtered the signals between 5 Hz and 5 kHz, amplified them 2,000×, and converted them to analog form. These analog signals, as well as the stimulus waveform from the DAQPad, were digitized at a rate of 25 kHz and streamed to disk using a National Instruments board (CA-1000). Throughout each session, the signal outputs of the base station were monitored visually on an oscilloscope to assess signal quality and determine if intervention was necessary (e.g., asking the subject to relax or reattaching electrodes).

Waveform averaging

For each click level and electrode pair, the streamed-to-disk electrode signals were processed as follows. Segments of data extending from 20 ms before to 20 ms after each click presentation were extracted. Next, in preparation for a first stage of data-quality assessment, the mean pre-stimulus signal was subtracted from each segment to remove any DC offset and, the segment was low-pass filtered to reduce high-frequency noise (2 kHz cutoff, Butterworth, fourth order). Each segment was then tested for a standard deviation in pre-stimulus signal of 8 μV or less and a maximum post-stimulus amplitude of 30 μV or less. These criteria were determined, by empirical trial and error in the first 10 % of subjects tested, to provide a means for screening out segments that subsequently led to noisy waveform averages. Segments not meeting these criteria were rejected from subsequent analyses.

The surviving segments were used to compute a weighted average, where the weighting for a segment was the reciprocal of the standard deviation of the pre-stimulus signal, divided by the sum of the reciprocal standard deviations over all surviving segments (e.g., see the similar approach of Elberling and Wahlgreen 1985). This average waveform was then subjected to a second stage of quality assessment that focused on a period extending from 10 ms before the stimulus to 10 ms after it. First, each waveform was corrected for signal drift by subtracting a linear fit to the pre-stimulus baseline. Then, it was tested for adherence to two more empirically determined criteria: (1) standard deviation of the pre-stimulus baseline less than or equal to 0.03 (30 dB) or 0.05 μV (50, 70, and 80 dB) and (2) positive signal values at least 30 % of the time during waves I–III. Waveforms not meeting these criteria were rejected. Overall, 253 of 999 waveforms were rejected on the basis of noisiness; the results are based on the remaining 746.

Stimulus artifact removal

Because the dependence of auditory nerve responses on stimulus level is different for condensation and rarefaction clicks (Peake and Kiang 1962), the conventional method of alternating click stimulus polarity to reduce stimulus artifact in the average waveforms was not used to avoid complicating interpretation of the wave I data. As a consequence of using single-polarity clicks, the ABR waveforms (mainly the onset of wave I) at 70 and 80 dB were overlapped by a stimulus artifact. The artifact was removed from the 70 and 80 dB waveforms by subtracting an estimate of the artifact alone. The estimate was obtained by (1) recording the waveform produced by an 80-dB click at electrodes applied to an inert sphere of conducting material (ground chicken), which was substituted for a subject's head in a set-up that was otherwise identical to that during actual ABR recordings and (2) scaling the waveform amplitude to match the amplitude of any artifact in the ABR recording. This scaling-to-match amplitude was performed on one of the last of the successive peaks in the artifact waveform, which immediately preceded wave I. After subtraction, any obvious residual artifact preceded wave I (Fig. 2). However, to further ensure that the final wave I amplitude values were uncontaminated, amplitude was measured from peak-to-trough so as to cancel any residual artifact that might extend throughout wave I. The validity of this approach for artifact removal was tested by (1) on the inert, substitute head, measuring waveforms for 80 dB, as well as lower stimulus levels and (2) treating the waveforms recorded for lower levels as if they were actual ABR recordings by scaling and subtracting the 80-dB waveform. Near-complete removal of the artifact from the lower-level waveforms confirmed the validity of the procedures.

FIG. 2
figure 2

Mean ABR waveforms at each stimulus level for the same tinnitus, matched non-tinnitus, and young non-tinnitus subjects and stimulated ears represented in Figures 3 and 6. The waveform deflections preceding wave I are residual stimulus artifact.

Quantification of ABR amplitude

Quantification of ABR amplitude involved the following:

  1. 1.

    Identification of the peaks of waves I, III, and V. A “peak” was defined as a local maximum within a specified time window. The time windows were defined separately for each stimulus level based on a grand average of ABR waveforms for the younger, non-tinnitus subjects at that level. The boundaries of the time windows coincided with troughs in the average waveform and points of inflection preceding wave I and following wave V. Peaks were picked automatically. However, the time windows and waveforms were also compared visually to confirm that each window accurately encompassed the appropriate peak. The applicability of time windows defined from young, non-tinnitus data to the other subject groups is not surprising given the similarity of peak latencies across groups, illustrated by the superimposed grand average ABR waveforms for the three groups in Figure 2.

  2. 2.

    Measurement of wave amplitudes. Wave amplitudes were measured so as to quantify the activity of their underlying wave generators. Wave I amplitude was measured peak to subsequent trough for the additional reason of artifact rejection just described. Waves III and V were measured from pre-stimulus baseline to peak, rather than peak-to-trough, so as to include the magnitude of the slow, prolonged positivity on which the more rapid ABR fluctuations ride (e.g., see Fig. 2). In some cases, particularly at the lower stimulus levels, there was no definable peak for wave I (percentage of waveforms at 30 dB, 31 %; 50 dB, 18 %; 70 dB, 2 %; and 80 dB, 0 %) or wave III (30 dB, 8 %; 50 dB, 2 %; and 70 and 80 dB, 0 %). Whenever a wave I peak was missing, there was also no obvious trough, so wave I amplitude was zero and a latency could not be assigned. When there was no wave III peak, there was still sometimes a signal elevation above baseline in the wave III time window corresponding to the slow ABR positivity. In order to capture this elevation, wave III amplitude was quantified as the difference between the average signal level during the wave III time window and the pre-stimulus baseline (the average signal level over the 10 ms preceding the stimulus) and latency was indeterminate. A peak corresponding to wave V was always distinguishable.

Finally, because neither wave amplitude nor latency varied systematically across the three electrode pairs, the amplitude and latency data for a given wave were averaged across pairs to yield the final values for each subject, ear, and stimulus level. Table 3 gives mean amplitude and latency for the tinnitus and matched, non-tinnitus groups defined at each stimulus level (see following section), and for the younger, non-tinnitus subjects.

TABLE 3 Amplitude and latency of waves I, III, and V for each stimulus level and subject group

Because wave V amplitude is commonly quantified peak-to-trough, this measurement of wave V was made for the two highest stimulus levels for comparison with our main, baseline-to-peak measurements. Specifically, amplitude was measured from peak to the trough following wave VI, which often merges into the falling edge of wave V. In contrast to the baseline-to-peak measure, the peak-to-trough measure did not differ between tinnitus and control subjects, indicating that the low-frequency components of the ABR waveform (comprising the slow, prolonged positivity on which the fast waves ride) was important to the difference in wave V amplitude reported in the “Results” (see “Discussion”).

Tinnitus/non-tinnitus matching

ABR data from tinnitus subjects were compared with that of non-tinnitus subjects from group 2 (subjects similar in age and threshold to the tinnitus subjects) by forming tinnitus and non-tinnitus datasets matched in mean threshold and age. (Sex was matched automatically since all subjects were male.) A separate match at each stimulus level was performed since the tinnitus subjects and stimulated ears providing data differed somewhat across stimulus levels because some ABR waveforms did not meet the noise criteria imposed during the analyses and because some subjects did not tolerate the 80 dB stimulus. In matching mean threshold and age, and calculating ABR wave amplitudes, results for left ear and right ear stimulation in a given subject were treated as separate data points.

Results

Reduced wave I amplitude but elevated waves III and V in male subjects with tinnitus

Figure 3 shows ABR amplitude data for tinnitus and non-tinnitus subject groups matched in mean age and threshold. Figure 3A–C shows mean amplitude of waves I, III, and V, respectively, at the four click levels presented (tinnitus, black bars and matched non-tinnitus, gray bars). Figure 3D shows mean pure-tone threshold for the stimulated ears represented in Figure 3A–C. Thresholds are plotted separately for each level because the ABR data at each level are for slightly different stimulated ears (see “Methods” and “Tinnitus/non-tinnitus matching”). At each level, the tinnitus and non-tinnitus thresholds were closely matched, as was mean age (Table 4), and sex (all subjects male).

FIG. 3
figure 3

Reduced wave I amplitude but elevated waves III and V in tinnitus subjects compared with threshold-, age-, and sex-matched non-tinnitus subjects. AC Mean amplitude of waves I, III, and V at each stimulus level. Error bars indicate ± one standard error. *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001, significant differences in rank-sum comparisons. Mean pure-tone thresholds ± one SE (D) are for the same subjects and stimulated ears as (A) to (C). The number of stimulated ears and subjects contributing to the tinnitus data at each level (30, 50, 70, and 80 dB) are: 21 (12 subjects), 22 (13), 25 (14), and 11 ears (7). Contributing to the matched non-tinnitus data are: 32 (19 subjects), 28 (17), 33 (19), and 14 ears (7). Threshold at 16 kHz is not plotted in (D) because threshold at this frequency was elevated beyond the limit of the measurement system in approximately 40 % of stimulated ears. Substituting this limit for threshold in these instances yields a minimum mean threshold at 16 kHz for the tinnitus and matched non-tinnitus cohorts of 39 and 36 dB HL, respectively.

TABLE 4 Characteristics of tinnitus and matched non-tinnitus subject groups

Despite the close matching, mean wave I amplitude was reduced in tinnitus subjects for 50, 70, and 80 dB clicks (Fig. 3A). The reduction at 80 dB was significant (rank-sum, p = 0.0007, not corrected for multiple comparisons). Because of the slight but systematic difference in the mean threshold above 8 kHz for the tinnitus vs. the non-tinnitus data (Fig. 3D, 80 dB audiograms), wave I amplitude was tested for dependence on threshold at these high frequencies by cross-correlating it with mean threshold from 9 to 14 kHz (omitting 10 kHz where the click stimulus has a spectral null). There was no significant correlation (p = 0.3; r = −0.24) indicating insensitivity of wave I amplitude to threshold above 8 kHz and making it highly unlikely that threshold differences at these high frequencies contributed significantly to the reduced wave I amplitude of the tinnitus group at 80 dB compared with the matched, non-tinnitus group. The results of the correlation analysis are, in fact, to be expected since stimulus intensity was lower and thresholds (both tinnitus and non-tinnitus) were higher above 9 kHz compared with below (Figs. 1 and 3D). Together, these factors would work to reduce cochlear stimulation above 9 kHz and thus diminish the relevance of threshold differences at these frequencies.

Despite the reduction in wave I, mean waves III and V amplitudes were increased in tinnitus subjects relative to non-tinnitus subjects (Fig. 3B, C). The elevations for wave V at 80 and 30 dB were significant (p ≤ 0.05).

To determine whether the reduction in wave I and the elevation in waves III and V, were related, the amplitude of the two later waves was plotted versus wave I amplitude (Fig. 4). The vertical line in each panel of Figure 4 indicates mean wave I amplitude for the non-tinnitus data points in the plot (gray dots); the horizontal line indicates mean wave V (top row) and III (bottom row) amplitude. While only wave V amplitude at 80 dB correlated significantly with wave I amplitude (Spearman—p = 0.05; r = −0.39 and others—p ≥ 0.1), the data points for tinnitus subjects (black dots) tended to fall in the upper left quadrant of each panel and be absent from the lower right quadrant, meaning there was some tendency for the tinnitus subjects with smaller wave I amplitudes to also have larger waves V and III amplitudes.

FIG. 4
figure 4

Individual data showing waves V (top) and III amplitudes (bottom) versus wave I amplitude for two stimulus levels, 80 (left) and 70 dB nHL (right). Tinnitus and matched non-tinnitus subjects are the same as those in Figure 3. Each point corresponds to ABR data for a given subject and stimulated ear. Vertical and horizontal lines indicate mean wave I amplitude and mean wave V (top row) and III (bottom row) amplitudes, respectively, for the matched non-tinnitus subjects.

The tendencies in Figure 4 motivated an analysis of the following amplitude ratios: V/I and III/I (Fig. 5; each point corresponds to a subject and stimulated ear; bars indicate medians). Tinnitus subjects showed significantly greater V/I and III/I amplitude ratios at 80 and 70 dB compared with the matched, non-tinnitus subjects (Fig. 5).

FIG. 5
figure 5

Wave V/I (left) and III/I (right) amplitude ratios were elevated in tinnitus subjects for both 80 and 70 dB nHL clicks. Tinnitus and matched non-tinnitus subjects are the same as those in Figure 3 except that one non-tinnitus subject (No. 9) with zero wave I amplitude was excluded at 70 dB. Each data point corresponds to a given subject and stimulated ear. Bars indicate median. *p ≤ 0.05; ***p ≤ 0.001, significance of rank-sum comparisons.

Testing for effects of variables other than tinnitus

Given previous results showing dependencies of midbrain fMRI activation on SLT (Gu et al. 2010), the amplitude of ABR waves I, III, and V, as well as the amplitude ratios, III/I and V/I, were examined for effects of SLT two ways. First, each ABR measure at each stimulus level was tested for correlations with loudness discomfort level (LDL) and with score on a questionnaire of SLT (SLTQ score). The correlation calculation at each stimulus level was based on the tinnitus and matched, non-tinnitus data contributing to Figure 3. Neither LDL nor SLTQ score showed correlations with wave I, III, or V amplitude or the amplitude ratios. A second analysis involved first categorizing the tinnitus and non-tinnitus data according to SLT category, as in Gu et al. (2010). The classification was such that subjects having both an SLTQ score of ≥0.7 and an LDL of >110 SPL were designated as having “normal” SLT, while others having lower SLTQ scores or LDL were classified as “abnormal.” A two-way analysis of variance (ANOVA; tinnitus category × SLT category) was then performed on the waves I, III, and V amplitudes and amplitude ratios at each stimulus level. The results confirmed statistically significant effects of tinnitus for wave I at 80 dB (p = 0.001), wave V at 80 and 30 dB (p = 0.007, 0.04, respectively), the V/I amplitude ratio at 80 and 70 dB (p = 0.003, 0.05), and the III/I ratio at 80 dB (p = 0.04). In contrast, there were no significant effects of SLT (p ≥ 0.2).

To assess further whether variables other than tinnitus could account for the observed differences in ABR measures between tinnitus and non-tinnitus data, waves I, III, and V amplitudes and amplitude ratios were tested for correlations with score on the depression and anxiety questionnaires. As for the SLT analysis, the correlations were calculated for each stimulus level based on the tinnitus and non-tinnitus data in Figure 3. In each instance of a significant correlation, effects of tinnitus were compared with the correlated variable(s) (depression score and/or anxiety score) in an ANOVA. The results are as follows:

  1. 1.

    Wave I at 80 dB was significantly correlated with depression (p = 0.005) and, a two-way ANOVA (tinnitus × depression) showed effects of both depression (p = 0.03) and tinnitus (p = 0.02; no interaction, p = 0.2). Thus, it is possible that depression, in addition to tinnitus, was a factor in the tinnitus vs. non-tinnitus difference in wave I amplitude at 80 dB.

  2. 2.

    Wave I at 70 dB showed significant correlations with depression (p = 0.02) and anxiety (p = 0.007), but a three-way ANOVA showed no effect of depression or anxiety, or of tinnitus (p ≥ 0.4 for each variable). The latter result is consistent with the lack of significant difference in the original tinnitus vs. non-tinnitus comparison.

  3. 3.

    The V/I and III/I ratio at 70 dB showed a significant correlation with anxiety (V/I—p = 0.005; r = 0.49 and III/I—p = 0.001; r = 0.54), but this was likely because anxiety tended to be higher in tinnitus subjects. Specifically, when anxiety and tinnitus were weighed against one another in a two-way ANOVA, there was a significant effect of tinnitus (V/I—p = 0.003 and III/I—p = 0.04), no effect of anxiety (V/I—0.3 and III/I—0.3), and any interaction was less than the effect of tinnitus (V/I—0.05 and III/I—0.2). In other words, elevations in V/I and III/I amplitude ratios were more closely related to tinnitus than to anxiety.

Testing for relations to tinnitus characteristics

ABR measures differing significantly between tinnitus and matched non-tinnitus subjects were tested for correlations with tinnitus pitch, loudness, MML, or score on the TRQ. Specifically, the following measures in tinnitus subjects were cross-correlated with each of these variables: wave I amplitude at 80 dB, wave V amplitude at 30 and 80 dB, and the V/I and III/I amplitude ratios at 70 and 80 dB. Most of the correlations were insignificant (p ≥ 0.2). However, there was a significant correlation between tinnitus loudness and wave I amplitude at 80 dB (Spearman—p = 0.008; r = −0.75), wave V amplitude at 30 dB (p = 0.04; r = −0.46), and wave V/I amplitude ratio at 80 dB (p = 0.04; r = 0.62). While the correlation for wave I at 80 dB was highly statistically significant, it was not corroborated by any similar trend at lower sound levels.

Reduced ABR amplitudes in matched, non-tinnitus subjects compared with young, non-tinnitus subjects

Figure 6 shows the matched, non-tinnitus data from Figure 3 alongside data for non-tinnitus subjects who, on average, were 20 years younger. The greater audiometric health of the younger non-tinnitus subjects is clear from both the higher ABR amplitudes for all waves and stimulus levels (Fig. 6A–C) and from the markedly better mean pure-tone thresholds at frequencies above 4 kHz (Fig. 6D). This comparison illustrates that the differences between tinnitus and matched non-tinnitus subjects occurred on top of pathology shared by the two groups and evident in comparison to the younger, non-tinnitus cohort.

FIG. 6
figure 6

ABR amplitudes and thresholds in young, non-tinnitus subjects compared with the data for matched, non-tinnitus subjects from Figure 3. See Figure 3 caption.

Discussion

The primary result of the present study is as follows: on average, wave I was reduced in tinnitus subjects compared with closely matched non-tinnitus controls whereas waves III and V were not. In fact, waves III and V showed slight amplitude enhancements, which reached statistical significance for wave V at two of the four ABR stimulus levels. The enhancement of waves III and V relative to wave I, expressed as amplitude ratios, was also significant. While statistically significant, the differences between groups were small and likely detectable because of the close matching between tinnitus and non-tinnitus subjects—in mean hearing threshold over the frequency range of the ABR stimulus, in age, and in sex. Post hoc analyses ruled out dependencies of wave amplitudes on either SLT or anxiety. While the analyses raised the possibility that depression was a factor in wave I amplitude reduction, they also indicated that the effect of tinnitus was at least as strong. Thus, the results indicate a relationship between tinnitus and the wave reductions and enhancements documented in the present study.

Interpretations of reduced wave I in tinnitus subjects

It is well established that wave I of the ABR, both human and animal, is generated by the auditory nerve (Møller and Jannetta 1981; Buchwald and Huang 1975). Thus, the reduction in wave I for tinnitus subjects compared with non-tinnitus subjects matched in mean threshold, age, and sex indicates reduced auditory-nerve activity. Although mean threshold above 8 kHz was slightly, but systematically, poorer in tinnitus subjects, this cannot explain the reduction in wave I because of the substantial loss of hearing sensitivity in both tinnitus and matched non-tinnitus groups in this high-frequency range, combined with diminished energy in the stimulus at these frequencies.

There are several possible explanations for the reduced wave I amplitude in the tinnitus data compared with the matched, non-tinnitus data. One possibility is that there was diffuse loss of inner hair cells in the tinnitus subjects compared with their matched, non-tinnitus counterparts, which was not sufficient to manifest as an elevated mean threshold for the tinnitus subjects but was nevertheless sufficient to result in a lowered wave I amplitude. Another possibility is that inner hair cells, and also ANFs, were equally intact in tinnitus and closely matched non-tinnitus subjects, but excitability of ANFs was reduced via the lateral olivocochlear efferents which terminate on their endings (Le Prell et al. 2003, 2005). Yet another possibility is that the inner hair cell population was equally intact in tinnitus subjects and matched, non-tinnitus subjects, but there was (1) diffuse loss of ANFs that was sufficient to manifest as a reduction in mean wave I amplitude but not a threshold elevation in tinnitus subjects and/or (2) loss of higher-threshold ANFs but not the lowest threshold fibers determining behavioral threshold. The latter scenario has a proof of concept in recent animal work. After recovery from a temporary threshold shift, acoustically over-exposed mice showed a normal population of inner hair cells but degeneration of ANFs (Kujawa and Liberman 2009). A similar result has also been obtained in guinea pig where it was further found that mainly higher-threshold, low- and medium-spontaneous-rate fibers had degenerated while lower threshold, high-spontaneous rate ANFs remained (Lin et al. 2011). Two observations suggest that the situation in these recent animal experiments applies to the human data of the present study: (1) wave I showed no difference in amplitude between tinnitus and matched non-tinnitus groups for the lowest level ABR stimulus and (2) the difference in wave I between groups was greatest for the highest stimulus level (see 30 and 80 dB data in Fig. 3A). While the precise anatomical and physiological basis for the threshold-independent reduction in wave I of tinnitus subjects cannot be stated unequivocally at present, there is little doubt that it reflects reduced ANF activity.

Ventral cochlear nucleus and tinnitus

In contrast to wave I, the neuronal generators of the subsequent ABR waves are somewhat species dependent. In cat, for instance, focal lesioning within the cochlear nucleus has demonstrated that ABR waves subsequent to wave I are generated by pathways originating in the VCN (Gardi et al. 1979; Achor and Starr 1980; Fullerton and Kiang 1990; Melcher et al. 1996b). It has further indicated that two particular parallel pathways originating in VCN generate most, if not all, of the later ABR waves. One pathway arises from the globular bushy cells (GBCs) and the other with spherical bushy cells (SBCs) (Melcher et al. 1996a, b; Melcher and Kiang 1996). While there is no equally direct evidence regarding the generators of the later ABR waves in humans, reasonable hypotheses follow from human anatomical data indicating that both GBCs and a prominent recipient of GBC projections, the medial nucleus of the trapezoid body are poorly represented in humans if at all (Adams 1986; Richter et al. 1983; Moore and Moore 1971; Moore and Osen 1979). These observations, combined with the cat lesion data, suggest that the SBC pathway largely accounts for the later waves of the human ABR (Melcher and Kiang 1996). The evidence specifically suggests that wave III is generated by SBCs themselves, while wave V is generated by neurons receiving direct projections from SBCs (e.g., principal neurons of the medial superior olive, which project to the inferior colliculus; Adams 1979; Smith et al. 1993). Thus, enhancements in waves III and V in tinnitus subjects relative to threshold-matched, non-tinnitus subjects likely indicate elevated activity in SBCs and neurons receiving SBC inputs.

Since the wave III enhancements in tinnitus subjects did not reach statistical significance (nor did the wave V enhancements at two stimulus levels), it is important to recognize that even unchanged wave III or wave V amplitudes—in the face of reduced wave I—indicate a compensating mechanism at work in the SBC pathway. The reason is that waves I, III, and V are produced by serially arranged neuronal populations, beginning with ANFs which provide the main source of excitatory input to SBCs. Therefore, the III/I and V/I amplitude ratios effectively quantify the degree to which reduced auditory nerve activity is reflected in the activity of subsequent populations of the SBC pathway. The fact that these ratios were significantly elevated in the tinnitus group indicates that the reduced activity is not simply carried forward. Rather, one or more mechanisms apparently work to elevate population neural activity of the SBC pathway.

Note that in attributing the wave amplitude differences between tinnitus and matched, non-tinnitus subjects to differences in the amount of activity in the neuronal populations generating each wave, we recognize that the degree to which activity is synchronized across neurons is another, theoretical explanation for the differences. However, the increased (decreased) synchronization needed to produce amplitude enhancements (reductions) would also lead to narrowing (broadening) of waves III and V (wave I), which was not seen. Thus, differences in total underlying activity appear to be the more likely explanation for the ABR amplitude differences between the tinnitus and matched, non-tinnitus subject groups.

Possible mechanisms underlying wave III/I and V/I elevations in tinnitus subjects

The greater wave III/I and V/I amplitude ratios of tinnitus subjects compared with matched non-tinnitus subjects could arise in a variety of ways including via (1) alterations in the membrane properties of neurons in the SBC pathway leading to their greater excitability, (2) diminished inhibition of the SBC pathway from descending projections, or possibly, (3) synaptic remodeling (Kim et al. 2004; Zeng et al. 2009; or lack thereof, Kraus et al. 2011). Recent neurophysiological data in animals supports the presence of increased SBC excitability following acoustic trauma, a common cause of tinnitus (Vogler et al. 2011). In particular, elevated spontaneous activity was evident in the VCN, particularly in two VCN unit types, one of which (primary-like) corresponds to SBCs.

Shared pathology in tinnitus and matched, non-tinnitus subjects: comparison to young, non-tinnitus subjects

In comparing the tinnitus subjects to the matched, non-tinnitus subjects, it is worth recognizing that the auditory periphery of the matched, non-tinnitus subjects was not normal even though their audiograms were generally within clinically normal limits. In particular, compared with the younger, audiometrically healthier non-tinnitus subjects, mean pure-tone thresholds for the matched non-tinnitus subjects were elevated at frequencies above 4 kHz and the overall amplitude of the ABR, not just wave I, was reduced (Fig. 6). Thus, the reduced wave I of the tinnitus group compared with the matched non-tinnitus group reflected reduced auditory nerve activity on top of that already manifest in the matched non-tinnitus subjects. And, the enhancements in waves III and V in tinnitus subjects partly counteracted already reduced amplitudes. In other words, the differences in ABR between tinnitus and matched non-tinnitus subjects occurred against a background of substantial, shared peripheral pathology. Even though this shared pathology is not tinnitus specific, one cannot discount the possibility that it is a contributing factor to the development of tinnitus.

Comparison to previous ABR studies of tinnitus

The present results reinforce and extend the findings of the previous ABR studies reviewed in the introduction. They are in agreement with previous reports of the following in tinnitus subjects: elevated V/I amplitude ratio (Kehrle et al. 2008; Schaette and McAlpine 2011), elevated wave III amplitude (Attias et al. 1996), and reduced wave I amplitude (Attias et al. 1993; Schaette and McAlpine 2011). An absence of reduction in wave III of tinnitus subjects (Attias et al. 1993) or wave V (Attias et al 1993; Schaette and McAlpine 2011), despite reduced wave I, has also been reported, which is tantamount to enhanced activity given the reduction in wave I (as discussed above).

The similarity of ABR findings across studies is important because it suggests that the neurophysiological processes behind the findings may be widely present in the tinnitus population at large and are not relegated to a particular tinnitus subpopulation that happened to have been disproportionately represented in a particular recruited sample of tinnitus subjects.

The major difference between the present and previous studies lies in the interpretation of the results. Attias et al (1993, 1996) doubted the trends in their data, which have now been confirmed. Kehrle et al. (2008) offered no direct interpretation. Shaette and McAlpine used a dorsal cochlear nucleus (DCN) model to interpret their finding of a V/I amplitude elevation despite evidence that the DCN and its output pathways make little or no contribution to the ABR (Gardi et al. 1979; Achor and Starr 1980; Fullerton and Kiang 1990; Melcher et al. 1996b). Here, we interpret the enhanced V/I and III/I amplitude ratios to elevated neuronal activity in a pathway through VCN.

While most aspects of the present results mirror effects seen in at least one of the previous reports, one does not: the frank enhancement in wave V amplitude in tinnitus subjects (as distinct from an absence of reduction). The wider-band filtering of the ABR waveforms in the present study may account for this (and have also helped reveal the enhancement in wave III). Typical ABR filter cutoffs of 100 or 200 Hz were used in the previous work, compared with the cutoff of 5 Hz used here, and chosen to ensure that the ABR in its entirety was quantified (Fullerton et al. 1987). This filtering explanation for the differing wave V results between the present and previous studies is consistent with results for wave V measured peak-to-trough in the present study (see “Methods”). This measure, in contrast to the primary reported measure from baseline to peak, did not differ between tinnitus and control subjects and does not capture the amplitude contribution from lower frequency components of the ABR waveform.

Comparison to fMRI studies of tinnitus

It might appear contradictory that the present ABR results indicate a relationship between levels of input activity to the inferior colliculus (wave V amplitude) and tinnitus whereas previous fMRI data related elevated sound-evoked responses in the inferior colliculi to abnormal SLT (hyperacusis), and not to tinnitus. The two results are, however, readily reconciled since the ABR and fMRI are sensitive to different aspects of neural activity. fMRI measures neural activity indirectly through a chain of slow, hemodynamic processes. It is therefore far less sensitive to the degree of synchronization of activity across neurons than are evoked potentials such as the ABR. While ABR wave V reflects highly synchronized activity (putatively in the SBC pathway), fMRI activation of the inferior colliculi reflects activity from numerous other neuronal populations as well—activity that may swamp any contribution from the SBC pathway. The net result is that the ABR and fMRI activation reflect activity in different neuronal populations. It is therefore understandable that the two measures could show relationships to different aspects of the tinnitus condition, the presence of tinnitus in the case of the ABR, and the abnormal SLT in the case of fMRI.

Roles for the VCN and DCN in tinnitus

It is important to recognize that the present results, while implicating the ventral division of the cochlear nucleus in tinnitus, in no way exclude a role for the dorsal division, the function of which is not reflected in ABR measurements. In fact, it is entirely possible that multiple cochlear nucleus cell populations play a role in tinnitus, both VCN populations and, as indicated by a large body of animal work, DCN populations (Brozoski et al. 2002; Kaltenbach et al. 2004). It is also possible that the cochlear nucleus neuronal populations involved in tinnitus differ among people. The latter idea is interesting in view of the inter-subject variability in V/I and III/I amplitude ratios (Fig. 5). Perhaps the tinnitus subjects with high V/I ratios have tinnitus mediated by the VCN, whereas those with a low V/I ratio, comparable to non-tinnitus controls, have tinnitus mediated via the DCN. This proposal could be tested in experiments already examining the neurophysiology of the DCN in animal models of tinnitus by adding (comparatively simple) ABR measurements.