Abstract
Birdsong, like human speech, relies critically on auditory feedback to provide information about the quality of vocalizations. Although the importance of auditory feedback to vocal learning is well established, whether and how feedback signals influence vocal premotor circuitry has remained obscure. Previous studies in singing birds have not detected changes to vocal premotor activity after perturbations of auditory feedback, leading to the hypothesis that contributions of feedback to vocal plasticity might rely on“offline” processing. Here, we recorded single and multiunit activity in the premotor nucleus HVC (proper name) of singing Bengalese finches in response to feedback perturbations that are known to drive plastic changes in song. We found that transient feedback perturbation caused reliable decreases in HVC activity at short latencies (20–80 ms). Similar changes to HVC activity occurred in awake, nonsinging finches when the bird's own song was played back with auditory perturbations that simulated those experienced by singing birds. These data indicate that neurons in avian vocal premotor circuitry are rapidly influenced by perturbations of auditory feedback and support the possibility that feedback information in HVC contributes “online” to the production and plasticity of vocalizations.
Introduction
The learning and maintenance of vocalizations in humans and songbirds relies critically on auditory feedback (Doupe and Kuhl, 1999; Konishi, 2004). Speech and song are subserved by specialized vocal premotor structures. In songbirds, the forebrain areas HVC (proper name) and RA (robust nucleus of the arcopallium) generate premotor commands for song (see Fig. 1). Ultimately, such vocal premotor regions must be shaped by auditory feedback to give rise to appropriate patterns of activity that generate speech and song. Moreover, studies in humans and songbirds indicate that vocalizations can be modulated “online” by perturbations of auditory feedback, suggesting that auditory signals have real-time access to vocal premotor circuitry (Howell and Archer, 1984; Houde and Jordan, 1998; Cynx and von Rad, 2001; Sakata and Brainard, 2006). Such signals are posited to be crucial for online control of vocalizations and vocal learning and maintenance (Doya and Sejnowski, 1998; Troyer and Doupe, 2000).
Despite these observations, the auditory feedback signals that inform premotor pathways about vocal performance remain obscure. For humans there is very limited evidence that feedback has access to vocal premotor structures during speech (McGuire et al., 1996; Hirano et al., 1997; Hashimoto and Sakai, 2003), and for songbirds, such evidence has been completely lacking (Konishi, 2004). In zebra finches, passively presented auditory stimuli can activate vocal premotor circuitry in anesthetized or sleeping birds, but auditory signals are generally attenuated or absent in awake animals (Dave et al., 1998; Schmidt and Konishi, 1998; Rauske et al., 2003; Cardin and Schmidt, 2004). Deafening does not dramatically alter singing-related activity in the song system (Hessler and Doupe, 1999), and several studies have not detected online responses to feedback perturbations in song system nuclei (Leonardo, 2004; Kozhevnikov and Fee, 2007; Prather et al., 2008). Collectively, these data have led to the suggestion that song system structures do not have access to information about the quality of feedback during vocal production and that vocal learning might rely on offline feedback processing, for example during sleep (Dave and Margoliash, 2000; Margoliash, 2003; Konishi, 2004). Knowledge of whether vocal premotor structures have online access to auditory feedback is fundamental to understanding how vocal learning proceeds (Tchernichovski et al., 2001).
Here, we investigated responses of HVC neurons in singing Bengalese finches to transient perturbations of auditory feedback. As in the zebra finch, such feedback perturbations can drive gradual changes to Bengalese finch song (Leonardo and Konishi, 1999; Leonardo, 2004; Kozhevnikov and Fee, 2007; Tumer and Brainard, 2007). However, behavioral experiments indicate a stronger reliance on auditory feedback for Bengalese finch song than for zebra finch song; deafening alters adult Bengalese finch song within days as opposed to weeks for zebra finch song (Nordeen and Nordeen, 1992; Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997). Hence, we anticipated that auditory feedback signals might be more salient in Bengalese finches. We recorded extracellular activity from HVC because it receives auditory inputs and generates motor commands for song (Fee et al., 2004; Mooney, 2004). Because of its projections (Fig. 1), the presence of feedback signals within HVC would indicate their likely availability elsewhere in the song system.
Materials and Methods
Animals
Adult Bengalese finch males (age range, 3–28 months; n = 14) were raised in our colony and selected based on song structure and amount of singing. Birds were housed with their parents until at least 60 d of age, then housed with other males on a 14 light/10 dark hour photoperiod. For testing, birds were isolated and housed individually in sound-attenuating chambers (Acoustic Systems), and food and water were provided ad libitum. All procedures were performed in accordance with established animal care protocols approved by the University of California, San Francisco Institutional Animal Care and Use Committee.
Data collection
Details of surgery are outlined in Hessler and Doupe (1999). Electrodes (1–4 MΩ) that were carried by a lightweight microdrive [California Institute of Technology (Caltech), Pasadena, CA and University of California, San Francisco (UCSF), San Francisco, CA machine shops] were stereotaxically targeted to either the right or left HVC under anesthesia using isoflurane or equithesin. No significant difference in the magnitude of neural effects was observed between the hemispheres so data were pooled for analysis. Sound was recorded using an omnidirectional microphone (Countryman Associates), and acoustic signals were bandpass filtered between 0.3–9 kHz (Krohn-Hite). A computerized, song-activated recording system was used to detect and digitize song and neural activity (observer, A. Leonardo, Caltech, C. Roddey, UCSF; digitized at 32 kHz) for later offline analysis using software written in the Matlab programming language (Mathworks). Neural activity was bandpass filtered between 0.3 and 10 kHz (A-M Systems). All songs and neural activity were collected from birds singing in isolation (“undirected” song).
At the conclusion of experimentation, lesions were made at recording sites (10 μA for 20 s), and birds were killed using isoflurane and perfused transcardially with saline followed by 3.7% formalin. Brain sections were cut at 40 μm and Nissl-stained. We verified that recordings were made from HVC by the presence of characteristic activity during singing and song playback and by post hoc examination of electrode placement in histological sections.
Altered auditory feedback during singing
Experimental design.
We implemented reversible feedback manipulations instead of deafening, a common method of assessing auditory contributions, because deafening experiments rely on comparisons of neural activity from potentially different populations of neurons before and after the surgical removal of the cochlea. Consequently, deafening experiments are likely to be sensitive only to relatively gross changes to activity associated with hearing loss. In contrast, reversible perturbations of feedback in singing birds enable interleaved recordings from maintained populations under varying conditions of feedback, thereby increasing the ability to measure any feedback-dependent contribution to neural activity.
Behavioral studies in both humans and birds indicate that the influence of feedback disruption depends on its timing relative to ongoing vocalizations (Howell and Archer, 1984; Howell and Powell, 1987; Sakata and Brainard, 2006). If the magnitude or timing of feedback signals in the brain also depends on the timing of feedback, then the statistical power to detect feedback-driven changes to neural activity will be attenuated depending on the temporal variability of feedback perturbations across trials. Therefore, to minimize this temporal variability, we used a computerized system that detected specific spectral features of targeted syllables as they were being produced and superimposed the sound of a feedback element (a syllable from the male's repertoire) at a short and controlled latency. Using this design, the variation across trials in the timing of feedback perturbation was <6 ms (SD). We expected that such reproducibility of feedback disruptions would facilitate the detection of feedback signals attributable to more reproducible differences between neural activity during interleaved control and feedback trials.
To the extent that variability in the timing of feedback perturbation results in decreased sensitivity, larger sample sizes will be important to detect and characterize such signals. For our experiments, we recorded multiunit and single unit activity in HVC in response to feedback perturbation and focused our analysis on datasets with larger sample sizes. For multiunit recordings (n = 36), we restricted our analysis to experiments with at least 46 total trials (range, 46–197; mean, 97; where a trial reflects a fixed pattern of song with or without disrupted feedback). For single unit recordings, we restricted our analysis to experiments with at least 14 total trials (range, 14–118; mean, 44).
Details of feedback perturbation.
The procedure for targeting syllables for feedback perturbation was identical to that of Sakata and Brainard (2006). In brief, birds were housed individually in sound boxes for at least 24 h, during which baseline songs were recorded. We defined “syllables” as individual acoustic elements of Bengalese finch song that are separated from each other by at least 5 ms of silence (Okanoya and Yamaguchi, 1997). Over the course of a single experimental session (<1 d), syllables that were produced often in song were detected based on their pattern of spectral features. After detection, a prerecorded sound (feedback element: syllable from the male's repertoire) was played back at a short and fixed latency via a free-field speaker so that the singing bird experienced a temporally localized superposition of extraneous feedback and his own normal feedback. When presented in isolation, single syllables elicited transient short latency increases in activity, indicating that they are salient stimuli for HVC neurons. The range of intensities at which feedback elements were played (∼70–100 dB) approximates the intensity of the bird's own vocalizations during song production measured within 10 cm of the bird (Cynx and von Rad, 2001), and a bird experienced the same feedback element across experimental sessions. On randomly interleaved control trials, targeted syllables were detected, but extraneous feedback was omitted. With this experimental design, we could directly assess the real-time consequences of altering feedback on neural activity under interleaved normal and altered feedback conditions.
Across experiments, durations of targeted syllables averaged ∼53 ms (range, 42–73 ms), durations of feedback elements averaged ∼70 ms (range, 60–80 ms), and the delay between the onset of the target syllable and onset of the feedback element averaged ∼46 ms (range, 25–78). Previous behavioral experiments indicated that there was not a significant difference between the effectiveness of feedback elements that matched the targeted syllable versus those that did not (Sakata and Brainard, 2006). Here, we also used feedback elements that either matched (nine experiments in three birds) or differed from (39 experiments in nine birds) the targeted syllable and found no differences in either behavioral effectiveness or effectiveness in eliciting changes to HVC activity. Hence, data were pooled for analysis.
Auditory responses.
Stimuli for auditory experiments were generated from previously recorded versions of the bird's own song (BOS) and conspecific songs (CON), and stimuli were matched for peak amplitude. One exemplar of BOS and a reversed version of the same stimulus (rBOS) as well as CON stimuli (1–4 exemplars) were played back in an interleaved manner from a speaker mounted above the bird's cage (60–80 dB at center of cage, A scale). Across experiments, inter-stimulus intervals ranged from 5 to 15 s, and stimuli were played back 10–48 times per experiment (median, n = 23). For 35 of the 55 experiments, we used the protocol for characterization of auditory responses in awake zebra finches outlined in Schmidt and Konishi (1998); chamber lights were extinguished ∼1–5 min before playback initiation to minimize movement and vocalizations in response to playbacks. These experiments lasted <20 min, and small movements were often heard during these sessions; hence, the birds likely remained awake throughout the experiment. In the remaining 20 experiments, lights remained on, and birds were observed on a video monitor to confirm that they remained awake. There were no significant differences in the selectivity of neural responses between experiments in which lights were on or off [discriminability (d′)BOS-rBOS: lights off, 3.2 ± 0.7; lights on, 3.2 ± 0.6; average values per bird ±SEM], and, hence, data were pooled.
To qualitatively compare how disruptions of the sound of the bird's own song affect HVC activity in singing versus quiescent birds, we also analyzed how HVC neurons responded to the sound of acoustic stimuli that simulated those experienced while birds sang under conditions of experimentally perturbed feedback. We played back prerecorded versions of BOS to passively listening birds and used the same automated system that was used to perturb feedback during singing to alter the acoustic stimuli at targeted times during BOS playback (by superposition of single syllables on BOS; 16 experiments in five birds). Normal and altered stimuli were randomly interleaved so that we could compare directly the responses of HVC neurons under these conditions. Trials with perturbed or unperturbed feedback were presented 12–79 times (median, n = 47).
Data analysis
All song analysis was done offline in Matlab (Mathworks). Raw neural data for each experiment were analyzed using a Bayesian spike sorting algorithm (M. S. Lewicki, Caltech; B. D. Wright, UCSF) (Lewicki, 1994, 1998). For 12 experiments, spike sorting identified single units that exhibited characteristics reported for HVC interneurons (narrow spike widths, high rates of spontaneous activity, continuous firing during singing and song playback, and firing during calls) (supplemental Fig. 1, available at www.jneurosci.org as supplemental material) (Mooney, 2000; Hahnloser et al., 2002; Rauske et al., 2003; Kozhevnikov and Fee, 2007). In a small number of cases, we also identified single units that exhibited characteristics of sparse firing HVC projection neurons (broader spike widths, little or no spontaneous activity, temporally sparse and precise firing during only one or two syllables of song, and no activity during calls) (supplemental Fig. 1, available at www.jneurosci.org as supplemental material) (Mooney, 2000; Hahnloser et al., 2002; Rauske et al., 2003; Kozhevnikov and Fee, 2007; Prather et al., 2008). In no cases were sparse firing neurons (putative projection neurons) active at the time of the fixed location of feedback perturbation in our experiments. Therefore, in Results, the 12 reported single unit experiments reflect the responses of putative interneurons. An additional 36 experiments were classified as multiunit, either because spike sorting did not reveal distinct clusters, or because there were excessive refractory period violations (>1.5% of interspike intervals <1 ms). Although single units could not be confidently extracted from these recordings, in most cases the action potential models fit by the Bayesian spike sorter were narrow, and firing patterns were continuous throughout song, with activity increasing before song initiation. We therefore think it is likely that these recordings also were dominated by HVC interneurons, but cannot rule out contributions of projection neurons. For these multiunit experiments raw neural traces were thresholded to generate spike counts with thresholds set at least 2 SD (range, 2.2–4.1 SD) above background levels. For both single and multiunit recordings, feedback conditions (control or perturbed) were randomly interleaved throughout each experiment. Hence, differences in neural activity between conditions can be attributed to the effect of feedback perturbation rather any changes over time in the population of recorded units (for multiunit experiments) or the quality of unit isolation (for single unit experiments).
For all experiments, we calculated peri-stimulus time histograms (PSTHs) by convolving spike times for each trial with a Hanning window (10 ms width at half-height), resampled the data at 1 kHz, then calculated the mean ± SEM for control and feedback trials. The findings for single and multiunit recordings were qualitatively similar. For key points, we substantiate this by presenting separate examples and quantifications for single versus multiunit recordings. Otherwise, except as noted, results are pooled across all recordings.
In addition to driving gradual changes to song structure, superposition of extraneous feedback on ongoing song can sometimes induce acute, transient changes to song tempo (Sakata and Brainard, 2006). We measured online changes to tempo by comparing the interval from the onset of the targeted syllable to the onset of the first syllable after feedback perturbation for feedback versus control trials. We divided our analysis based on whether feedback caused a significant increase in this interval (t test; p < 0.05). Experiments were classified as “sensory-motor” if this difference was significant, and as “sensory-only” if this difference was not significant (supplemental Table 1, available at www.jneurosci.org as supplemental material). Although the distribution of feedback effects on song tempo was continuous, we used this statistical categorization so that we could specifically examine the degree to which feedback effects were present in experiments where there was no measurable influence on vocal output (sensory-only experiments) and compare the nature and magnitude of those effects with what was observed in experiments where there were significant changes to vocal output (sensory-motor experiments). We additionally analyzed seven acoustic features of the syllable immediately after feedback perturbation (mean frequency, frequency slope, amplitude slope, duration, spectral entropy, amplitude entropy, spectrotemporal entropy) (Sakata and Brainard, 2006). Changes to acoustic properties of syllables were rarely observed after feedback perturbations. However, for data presented in Results, we restricted our analysis to experiments in which none of these parameters were significantly affected by feedback perturbation (t test, α = 0.01 for multiple comparisons). Adopting a more conservative criterion for excluding experiments (α = 0.05, such that the threshold for excluding experiments is lower) did not affect the significance of the results or data interpretation, so we present only the analyses (n = 48 experiments) using the same criterion as in our previous study (α = 0.01) (Sakata and Brainard, 2006).
We also analyzed effects of feedback perturbation on song amplitude. Because control and feedback trials were randomly interleaved the variation in song amplitude (caused by variation in the location of the behaving bird relative to the fixed microphone) was balanced across conditions. Consequently, for all experiments, amplitude profiles for syllables produced before feedback onset were equal across control and feedback conditions, eliminating the need to normalize song amplitude. Of the 48 experiments that we analyzed, there were three (sensory-motor) in which the amplitude of the syllable after the perturbative stimulus was significantly decreased on feedback trials. There was no clear relationship between the magnitude of this amplitude decrease and change in HVC activity, although the scarcity and low magnitude of amplitude effects provided little statistical power to draw strong conclusions about whether a component of the neural changes present in sensory-motor experiments corresponds with changes to syllable amplitude. Exclusion of the three experiments had no effect on the significance of reported results.
For both singing and playback experiments, more than one sequence was tested (with perturbation of feedback) at some recording sites, and in these cases, each sequence was analyzed as a separate experiment. For feedback perturbation in singing birds, we conducted 48 experiments at 22 sites in 10 birds. For playback of BOS to quiescent birds, we conducted 55 experiments (20 with lights on and 35 with lights off) in 11 birds. For playback of versions of BOS with targeted superposition of feedback elements, we performed 16 experiments in five birds.
To calculate the significance and latency of changes in HVC activity after altered auditory feedback (AAF), we computed the d′ value between HVC activity under normal versus altered feedback conditions (Green and Swets, 1966; Mooney, 2000; Solis and Doupe, 2000). This measure takes into account both the mean difference in activity between conditions and the variability of activity across trials to provide an indication of the discriminability (d′) between the patterns of activity in each condition. Data for each trial were smoothed using a Hanning window (10 ms width at half-height) then resampled at 1 kHz. At each time point we calculated the d′ using the following formula: d′AAF-NORMAL = 2 * μAAF-μNORMAL /(σ2AAF + σ2NORMAL)1/2, where μAAF and μNORMAL refer to mean HVC activity under altered and normal feedback conditions, respectively, and where σ2AAF and σ2NORMAL refer to the variance of HVC activity under altered and normal feedback conditions, respectively. To determine significance we used randomization tests to assess the likelihood of obtaining specific d′AAF-NORMAL values given the measured trial-by-trial responses. For this process, we randomly shuffled, without replacement, each trial into one of two groups representing the normal and altered feedback groups, while conserving the sample size for each group, and then calculated d′ values for the shuffled dataset. This process was repeated 1000×. When d′AAF-NORMAL was greater than the 99th percentile of this distribution, we categorized this difference as significant. To cross-validate our statistical methods, we analyzed differences in HVC activity during the period before feedback disruption, when motor production and auditory feedback were matched for control and feedback trials. For this period, we found that differences in activity between the two sets of trials crossed the threshold for significance 0.8% of the time. This confirms that our statistical procedure is equivalent to setting α = 0.01. The same procedure was used to analyze the significance and latency of changes in HVC activity after localized perturbations of the sound of BOS.
For the ideal observer analysis, we measured firing rate (smoothed using a Hanning window with 10 ms width at half-height and resampled at 1 kHz) at the maximally informative time (largest d′AAF-NORMAL) during the window 20–80 ms from feedback onset (“early” window). We then computed the probability of correctly assigning trials as control or feedback at systematically varying threshold values (Dayan and Abbott, 2001).
Auditory selectivity for BOS versus rBOS was quantified using the signed d′ statistic: d′BOS-rBOS = 2 * (μRS-BOS-μRS-rBOS)/(σ2RS-BOS + σ2RS-rBOS)1/2, where μRS-BOS and μRS-rBOS refer to the mean response strength (RS) of HVC activity after playback of BOS and rBOS, respectively, and where σ2RS-BOS and σ2RS-rBOS refer to the variance in RS after BOS and rBOS playback, respectively. The RS is defined as the difference between the firing rate (spikes per second) during a three second baseline period before playback onset and the rate during playback. Positive d′ values signify preferential activation in response to BOS, whereas negative values signify preferential activation in response to rBOS. Similar calculations were used to characterize the selectivity for BOS versus CON.
Results
HVC neurons respond to auditory feedback perturbations during singing
We used chronic recordings to characterize single and multiunit activity in HVC of singing Bengalese finches under conditions of normal and altered feedback. Single units reported here were likely to be HVC interneurons based on their waveforms and patterns of activity when birds were quiescent, engaged in singing or calling, and listening to song playback (see Materials and Methods) (supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Consistent with previous studies of singing-related activity of HVC neurons in the zebra finch (McCasland, 1987; Yu and Margoliash, 1996; Hahnloser et al., 2002; Kozhevnikov and Fee, 2007), we observed that HVC neurons in the Bengalese finch increased their activity before song initiation and exhibited modulation of activity locked to the structure of song (Fig. 2). Bengalese finch song consists of discrete acoustic units called syllables that are organized into learned sequences (Clayton, 1989; Okanoya and Yamaguchi, 1997; Sakata and Brainard, 2006). To quantify singing-related neural activity at individual recording sites, we identified stereotyped sequences of syllables that were produced by birds one or more times in each song (for example, the sequences “efgg…” in Fig. 2a,b and “sabb…” in Fig. 2c,d). The neural activity recorded during each rendition of the sequence was aligned and averaged to construct a histogram reflecting variation in firing rate over the course of the sequence (Fig. 2b,d). Firing rates at all recording sites were modulated across the production of a fixed sequence of syllables. The observed patterns of singing-related neural activity accord with the known role of HVC in the premotor control of song production, but do not reveal whether auditory feedback contributes to some component of that activity.
To test for contributions of auditory feedback to ongoing patterns of singing-related activity, we used a computerized system to alter feedback while simultaneously recording neural activity in HVC of singing birds (Fig. 3). It has previously been shown that such feedback perturbations during singing can lead to gradual changes to adult song, indicating that this manipulation is salient to the nervous system and capable of engaging mechanisms of vocal plasticity in Bengalese finches as well as zebra finches (Leonardo and Konishi, 1999; Leonardo, 2004; Kozhevnikov and Fee, 2007; Tumer and Brainard, 2007). In addition to driving vocal plasticity, such feedback perturbations can cause acute changes to the tempo of ongoing song (Sakata and Brainard, 2006). Evidence from multiple sources indicates that much of HVC activity is tightly locked to the structure of song and is likely premotor in nature (McCasland, 1987; Vu et al., 1994; Yu and Margoliash, 1996; Hahnloser et al., 2002). Hence, for experiments in which feedback is altered, the question arises of whether any observed change in neural activity reflects a change in sensory feedback, motor output, or both. Any online change to HVC activity in response to perturbation of feedback potentially informs song premotor circuitry about the quality of song and contributes to vocal plasticity. However, it is also of interest to determine the degree to which information about sensory feedback can be represented independent of information about changes to vocal output. Consequently, we consider separately changes to HVC activity for experiments in which feedback perturbation had no acute effect on song (sensory-only) versus experiments in which feedback perturbation caused localized changes to song (sensory-motor).
We first examined HVC activity in sensory-only experiments where feedback perturbation had no significant acute effect on vocal production. This was the case for 25 of 48 experiments (in seven birds) where the timing and structure of the fixed sequences produced by birds were quantitatively indistinguishable between interleaved control and feedback trials (see Materials and Methods). Because trials were randomly interleaved, we can attribute differences in neural activity between conditions to the effects of feedback perturbation rather than any changes in the population of recorded neurons or single unit isolation. This set of sensory-only experiments allowed us to assess whether there were any changes to HVC activity caused by feedback alteration that specifically reflected a sensitivity to what the bird heard, independent of acute changes to vocal output.
Three examples of sensory-only experiments are shown in Figure 4. In each case, perturbation of auditory feedback caused a significant localized decrease in ongoing HVC activity at a latency of 40–60 ms. To characterize the significance of feedback-induced changes to song for each experiment, we calculated the discriminability between control and feedback trials at each point in time using the d′ statistic, which takes into account the mean and variability of neural responses (see Materials and Methods). A d′ value of 0 indicates no difference in HVC activity between feedback and control conditions and progressively larger (absolute) d′ values indicate progressively greater discriminability between activity in the two conditions. For the example in Figure 4a, the maximum discriminability occurred at 59 ms after the onset of feedback perturbation. At this latency, a separation of the firing rate distributions for individual trials was evident (Fig. 4ai–iii). Correspondingly, the d′ at this time point achieved a value of 0.91. To assess significance, we used a Monte Carlo randomization procedure (see Materials and Methods) to estimate the probability of achieving this large a d′ value by chance (Fig. 4aiv). For this time point, the measured d′ exceeded the 99th percentile of d′ values achieved under the null hypothesis (vertical dashed line) and, hence, was deemed significant. For each time point, from 150 ms before to 200 ms after onset of feedback perturbation, we similarly compared the measured d′ with the 99th percentile of the d′ values computed for that time point under the randomization procedure. The solid bar above the PSTHs (Fig. 4a, top, asterisk) indicates a period 54–63 ms after feedback onset over which feedback perturbation caused a significant decrease in HVC activity for this experiment. The magnitude of this change in activity was near the median of that observed for sensory-only experiments. Figure 4b illustrates one of the largest effects observed for sensory-only experiments. Here, the neural activity between control and feedback trials was significantly different for the period 38–50 ms after feedback perturbation, and the maximum d′ was 1.62. Figure 4c illustrates a third example from a single putative HVC interneuron (same unit shown in Fig. 2c, see Fig. 6b). Here, the activity between control and feedback trials was different for the period 50–55 ms after feedback perturbation, and the maximum d′ was 1.47.
Across all sensory-only experiments, such short latency, localized changes in HVC activity in response to alteration of auditory feedback were common. Figure 5a depicts another sensory-only experiment. The top panel illustrates the average amplitude waveforms (mean ± SEM) of all songs produced on control trials (blue) and of all songs produced (plus the sound of the perturbative feedback) on feedback trials (red). The close alignment of these acoustic traces both preceding and after feedback perturbation (at t = 0) illustrates that the fine structure of song produced by the bird was indeed closely matched between the different feedback conditions. Consistent with the matched vocal output, there was generally a close correspondence in the pattern of HVC activity between feedback (red) and control (blue) trials (Fig. 5a, bottom). However, there were significant localized decreases in HVC activity between 37 and 68 ms after the onset of feedback perturbation (solid bars above PSTHs). For 15 of the 25 sensory-only experiments, significant changes to HVC activity similarly occurred within the first 80 ms of feedback perturbation. For these experiments the latencies to significant changes in activity ranged from 10 to 60 ms with a mean value of 44.2 ± 3.4 ms (mean ± SEM).
To quantify the magnitude of change in HVC activity we measured for each experiment the mean activity on feedback and control trials during a window extending from 20 to 80 ms after onset of feedback perturbation (Fig. 5a, bottom, early window). For the example in Figure 5a, HVC activity during this early window was decreased by 21.3% during feedback trials relative to control trials. Across all 25 sensory-only experiments, feedback alteration caused, on average, a 13.7 ± 1.9% (mean ± SEM) decrease in HVC activity during the early window, and this change was highly significant (t test, p < 0.0001) (Fig. 5b, early). For the subset of experiments in which we recorded from well-isolated units (putative interneurons; see Materials and Methods) (supplemental Fig. 1, available at www.jneurosci.org as supplemental material), we observed a similar magnitude of change [18.5 ± 1.7% (mean ± SEM) decrease in HVC activity during the early window (supplemental Fig. 2, available at www.jneurosci.org as supplemental material]. The change in HVC activity during sensory-only experiments was transient. For a 60 ms window lasting from 80 to 140 ms after onset of feedback perturbation (Fig. 5a, bottom, “late window”) there was no significant difference in HVC activity between feedback and control trials (Fig. 5b, late). Hence, in sensory-only experiments, perturbation of feedback caused significant, transient short latency decreases in HVC activity independent of measurable changes to vocal output.
In sensory-motor experiments, where feedback perturbation acutely altered vocal motor output, both short and longer latency changes to HVC activity were elicited by perturbation of feedback. Motor effects occurred in 23 of 48 experiments (in six birds) and, in each case, feedback perturbation led to a localized decrease in song tempo. One example is shown in Figure 5d. In this case, superposition of an extra syllable caused a localized slowing of song, such that each subsequent syllable on feedback trials was delayed relative to the timing of the same syllables on control trials. This is apparent in the top panel of Figure 5d as a persistent rightward shift in the acoustic trace after feedback perturbation for altered feedback trials (red) versus control trials (blue). Here, there was a short latency decrease in HVC activity on feedback trials versus control trials that began 58 ms after feedback perturbation (Fig. 5d, bottom). For 13 of 23 sensory-motor experiments, such significant decreases in HVC activity occurred within the first 80 ms after the onset of feedback perturbation. For these experiments the latencies to significant changes in activity ranged from 22 to 62 ms with a mean value of 44.1 ± 3.8 ms (mean ± SEM). The magnitude of changes in HVC activity during the early window was comparable with that observed in sensory-only experiments. Across all 23 sensory-motor experiments, HVC activity for the early window was decreased, on average, by 17.9 ± 3.4% (mean ± SEM) during feedback trials (Fig. 5e, early), and this change was highly significant (t test; p < 0.0001). Again, a similar magnitude of change was observed for the subset of well-isolated units [22.6 ± 11.6% (mean ± SEM) decrease in HVC activity during the early window (supplemental Fig. 2, available at www.jneurosci.org as supplemental material]. Hence, both sensory-only and sensory-motor experiments demonstrated that auditory perturbations that can drive song plasticity are represented online in vocal premotor circuitry of singing Bengalese finches.
For sensory-motor experiments (in contrast to sensory-only experiments), the difference in HVC neural activity between feedback and control trials persisted. This was manifested as repeated and continuing crossings of the significance threshold for discriminability after feedback perturbation (Fig. 5d, bars above PSTHs). This persistent difference in HVC activity reflected a systematic rightward shift after feedback perturbation in the pattern of neural activity for feedback versus control trials. The parallel between the rightward shifts for acoustic and neural traces strongly suggests that the persistent differences in HVC activity observed here reflect a component of HVC activity that is tightly locked to altered vocal output. Similar patterns of change to HVC activity were observed across sensory-motor experiments. As a result, significant differences between feedback and control trials persisted into the late window (80–140 ms after feedback perturbation) for sensory-motor experiments (11.3 ± 3.3%; t test, p = 0.0020) (Fig. 5e, late).
Together, the data from the sensory-only and sensory-motor experiments suggest that short latency changes to HVC activity occurred in response to altered auditory feedback independent of subsequent acute changes to vocal output, whereas later, persistent changes to HVC activity reflected altered vocal output. We verified this using two approaches. First, we assessed differences in HVC activity after correcting for motor differences between control and feedback trials in sensory-motor experiments. We removed significant tempo differences across conditions by selecting trials from control and feedback trials such that sequence durations were matched. After this motor correction, differences in HVC activity between control and feedback trials during the early window remained significant whereas differences in the late window that were present before the motor correction were removed (supplemental Fig. 3, available at www.jneurosci.org as supplemental material). Second, we used a correlation analysis across sensory-only and sensory-motor experiments (n = 48) to assess the relationship between the magnitude of acute changes to vocal output (percentage change to tempo) and the magnitude of changes to HVC activity (mean discriminability between feedback and control trials) in response to altered auditory feedback (supplemental Fig. 4, available at www.jneurosci.org as supplemental material). This analysis treats motor effects as continuous rather than categorical (sensory-only versus sensory-motor). For the early window, there was no relationship between the change in HVC activity elicited by feedback perturbation and the change in song tempo (r = 0.13; p = 0.3924). In contrast, for the late window, the correlation was significant (r = −0.41; p = −0.0036), indicating that larger changes in song tempo were associated with larger changes in HVC activity. These results are consistent with an interpretation that, in response to feedback perturbation, short latency changes in HVC activity correspond to the sensory experience of the singing bird, whereas longer latency changes reflect whether or not that sensory experience is translated into an acute change in vocal production.
Short latency changes in neural activity after feedback perturbation, regardless of whether they are subsequently accompanied by alterations of vocal output, unambiguously indicate that HVC has online access to information about whether feedback was altered or normal. To quantify how reliably HVC firing rate in individual experiments discriminated between feedback and control conditions, we further analyzed d′ values for sensory-only and sensory-motor experiments. Figure 5, c and f, shows the distributions of peak (signed) d′ values from the window 20–80 ms after onset of feedback perturbation (early window) for sensory-only and sensory-motor experiments, respectively. For both sensory-only and sensory-motor experiments the distribution of peak d′ values was significantly less than zero, indicating reduced HVC activity in the feedback versus the control condition (t test; p < 0.0001 for both). In songbirds, absolute d′ values >0.5–0.7 have been construed as indicating selectivity for one condition over another (Solis and Doupe, 2000; Mooney, 2000). For sensory-only and sensory-motor experiments, mean d′ values were −0.981 and −0.982, respectively (for the subset of single unit experiments mean values were −0.88 and −0.74, respectively), and for 79% of individual experiments, d′ values exceeded a criterion of 0.7. These data indicate that HVC activity selectively discriminated between feedback and control conditions.
To further quantify the degree to which HVC activity could discriminate between conditions, we asked how well an ideal observer could categorize individual trials as feedback or control using only the corresponding single-trial firing rate. For each experiment (n = 48), we considered the measured distribution of firing rates for feedback and control trials at the most informative time during the window 20–80 ms after feedback onset (e.g., the firing rate distributions plotted in Fig. 4aiii–ciii) (see Materials and Methods). Categorization of individual trials by an ideal observer was modeled using a simple fixed threshold; trials with firing rates below threshold were categorized as feedback and those with firing rates above threshold as control. The threshold that optimized performance (percentage of trials correctly categorized) was empirically determined for each site. On average, across all experiments, an ideal observer could correctly categorize the feedback condition on 69% of trials. Hence, even without pooling data across different HVC sites, substantial information about auditory feedback is encoded by HVC activity during singing.
HVC neurons respond robustly and selectively to the sound of the bird's own song in awake, quiescent birds
The experiments and analysis presented above indicate that, in singing birds, perturbation of the sound of the bird's song results in a decrease in HVC activity. One possibility is that a component of this decrease reflects auditory selectivity of HVC neurons distinct from premotor activity. To further test this idea, we assessed the responsiveness and selectivity of HVC neurons in a purely auditory setting, in which behaviorally relevant stimuli were played back to awake, quiescent (nonsinging) birds. We first measured responses of HVC neurons in passively listening birds to playbacks of BOS, the sound heard by birds under normal conditions of singing. HVC neurons were robustly activated by playbacks of BOS (Fig. 6a,b, left panels). Furthermore, HVC neurons exhibited strong selectivity for BOS relative to other complex auditory stimuli, including rBOS (Fig. 6a,b, right panels) and songs of other Bengalese finches (CON). We characterized this selectivity using the discriminability index d′. Neurons that respond more strongly to BOS than to other stimuli have d′ values greater than zero. HVC neurons consistently responded more strongly to BOS than to rBOS or CON, and correspondingly had positive d′ values (Fig. 6c). The mean d′ for BOS vs rBOS (d′BOS-rBOS) was 3.01 (55 experiments in 11 birds) and for BOS versus CON (d′BOS-CON) was 3.18 (13 experiments in six birds). These average d′ values are much greater than those reported for HVC neurons in awake, quiescent zebra finches (d′BOS-rBOS = 0.22–1.7) (Rauske et al., 2003; Cardin and Schmidt, 2004). Collectively, these data indicate that HVC neurons in the awake Bengalese finch are endowed with appropriate selectivity to participate in the processing of auditory feedback.
We next presented to awake, quiescent birds acoustic stimuli that qualitatively simulated those experienced while birds sang under conditions of experimentally perturbed feedback (Fig. 7a). We played back prerecorded versions of BOS to passively listening birds and altered the acoustic stimuli at targeted times during playback using the same automated system that was used to perturb feedback during singing. As reported above, HVC neurons in awake, quiescent birds responded with vigorous increases in activity to presentations of BOS. Similarly, playback of single syllables in isolation resulted in short latency, transient increases in HVC activity (supplemental Fig. 5, available at www.jneurosci.org as supplemental material). However, superimposing single syllables on playback of BOS consistently caused a transient decrease in ongoing activity (Fig. 7a,b) (16 experiments in five birds). Such decreases in activity occurred within 80 ms after the onset of auditory perturbations in 13 of 16 playback experiments. The latency to these decreases in activity in response to superposition of extra syllables was 47.9 ± 6.6 ms (mean ± SEM). This was appreciably greater than the latency to increases in activity when single syllables were played back in isolation (typically <15 ms) (supplemental Fig. 5, available at www.jneurosci.org as supplemental material), but was comparable with the latency of feedback-elicited decreases in activity in singing birds (44.2 ms for sensory-only experiments and 44.1 ms for sensory-motor experiments). Hence, both the latency and direction of change in HVC activity was similar between singing and quiescent birds in response to superposition of extraneous syllables on the sound of the bird's own song.
In contrast, the magnitude of short latency changes to HVC activity caused by superposition of syllables tended to be greater when birds were quiescent than when they were singing. Superposition of extra syllables resulted in an average decrease in HVC activity of 45.5% in quiescent birds versus 13.7 and 17.9% in singing birds for sensory-only and sensory-motor experiments, respectively. These differences could reflect state-dependent changes in auditory sensitivity, as observed in humans, nonhuman primates, and songbirds (Müller-Preuss and Ploog, 1981; Paus et al., 1996; Schmidt and Konishi, 1998; Numminen and Curio, 1999; Curio et al., 2000; Houde et al., 2002; Cardin and Schmidt, 2003, 2004a; Eliades and Wang, 2003; Rauske et al., 2003). Such differences between the effects of feedback perturbation in quiescent versus singing birds also could arise because the auditory stimuli are not equated between these conditions (attributable to differences in the amplitude, spectrum and binaurality of self-generated versus broadcast versions of the bird's own song) or because of differential contributions of ongoing premotor activity in the singing condition. Despite these differences, in both singing and passively listening birds, HVC activity was maximal when the acoustic stimulus experienced by the bird resembled the sound of the bird's own song and decreased in response to perturbations that disrupted that sound. The qualitative similarity of short latency decreases in HVC activity in singing birds to those observed in quiescent birds supports the idea that a component of the changes observed during singing reflect sensitivity to auditory feedback.
Discussion
Although the importance of auditory feedback to song learning and production has long been recognized, the presence and nature of auditory feedback signals in vocal premotor circuitry have remained elusive. Here, we report that neural activity in the vocal premotor nucleus HVC of the Bengalese finch is affected by perturbations of auditory feedback during singing. Specifically, HVC activity consistently decreased at a short latency (20–80 ms) after the perturbation of normal feedback during ongoing song (Figs. 4, 5). The type of feedback perturbation that we used elicits both online changes to vocal production and gradual modifications to song (Howell and Archer, 1984; Howell and Powell, 1987; Leonardo and Konishi, 1999; Leonardo, 2004; Zevin et al., 2004; Sakata and Brainard, 2006; Kozhevnikov and Fee, 2007; Tumer and Brainard, 2007). However, previous studies have not found that such perturbations are registered by vocal premotor nuclei or other song system structures, and have suggested that vocal plasticity in response to these perturbations occurs offline, outside the context of singing (Leonardo, 2004; Kozhevnikov and Fee, 2007; Prather et al., 2008). In contrast, our finding of neural signals in a vocal premotor structure in response to these behaviorally effective perturbations of feedback indicates the potential online contributions of feedback signals to vocal control and learning.
Our results indicate that HVC activity is transiently decreased at a short latency after perturbation of feedback. Because such perturbations occasionally elicit acute changes to the structure of ongoing song, we cannot unambiguously assign the observed neural signals as sensory versus motor. We can, nevertheless, examine the degree to which the observed changes in HVC activity reflect the perturbation of feedback versus the presence or absence of immediate motor consequences. We found that short latency signals in HVC (20–80 ms) correlated strongly with whether or not feedback was altered, but not with the presence or magnitude of acute changes to song (Fig. 5; supplemental Figs. 3, 4, available at www.jneurosci.org as supplemental material). Moreover, at a qualitative level, similar decreases in HVC activity were observed when neurons were presented with acoustic stimuli that simulated those experienced by singing birds (Fig. 7). In contrast, longer latency changes to HVC activity (>80 ms) in singing birds significantly correlated with the magnitude of acute motor effects (supplemental Figs. 3, 4, available at www.jneurosci.org as supplemental material). These data are consistent with the interpretation that short latency feedback-driven changes in HVC activity reflect the auditory experience of the bird independent of motor consequences. However, we cannot rule out that these short latency signals reflect acute changes to song structure that are subthreshold for detection or otherwise covert. Indeed, it is unclear for any system whether a meaningful distinction can be drawn between “sensory signals” and “subthreshold motor responses” to those sensory signals. Regardless of their origin, the short latency changes in HVC activity after feedback perturbation can inform the song system about whether feedback is normal or aberrant and, therefore, about the quality of ongoing song.
Our finding of online feedback signals in premotor structures of singing birds contrasts with results from previous investigations (McCasland and Konishi, 1981; Leonardo, 2004; Kozhevnikov and Fee, 2007; Prather et al., 2008). One difference between our experiments and previous ones is the reproducibility with which feedback was perturbed across trials. In contrast to previous studies, in which the timing of feedback perturbation typically varied from one song to the next, we used a computerized system to detect specific features of targeted syllables and to reliably disrupt feedback at a precisely controlled time relative to ongoing song. Behavioral studies in both humans and birds indicate that the influence of feedback disruption depends on its timing relative to ongoing vocalizations (Howell and Archer, 1984; Howell and Powell, 1987; Sakata and Brainard, 2006). The reproducibility of our feedback perturbation likely enhanced the statistical power to detect feedback signals.
Our choice of species may also have been important. In contrast to previous studies that used zebra finches, we focused on Bengalese finches because the maintenance and production of their song is more dependent on auditory feedback (Nordeen and Nordeen, 1992; Okanoya and Yamaguchi, 1997; Woolley and Rubel, 1997; Lombardino and Nottebohm, 2000; Brainard and Doupe, 2001; Sakata and Brainard, 2006); therefore, we anticipated that auditory signals might be more salient in Bengalese finches than zebra finches. The mechanism underlying this species difference is unknown but differences in song structure could be relevant (e.g., more repeated and asymmetric syllables, more complex and variable syllable sequencing in Bengalese finch song). Regardless of the mechanism, we observed auditory responses in HVC of adult Bengalese finches that were consistent with a greater dependence on hearing in this species: in awake, quiescent Bengalese finches, HVC neurons were robustly and selectively activated by the sound of BOS (Fig. 6), whereas in awake, quiescent zebra finches, HVC neurons are generally neither robustly nor selectively activated by BOS (Schmidt and Konishi, 1998; Cardin and Schmidt, 2003, 2004; Rauske et al., 2003). The d′ values we observed (d′BOS-rBOS = 3.01) greatly exceed d′ values previously reported for HVC neurons in awake zebra finches (e.g., d′BOS-rBOS = 0.22–1.7) (Rauske et al., 2003; Cardin and Schmidt, 2004), but are comparable with the values reported for anesthetized or sleeping zebra finches (e.g., d′BOS-rBOS = 2.89–3.52) (Mooney, 2000; Rauske et al., 2003; Cardin and Schmidt, 2003, 2004a). These data indicate that the processing of auditory feedback may indeed be more salient in Bengalese finches than zebra finches.
There are several mechanisms by which feedback alteration could influence HVC activity. HVC receives excitatory input from the nucleus interfacialis of the nidopallium (NIf) (Fortune and Margoliash, 1995; Cardin and Schmidt, 2004a,b; Coleman and Mooney, 2004; Cardin et al., 2005) and nucleus uvaeformis (Uva) via the lateral lemniscal pathway (Foster and Bottjer, 1998; Coleman et al., 2007). Consequently, the changes observed in HVC activity might reflect feedback-dependent changes in NIf, Uva, or other sources of auditory input to HVC. Additionally, feedback-dependent changes in the activity of neuromodulatory systems could underlie changes in HVC activity (Li and Sakaguchi, 1997; Appeltants et al., 2000; Shea and Margoliash, 2003; Cardin and Schmidt, 2004b). For example, midbrain nuclei that send catecholaminergic projections to HVC, such as the ventral tegmental area and central gray, could modulate neural activity in premotor circuitry. Homologous neuromodulatory regions in the mammalian midbrain have been found to respond to deviations in expectancy (Schultz and Dickinson, 2000). Hence, these populations in songbirds could plausibly encode deviations of auditory feedback from its expected form and broadcast this information to HVC.
The feedback signals present within HVC could potentially contribute to hearing-dependent song plasticity as well as song control. The feedback perturbations used here are effective in driving modifications of song in both juvenile and adult songbirds (Leonardo and Konishi, 1999; Kozhevnikov and Fee, 2007; Tumer and Brainard, 2007). Hence, the signals we observed in HVC in response to these perturbations might contribute causally to behavioral change. For example, the decreased activity of putative interneurons in response to disruptions of the sound of the bird's own song could signal deviation from an accurate or expected rendition of song. Such a signal, in principle, could operate within HVC itself to alter synaptic connectivity by weakening synapses that are differentially active on trials in which feedback is disrupted. This kind of process, which has been hypothesized to contribute to reinforcement learning of song, would tend to differentially weaken premotor patterns that give rise to “worse” versus “better” versions of song (Sutton and Barto, 1998; Fiete et al., 2007). Auditory signals that reach HVC during singing also have the potential to contribute to online song control in response to self-generated feedback and to acoustic signals from other birds. Such responsiveness to external sounds could help coordinate production and avoid acoustic interference (e.g., during counter-singing and dueting).
In addition to the possibility that feedback signals operate directly within HVC to shape song, such signals might also be broadcast to the anterior forebrain pathway (AFP), a circuit critical for vocal learning (Fig. 1). HVC interneurons form inhibitory synapses on HVC neurons that project to the AFP (HVC-X neurons) (Mooney and Prather, 2005). Hence, a simple model might predict that decreased activity of interneurons after feedback perturbation would result in increased activity of HVC-X neurons. However, in vivo recordings in HVC do not suggest such a simple relationship between firing patterns in these two populations (Rosen and Mooney, 2006). This likely reflects the presence of other sources of input to HVC-X neurons (including direct auditory or neuromodulatory inputs) so that the net effect of feedback perturbation on this population is difficult to predict. Previous studies that did not detect feedback signals within the song system focused specifically on HVC-X neurons (Kozhevnikov and Fee, 2007; Prather et al., 2008) or neurons within the AFP itself (Leonardo, 2004). This raises the possibility that the feedback signals that we have characterized are only present within HVC interneurons. However, differences in sensitivity to detect such signals caused by experimental design and species differences outlined above could also account for this discrepancy. It will therefore be important for future experiments to take these differences into consideration to characterize the presence and behavioral relevance of feedback signals within the song system.
In summary, our data provide the first neurophysiological demonstration in songbirds that information derived from auditory feedback is rapidly available to vocal premotor structures during singing. Auditory feedback plays a crucial role during normal song learning and in the maintenance of adult song (Brainard and Doupe, 2000). Hence, the vocal motor pathways that produce song ultimately must be shaped by information derived from auditory feedback. Our results indicate the availability of online feedback signals that can contribute to the production and plasticity of learned vocalizations, although offline contributions to song learning and maintenance cannot be ruled out. The shared dependence of birdsong and speech on auditory feedback suggests that similar rapid, online feedback signals could inform vocal motor areas during speech production.
Footnotes
-
This work was supported by National Institutes of Health (NIH) Grants T32 NS0707-25 and F32-MH068055-01 (J.T.S.), the McKnight Foundation, a Searle Scholars Award, and the NIH (M.S.B.). We thank S. C. Woolley, M. H. Kao, K. I. Nagel, and A. J. Doupe for critical readings of this manuscript; J. Wong for assistance with histology; and C. Roddey and J. Houde for programming assistance.
- Correspondence should be addressed to Dr. Jon T. Sakata, Keck Center for Integrative Neuroscience, Department of Physiology, Box 0444, University of California, San Francisco, San Francisco, CA 94143-0444. jsakata{at}phy.ucsf.edu