Abstract
Vocal learning in songbirds is mediated by a highly localized system of interconnected forebrain regions, including recurrent loops that traverse the cortex, basal ganglia, and thalamus. This brain-behavior system provides a powerful model for elucidating mechanisms of vocal learning, with implications for learning speech in human infants, as well as for advancing our understanding of skill learning in general. A long history of experiments in this area has tested neural responses to playback of different song stimuli in anesthetized birds at different stages of vocal development. These studies have demonstrated selectivity for different song types that provide neural signatures of learning. In contrast to the ease of obtaining responses to song playback in anesthetized birds, song-evoked responses in awake birds are greatly reduced or absent, indicating that behavioral state is an important determinant of neural responsivity. Song-evoked responses can be elicited during sleep as well as anesthesia, and the selectivity of responses to song playback in adult birds is highly similar between anesthetized and sleeping states, encouraging the idea that anesthesia and sleep are similar. In contrast to that idea, we report evidence that cortical responses to song playback in juvenile zebra finches (Taeniopygia guttata) differ greatly between sleep and urethane anesthesia. This finding indicates that behavioral states differ in sleep versus anesthesia and raises questions about relationships between developmental changes in sleep activity, selectivity for different song types, and the neural substrate for vocal learning.
Significance Statement
Electrophysiological recordings of spiking activity in different taxa are heavily dependent on behavioral state. Neural activity patterns are frequently similar between sleeping and anesthetized animals, which has encouraged the idea that similar states characterize sleep and anesthesia. Based on comparisons across studies from our lab, we report that activity patterns are highly dissimilar between sleep and urethane anesthesia in a cortical region of juvenile songbirds. These data argue against the idea that similar behavioral states are achieved in sleep versus anesthesia.
Introduction
Vocal learning in zebra finches serves as a powerful model for investigating mechanisms of motor skill learning during development (Doupe and Kuhl, 1999; Brainard and Doupe, 2013). Juvenile zebra finches learn the sounds used for vocal communication, and this type of skill learning, like other forms of goal-directed learning, is controlled by cortico-basal ganglia circuits (Yin and Knowlton, 2006; Graybiel, 2008; Redgrave et al., 2010; Turner and Desmurget, 2010; Cox and Witten, 2019). Similar to infants learning speech, juvenile songbirds memorize the vocal sounds of their adult tutor. They then progressively refine their own vocal behavior to imitate the tutor song (the goal behavior) during the sensorimotor stage of vocal learning. This process requires the evaluation of feedback of self-generated vocalizations against a neural representation of the goal tutor song to guide the gradual acquisition of an accurate imitation.
Neural control of vocal learning in juvenile zebra finches is vested in basal ganglia loops that emanate from the cortical nucleus LMAN (Fig. 1; Bottjer et al., 1984; Scharff and Nottebohm, 1991; Aronov et al., 2008). CORE and SHELL subregions of LMAN make parallel connections through the basal ganglia and thalamus (Johnson et al., 1995; Iyengar et al., 1999; Luo et al., 2001; Iyengar and Bottjer, 2002a; Bottjer, 2004; Gale et al., 2008; Person et al., 2008; Paterson and Bottjer, 2017). The CORE pathway mediates vocal motor production in juvenile songbirds (Bottjer et al., 1984; Scharff and Nottebohm, 1991; Aronov et al., 2008; Elliott et al., 2014; Kojima et al., 2018) and is functionally similar to sensorimotor cortico-basal ganglia loops in mammals that contribute to learning and performance (Alexander and Crutcher, 1990; Graybiel, 2008; Yin et al., 2009; Ashby et al., 2010; Redgrave et al., 2010; Thorn et al., 2010; Gremel and Costa, 2013; Kupferschmidt et al., 2017). In contrast, the SHELL pathway is involved in evaluating sensorimotor performance and is functionally similar to associative-limbic loops that traverse the basal ganglia; lesions in the SHELL pathway of juvenile birds impair the ability to imitate tutor song, but do not cause motor disruption of song production (Bottjer and Altenau, 2010). This disruption of learning but not motor performance suggests that SHELL circuitry helps to evaluate whether self-generated vocalizations match learned tutor sounds.
Studies of the mechanisms that underlie vocal learning in songbirds have a long and venerable history of examining neural responses to playback of different song types in anesthetized juvenile and adult birds (Margoliash, 1983, 1986; Volman, 1993; Lewicki and Konishi, 1995; Solis and Doupe, 1997, 1999, 2000; Adret et al., 2012). One quest in this area was to discover a population of neurons that encode the tutor song memorized by each juvenile bird. Achiro and Bottjer (2013) reported that the SHELL subregion of LMAN in juvenile anesthetized birds contains a large proportion of neurons (∼30%) that respond significantly only to playback of tutor song. This tutor-tuned population provides a target memory that is essential for matching self-generated utterances to the goal tutor song, and is present only during early stages of sensorimotor integration. The proportion of tutor-tuned neurons diminishes during development as the incidence of neurons that responds selectively to each bird’s own song increases, suggesting that tutor-tuned neurons are lost (Johnson and Bottjer, 1992, 1993, 1994) or re-tuned to provide a template of self-generated song (Volman, 1993; Zevin et al., 2004; Nick and Konishi, 2005b; Kojima and Doupe, 2007; Achiro and Bottjer, 2013). In accord with the latter idea, the emergence of selectivity for each bird’s own song is a ubiquitous signature of vocal learning across forebrain regions including cortex (HVC, LMAN, RA), basal ganglia, and thalamus (Margoliash, 1983, 1986; Margoliash and Konishi, 1985; Margoliash and Fortune, 1992; Volman, 1993; Doupe, 1997; Solis and Doupe, 1997; Person and Perkel, 2007).
Thus, selective neural responses to playback of songs in anesthetized birds has highlighted the power of such experiments for studying mechanisms of vocal learning. However, several studies have shown that behavioral state is an important determinant of neural responsivity to song playback. Song-evoked responses can be elicited in sleeping as well as anesthetized zebra finches, and responses to song playback in adult birds are highly similar between anesthetized and sleeping states (Dave et al., 1998; Dave and Margoliash, 2000; Nick and Konishi, 2001), encouraging the idea that anesthesia and sleep states are highly similar. In contrast, song-evoked responses are greatly diminished or absent in awake zebra finches (Schmidt and Konishi, 1998; Cardin and Schmidt, 2003; Rauske et al., 2003; Cardin and Schmidt, 2004a, b), which is reminiscent of the suppression of auditory responses to self-generated sounds in both vertebrate and invertebrate taxa (Suga and Shimozawa, 1974; Poulet and Hedwig, 2006, 2007; Eliades and Wang, 2008; Singla et al., 2017; see Discussion). Here, we report that responses to playback of different song types in both CORE and SHELL subregions of LMAN in sleeping juvenile birds are substantially different from those that we reported previously in urethane-anesthetized birds of the same age under identical experimental conditions (Achiro and Bottjer, 2013). This difference stands in marked contrast to reports of similar song-evoked responsivity in sleeping and anesthetized adult songbirds (Dave et al., 1998; Dave and Margoliash, 2000; Nick and Konishi, 2001), and is consistent with recent data showing that urethane anesthesia does not mimic sleep states (Mondino et al., 2021).
Materials and Methods
Subjects
All procedures were performed in accordance with the University of Southern California’s animal care committee’s regulations. Five juvenile male zebra finches (Taeniopygia guttata) were bred in our group aviaries and remained there with their natural parents until 33 d posthatch (dph). At that time, they and their father were removed from the main aviary and housed in an individual cage in the recording chamber to habituate them to the space. Experimental birds therefore received normal social-auditory experience and exposure to the tutor song (their father’s song; Böhner, 1983, 1990; Mann et al., 1991; Mann and Slater, 1995; Roper and Zann, 2006).
Electrophysiology
At 39 dph birds were anesthetized with isoflurane (1.5–1.8% inhalation) and placed in a stereotaxic apparatus. An electrode assembly consisting of eight tungsten-wire stereotrodes affixed to a movable microdrive was attached to the skull using dental cement such that the stereotrodes were implanted ∼300 mm dorsal to LMAN CORE and SHELL. Each stereotrode was a twisted pair of polyester polyamide-imide overcoated tungsten wires (25 μm in diameter, California Fine Wire Company) routed through fused silica capillary tubing (200 μm in diameter). The assembly consisted of four posterior stereotrodes and four anterior stereotrodes; a silver wire, placed between the skull and skin, served as animal ground. Following surgery each bird was housed in a small individual cage in the recording chamber adjacent to the cage with the father; the father was removed 4–6 d later.
One to 2 d following surgery, the stereotrode assembly was connected to a recording headstage (HS-16, Neuralynx) with a flexible cable connected to a commutator (PSR, Neuralynx); 15 channels of neural data were amplified, band passed between 300 and 5000 Hz (two Lynx-8 amplifiers, Neuralynx), and digitized at 32 kHz using Spike2 software (Power 1401 data acquisition interface, Cambridge Electronic Design). Audio and video were recorded coincident with neural activity: vocalizations were recorded to the 16th channel using a lavalier microphone (Sanken COS-11D) mounted in the cage, and two USB-video cameras (30 FPS, ELP Day Night Vision, X000UPN1M5, HD 1080p) were placed on opposite sides of the cage to record video files aligned to the neural activity. Two consecutive 60-min recordings were made between ∼8 and 10 P.M. starting about 1 h after lights off. Stereotrodes were manually advanced with the microdrive on consecutive days in the afternoon. The range of ages when recordings were made from LMAN CORE and/or SHELL ranged from 43 to 53 with a mean of 48.5 dph.
All birds received playback of four different songs: the bird’s own song (OWN, recorded within 24 h before each recording), the bird’s tutor song (TUT), a juvenile conspecific song (JuvCon), and an adult conspecific song (AdlCon). The latter two songs served as control stimuli for OWN song and TUT song, respectively. JuvCon songs were age-matched to the age of the experimental bird’s OWN songs. The order of stimuli within a block of four songs was random without replacement, and the interstimulus interval was 30 ± 1 s. Each song type was played back ∼50 times at an amplitude of 56–59 dB, but only playbacks that occurred during sleeping periods were used for analysis (see below).
At the end of each experiment, birds were perfused (0.7% saline followed by 10% formalin), and brains were removed and postfixed before being cryo-protected (30% sucrose solution) and frozen-sectioned in the coronal plane (50 μm thick). Sections were Nissl stained with thionin to visualize stereotrode tracks and verify recording locations. The border between CORE and SHELL subregions of LMAN was distinguished based on the density of magnocellular somata, which is low in SHELL relative to CORE.
Data analysis
A recording site was considered for analysis if it was confirmed histologically to be in either LMAN-CORE or LMAN-SHELL (excluding 50 μm on either side of the CORE/SHELL border). The evoked responses of LMAN neurons tend not to persist throughout song stimuli longer than 1 s, as reported previously (Doupe, 1997; Solis and Doupe, 1997, 1999; Kojima and Doupe, 2007; Achiro and Bottjer, 2013). Therefore, response strengths calculated for song stimuli longer than 1 s underestimate the actual response by averaging across both the early phasic response and the period of decreased response. To correct for this stimulus duration bias (e.g., longer songs underestimate true response strengths), all analyses were performed using neural data collected during the first second of song playback.
Periods of sleep were scored manually by two independent observers; as a conservative estimate, only periods ranked as sleep by both observers were used for analysis. Careful examination of the video files was used to mark sleeping periods as those in which birds were completely quiescent, displaying a regular pattern of deep rhythmic breathing with their eyes closed for at least 10 s. Sleeping periods were terminated at least 2 s before onset of large movements (e.g., wing movements) or eye-opening. As indicated above, nonsleeping periods were eliminated such that only song playbacks that occurred during sleeping periods were included for analysis; the number of playbacks ranged from 16 to 48 (average 33 playbacks per song type in SHELL and 28 in CORE).
Movement artifact in multiunit neural recordings was correlated across recording channels and was eliminated or reduced using offline common average referencing: for each recording channel, the signal across the 14 remaining recording channels was averaged and subtracted from that channel to remove movement artifact (Ludwig et al., 2009). Noise was calculated as the standard deviation of the entire (2-h) voltage recording, and minimum signal-to-noise ratio was set as three times the standard deviation; this threshold was used for spike detection. Single units were sorted from multiunit data by first automatically clustering units with KlustaKwik (KD Harris, University College London). KlustaKwik clusters were manually inspected across 18 different waveform features and further refined using MClust 4.4 (A. D. Redish, University of Minnesota). Clusters were included for analysis if < 1% of spikes had an interspike interval (ISI) < 2 ms.
We determined whether each single unit was responsive to song playback by testing for a significant change in firing rate (excitation or suppression) between baseline and each song type (Wilcoxon signed-rank test, p < 0.05). Baseline periods were defined as 1-s periods immediately before stimulus playback, with the restriction that they must fall within sleep periods. For each song playback, the firing rates during the two closest baseline periods were averaged to generate a corresponding baseline value. To compare differences in firing rates across neurons, standardized response strengths (RS) were calculated as:
Where S is the firing rate (spikes/s) during stimulus, and B is the firing rate during baseline, such that a positive value indicates an increased rate to a stimulus (excitation) and a negative value indicates a decreased rate (suppression). We refer to the standardized response strength as “response strength” (RS) throughout the text.
We report RS values in three different ways. (1) In order to assess response strengths across both excited and suppressed responses, we report absolute values of RS. (2) We report response strengths for excited and suppressed responses separately. In this measure, we calculated firing rates for all excited responses (all RS values greater than zero) and all suppressed responses (all RS values less than zero), and omitted RS values of zero (indicating the same firing rate during both stimulus and baseline) since they are neither excited nor suppressed. (3) We report significant excited and suppressed responses, i.e., including only RS values in which the firing rate during a stimulus was significantly different from baseline.
To measure song selectivity for each song type for each cell, a difference score between response strength values was calculated for “SongA” as follows: Song AΔRS = RSSongA – RSSongB. For example, positive scores obtained by subtracting response strengths to OWN, AdlCon, and TUT (“SongB” comparison songs) from JuvCon (“SongA” reference song) would indicate selectivity for JuvCon. This measure is similar to the psychometric discriminability index d’ except that responses are standardized before being subtracted, as opposed to subtracting response strengths and then dividing by the standard deviations as in d’ (see Achiro and Bottjer, 2013). This approach corrects for potential limitations of d’ scores, which are sensitive to response variability as well as response strength (Coleman and Mooney, 2004). Difference scores for song-suppressed responses were reversed in sign so that in all cases (both excited and suppressed responses) a positive difference score indicates a preference for the reference song over the comparison song while a negative difference score indicates a preference for the comparison song.
The conditions of both collecting and analyzing the data reported here are identical to the procedures used by Achiro and Bottjer (2013), including age and breeding population of birds, equipment and experimental setup, and scripts for analysis.
Statistics
We used nonparametric statistics because of non-normal distributions of data, differing numbers of significant responses between song types, and differing numbers of neurons between CORE and SHELL regions. Differences in proportions were tested using χ2 or Fisher’s exact tests, and differences in distributions were tested with Kolmogorov–Smirnov Z tests. Friedman tests were used to evaluate differences in RS between song types (as a repeated measure) within CORE and SHELL regions, whereas Kruskal–Wallis tests were used to evaluate differences between song types for significant excitatory and suppressed responses within each region (because of differing number of responses). Wilcoxon signed-rank tests were used to assess individual differences between song types for all responses, and for comparing firing rates and burst fractions during sleep versus nonsleep periods; Benjamini–Hochberg corrections were used for multiple comparisons (Benjamini and Hochberg, 1995). Mann–Whitney tests were used to assess individual differences between song types for excited versus suppressed responses, which were also corrected for multiple comparisons using Benjamini–Hochberg. All values are given as mean ± SEM unless specified otherwise.
Results
Different song types elicited different proportions of responses in both CORE and SHELL regions of LMAN
We recorded from CORE and SHELL subregions of LMAN in sleeping juvenile zebra finches (43–53 dph, mean = 48.5 dph). By this age juveniles have completed memorization of their tutor’s song and begun to practice their incipient song vocalizations. All neurons (n = 66, CORE; n = 104, SHELL) were tested with four different song types: each bird’s own song (OWN), each bird’s tutor song (TUT), an age-matched song from a juvenile conspecific (JuvCon) and an adult conspecific song (AdlCon). Approximately half of the neurons in both CORE and SHELL showed a significant change in firing rate to at least one of the song types presented (CORE: 0.53, 35/66; SHELL: 0.47, 49/104); thus, both regions showed similar levels of responsivity to song playback (χ2 = 0.57, p = 0.45). Proportions of significant playback responses varied by song type within both CORE and SHELL (CORE: χ2 = 13.8, p = 0.003; SHELL: χ2 = 12.6, p = 0.006; Fig. 2, top panel; Table 1). JuvCon song elicited the highest proportion of responses whereas TUT evoked the lowest. Individual comparisons showed that the incidence of evoked responses to JuvCon was higher than that to TUT (CORE: p = 0.003; SHELL: p = 0.01, Fisher’s exact test, Benjamini–Hochberg corrected). JuvCon song also elicited a higher proportion of responses in LMAN-CORE neurons compared with AdlCon song (p = 0.04); no other comparisons between JuvCon and other song types were significant across all responses (p > 0.09 or higher).
Although we found evidence of both excitation and suppression, the majority of cells within both CORE and SHELL showed only suppressed responses. Approximately 75% of cells in CORE and SHELL were suppressed by song playback, whereas relatively few cells responded with only excitation or a combination of excitation and suppression to different song types (Table 2). The dominance of suppressed responses was clear for all four song types, but was particularly pronounced for the two song types that elicited the highest percentage of responses, JuvCon and OWN (Table 1). We therefore examined the proportions of suppressed responses elicited by different song types (Fig. 2, bottom panel). In contrast to comparison across all playback responses (Fig. 2, top panel), the bottom panel of Figure 2 shows that only SHELL neurons showed differential suppression between song types (SHELL: χ2 = 16.0, p = 0.001; CORE: χ2 = 4.99, p = 0.173). Within SHELL neurons, JuvCon evoked a higher incidence of suppressed responses compared with both TUT and AdlCon, but not OWN (TUT: p = 0.006; AdlCon: p = 0.034, Fisher’s exact test, Benjamini–Hochberg corrected). OWN song also evoked a higher proportion of suppressed responses compared with TUT (TUT: p = 0.011; OWN vs AdlCon was marginally significant, p = 0.057; Fisher’s exact test, Benjamini–Hochberg corrected). Thus, the proportion of suppressed responses varied by song type in SHELL, but not CORE; within SHELL neurons both JuvCon and OWN elicited a high incidence of suppressed responses relative to AdlCon and (especially) TUT.
Individual neurons were not broadly tuned: almost all neurons responded to either one or two of the four song types played; CORE neurons responded to 1.26 ± 0.07 different songs on average, whereas SHELL neurons responded to 1.29 ± 0.08. Figure 3, left, shows that ∼75% of neurons in both CORE and SHELL subregions responded to only one song type (green shading); the majority of the remaining cells responded to only two song types (yellow shading); no CORE neurons and only 4% of SHELL neurons responded to three songs (blue shading). Figure 3, right side, depicts the song types to which each individual neuron responded (dark shading, suppressed responses; light shading, excited responses), confirming that a low proportion of neurons in both CORE and SHELL responded to playback of TUT song, while relatively high proportions responded to both JuvCon and OWN songs.
To summarize these data based on proportions of song-evoked responses in sleeping juvenile birds during the period of sensorimotor integration: (1) neurons in both subregions of LMAN responded in a selective fashion to song stimuli; (2) all songs were more likely to elicit suppression of firing rates rather than excitation, especially JuvCon and OWN songs; (3) SHELL neurons showed a greater tendency toward suppressed responses to JuvCon and OWN songs compared with CORE neurons. Neurons at the population level evinced a preference for juvenile songs over adult songs, regardless of whether the juvenile song was self-generated (OWN) or produced by an age-matched conspecific bird (JuvCon).
This overall pattern of results contrasts markedly with that observed in our previous study in which birds of the same age were urethane-anesthetized rather than sleeping (Achiro and Bottjer, 2013). In that study, neurons in CORE were more likely to respond to playback compared with those in SHELL (0.89 vs 0.68), and neurons in both CORE and SHELL were much more likely to show excitation: CORE neurons never showed suppressed responses whereas ∼80% of responses in SHELL neurons were excitatory and ∼20% were suppressed. In addition, our prior work revealed a large proportion of SHELL neurons that exhibited a significant response only to TUT compared with those in CORE (0.28 vs 0.04), whereas a large proportion of CORE neurons responded to TUT plus other songs compared with SHELL neurons (0.43 vs 0.15; Achiro and Bottjer, 2013). Thus, the SHELL region of LMAN in anesthetized birds contains two distinct populations of neurons during early sensorimotor integration (45 dph): a larger one that responds only to the tutor song and a separate smaller population that responds only to the bird’s own song. In general, CORE neurons in anesthetized birds of this age are much more broadly tuned than SHELL neurons and show little evidence of selective responsivity to tutor song (see Achiro and Bottjer, 2013). The current data did not replicate any of these patterns in sleeping birds (see Discussion; Table 3).
Firing rates during sleeping versus nonsleeping periods
We are confident that we measured spiking responses to song playback during periods of sleep since the use of behavioral criteria has been shown to be highly reliable (Szymczak et al., 1996; Low et al., 2008). In addition, neural measures were consistent with our behavioral scoring: spontaneous firing rates (spikes/s) during the night were lower during periods marked as sleep compared with nonsleep (Wilcoxon signed-rank tests p < 0.0001 for both CORE and SHELL), and the percent of spikes that occurred in bursts (ISIs < 5 ms) was higher during sleep periods than during nonsleep periods (Wilcoxon signed-rank tests p < 0.005 for both CORE and SHELL; Fig. 4, left panel). This pattern is consistent with that observed in thalamocortical neurons in mammals, which fire at high regular rates during waking versus low rates interspersed with bursts during sleep or anesthesia (Steriade and Llinás, 1988; Swadlow and Gusev, 2001; Weyand et al., 2001). A similar pattern has been observed in adult songbirds: spontaneous bursting frequently occurs during sleep in neurons of the motor pathway (HVC and RA; Fig. 1), but not in awake nonsinging birds, and sleep bursts are dependent on bursting activity in the thalamic nucleus Uva (uvaeform nucleus; Yu and Margoliash, 1996; Dave et al., 1998; Hahnloser et al., 2002, 2008).
Response strengths in subsets of CORE and SHELL neurons were selective for specific songs
Figure 4, right panel, shows absolute values of response strengths for all responses to each song type (including both excitatory and suppressed responses) for CORE and SHELL neurons. This measure revealed no difference in firing rates between songs in SHELL but a significant difference in CORE (Friedman test: CORE, p = 0.039; SHELL, p = 0.257); however, no individual comparisons were significant for CORE neurons despite the stronger response to JuvCon relative to other songs (p = 0.076 for JuvCon vs TUT, Wilcoxon signed-rank tests, Benjamini–Hochberg corrected). Given the relatively large proportion of significant responses to JuvCon song in both CORE and SHELL (Fig. 2), we compared absolute values of response strengths across song types for the subset of neurons that responded significantly to JuvCon (n = 19 CORE, n = 21 SHELL). Figure 4, right, middle, shows that this subpopulation in both CORE and SHELL exhibited a significantly higher firing rate to JuvCon compared with the other three song types (Friedman tests, p < 0.0001 in both CORE and SHELL; Wilcoxon signed-rank tests for JuvCon vs other song types always p < 0.004 or lower, Benjamini–Hochberg corrected). To determine whether this selective increase in firing rate was restricted to JuvCon-responsive neurons, we calculated firing rates for each subset of neurons that showed a significant response to the remaining three song types. A similar pattern was obtained for OWN-responsive, AdlCon-responsive, and TUT-responsive neurons, showing that single neurons that responded significantly to a given song type also showed a higher firing rate to that song type compared with other song stimuli. For example, OWN-responsive neurons in both CORE and SHELL had significantly higher response strengths to OWN compared with all other songs (Wilcoxon signed-rank tests for OWN vs other song types in SHELL always p < 0.001; in CORE always p < 0.005; Benjamini–Hochberg correction for multiple comparisons; Fig. 4, right, bottom). (We did not perform statistical tests for TUT-responsive or AdlCon-responsive neurons because of relatively low ns; see Table 1.)
Figure 5, top panels, presents all suppressed responses (less than zero) versus all excited responses (greater than zero). Interestingly, the trend toward stronger responses to JuvCon in CORE neurons shown in the top panel of Figure 4 was vested in excitatory responses: the firing rate to JuvCon in CORE neurons was greater compared with other songs for excited responses but not for suppressed responses (Kruskal–Wallis tests for CORE neurons: excitatory p = 0.024, suppressed p = 0.539; Mann–Whitney tests showed that individual comparisons were significant only for JuvCon vs TUT excited responses, p = 0.003, Benjamini–Hochberg correction). This pattern is consistent with that shown in Figure 2, in which differences in song responsivity were no longer significant in CORE neurons for suppressed-only responses. In contrast to CORE, suppressed, but not excited responses in SHELL neurons showed a significant overall difference in firing rates between song stimuli (Kruskal–Wallis tests for SHELL neurons: excitatory p = 0.228, suppressed p = 0.041), although no individual comparisons were significant (p > 0.17 in all cases, Benjamini–Hochberg correction for multiple comparisons). However, these differences in song-evoked firing rates were not evident when only significant responses were examined (Fig. 5, bottom panels)-statistical analyses performed on suppressed responses were not significant (Kruskal–Wallis: SHELL p = 0.97; CORE p = 0.51; excited responses were not assessed because of the small number of significant responses). The lack of any differences between significant responses to songs indicates that the enhanced firing rate seen in CORE neurons across excited responses (Fig. 5, top right panel) was because of responses that were stronger to JuvCon song but fell short of significance; when only significant excited responses were considered (Fig. 5, bottom right panel), this tendency was eliminated. Thus, significant suppression and excitation did not vary as a function of song type.
Figure 6 shows raster and poststimulus time histograms (PSTHs) for example neurons from LMAN in a bird that was 44 dph. The left panel shows a CORE neuron that showed excitation to playback of JuvCon song, and the right panel shows a SHELL neuron in which the firing rate was suppressed by JuvCon song. The unit on the left showed a consistent increase in firing rate during the first second of song playback, but not at precisely the same time point, as documented in previous studies (Doupe and Solis, 1997; Olveczky et al., 2005; Kao and Brainard, 2006; Kao et al., 2008). Thus, in accord with prior studies in anesthetized birds, firing rates in LMAN neurons are sparse, and song-evoked spikes exhibit a high level of trial-to-trial variability.
Given the prevalence of suppressed responses to JuvCon songs (Fig. 3; Tables 1, 2), we examined neural selectivity between pairs of stimuli for JuvCon-suppressed neurons by calculating the difference in response strength between song types (see Materials and Methods). Response strengths to OWN, AdlCon, and TUT were subtracted from significantly suppressed JuvCon responses for each cell. A positive difference score indicates that a neuron preferred JuvCon song over comparison songs. Figure 7 shows cumulative distributions of difference scores in CORE versus SHELL neurons for JuvCon against each of the three other song types (n = 13 CORE, n = 19 SHELL). CORE and SHELL neurons clearly showed the same degree of preference for JuvCon song (Kolmogorov–Smirnov tests, p always > 0.87). A similar pattern of selectivity in CORE versus SHELL neurons was obtained when we compared cumulative distributions of difference scores for OWN (n = 9 CORE, n = 18 SHELL) against each of the three other song types (data not shown). Furthermore, very few neurons exhibited negative selectivity scores; the preponderance of positive scores in Figure 7 shows that cells that exhibited significant suppression to JuvCon almost never showed greater suppression to any other song stimulus. For example, only one SHELL neuron and no CORE neurons showed stronger suppression to AdlCon compared with JuvCon (Fig. 7). One-sample Wilcoxon signed-rank tests to assess whether the distributions were different from zero always yielded p values of <0.0001.
Discussion
The overall pattern of song-evoked responses that we observed in this study contrasted markedly with that in our previous work (Achiro and Bottjer, 2013). Despite the fact that birds of the same age from our breeding colony were used, and all experimental procedures and analyses were the same between the two studies, song-evoked responses in sleeping birds (this study) were substantially different from those observed in our previous work using anesthetized birds (Table 3). Comparing the current results to our prior study, in the present study, the overall level of song-evoked responsivity was lower, responses were overwhelmingly suppressed instead of excited, the incidence of SHELL neurons selectively tuned to tutor song was extremely low, and neurons in both CORE and SHELL tended to show a preference for JuvCon and OWN songs. The preference for juvenile over adult songs is similar to that reported by Yuan and Bottjer (2019) for juvenile RA neurons, which receive direct input from LMAN-CORE, raising the possibility that this preference arises in the corticothalamic circuit that includes CORE.
Responses in awake versus sleeping or anesthetized states of adult animals
Neural responses to song playback in the motor pathway of the song system, including the cortical regions HVC and RA (Fig. 1), are greatly diminished or absent in awake adult male songbirds but are unmasked under anesthesia or in sleep (Dave et al., 1998; Schmidt and Konishi, 1998; Dave and Margoliash, 2000; Cardin and Schmidt, 2003, 2004a, b; Rauske et al., 2003). This pattern may reflect, at least in part, a general tendency for responses in auditory and/or sensorimotor brain regions to be suppressed during self-generated sounds (Suga and Shimozawa, 1974; Poulet and Hedwig, 2006, 2007; Eliades and Wang, 2008; Singla et al., 2017). For example, neurons in auditory cortex of marmosets show suppression during vocal production; however, responses to their self-generated vocalizations are unmasked when auditory feedback is altered by real-time frequency shifts delivered through headphones (Eliades and Wang, 2008). One idea to arise from such findings is that learned signals from motor or other nonauditory inputs can predict auditory feedback and cancel responses to corresponding auditory sounds. A variant of this idea might explain the absence of song-evoked responses in awake songbirds; for example, motor circuits or pathways for efference copy might act to suppress auditory responses in an awake state even in the absence of active vocalizing. More broadly, the tendency for responses in awake or vocalizing animals to be suppressed is consistent with the idea that behavioral state can regulate a “gate” that controls auditory input.
Qualitative comparisons showing the similarity between song-evoked responses in sleeping versus anesthetized adult male songbirds showed that sleep and anesthesia entail similar behavioral states (Dave et al., 1998; Dave and Margoliash, 2000; Nick and Konishi, 2001). In both anesthetized and sleeping adult songbirds, neurons throughout the song system are selectively tuned to each individual bird’s own song (Margoliash, 1983, 1986; Margoliash and Fortune, 1992; Vicario and Yohay, 1993; Nick and Konishi, 2001; Cardin and Schmidt, 2003; Person and Perkel, 2007). Cardin and Schmidt (2003) directly compared responses of HVC neurons in anesthetized, sleeping/drowsy, and awake adult zebra finches; responses in both anesthetized and sleeping birds were consistently selective for OWN songs, whereas responses in waking birds were highly variable and not selective for OWN. Responses to playback in awake birds reflected the level of arousal: higher levels of arousal uniformly suppressed song-evoked responses in HVC (but had no effect in primary auditory cortex). The similarity of selective responses to OWN song in sleeping and anesthetized birds encouraged the idea that similar behavioral states underlie sleep and anesthesia.
Responses in awake versus sleeping or anesthetized states of juvenile animals
Very few studies have examined responses to song playback in juvenile songbirds during sleep. Nick and Konishi (2005a) reported that multiunit responses in HVC were strongest to tutor song in awake juvenile zebra finches during early sensorimotor integration, whereas OWN was preferred over tutor songs during sleep. Selectivity for OWN song changed over development in a pattern that tracked the current motor version of each bird’s song (Nick and Konishi, 2005b). Spontaneous patterns of spiking in HVC during sleep also change over song development: both firing rate and bursting increase with age (Crandall et al., 2007).
We are not aware of any previous studies that recorded the response of LMAN neurons to song playback during sleep in juvenile songbirds. Comparison of the present results in juvenile sleeping birds with those of our prior work in anesthetized juveniles (Achiro and Bottjer, 2013) clearly shows that responses of LMAN neurons during sleep are substantially different from those recorded under urethane anesthesia in zebra finches during early sensorimotor integration. Salient differences in LMAN activity between this study and our previously published work are summarized in Table 3. Activity patterns in anesthetized birds differed between CORE and SHELL for each of the measures listed in Table 3, whereas none of these measures varied between regions in sleeping birds. Two particularly striking differences are the dominance of suppressed responses in the present study, and the lack of a prominent neuronal subpopulation that responds selectively to tutor song in SHELL as is seen in anesthetized birds. These differences raise the question of when and how the tutor-tuned SHELL neurons are used in the service of learning. Perhaps our sleep conditions were somehow not conducive to eliciting responses from tutor-selective neurons, in which case they may have an important sleep-related function under other sleep conditions. Or perhaps these neurons are actively involved in some aspect of learning during sleep but are gated off from activation via external auditory playback. Another possibility is that tutor-tuned neurons can be activated during awake states (as for HVC neurons of juvenile birds; Nick and Konishi, 2005a), particularly during singing. If tutor-tuned neurons in awake birds are activated only during singing, their activity might be difficult to identify in the context of motor-related activity (Achiro et al., 2017).
Sleep is essential for vocal learning in juvenile zebra finches (Dave and Margoliash, 2000; Derégnaucourt et al., 2005; Crandall et al., 2007; Shank and Margoliash, 2009; Margoliash and Schmidt, 2010), which brings into question the influence of developmental changes in song-evoked activity during sleep, patterns of spontaneous spiking, and maturation of EEG patterns (see below). Such changes within sensorimotor song regions may be related to substantial changes in the neural substrate for song learning (Alvarez-Buylla et al., 1988; Nordeen and Nordeen, 1988a,b; Herrmann and Arnold, 1991; Johnson and Bottjer, 1992, 1993, 1994; Nordeen et al., 1992; Livingston and Mooney, 1997; Foster and Bottjer, 1998; Iyengar et al., 1999; Kittelberger and Mooney, 1999; Livingston et al., 2000; Nixdorf-Bergweiler, 2001; Iyengar and Bottjer, 2002a, b; Bottjer, 2005; Miller-Sims and Bottjer, 2012; Garst-Orozco et al., 2014; Chung and Bottjer, 2022). For example, axonal projections that are present only during early stages of sensorimotor integration may mediate temporally restricted processes of song learning (Miller-Sims and Bottjer, 2012; Chung and Bottjer, 2022); in addition, refinement of axonal connectivity may represent either a morphologic correlate of song learning or a necessary prerequisite for acquisition of song (Iyengar and Bottjer, 2002a, b). Developmental changes in sleep activity as well as in the neural substrate are likely to be related to changing patterns of responsivity to different song types at different stages of learning. A promising area for investigation lies in the extent to which developmental changes in sleep activity, the underlying neural substrate, selectivity for different song types, and maturation of vocal motor production are interrelated.
Comparing urethane anesthesia and different sleep states
EEG patterns are not a reliable indicator of sleep in juvenile zebra finches; the amplitude of 1- to 4-Hz activity (δ, an indicator of slow-wave sleep) did not vary between sleep and wake states in zebra finches between 45–65 dph (Nick and Konishi, 2005a; Crandall et al., 2007). The cortical EEG also does not show evidence of state-dependent activity in early postnatal mammals (Gramsbergen, 1976; Frank and Heller, 1997; Blumberg et al., 2005). Even after EEG patterns differentiate (≥12 d postnatal in rodents), a long period of developmental changes ensues, which may be related to maturational changes that facilitate normal development of the nervous system (Khazipov and Luhmann, 2006; Cirelli and Tononi, 2015; Rensing et al., 2018). In any case, these developmental changes complicate efforts to judge similarity between behavioral states in sleep and anesthesia.
Given the similar patterns of song-evoked activity in sleeping and anesthetized adult songbirds, before beginning this study, we assumed that responses to song playback during sleep in LMAN of juvenile birds would replicate our previous results in anesthetized birds. Because we did not intend to study sleep-related factors we made no effort to characterize different stages of sleep in relation to playback. Despite the fact that EEG patterns do not correlate with sleep stages in juvenile animals (Gramsbergen, 1976; Frank and Heller, 1997; Blumberg et al., 2005; Nick and Konishi, 2005a; Crandall et al., 2007; Cirelli and Tononi, 2015), different states of sleep and/or ultradian rhythms may nevertheless influence song responsivity. If so, different sleep states might provide a possible alternative explanation of the stark differences we observed between song-evoked activity in LMAN of sleeping versus anesthetized juvenile zebra finches. Robust responses to song playback are observed during slow wave sleep in HVC of adult zebra finches (Nick and Konishi, 2001). We are not aware of any studies that have compared song-evoked responses during REM (rapid eye movement) versus non-REM sleep. It would be interesting to correlate responsivity to song playback with EEG patterns in adult birds, taking into account that episodes of different sleep states are quite brief (<30 s in adult budgies) and slow wave sleep decreases through the night while REM sleep increases (Canavan and Margoliash, 2020). It is not clear how informative this approach might be in young songbirds given that EEG patterns are not a reliable indicator of sleep states in juvenile animals, although it is nevertheless possible that a given song type could elicit different neural responses in sensorimotor song regions depending on EEG activity.
The similarity of selective responses to OWN songs under sleep and anesthesia in HVC neurons of adult songbirds has encouraged the idea that behavioral states are highly similar between the two conditions. However, this idea has not been extensively tested in either birds or mammals. Some studies have suggested that urethane anesthesia mimics sleep, based on alternation of EEG patterns between a slow-wave state that resembles non-REM sleep and an “activated” state with features of both REM sleep and waking (Clement et al., 2008; Pagliardini et al., 2013; Tisdale et al., 2018; Hauer et al., 2021; Silver et al., 2021). Recent work has not supported this idea, based on detailed comparisons that measured several correlates to define waking versus sleeping states, including power spectra of EEGs, synchronization between high-frequency (γ) oscillations in different brain regions, directional patterns of activation, and temporal complexity of neural oscillations (Mashour and Hudetz, 2018; Kelz and Mashour, 2019; Mashour et al., 2020). Within-subject comparisons of sleep versus urethane anesthesia in rats indicated that these EEG correlates of consciousness were significantly lower during anesthesia compared with sleep (Mondino et al., 2021). For example, normalized power of δ oscillations was higher during both “REM-like” and “non-REM-like” states of urethane anesthesia compared with their respective REM and non-REM states during sleep. These authors concluded that urethane induces a pattern of “sustained unconsciousness” dissimilar from that of sleep. Thus, it seems likely that differences in patterns of brain activity between sleep and anesthesia could underlie the different responses to song playback that we observed in LMAN of juvenile zebra finches between the current study and our previous work (Achiro and Bottjer, 2013). If so, that would suggest that urethane anesthesia is more effective at removing one or more gates of song-evoked activity compared with sleep.
Footnotes
The authors declare no competing financial interests.
This work was supported by the National Institutes of Health Grant R56 NS098744 and the National Science Foundation Grant 1940957.
C. Le Moing’s present address: Oregon Health & Science University, Portland, OR 97239.
R. Yuan’s present address: University of Oregon, Knight Campus, Eugene, OR 97403.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.