Representation of speech in human auditory cortex: Is it special?

doi:10.1016/j.heares.2013.05.013

Hearing Research

Volume 305, November 2013, Pages 57-73

https://doi.org/10.1016/j.heares.2013.05.013 Get rights and content

Highlights

•
Comparable neural responses to speech in monkey A1 and Heschl's gyrus.
•
Entrainment to the amplitude envelope is neither specific to humans nor to speech.
•
VOT is represented by responses time-locked to consonant release and voicing onset.
•
Fundamental frequency of male speakers is represented by phase-locked responses.
•
Place of articulation encoding is based on the frequency selectivity of neurons.

Abstract

Successful categorization of phonemes in speech requires that the brain analyze the acoustic signal along both spectral and temporal dimensions. Neural encoding of the stimulus amplitude envelope is critical for parsing the speech stream into syllabic units. Encoding of voice onset time (VOT) and place of articulation (POA), cues necessary for determining phonemic identity, occurs within shorter time frames. An unresolved question is whether the neural representation of speech is based on processing mechanisms that are unique to humans and shaped by learning and experience, or is based on rules governing general auditory processing that are also present in non-human animals. This question was examined by comparing the neural activity elicited by speech and other complex vocalizations in primary auditory cortex of macaques, who are limited vocal learners, with that in Heschl's gyrus, the putative location of primary auditory cortex in humans. Entrainment to the amplitude envelope is neither specific to humans nor to human speech. VOT is represented by responses time-locked to consonant release and voicing onset in both humans and monkeys. Temporal representation of VOT is observed both for isolated syllables and for syllables embedded in the more naturalistic context of running speech. The fundamental frequency of male speakers is represented by more rapid neural activity phase-locked to the glottal pulsation rate in both humans and monkeys. In both species, the differential representation of stop consonants varying in their POA can be predicted by the relationship between the frequency selectivity of neurons and the onset spectra of the speech sounds. These findings indicate that the neurophysiology of primary auditory cortex is similar in monkeys and humans despite their vastly different experience with human speech, and that Heschl's gyrus is engaged in general auditory, and not language-specific, processing.

This article is part of a Special Issue entitled “Communication Sounds and the Brain: New Directions and Perspectives”.

Introduction

The ease with which speech is perceived underscores the refined operations of a neural network capable of rapidly decoding complex acoustic signals and categorizing them into meaningful phonemic sequences. A number of models have been devised to explain how phonemes are extracted from the continuous stream of speech (e.g., McClelland and Elman, 1986, Church, 1987, Pisoni and Luce, 1987, Stevens, 2002). Common to all these models is the recognition that phonemic perception is a categorization task based on sound profiles derived from a multidimensional space encompassing numerous acoustic features unfolding over time (Holt and Lotto, 2010). Features are all characterized by acoustic parameters that vary along intensity, spectral, and temporal dimensions. Increased intensity, especially in the low to mid-frequency ranges, helps to distinguish vowels from consonants (McClelland and Elman, 1986, Stevens, 2002). Distinct spectral (formant) patterns during these periods of increased intensity promote accurate vowel identification (Hillenbrand et al., 1995).

The temporal dimension of phonemic categorization has received increased attention in recent years. An influential proposal posits that speech perception occurs over several overlapping time scales (e.g., Poeppel et al., 2008, Poeppel et al., 2012, Giraud and Poeppel, 2012). Syllabic analyses occur within a time frame of about 150–300 m, and correlate with the amplitude envelope of speech. Speech comprehension remains high even when sentence fragments are time-reversed in 50 ms bins, and only becomes severely degraded when time-reversals occur at frequencies overlapping those of the speech envelope (Saberi and Perrott, 1999). Furthermore, temporal smearing of the speech envelope leads to significant degradation in the intelligibility of sentences only at frequencies commensurate with the speech envelope (Drullman et al., 1994).

More refined acoustic feature analyses are performed within shorter temporal windows of integration that vary between about 20 and 80 m. Segmentation of speech within this range is critical for phonetic feature encoding, especially for shorter duration consonants. Times at which rapid temporal and spectral changes occur are informationally rich landmarks in the speech waveform (Stevens, 1981, Stevens, 2002). Both the spectra and formant transition trajectories occurring at these landmarks are crucial for accurate identification of true consonants such as the stops (Kewley-Port, 1983, Walley and Carrell, 1983, Alexander and Kluender, 2009). Voice onset time (VOT), the time between consonant release and the onset of rhythmic vocal cord vibrations, is a classic example of rapid temporal discontinuities that help to distinguish voiced consonants (e.g., /b/, /d/, and /g/) from their unvoiced counterparts (e.g., /p/, /t/, and /k/) (e.g., Lisker and Abramson, 1964, Faulkner and Rosen, 1999). Indeed, when semantic information is lacking, listeners of time-reversed speech have significant comprehension difficulties at the shorter temporal intervals required for phonetic feature encoding (Kiss et al., 2008).

Early stations in the human auditory system are exquisitely tuned to encode speech-related acoustic features. Population brainstem responses accurately represent the intensity, spectrum, and temporal envelope of speech sounds (Chandrasekaran et al., 2009, Anderson and Kraus, 2010). Magnetoencephalographic (MEG) responses reflect consonant place of articulation (POA) within 50 ms after sound onset (Tavabi et al., 2007), and within 100 ms, responses differentiate intelligible versus unintelligible speech (Obleser et al., 2006). Neural responses obtained from intracranial recordings in Heschl's gyrus (HG), the putative location of primary auditory cortex in humans (Hackett et al., 2001), demonstrate categorical-like changes to syllables that vary in their VOT in a manner that parallels perception (Steinschneider et al., 1999, Steinschneider et al., 2005). Spectrotemporal receptive fields derived from single unit activity in HG elicited by one portion of a movie soundtrack dialog can accurately predict response patterns elicited by a different portion of the same dialog (Bitterman et al., 2008). Finally, both MEG responses and responses obtained from invasive recordings within HG have shown that accurate tracking of the speech envelope degrades in parallel with the ability to perceive temporally compressed speech (Ahissar et al., 2001, Nourski et al., 2009; see also Peelle et al., 2013). These observations lend support to the conclusion that “acoustic–phonetic features of the speech signal such as voicing, spectral shape, formants or amplitude modulation are made accessible by the computations of the ascending auditory pathway and primary auditory cortex” (Obleser and Eisner, 2008, p. 16).

An important and unresolved question is whether the representation of acoustic features of speech in the brain is based on neural processing mechanisms that are unique to humans and shaped by learning and experience with an individual's native language. The role of experience in modifying auditory cortical physiology is prominently observed during early development. The appearance of the mismatch negativity component of the event-related potential becomes restricted to native-language phonemic contrasts by 7½ months of age (Kuhl and Rivera-Gaxiola, 2008). Better native language-specific responses predict enhanced language skills at two years of age. The emergence of new event-related potentials that parallel developmental milestones in speech processing provides an additional example of neural circuitry changes derived from language experience (Friederici, 2005). In adults, both gray matter volume of primary auditory cortex and the amplitude of short-latency auditory evoked potentials generated in primary auditory cortex are larger in adult musicians than in musically-naïve subjects (Schneider et al., 2002). Recordings from animal models that are complex vocal learners such as songbirds also demonstrate pronounced modifications that occur in auditory forebrain processing of sound based on developmental exposure to species-specific vocalizations (e.g., Woolley, 2012). In sum, it remains unclear how “special” or unique in mammalian physiology human primary auditory cortex is with regard to decoding the building blocks of speech.

Here, we examine this question by comparing the neural activity elicited by speech in primary auditory cortex (A1) of macaque monkeys, who are limited vocal learners, with that in HG of humans, who are obviously expert vocal learners (Petkov and Jarvis, 2012). Neural activity from human primary auditory cortex was acquired during intracranial recordings in patients undergoing surgical evaluation for medically intractable epilepsy. Measures included averaged evoked potentials (AEPs) and event-related-band-power (ERBP) in the high gamma (70–150 Hz) frequency range. Comparable population recordings were performed in the macaques. Measures included AEPs, the derived current source density (CSD), and multiunit activity (MUA). The focus of this report will be on clarifying the neural representation of acoustic features of speech that vary along both temporal and spectral dimensions. Some of the results represent a summary of previous studies from human and monkey primary auditory cortex. The remainder of the results represents new data that extend the previous findings. If perceptually-relevant features of speech are encoded similarly in humans and monkeys, then it is reasonable to conclude that human primary auditory cortex is not special.

Section snippets

Subjects

Results presented in this report represent neurophysiological data obtained from multiple male monkeys (Macaca fascicularis) that have been accumulated over many years. During this time, there have been gradual changes in methodology. The reader is referred to the cited publications for methodological details (i.e., Fig. 3, Steinschneider et al., 2003; six subjects; Fig. 8, Steinschneider and Fishman, 2011; four subjects). Methods described here refer to studies involving two monkey subjects

Monkey

Entrainment to the temporal envelope of vocalizations within auditory cortex is specific neither to humans nor to human speech. Fig. 1 demonstrates neural entrainment to the temporal envelope of three monkey vocalizations at a low BF location within A1. The left-hand graph in Fig. 1A depicts the FRF of this site based on responses to pure tones presented at 60 dB SPL. The BF of this site is approximately 400 Hz, with a secondary peak at the 200 Hz subharmonic. FRFs based on responses to tones

Summary and general conclusions

The key finding of this study is that neural representations of fundamental temporal and spectral features of speech by population responses in primary auditory cortex are remarkably similar in monkeys and humans, despite their vastly different experience with human language. Thus, it appears that plasticity-induced language learning does not significantly alter response patterns elicited by the acoustical properties of speech sounds in primary auditory cortex of humans as compared with

Contributors

All authors participated in the writing of this manuscript. Drs. Fishman and Steinschneider collected and analyzed data obtained from monkeys. Drs. Nourski and Steinschneider collected and analyzed data obtained from humans. All authors have approved the final article.

Conflict of interest

All authors state that they have no actual or potential conflict of interest.

Acknowledgments

The authors thank Drs. Joseph C. Arezzo, Charles E. Schroeder and David H. Reser, and Ms. Jeannie Hutagalung for their assistance in the monkey studies, and Drs. Hiroyuki Oya, Hiroto Kawasaki, Ariane Rhone, Christopher Kovach, John F. Brugge, and Matthew A. Howard for their assistance in the human studies. Primate studies supported by the NIH (DC-00657), and human studies supported by NIH (DC04290, UL1RR024979), Hearing Health Foundation, and the Hoover Fund.

References (143)

S.E. Anderson et al.
Response to broadband repetitive stimuli in auditory cortex of the unanesthetized rat
Hear. Res.
(2006)
B. Chandrasekaran et al.
Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: implications for developmental dyslexia
Neuron
(2009)
K.W. Church
Phonological parsing and lexical retrieval
Cognition
(1987)
E. de Villers-Sidani et al.
Lifelong plasticity in the rat auditory cortex: basic mechanisms and role of sensory experience
Prog. Brain Res.
(2011)
Y.I. Fishman et al.
Temporally dynamic frequency tuning of population responses in monkey primary auditory cortex
Hear. Res.
(2009)
A.D. Friederici
Neurophysiological markers of early language acquisition: from syllables to sentences
Trends Cogn. Sci.
(2005)
U. Goswami
A temporal sampling framework for developmental dyslexia
Trends Cogn. Sci.
(2011)
M. Jenkinson et al.
Improved optimization for the robust and accurate linear registration and motion correction of brain images
NeuroImage
(2002)
P.K. Kuhl
Brain mechanisms in early language acquisition
Neuron
(2010)
A.M. Liberman et al.
The motor theory of speech perception revised
Cognition
(1985)

A.M. Liberman et al.

On the relation of speech to language

Trends Cogn. Sci.

(2000)

J.L. McClelland et al.

The TRACE model of speech perception

Cognit. Psychol.

(1986)

P. Müller-Preuss et al.

Functional anatomy of the inferior colliculus and the auditory cortex: current source density analyses of click-evoked potentials

Hear. Res.

(1984)

D.F. Newbury et al.

Genetic advances in the study of speech and language disorders

Neuron

(2010)

E. Oshurkova et al.

Click train encoding in primary and non-primary auditory cortex of anesthetized macaque monkeys

Neuroscience

(2008)

D.B. Pisoni et al.

Acoustic–phonetic representations in word recognition

Cognition

(1987)

E. Ahissar et al.

Speech comprehension is correlated with temporal response patterns recorded from auditory cortex

Proc. Natl. Acad. Sci. U. S. A.

(2001)

J.M. Alexander et al.

Spectral tilt change in stop consonant perception

J. Acoust. Soc. Am.

(2008)

J.M. Alexander et al.

Spectral tilt change in stop consonant perception by listeners with hearing impairment

J. Speech Lang. Hear. Res.

(2009)

S. Anderson et al.

Objective neural indices of speech-in-noise perception

Trends Amplif.

(2010)

D.A. Bendor et al.

Dual-pitch processing mechanisms in primate auditory cortex

J. Neurosci.

(2012)

D.A. Bendor et al.

Differential neural coding of acoustic flutter within primate auditory cortex

Nat. Neurosci.

(2007)

J. Bertoncini et al.

Discrimination in neonates of very short CVs

J. Acoust. Soc. Am.

(1987)

A. Bieser

Processing of twitter-call fundamental frequencies in insula and auditory cortex of squirrel monkeys

Exp. Brain Res.

(1998)

Y. Bitterman et al.

Ultra-fine frequency tuning revealed in single neurons of human auditory cortex

Nature

(2008)

S.E. Blumstein et al.

Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants

J. Acoust. Soc. Am.

(1979)

S.E. Blumstein et al.

Perceptual invariance and onset spectra for stop consonant vowel environments

J. Acoust. Soc. Am.

(1980)

S. Borsky et al.

“How to milk a coat”: the effects of semantic and acoustic information on phoneme categorization

J. Acoust. Soc. Am.

(1998)

M. Brosch et al.

Stimulus-dependent modulations of correlated high-frequency oscillations in cat visual cortex

Cereb. Cortex

(1997)

J.F. Brugge et al.

Coding of repetitive transients by auditory cortex on Heschl's gyrus

J. Neurophysiol.

(2009)

R.P. Carlyon et al.

Comparing the fundamental frequencies of resolved and unresolved harmonics: evidence for two pitch mechanisms

J. Acoust. Soc. Am.

(1994)

A.E. Carney et al.

Noncategorical perception of stop consonants differing in VOT

J. Acoust. Soc. Am.

(1977)

S. Chang et al.

The role of onsets in perception of stop consonant place of articulation: effects of spectral and temporal discontinuity

J. Acoust. Soc. Am.

(1981)

E.F. Chang et al.

Categorical speech representation in human superior temporal gyrus

Nat. Neurosci.

(2010)

O. Creutzfeldt et al.

Thalamocortical transformation of responses to complex auditory stimuli

Exp. Brain Res.

(1980)

S.J. Cruikshank et al.

Auditory thalamocortical synaptic transmission in vitro

J. Neurophysiol.

(2002)

S.V. David et al.

Task reward structure shapes rapid receptive field plasticity in auditory cortex

Proc. Natl. Acad. Sci. U. S. A.

(2012)

R.L. Diehl et al.

Speech perception

Annu. Rev. Psychol.

(2004)

N. Ding et al.

Emergence of neural encoding of auditory objects while listening to competing speakers

Proc. Natl. Acad. Sci. U. S. A.

(2012)

R. Drullman et al.

Effect of temporal envelope smearing on speech reception

J. Acoust. Soc. Am.

(1994)

J.J. Eggermont

Neural correlates of gap detection and auditory fusion in cat auditory cortex

Neuroreport

(1995)

J.J. Eggermont

Representation of spectral and temporal sound features in three cortical fields of the cat. Similarities outweigh differences

J. Neurophysiol.

(1998)

J.J. Eggermont

Neural responses in primary auditory cortex mimic psychophysical, across-frequency-channel, gap-detection thresholds

J. Neurophysiol.

(2000)

J.J. Eggermont

Temporal modulation transfer functions in cat primary auditory cortex: separating stimulus effects from neural mechanisms

J. Neurophysiol.

(2002)

R. Eilers et al.

Linguistic experience and phonetic perception in infancy: a crosslinguistic study

Child Dev.

(1979)

P. Eimas et al.

Speech perception in infants

Science

(1971)

C.T. Engineer et al.

Cortical activity patterns predict speech discrimination ability

Nat. Neurosci.

(2008)

A. Faulkner et al.

Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audio-visual and auditory speech perception

J. Acoust. Soc. Am.

(1999)

Y.I. Fishman et al.

Searching for the mismatch negativity in primary auditory cortex of the awake monkey: deviance detection or stimulus specific adaptation?

J. Neurosci.

(2012)

J.L. Flanagan et al.

On the pitch of periodic pulses

J. Acoust. Soc. Am.

(1960)

Cited by (91)

Neural Fluctuation Contrast as a Code for Complex Sounds: The Role and Control of Peripheral Nonlinearities
2024, Hearing Research
The nonlinearities of the inner ear are often considered to be obstacles that the central nervous system has to overcome to decode neural responses to sounds. This review describes how peripheral nonlinearities, such as saturation of the inner-hair-cell response and of the IHC-auditory-nerve synapse, are instead beneficial to the neural encoding of complex sounds such as speech. These nonlinearities set up contrast in the depth of neural-fluctuations in auditory-nerve responses along the tonotopic axis, referred to here as neural fluctuation contrast (NFC). Physiological support for the NFC coding hypothesis is reviewed, and predictions of several psychophysical phenomena, including masked detection and speech intelligibility, are presented. Lastly, a framework based on the NFC code for understanding how the medial olivocochlear (MOC) efferent system contributes to the coding of complex sounds is presented. By modulating cochlear gain control in response to both sound energy and fluctuations in neural responses, the MOC system is hypothesized to function not as a simple feedback gain-control device, but rather as a mechanism for enhancing NFC along the tonotopic axis, enabling robust encoding of complex sounds across a wide range of sound levels and in the presence of background noise. Effects of sensorineural hearing loss on the NFC code and on the MOC feedback system are presented and discussed.
Parallel and distributed encoding of speech across human auditory cortex
2021, Cell
Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.
Brain electrical dynamics in speech segmentation depends upon prior experience with the language
2021, Brain and Language
It remains unclear whether the process of speech tracking, which facilitates speech segmentation, reflects top-down mechanisms related to prior linguistic models or stimulus-driven mechanisms, or possibly both. To address this, we recorded electroencephalography (EEG) responses from native and non-native speakers of English that had different prior experience with the English language but heard acoustically identical stimuli. Despite a significant difference in the ability to segment and perceive speech, our EEG results showed that theta-band tracking of the speech envelope did not depend significantly on prior experience with language. However, tracking in the theta-band did show changes across repetitions of the same sentence, suggesting a priming effect. Furthermore, native and non-native speakers showed different phase dynamics at word boundaries, suggesting differences in segmentation mechanisms. Finally, we found that the correlation between higher frequency dynamics reflecting phoneme-level processing and perceptual segmentation of words might depend on prior experience with the spoken language.
Neural tracking of the speech envelope is differentially modulated by attention and language experience
2021, Brain and Language
Citation Excerpt :
Based on the existing and emerging literature on the neural tracking of the speech envelope (Ding & Simon, 2014), two prominent hypotheses have been put forward regarding the mechanisms underlying neural tracking of the speech envelope. Per the domain-general auditory encoding hypothesis, neural tracking of the speech envelope is primarily driven by general auditory mechanisms that can also be evidenced in animal models (Doelling et al., 2014; Joris et al., 2004; Steinschneider et al., 2013). An alternative hypothesis, the interactive processing hypothesis, suggests that the neural tracking of the speech envelope reflects dynamic interactions between the processing of low-level acoustic cues and higher-level linguistic information (Zou et al., 2019).
The ability to selectively attend to a speech signal amid competing sounds is a significant challenge, especially for listeners trying to comprehend non-native speech. Attention is critical to direct neural processing resources to the most essential information. Here, neural tracking of the speech envelope of an English story narrative and cortical auditory evoked potentials (CAEPs) to non-speech stimuli were simultaneously assayed in native and non-native listeners of English. Although native listeners exhibited higher narrative comprehension accuracy, non-native listeners exhibited enhanced neural tracking of the speech envelope and heightened CAEP magnitudes. These results support an emerging view that although attention to a target speech signal enhances neural tracking of the speech envelope, this mechanism itself may not confer speech comprehension advantages. Our findings suggest that non-native listeners may engage neural attentional processes that enhance low-level acoustic features, regardless if the target signal contains speech or non-speech information.
Functional geometry of auditory cortical resting state networks derived from intracranial electrophysiology
2023, PLoS Biology
Understanding central auditory processing critically depends on defining underlying auditory cortical networks and their relationship to the rest of the brain. We addressed these questions using resting state functional connectivity derived from human intracranial electroencephalography. Mapping recording sites into a low-dimensional space where proximity represents functional similarity revealed a hierarchical organization. At a fine scale, a group of auditory cortical regions excluded several higher-order auditory areas and segregated maximally from the prefrontal cortex. On mesoscale, the proximity of limbic structures to the auditory cortex suggested a limbic stream that parallels the classically described ventral and dorsal auditory processing streams. Identities of global hubs in anterior temporal and cingulate cortex depended on frequency band, consistent with diverse roles in semantic and cognitive processing. On a macroscale, observed hemispheric asymmetries were not specific for speech and language networks. This approach can be applied to multivariate brain data with respect to development, behavior, and disorders.
This study describes the organization of human neocortex on multiple spatial scales, based on resting state intracranial electrophysiology. This provides evidence for hierarchical organization of auditory cortical regions and a limbic stream that parallels the classically described ventral and dorsal auditory processing streams.
Direct neural coding of speech: Reconsideration of Whalen et al. (2006) (L)
2024, Journal of the Acoustical Society of America

View all citing articles on Scopus

View full text

ReviewRepresentation of speech in human auditory cortex: Is it special?

Highlights

Abstract

Introduction

Section snippets

Subjects

Monkey

Summary and general conclusions

Contributors

Conflict of interest

Acknowledgments

Hear. Res.

Neuron

Cognition

Prog. Brain Res.

Hear. Res.

Trends Cogn. Sci.

Trends Cogn. Sci.

NeuroImage

Neuron

Cognition

Trends Cogn. Sci.

Cognit. Psychol.

Hear. Res.

Neuron

Neuroscience

Cognition

Speech comprehension is correlated with temporal response patterns recorded from auditory cortex

Proc. Natl. Acad. Sci. U. S. A.

Spectral tilt change in stop consonant perception

J. Acoust. Soc. Am.

Spectral tilt change in stop consonant perception by listeners with hearing impairment

J. Speech Lang. Hear. Res.

Objective neural indices of speech-in-noise perception

Trends Amplif.

Dual-pitch processing mechanisms in primate auditory cortex

J. Neurosci.

Differential neural coding of acoustic flutter within primate auditory cortex

Nat. Neurosci.

Discrimination in neonates of very short CVs

J. Acoust. Soc. Am.

Processing of twitter-call fundamental frequencies in insula and auditory cortex of squirrel monkeys

Exp. Brain Res.

Ultra-fine frequency tuning revealed in single neurons of human auditory cortex

Nature

Acoustic invariance in speech production: evidence from measurements of the spectral characteristics of stop consonants

J. Acoust. Soc. Am.

Perceptual invariance and onset spectra for stop consonant vowel environments

J. Acoust. Soc. Am.

“How to milk a coat”: the effects of semantic and acoustic information on phoneme categorization

J. Acoust. Soc. Am.

Stimulus-dependent modulations of correlated high-frequency oscillations in cat visual cortex

Cereb. Cortex

Coding of repetitive transients by auditory cortex on Heschl's gyrus

J. Neurophysiol.

Comparing the fundamental frequencies of resolved and unresolved harmonics: evidence for two pitch mechanisms

J. Acoust. Soc. Am.

Noncategorical perception of stop consonants differing in VOT

J. Acoust. Soc. Am.

The role of onsets in perception of stop consonant place of articulation: effects of spectral and temporal discontinuity

J. Acoust. Soc. Am.

Categorical speech representation in human superior temporal gyrus

Nat. Neurosci.

Thalamocortical transformation of responses to complex auditory stimuli

Exp. Brain Res.

Auditory thalamocortical synaptic transmission in vitro

J. Neurophysiol.

Task reward structure shapes rapid receptive field plasticity in auditory cortex

Proc. Natl. Acad. Sci. U. S. A.

Speech perception

Annu. Rev. Psychol.

Emergence of neural encoding of auditory objects while listening to competing speakers

Proc. Natl. Acad. Sci. U. S. A.

Effect of temporal envelope smearing on speech reception

J. Acoust. Soc. Am.

Neural correlates of gap detection and auditory fusion in cat auditory cortex

Neuroreport

Representation of spectral and temporal sound features in three cortical fields of the cat. Similarities outweigh differences

J. Neurophysiol.

Neural responses in primary auditory cortex mimic psychophysical, across-frequency-channel, gap-detection thresholds

Review
Representation of speech in human auditory cortex: Is it special?