Abstract
Voices are arguably among the most relevant sounds in humans’ everyday life, and several studies have suggested the existence of voice-selective regions in the human brain. Despite two decades of research, defining the human brain regions supporting voice recognition remains challenging. Moreover, whether neural selectivity to voices is merely driven by acoustic properties specific to human voices (e.g., spectrogram, harmonicity), or whether it also reflects a higher-level categorization response is still under debate. Here, we objectively measured rapid automatic categorization responses to human voices with fast periodic auditory stimulation (FPAS) combined with electroencephalography (EEG). Participants were tested with stimulation sequences containing heterogeneous non-vocal sounds from different categories presented at 4 Hz (i.e., four stimuli/s), with vocal sounds appearing every three stimuli (1.333 Hz). A few minutes of stimulation are sufficient to elicit robust 1.333-Hz voice-selective focal brain responses over superior temporal regions of individual participants. This response is virtually absent for sequences using frequency-scrambled sounds, but is clearly observed when voices are presented among sounds from musical instruments matched for pitch and harmonicity-to-noise ratio (HNR). Overall, our FPAS paradigm demonstrates that the human brain seamlessly categorizes human voices when compared with other sounds including matched musical instruments and that voice-selective responses are at least partially independent from low-level acoustic features, making it a powerful and versatile tool to understand human auditory categorization in general.
Significance Statement
Voices are arguably among the most relevant sounds we hear in our everyday life, and several studies have corroborated the existence of regions in the human brain that respond preferentially to voices. However, whether this preference is driven by specific acoustic properties of voices or whether it rather reflects a higher-level categorization response to voices is still under debate. We propose a new approach to objectively identify rapid automatic voice-selective responses with frequency tagging and electroencephalographic (EEG) recordings. In 4 min of recording only, we recorded robust voice-selective responses independent from low-level acoustic cues, making this approach highly promising for studying auditory perception in children and clinical populations.
Footnotes
The authors declare no competing financial interests.
This work was supported in part by the European Research Council Grant Mapping the Deprived Visual System (MADVIS): Cracking function for Prediction (Project: 337573, ERC-20130StG; to O.C.), the Belgian Excellence of Science Program Project 30991544 (partly to O.C. and B.R.), and by Fonds de la Recherche Scientifique-Fonds National de la Recherche Scientifique (FRS-FNRS) Mandate d’Impulsion Scientifique (to O.C.). F.M.B. is a PhD student supported by FRS-FNRS Belgium, R.C.P. is a PhD student supported by FRIA, S.T. is a PhD student supported by a Louvain Cooperation Grant, and O.C. is a research associate at FRS-FNRS Belgium.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.
Jump to comment: