Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Cognition and Behavior

Distinguishing Fine Structure and Summary Representation of Sound Textures from Neural Activity

Martina Berto, Emiliano Ricciardi, Pietro Pietrini, Nathan Weisz and Davide Bottari
eNeuro 29 September 2023, 10 (10) ENEURO.0026-23.2023; https://doi.org/10.1523/ENEURO.0026-23.2023
Martina Berto
1Molecular Mind Lab, IMT School for Advanced Studies Lucca, Lucca, 55100, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martina Berto
Emiliano Ricciardi
1Molecular Mind Lab, IMT School for Advanced Studies Lucca, Lucca, 55100, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Emiliano Ricciardi
Pietro Pietrini
1Molecular Mind Lab, IMT School for Advanced Studies Lucca, Lucca, 55100, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pietro Pietrini
Nathan Weisz
2Department of Psychology and Centre for Cognitive Neuroscience, Paris-Lodron University of Salzburg, 5020, Austria
3Neuroscience Institute, Christian Doppler University Hospital, Paracelsus Medical University, Salzburg, 5020, Austria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Davide Bottari
1Molecular Mind Lab, IMT School for Advanced Studies Lucca, Lucca, 55100, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Davide Bottari
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Article Figures & Data

Figures

  • Extended Data
  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    Experimental stimuli. A, Computational texture model to extract auditory statistics. An original recording of a natural sound texture is passed through the auditory texture model (the list of presented sound textures is available as Extended Data Fig. 1-2). The model provides a mathematical formulation of the auditory system’s computations (auditory statistics) to represent the sound object. The signal is filtered with 32 audio filters to extract analytic and envelope modulations for each cochlear sub-band. Envelopes are downsampled and multiplied by a compression factor. From the compressed envelopes, a first set of statistics is computed: marginal moments (including envelope mean, variance, and skewness), autocorrelation between temporal intervals, and cross-band correlations. Compressed envelopes are then filtered with 20 modulation filters. The remaining statistics are extracted from the filtered envelopes: modulation power and cross-band correlations between envelopes filtered with the same modulation filter (C1) and between the same envelope filtered through different filters (C2). B, Schematic of sound synthesis. The white-noise sample is filtered through the auditory model (McDermott and Simoncelli, 2011) to extract its cochlear envelopes, which are then subtracted from those obtained from the original sound texture. The average statistics from the original sound textures are then imposed on the subtracted white noise envelopes. The outcome is multiplied by the fine structure of the white noise sample to preserve its local acoustic distribution (e.g., temporal structure). The result is recombined in the synthetic signal, reiterating the procedure until a desired SNR of 20 dB is reached. C, Impact of white noise sample and imposed statistics on synthetic sounds. Two different sets of statistics are extracted from two sound textures: “frogs” and “horse trotting.” Each set of values is imposed on two different random white noise samples. When the same statistics are imposed on different white noise samples, the outcomes are two synthetic exemplars of the same sound texture. These exemplars will have the same summary statistical representation but will diverge in their local features as the original input sound will influence them. When different statistics are imposed on the same white noise sample, the results are two synthetic exemplars that will diverge in their overall summary statistics and be perceptually associated with different sound objects. The cochleograms of the 0.5-s synthetic exemplars are displayed. D, Similarity of statistics between excerpt pairs. Couples of sound excerpts presented in the study (repeated and novel; see Fig. 2A for the experimental protocol) could be derived from different white noise samples to which we imposed the same statistics (in coral) or from the same white noise sample with different statistics (in blue). The summary statistics similarity between these couples of synthetic excerpts was computed by averaging the SNRs between statistics of repeated and novel sounds, measured separately for each statistical class. Boxplots show the averaged SNRs at three sound durations of interest (short, 40 ms; medium, 209 ms; long, 478 ms). When sounds were short (40 ms), statistical values were more similar for sounds derived from the same white noise samples (in blue) compared with different ones (in coral), even when including different original statistics. As duration increased (209, 478 ms), statistics progressively converged to their original values and were more dissimilar for sounds with different generative statistics (blue) than for sounds including the same statistics (coral), irrespectively of original white noise sample. ***p < 0.001. E, Comparing auditory statistics of 478 ms synthetic sounds. Envelope marginal moments (mean, skewness, and variance) of all sound textures are displayed, while highlighted are those from three sound excerpts selected randomly; two have the same imposed auditory statistics (in red and yellow), and one has different statistics (in blue). In the bottom row, the remaining statistics are displayed (envelope correlation, modulation power, C1, and C2). The similarity between statistical values is higher when the sounds come from the same original texture. F, Similarity between envelope pairs of short sounds. In the top panel, boxplots represent the correlation coefficients (r) measured between broadband envelopes for each pair of 40-ms sound excerpts (repeated and novel; n = 6912) divided according to experiment (local features or summary statistics). Amplitude modulations of brief excerpts are significantly more similar when sound pairs originate from the same white noise sample (summary statistics experiment) than when they do not (as in the local features experiment), disregarding their imposed generative statistics. ***p < 0.001. In the bottom panel are shown examples of the 40-ms broadband envelopes used for computing the correlation coefficients (r) above.

  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2.

    Experimental procedure and results of time domain analysis. A, Experimental protocol for EEG. Triplets of sounds were presented at a fast rate (one sound every 500 ms). Two sounds were identical (repeated), while the third was different (novel) and could vary in its local features (left) or summary statistics (right) depending on the experiment (local features or summary statistics). Three sound durations, equally spaced logarithmically (short, medium, and long: 40, 209, and 478 ms), were employed (in different sound streams) to tap into each auditory mode separately (local features vs summary statistics processing). The list of presented sound textures is available as Extended Data Figure 1-2. To ensure participants were attentive during the presentation, they performed an orthogonal task, consisting of pressing a button when an infrequent target (beep) appears. Performance accuracy was high in all experiments and durations and is displayed in Extended Data Figure 1-1. B, Grand average topographies of the differential response associated with the sound change (novel sound minus repeated sound) at significant latencies for each experiment and duration. For each latency, electrodes associated with significant clusters are displayed above as red stars on the scalp. *p < 0.025. On the right side of the topographical maps, the boxplots represent objective differences between the novel and repeated sounds of all auditory statistics (averaged). The difference was computed between the statistics of sounds presented for each run, experiment, and duration and averaged across all participants. Within each duration, medians differed at the 5% significance level between experiments. Local features > summary statistics at short (40) duration and summary statistics > local features for medium (209) and long (478) durations. The evoked response in the EEG agrees with the objective statistical difference measured from the sound excerpts. C, Grand average electrical activity (negative values are plotted up) of the differential response (novel minus repeated) at significant electrodes (in red) for both short and long durations. Shaded regions show interpolated standard error of the mean (SE) at each time point. Positive values indicate that the novel sound elicited a greater response than the repeated one. Results of cluster permutation are displayed as black bars extending through significant latencies. –p < 0.025. For visualizing the ERPs before subtraction (novel-repeated), see Extended Data Figure 2-1.

  • Figure 3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3.

    Results of time-frequency analysis. A, Grand average difference (novel minus repeated) of total power for short and long sound durations in both experiments (local features and summary statistics) at significant channels. Rectangular regions comprise the latencies and frequency range in which power changes were significant between experiments after cluster-based permutation. Significant channels are marked as red stars over the sketch of a scalp (*p < 0.05). In the left panel, results for the short duration are displayed and show higher-power desynchronization in the 20- to 28-Hz frequency range (high beta) for local features as compared with summary statistics. In the right panel, results for the long duration show higher 4- to 12-Hz (alpha-theta) power synchronization for summary statistics as compared with local features. Grand-average topographical maps at significant latencies and frequency ranges are displayed next to the corresponding power-spectrum plots. B, Average power difference between novel and repeated sounds for each range of frequency bands (slow, medium, and fast), averaged across all significant channels, plotted at all latencies (from 0 to 0.5 s). Significant channels are marked as red stars over the sketch of a scalp. Shaded regions show interpolated SE at each time point. *p < 0.05.

Extended Data

  • Figures
  • Extended Data Figure 1-1

    Behavioral results. Related to Figure 2. A, The group-level average proportion of correct detections of beeps when presented. Bar plots represent average values of hits across all participants. Error bars represent the SEM. No significant difference existed across conditions (all p > 0.05). Download Figure 1-1, TIF file.

  • Extended Data Figure 1-2

    List of sound textures. Related to Figures 1 and 2. In local features discrimination, for each sound texture in column 1, two synthetic exemplars of the sound texture were selected. One was presented twice (repeated) and the other was presented as the third element of the triplet (novel). In summary statistics discrimination, sound textures were paired according to perceived similarity (McDermott et al., 2013). For each sound texture in column 1, one synthetic exemplar was selected and presented twice. Then, an exemplar of the texture from the corresponding row in column 2 was selected and used as the third element of the triplet (novel). Download Figure 1-2, TIF file.

  • Extended Data Figure 2-1

    Auditory evoked response for repeated and novel sounds. Related to Figure 2. A, Grand-average topographies across participants of the responses to standard and oddball sounds for each experiment (local and global discrimination), displayed for short and long durations (478) at latencies of interest. B, Grand-average ERPs across participants of the average amplitude of the central channels displayed in the legend (red circles on the sketch of a scalp). ERPs are shown for both standard and oddball sounds for each experiment and duration. Shaded regions show interpolated SEM at each point. Download Figure 2-1, TIF file.

Back to top

In this issue

eneuro: 10 (10)
eNeuro
Vol. 10, Issue 10
October 2023
  • Table of Contents
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Distinguishing Fine Structure and Summary Representation of Sound Textures from Neural Activity
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Distinguishing Fine Structure and Summary Representation of Sound Textures from Neural Activity
Martina Berto, Emiliano Ricciardi, Pietro Pietrini, Nathan Weisz, Davide Bottari
eNeuro 29 September 2023, 10 (10) ENEURO.0026-23.2023; DOI: 10.1523/ENEURO.0026-23.2023

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
Distinguishing Fine Structure and Summary Representation of Sound Textures from Neural Activity
Martina Berto, Emiliano Ricciardi, Pietro Pietrini, Nathan Weisz, Davide Bottari
eNeuro 29 September 2023, 10 (10) ENEURO.0026-23.2023; DOI: 10.1523/ENEURO.0026-23.2023
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Acknowledgments
    • Footnotes
    • References
    • Synthesis
    • Author Response
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • auditory statistics
  • computational model
  • discriminative response
  • EEG
  • sound change
  • sound details

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • Parallel gene expression changes in ventral midbrain dopamine and GABA neurons during normal aging
  • Lactate receptor HCAR1 affects axonal development and contributes to lactate’s protection of axons and myelin in experimental neonatal hypoglycemia
  • Demyelination produces a shift in the population of cortical neurons that synapse with callosal oligodendrocyte progenitor cells
Show more Research Article: New Research

Cognition and Behavior

  • Calcium Dynamics in Hypothalamic Paraventricular Oxytocin Neurons and Astrocytes Associated with Social and Stress Stimuli
  • Touchscreen Response Precision Is Sensitive to the Explore/Exploit Trade-off
  • Eye Movements in Silent Visual Speech Track Unheard Acoustic Signals and Relate to Hearing Experience
Show more Cognition and Behavior

Subjects

  • Cognition and Behavior
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.