Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT

User menu

Search

  • Advanced search
eNeuro
eNeuro

Advanced Search

 

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Blog
    • Collections
    • Podcast
  • TOPICS
    • Cognition and Behavior
    • Development
    • Disorders of the Nervous System
    • History, Teaching and Public Awareness
    • Integrative Systems
    • Neuronal Excitability
    • Novel Tools and Methods
    • Sensory and Motor Systems
  • ALERTS
  • FOR AUTHORS
  • ABOUT
    • Overview
    • Editorial Board
    • For the Media
    • Privacy Policy
    • Contact Us
    • Feedback
  • SUBMIT
PreviousNext
Research ArticleResearch Article: New Research, Cognition and Behavior

EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes

Silvia Korte, Thorge Haupt and Martin G. Bleichner
eNeuro 7 July 2025, 12 (7) ENEURO.0154-25.2025; https://doi.org/10.1523/ENEURO.0154-25.2025
Silvia Korte
1Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg 26129, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Silvia Korte
Thorge Haupt
1Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg 26129, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thorge Haupt
Martin G. Bleichner
1Neurophysiology of Everyday Life Group, Department of Psychology, University of Oldenburg, Oldenburg 26129, Germany
2Research Center for Neurosensory Science, University of Oldenburg, Oldenburg 26129, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin G. Bleichner
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

In everyday life, ambient sounds can disrupt our concentration, interfere with task performance, and contribute to mental fatigue. Even when not actively attended to, salient or changing sounds in the environment can involuntarily divert attention. Understanding how the brain responds to these real-world auditory distractions is essential for evaluating the cognitive consequences of environmental noise. In this study, we recorded electroencephalography while participants performed different tasks during prolonged exposure to a complex urban soundscape. We identified naturally occurring, acoustically salient events and analyzed the corresponding event-related potentials (ERPs). Auditory spectral novelty reliably elicited a P3a response (250–350 ms), reflecting robust attentional capture by novel environmental sounds. In contrast, the reorienting negativity (RON) window (450–600 ms) showed no consistent modulation, possibly due to the continuous and largely behaviorally irrelevant nature of the soundscape. Performance in a behavioral task was briefly disrupted following novel sounds, underscoring the functional impact of attentional capture. Noise sensitivity, measured via the Weinstein Noise Sensitivity Scale ( 1978), was not associated with ERP amplitudes. Together, these findings demonstrate that the P3a component provides a stable neural marker of attentional shifts in naturalistic contexts and highlight the utility of spectral novelty detection as a tool for investigating auditory attention outside the laboratory.

  • auditory attention
  • auditory distraction
  • ecological validity
  • EEG
  • event-related potentials
  • neuroergonomics
  • P3a component
  • real-world soundscape
  • spectral novelty

Significance Statement

Everyday environments are filled with unpredictable sounds that can capture our attention and disrupt performance, yet most research on auditory distraction relies on highly controlled stimuli. Our study bridges this gap by identifying electroencephalography (EEG) responses to naturally occurring acoustic changes in a real-world soundscape. Using a spectral novelty algorithm, we show that the P3a component reliably tracks attentional capture in complex auditory scenes—even when the soundscape is behaviorally irrelevant. This approach not only enhances ecological validity but also demonstrates a practical method for studying auditory attention outside the lab. Our findings highlight the potential for using EEG to understand cognitive functioning in real-life environments such as offices, classrooms, or public spaces.

Introduction

In our increasingly noisy environments (Asdrubali, 2014), managing auditory attention is crucial for cognition, well-being, and brain health. Complex acoustic environments require focusing on relevant sounds while suppressing irrelevant ones (Hillyard et al., 1973; Fritz et al., 2005; Shinn-Cunningham and Best, 2008; Ahveninen et al., 2011; Choi et al., 2013; Schwartz and David, 2018). This demanding process can cause fatigue over time (Saremi et al., 2008). Understanding how the brain copes with persistent noise and individual differences in this process (Kjellberg et al., 1996) is key to mitigating cognitive and societal effects.

Studies on auditory distractibility propose a three-phase model of distraction (Escera et al., 2000; Wetzel and Schröger, 2014; Getzmann et al., 2024). In this model, distraction unfolds in three stages: an initial, automatic detection of change in the acoustic environment, an involuntary shift of attention toward the deviant sound, and a final stage where attention is voluntarily reoriented and internal predictions updated. This framework aligns with early theories of the orienting reflex, describing how unexpected sensory events interrupt behavior by signaling a mismatch with internal models (Sokolov, 1963).

Some distracting sounds are automatically filtered out before reaching awareness (Boutros and Belger, 1999), while others demand effortful suppression, draining cognitive resources, and contributing to fatigue and annoyance linked to health issues (Bidet-Caulet et al., 2007; Basner et al., 2014; Schwartz and David, 2018). However, individuals differ in their susceptibility to distraction: some are more easily captured by novel sounds, while others exhibit stronger cognitive control mechanisms (Kjellberg et al., 1996; Shepherd et al., 2016). Investigating how these differences shape neural responses to irrelevant sounds may explain why some listeners are more affected by noise than others. This variability underscores the importance of studying attention in settings that better reflect everyday auditory complexity.

Traditional event-related potential (ERP) paradigms rely on short, repetitive, and highly controlled auditory stimuli. While instrumental in advancing our understanding of auditory processing (Spong et al., 1965; Hillyard et al., 1973), they offer limited insight into how attention operates in complex environments. Real-world listening involves dynamic, overlapping streams that unfold continuously over time. This questions the generalizability of lab findings to real-world listening.

In response, some studies have adopted more ecologically valid designs by embedding naturalistic sounds, like speech or environmental noise, into continuous streams (Straetmans et al., 2021; Rosenkranz et al., 2023). These approaches preserve some real-world acoustics while maintaining electroencephalography (EEG) interpretability. Previous studies demonstrated that neural responses to repeated naturalistic stimuli are shaped by factors like task complexity and personal relevance (Korte et al., 2025). However, such studies still rely on artificial structuring of the soundscape. Fully natural soundscapes, such as a busy street, present another challenge. They are dense, unpredictable, and lack experimenter-controlled stimuli. Though listeners readily notice salient changes (Hicks and McDermott, 2024), identifying these moments objectively is non-trivial. Real-world acoustic events often overlap or are masked by background noise, much like subtle instruments in polyphonic music can be masked by louder ones (Müller, 2021). Salience depends on spectral and contextual factors, not always reflected in the waveform (Lavie, 2005; Müller, 2021).

To address this, we utilize a spectral novelty detection algorithm (Müller, 2021) to identify perceptually salient auditory events directly from the natural soundscape. It eliminates the need for artificially embedded stimuli, allowing examination of neural responses to naturally occurring, contextual sounds in complex acoustic environments. ERPs serve as a key tool for this investigation, capturing time-locked brain responses to discrete events.

The present study tests whether the three-phase model of auditory distraction (Escera et al., 2000; Wetzel and Schröger, 2014), originally developed under controlled lab conditions, can explain neural responses in complex, real-world environments. The model examines ERP components associated with each stage of distraction: the mismatch negativity (MMN), occurring around 100 ms at central and prefrontal sites, typically indexing pre-attentive deviance detection; the P3a, a frontocentral positivity around 300 ms, reflecting involuntary attention capture; and the reorienting negativity (RON), a frontocentral negativity occurring 400–600 ms after stimulus onset, marking the reallocation of attention. As our paradigm does not include a repetitive standard against which deviance can be established, the MMN component cannot be meaningfully assessed. Thus, we focus our analysis on the P3a and RON, which remain informative in this context.

By examining these later ERP components in response to spontaneous acoustic changes, our study evaluates the translatability of the three-phase model to ecologically valid auditory scenes. Our approach aims to extend attentional control models into complex acoustic environments that shape real-world listening.

Methods

Data and code availability

The stimuli, analysis scripts, and data to reproduce the findings of this paper can be found at: https://zenodo.org/records/15182196.

Participants

The present study builds on the dataset reported in Korte et al. (2025). In total, 30 individuals underwent audiometric screening (pure-tone audiometry). Twenty-three participants (13 female, 10 male) met the eligibility criterion of having hearing thresholds of at least 20 dB hearing level (HL) at octave frequencies from 250 Hz to 8 kHz and were included in the final sample. Participants were between 21 and 37 years old (mean: 25.57, SD: 3.48), right-handed, had normal or corrected-to-normal vision, and reported no history of neurological, psychiatric, or psychological conditions. All participants provided written informed consent and received monetary compensation for their participation.

Procedure

Prior to EEG data acquisition, participants completed the Weinstein Noise Sensitivity Scale (WNSS; Weinstein, 1978), a 21-item inventory designed to assess individual differences in noise sensitivity. The questionnaire asks participants to rate their agreement with statements related to noise (e.g., “I wouldn’t mind living on a noisy street if the apartment I had was nice”) on a 6-point Likert scale ranging from “strongly disagree” to “strongly agree.” The total WNSS score reflects a participant’s general sensitivity to noise, with higher scores indicating greater susceptibility to noise-related annoyance and distraction.

Afterwards, participants completed six blocks of EEG recordings, each lasting 15–45 min, totaling approximately 3.5 h of recording data. Participants could take self-determined breaks between blocks. The experimental design alternated between passive listening blocks, where participants were instructed to disregard the soundscape, and active listening conditions, where they responded to specific auditory events.

Paradigm

All parts of the paradigm, apart from the transcription task, were presented using the Psychophysics Toolbox extension (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007, Version: 3) on MATLAB 2021b.

Auditory stimuli

For our analysis in this paper, we only included blocks in which the pre-recorded soundscape of a busy city street (see https://www.youtube.com/watch?v=Le_g4s6KloU, accessed 01.07.22) was played. It consisted of a variety of ambient sounds, typical of an urban area (e.g., streetcars, motorcycles, or incomprehensible speech). The street scenario had a total length of 2 h and 21 min, from which we took four segments of 45 min each. These segments had a short overlap, since the original sound file was not long enough to cover three non-overlapping hours. The sequence of segments was randomized across participants.

The soundscape was presented via two free-field loudspeakers (Sirocco S30, Cambridge Audio) positioned at ear level, at a 45∘ angle to the left and right with a distance of approximately 0.5 m from the participant. Playback volume was calibrated prior to the experiment using a sound level meter placed at head position, with the average sound pressure level (SPL) set to 51 dB(A). To characterize the acoustic properties of each sound file, we computed short-term root mean square (RMS) energy using 50 ms windows (with 50% overlap) and converted this to dB SPL relative to the calibrated reference. Two brief signal artifacts were observed in File 1, where sound levels dropped below 30 dB(A) for isolated frames. These values were excluded from the SPL summary statistics to avoid skewing the results. No such artifacts were present in the other files.

The cleaned analysis revealed highly consistent sound level distributions across the four segments, with mean SPLs around 49.65 dB(A). Minimum values ranged from 30.56 to 32.01 dB(A), and maximum levels from 68.54 to 68.85 dB(A) (Fig. 1 and Table 1). A separate figure illustrates the artifacts in File 1 that were excluded from analysis (Extended Data Fig. 1-1).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

SPL envelope of the urban soundscape (exemplary snippet) used in the experiment. Short-term RMS energy was calculated using 50 ms windows (50% overlap) and converted to dB SPL based on a calibration reference of 51 dB(A). The plot reflects cleaned data, with values below 30 dB(A) excluded to eliminate brief non-acoustic artifacts. For visualization of these excluded artifacts, see Extended Data Figure 1-1.

Figure 1-1

Sound level envelope of File 1 with artifact values included. Two brief dips below 30 dB(A) (highlighted in red) were identified as likely signal artifacts rather than valid acoustic events and were excluded from the cleaned analysis. These artifacts occurred only in File 1 and are shown here for transparency. Download Figure 1-1, TIF file.

View this table:
  • View inline
  • View popup
Table 1.

Cleaned SPL statistics for the four soundscape segments

In the original study, additional auditory stimuli (church bells) were added to the street soundscape. However, for the analysis presented in this paper, only the urban soundscape without additional auditory cues was considered.

Nonauditory task

Participants engaged in one of two nonauditory tasks, depending on the experimental block. The first was a visual search task using detailed hidden-object pictures (HOPs), similar to the well-known “Where’s Waldo?/Where’s Wally?” game. Participants searched for specific objects within complex illustrated scenes and selected them using the mouse. The number of targets was deliberately set high to ensure continuous engagement throughout the block.

The second task was a transcription task that resembled simple office work. Taken from the citizen science project “World Architecture Unlocked” on “Zooniverse,” this task required participants to transcribe handwritten details from architectural photographs (for further information, see https://www.zooniverse.org/projects/courtaulddigital/world-architecture-unlocked/about/research). Participants categorized information such as city names, architects, or building names. No prior architectural knowledge was necessary, but the task was sufficiently complex to require sustained attention. Participants were encouraged to use search engines and online maps to verify and categorize the transcribed information. The task involved reading, typing, using the mouse, and researching, making it a suitable approximation of realistic office-based cognitive tasks, while providing a higher task complexity than the HOP task. In the original experiment, the street soundscape was played during four separate blocks, each combined with either the visual search task or the transcription task. These blocks were not designed for the current analyses but were selected post hoc because they provided extended, ecologically valid exposure to a complex auditory environment while participants were engaged in nonauditory activities.

Experimental blocks

The experiment consisted of three phases, divided into four consecutive blocks, as illustrated in Figure 2. The passive phase A comprised two blocks, while the active phase and the passive phase B each included one block.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Overview of experimental blocks. Order of the blocks is chosen to ensure naivety concerning target sounds in the passive phase A. Gray shaded blocks are not considered in this work and are only displayed for the sake of completeness.

During the experiment, participants were either instructed that the soundscape was irrelevant or that they had to detect a specific target sound (church bell). This manipulation was intended to draw participants’ attention to the overall soundscape. In the passive listening conditions, participants were told that background sounds would be present but were irrelevant to their task and could be ignored. In the active listening condition, participants were required to detect and respond to the target sound by pressing the F4 key. The response key was chosen to avoid interference with the nonauditory task, where the keyboard was also used. All other sounds in the soundscape were not behaviorally relevant and did not require a response.

The block structure and auditory manipulations were part of a broader dataset that has been described in detail in Korte et al. (2025). The present study focuses on a subset of the experimental blocks from that study. The structure for the present study was as follows:

  • Block 1 (passive listening A, passive phase A): A 45-min sequence of the street soundscape was played while participants performed the HOP task, without responding to the sounds.

  • Block 2 (passive listening A, passive phase A): Identical to Block 1, but participants performed the transcription task.

  • Block 3 (active listening, active phase): Identical to Block 2, except participants now responded to the target sounds (church bell) in the street soundscape.

  • Block 4 (passive listening B, passive phase B): Identical to Block 2. However, participants were instructed to ignore the previously relevant church bell chimes again.

The order of blocks was fixed for all participants to ensure that they remained naive to the target sound during the passive listening phase A and to preserve the mental representation of the target sound in the passive listening B phase. Randomizing the order would have disrupted the intended transition between conditions, particularly in the final block. This design allowed for a consistent progression across participants.

Data acquisition

Description of lab setup

Participants were seated in a soundproof recording booth at a desk equipped with a screen (Samsung, SyncMaster P2470). A keyboard and a mouse were placed on the desk for task input and target response. Event markers for auditory stimuli and task events were generated using the Lab Streaming Layer (LSL) library (see https://github.com/labstreaminglayer/liblsl-Matlab, v1.14.0). Keyboard input was logged using LSL-compatible key capture software (see https://github.com/labstreaminglayer/App-Input, v1.15.0). The Lab Recorder software (see https://github.com/labstreaminglayer/App-LabRecorder, v1.14.0) ensured synchronized data recording of the EEG data, the event markers, and the keyboard capture in .xdf format. Files were organized using the Brain Imaging Data Structure format (Gorgolewski et al., 2016) with the EEG data extension (Pernet et al., 2019).

EEG system

EEG data were collected using a 24-channel EEG cap (EasyCap GmbH) with passive Ag/AgCl electrodes (channel positions: Fp1, Fp2, F7, Fz, F8, FC1, FC2, C3, Cz, C4, T7, T8, CP5, CP1, CPz, CP2, CP6, TP9, TP10, P3, Pz, P4, O1, O2). The mobile cap setup, with fewer electrodes than typical lab systems, was chosen for participant comfort during the extended recording sessions and was well tolerated, even during breaks. A mobile amplifier (SMARTING MOBI, mBrainTrain) was attached to the EEG cap, allowing participants a more natural sitting position compared to a wired EEG system. Gyroscope data from the amplifier were recorded to track head movements. Data were transmitted via Bluetooth to a desktop computer using a BlueSoleil dongle. EEG and gyroscope data were streamed to LSL via the SMARTING Streamer software (v3.4.3; mBrainTrain) and recorded at a sampling rate of 250 Hz using Lab Recorder.

Measurement procedure

Before data collection, electrode sites were cleaned with 70% alcohol and abrasive gel (Abralyt HiCl, Easycap GmbH). Electrode gel was applied to maintain impedances below 10 kΩ and impedances were monitored throughout the session. If signal quality dropped, individual electrodes were re-gelled between blocks. Re-gelling was rare, typically affecting one or two electrodes, and no full cap removal was required. Given the experiment’s length, participants were allowed an extended lunch break, scheduled to avoid interference with experimental manipulation (active phase and passive phase B).

Data analysis

All analyses were conducted in MATLAB 2021b using the EEGLAB toolbox (Delorme and Makeig, 2004; version: 2021.1).

Behavioral data

To investigate whether sound events interfered with participants’ typing behavior, we analyzed the time interval between consecutive keystrokes. Specifically, we examined whether inter-keystroke intervals (IKIs) were longer when a sound onset occurred between two keystrokes, compared to intervals without an intervening sound. The IKI was defined as the time between the first and the second keystroke. If the sound was perceived as distracting, the second keystroke was expected to be delayed, resulting in a longer interval. To ensure comparability, control intervals were selected immediately prior to sound events. Only IKIs between 200 and 600 ms were retained to exclude implausibly short latencies and those unlikely to reflect continuous typing. For each participant and condition, mean IKI values were computed separately for sound and no-sound intervals.

Audio processing and feature extraction

We were interested in how listeners perceive complex street soundscapes, particularly how auditory events influence attention and neural responses. In an initial test run, two human listeners manually annotated perceptually salient events in the soundscape. Their annotations confirmed that distinct auditory objects were identifiable and corresponded to measurable brain responses when used to time-lock EEG data. However, manual annotation is both time-consuming and highly variable, depending on factors such as headphone use, attentional state, and individual listener differences.

To address this, we applied spectral novelty analysis (Müller, 2021), an algorithmic method for detecting salient sound events based on abrupt changes in spectral content—especially in higher frequencies, where transient sounds are more easily distinguished from background noise. This approach enables reproducible event detection in naturalistic soundscapes without relying on manual annotations or artificial stimuli.

We implemented the method using the open-source MATLAB functions spectral_novelty.m and simp_peak.m (available at github.com/ThorgeHaupt/Audionovelty). Audio recordings of the street scenes (sample rate: 44.1 kHz) were converted to mono by averaging stereo channels. Each signal was transformed into the time-frequency domain using a short-time Fourier transform (Hanning window size: 882 samples, hop size: 441 samples). The resulting magnitude spectrogram was logarithmically compressed (γ = 10) to enhance perceptually relevant spectral variations (Fig. 3, second plot from top, left panel).

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Left: example for spectral novelty decomposition on a snippet of 20 s. The raw audio (top plot) is first transformed into the time-frequency domain, from which a spectral change is computed (second plot). Afterwards, this spectral change is normalized and smoothed (third plot). Lastly, spectral peaks are identified based on a fixed threshold, resulting in a binary vector of peaks (bottom plot); Top right: spectral representation of an example novelty event. Bottom right: resulting ERP and topographies of P3a- and RON-time-window. Each trace represents one EEG channel.

To highlight spectral changes, a first-order derivative across time frames was computed. Negative values were set to zero, and a local average over a 0.5 s window was subtracted to reduce noise and emphasize meaningful fluctuations (Fig. 3, third plot from top, left panel). The resulting novelty function was normalized and resampled to 100 Hz to match the temporal resolution required for further EEG analysis.

Sound onsets were defined as local maxima in the novelty function that exceeded neighboring values and a fixed threshold of 0.1. This resulted in a binary peak vector marking moments of salient acoustic change (Fig. 3, bottom plot, left panel). These peak markers were used to time-lock EEG analyses to naturally occurring auditory events in the urban soundscape.

EEG data

The EEG was preprocessed as described in Korte et al. (2025). To ensure clarity, we briefly summarize the processing steps here. The data were first filtered between 1 and 40 Hz (default settings of the pop_eegfiltnew function). Next, bad channels were identified, using the clean_artifacts function of EEGLAB with the default settings for channel_crit_maxbad_time and subsequently stored for later interpolation. The data were then segmented into 1-s windows. Artifact rejection was performed on these windows based on a probability threshold of ±3 SD from the mean, which helped optimize independent component analysis (ICA) training. All data were then combined to compute ICA weights using the runica function in EEGLAB with the extended training mode. The ICA weights were applied to the raw EEG data.

Artifact rejection was performed using the ICLabel algorithm (Pion-Tonachini et al., 2019), where components classified with ≥80% probability as artifacts (e.g., eye blinks, muscle activity, heartbeats) were removed. Additionally, a manual inspection was conducted to account for possible misclassifications, as the ICLabel algorithm is primarily optimized for stationary datasets with minimal movement, whereas our setup allowed participants a degree of mobility. On average, 8 out of 24 components were removed per participant (min=4,max=11 ).

Following ICA-based artifact removal, the EEG data were further processed by applying a low-pass filter at 20 Hz and a high-pass filter at 0.5 Hz. Any previously identified bad channels were interpolated (mean =1.4,min=0,max=5 ), and the data were re-referenced to the average of electrodes Tp9 and Tp10, which never had to be interpolated. EEG data were resampled to 100 Hz to match the temporal resolution of the spectral novelty function.

Events corresponding to the onset of spectral novelty peaks were identified, and their latencies were mapped to the EEG time-series. Epochs were extracted from −0.2 to 0.8 s relative to sound onset. If an epoch extended beyond the available data range, zero-padding was applied to maintain uniform epoch length across trials. Baseline correction was performed using a pre-stimulus interval from 0.2 to 0 s, subtracting the mean baseline activity from each epoch.

To assess differences in neural processing of the soundscapes under different listening conditions, we computed a grand-average ERP and topographic maps for each block. Additionally, we investigated the relationship between neural responses and spectral novelty. To assess whether the magnitude of the neural response depends on the degree of novelty of a given sound, consistent with expectations based on Downar et al. (2002), where several brain areas showed sensitivity to stimulus novelty, epochs were sorted according to their novelty score and assigned to 20 equally sized bins, sorted in ascending order of spectral novelty. Bins were equalized (where necessary) by excluding excess trials randomly (with a fixed random seed for reproducibility). Artifactual epochs were removed after binning, with a probability criterion of ±3 SD from the mean. A 50% overlap between bins was applied to ensure smoother transitions between novelty levels.

To investigate ERP responses, we focused on four frontocentral electrodes (Fz, FC1, FC2, Cz), selected based on previous literature emphasizing their sensitivity to components within the three-phase model of auditory distraction (Escera et al., 2000; Wetzel and Schröger, 2014; Getzmann et al., 2024). For each participant and condition, ERPs were averaged across these electrodes and across trials. To enable cross-subject comparison, we further computed grand-average ERPs per novelty bin across participants.

The P3a component was analyzed as the mean amplitude in the 250–350 ms time window post-onset, while the RON component was assessed in the 450–600 ms window. These windows were selected based on visual inspection of peak deflections in the grand-average waveforms and align with typical latencies reported in the literature (Escera et al., 2000; Wetzel and Schröger, 2014; Getzmann et al., 2024). We did not include the MMN in our analysis, as its elicitation typically depends on a structured sequence of frequent standard and infrequent deviant stimuli (Näätänen et al., 1993), which introduce a violation of regular auditory patterns. Since our continuous, real-world street soundscape is highly dynamic by nature, no such pattern of consistent auditory regularities exist. Thus, our soundscape is not suited to investigate the MMN component.

Statistical analysis

Weinstein noise sensitivity scale

In an exploratory analysis, we examined whether individual differences in noise sensitivity predict neural responses to auditory novelty. We conducted Pearson’s correlation analyses between WNSS scores and individual ERP amplitudes averaged over all conditions and as a mean of the selected frontocentral channels. Specifically, we tested the relationship between WNSS scores and mean ERP amplitudes in two key time windows: the P3a window (250–350 ms) and the RON window (450–600 ms). The normality of the WNSS scores and ERP amplitudes were confirmed using the Shapiro–Wilk test (WNSS: p = 0.451, P3a: p = 0.366, RON: p = 0.160), justifying the use of Pearson’s correlation. Correlations were computed separately for the P3a and RON amplitudes, with significance levels set at p < 0.05.

Behavioral data

Statistical analysis was performed using the Wilcoxon signed-rank test for paired samples to compare typing speed between uninterrupted and interrupted typing within each experimental condition. This non-parametric test was chosen due to deviations from normality in the data distribution, as confirmed by the Shapiro–Wilk test. The Wilcoxon signed-rank test was applied separately for each condition to determine whether the presence of auditory interruptions significantly affected typing speed.

To account for multiple comparisons, p-values were adjusted using the Benjamini–Hochberg false discovery rate (FDR) correction. This method controls the expected proportion of false positives while maintaining statistical power.

To assess whether the magnitude of behavioral disruption (i.e., the difference in IKIs between interrupted and uninterrupted typing) differed between conditions, we conducted a Friedman test. This non-parametric equivalent of a repeated-measures ANOVA is suitable for comparing more than two related samples when the data may not follow a normal distribution. The Friedman test was applied to per-subject difference scores across all three experimental conditions that contained the transcription task .

Statistical analyses were conducted in MATLAB R2021b using the signrank.m function for Wilcoxon tests and the mafdr.m function for FDR correction.

EEG data

To assess differences in ERP amplitudes across conditions, we conducted Wilcoxon signed-rank tests, a non-parametric paired test, comparing mean ERP amplitudes within the time windows of interest for the P3a and RON.

We performed four pairwise comparisons, motivated by the study’s design:

  • Passive A + HOP versus passive A + transcription to test whether the type of nonauditory task modulates ERP amplitudes under passive listening conditions.

  • Passive A + transcription versus active + transcription to examine whether directing attention to the soundscape in the active condition influences ERP amplitudes.

  • Active + transcription versus passive B +  transcription to determine whether ERP amplitudes remain modulated after the active phase or return to passive A levels.

  • Passive A + HOP versus passive B + transcription, to evaluate whether ERP amplitudes in the passive B Phase differ from those in the passive A Phase.

To correct for multiple comparisons, we applied FDR correction using the Benjamini–Hochberg procedure.

To examine the influence of novelty intensity at the single-trial level, we used linear mixed-effects models (LMMs) with novelty score as a continuous predictor. Separate LMMs were fitted for the P3a and RON time windows. The models included fixed effects for novelty score and condition and a random intercept for subject to account for within-subject variability (Extended Data Fig. 7-1). We compared two models per time window using a likelihood ratio test (LRT):

  1. Full model: EEG amplitude ∼ novelty score + condition + (1|subject)

  2. Simpler model: EEG amplitude ∼ novelty score + (1|subject)

Model selection was based on Akaike information criterion and the LRT to determine whether including condition improved model fit. The models were implemented using the fitlme.m function in Matlab.

Results

Overall ERP responses to spectral novelty peaks

In a first step, we investigated the overall ERPs per condition, time-locked to all identified spectral novelty peaks. Figure 4 displays the time-series from −0.2 to 0.8 s relative to peak onsets, with corresponding topographical representations for two distinct time windows: 0.25 to 0.35 s, corresponding to the P3a, and 0.45 to 0.60 s, corresponding to the RON.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

ERPs time-locked to spectral novelty peaks across all participants and trials, visualized as butterfly plots (i.e., each trace represents one EEG channel). Gray shaded area in the time-series plots represents the time windows for the topographies (0.25–0.35 s and 0.45–0.60 s). N refers to the number of trials included in the average. Top left: passive listening A condition while participants engaged in the HOP task. Top right: passive listening A condition while participants engaged in the transcription task. Bottom left: active listening condition while participants engaged in the transcription task. Bottom right: passive listening B while participants engaged in the transcription task.

A distinct positive deflection at approximately 300 ms post-onset can be observed across all conditions, with the strongest amplitude around frontocentral electrodes. This pattern is consistent with the expected characteristics of the P3a component. The response is most pronounced in the passive listening A condition with the HOP, followed by the passive listening A condition with the transcription task. The amplitude of this peak appears slightly reduced in the active listening condition and lowest in the passive listening B condition.

In contrast, we did not observe a pronounced negative deflection in the expected RON time window (450–600 ms). While there are slight amplitude variations across conditions, the expected negativity is not clearly present. This suggests that the reorienting process might be weaker or less reliably elicited in the given experimental context. The summary statistics for these analyses are presented in Table 2.

View this table:
  • View inline
  • View popup
Table 2.

Summary statistics for the grand ERPs per condition in time windows of interest and for frontocentral channels (mean of Fz, FC1, FC2, Cz)

Condition effects on ERP amplitudes

P3a time window (250–350 ms)

A Wilcoxon signed-rank test was conducted to compare ERP amplitudes across conditions. None of the pairwise comparisons showed a significant difference in ERP amplitude before or after correction for multiple comparisons (all p-values >0.05). Specifically, comparisons between passive A + HOP and passive A + transcription (W = 165, p = 0.2113, FDR-corrected p = 0.8453), passive A +  transcription and active + transcription (W = 127, p = 0.9870), active + transcription and passive B + transcription (W = 126, p = 0.9870), and passive A + HOP and passive B + transcription (W = 133, p = 0.8329, FDR-corrected p = 0.9870) all yielded non-significant results.

These findings indicate that neither task engagement nor listening condition (passive vs active) led to significant changes in the P3a time window.

RON time window (450–600 ms)

To investigate voluntary reorientation processes, we examined ERP amplitudes within the 450–600 ms time window. The Wilcoxon signed-rank tests revealed no significant differences between conditions, even before correction for multiple comparisons (all uncorrected p-values >0.05). Specifically, comparisons between passive A + HOP and passive A + transcription (W = 147, p = 0.5057, FDR-corrected p = 0.6743), passive A + transcription and active +  transcription (W = 103, p = 0.4455, FDR-corrected p = 0.6743), active + transcription and passive B + transcription (W = 138, p = .7089), and passive A + HOP and passive B + transcription (W = 155, p = 0.3548, FDR-corrected p = 0.6743) all yielded non-significant results.

These findings suggest that no robust effects were observed in the RON time window, regardless of task or listening condition.

EEG responses as a function of spectral novelty

P3a component

We observed a systematic increase in EEG amplitude with higher spectral novelty scores (Fig. 5). This trend is visually apparent in the grand-average ERPs across novelty bins (Fig. 6), where larger P3a amplitudes are observed in bins with higher novelty values - particularly in the passive A + HOP condition (Fig. 7).

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Grand-average ERPs across the 20 novelty bins as mean over all conditions. Data were averaged per condition first, then across conditions.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Grand-average ERPs across 20 spectral novelty bins for each listening condition. Each subplot represents the ERP response for a specific spectral novelty bin, with Bin 1 corresponding to the lowest novelty values and Bin 20 to the highest. Gray marked windows correspond to the P3a window (early window, 0.25–0.35 s) and the RON window (late window, 0.45–0.60 s). For individual participants’ data averaged over all conditions, Extended Data Figure 6-1.

Figure 6-1

Individual ERPs for each participant, averaged across all conditions at selected frontocentral electrodes (Fz, FC1, FC2, and Cz). Shaded regions indicate the time windows of interest: P3a (250–350 ms, light gray) and Reorienting Negativity (RON, 450–600 ms, dark gray). Download Figure 6-1, TIF file.

Figure 7.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 7.

Development of mean ERP amplitudes across frontocentral electrodes (Fz, FC1, FC2, Cz) for each spectral novelty bin. For individual participants’ data, Extended Data Figure 7-1.

Figure 7-1

Smoothed mean EEG amplitude in the P3a window across spectral novelty bins and averaged over all conditions, split by median amplitude in the highest novelty bins. Red lines represent participants with amplitudes above the median, blue lines represent participants below the median, and the black line shows the grand average. This figure illustrates the overall trend of increasing EEG amplitude with spectral novelty, with inter-individual variability in the magnitude of this effect. Download Figure 7-1, TIF file.

LMM analysis confirmed a significant effect of novelty score on EEG amplitude (β = 4.53, p < 0.001), supporting the observed trend that higher novelty is associated with increased P3a amplitudes. Including “condition” as a predictor did not improve model fit (χ2(3) = −3.22, p = 1), and no reliable pairwise differences between conditions were observed (p > 0.05 for all conditions). While one contrast (passive A + HOP vs active + transcription) showed a borderline p-value (p = 0.049), this result should be interpreted with caution, as it emerged from a model that did not outperform the simpler one.

These statistical findings align with the visual representation in Figures 5 and 6, where novelty is the primary factor modulating amplitude, and condition differences are subtle. For an illustration of individual participant trajectories across novelty bins, Extended Data Figure 7-1, which presents ERP amplitude changes in the P3a time window for each participant, separated by median split. Similarly, the Wilcoxon signed-rank test results confirmed that condition did not significantly influence P3a amplitude.

RON component

In contrast to the P3a window, no strong amplitude changes are evident in the RON time window (Figs. 5, 6). The expected negative deflection for the RON component is not clearly present across conditions. LMM analysis confirmed the absence of an effect of novelty (β = −0.11, p = 0.823) or condition (p > .05 for all pairwise comparisons). Additionally, adding condition to the model did not improve fit (χ2(3) = 0.00, p = 1).

These statistical findings are visually supported by Figures 5 and 6, where no pronounced negativity is evident in the 450–600 ms window. Likewise, Wilcoxon signed-rank tests did not reveal significant differences between conditions, further reinforcing that RON amplitudes are not modulated by spectral novelty or condition.

Behavioral data

The analysis of IKIs revealed a consistent increase for interrupted typing compared to uninterrupted typing across all three experimental conditions (Fig. 8). Descriptive statistics showed that in the passive A + transcription condition, the mean IKI was 327.04 ms (SD = 18.17) for uninterrupted typing and increased to 350.03 ms (SD = 24.86) for interrupted typing. In the active + transcription condition, the mean IKI was 327.45 ms (SD = 23.48) for uninterrupted typing and 347.99 ms (SD = 37.77) for interrupted typing. Similarly, in the passive B + transcription condition, the mean IKI increased from 320.88 ms (SD = 14.30) in the uninterrupted state to 357.03 ms (SD = 49.97) in the interrupted state.

Figure 8.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 8.

Analysis of inter-keystroke intervals for uninterrupted versus interrupted typing across conditions. Gray lines connect corresponding values for each participant. Asterisks indicate significance levels (*p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001). Friedman’s test revealed no significant difference between conditions (p = 0.958).

Statistical analysis using the Wilcoxon signed-rank test confirmed that IKIs were significantly larger when typing was interrupted in all conditions. In the passive A + transcription condition, the test yielded W = 38, p = 0.0024 (uncorrected), p = 0.0035 (FDR-adjusted). For the active + transcription condition, the results were W = 42, p = 0.0035 (uncorrected), p = 0.0035 (FDR-adjusted). The effect was most pronounced in the passive B + transcription condition with W = 0, p < 0.0001 (uncorrected), p = 0.0001 (FDR-adjusted). Furthermore, Friedman test revealed no significant differences in the magnitude of behavioral disruption across conditions, χ2(2) = 0.09, p = 0.958, indicating that the prolonged IKIs following sound events were comparable across all conditions. These findings indicate that the sound events significantly increased IKIs, suggesting a robust disruptive effect of sound events on typing, irrespective of experimental conditions.

Weinstein noise sensitivity scale

Participants’ noise sensitivity was assessed using the WNSS (Weinstein, 1978). The WNSS scores were normally distributed (Shapiro–Wilk test: p = 0.451), with a mean score of 3.36 (SD = 0.54, range = 2.43–4.76). These values were previously reported in Korte et al. (2025), which examined the same sample in a different analytical context.

To provide context for these scores, we compared them to the normative value of 3.04 (SD = 0.57), indicating that our sample exhibited slightly higher noise sensitivity on average. However, the observed mean falls within one standard deviation of the norm, suggesting that the noise sensitivity distribution in our sample is broadly comparable to the general population.

To examine whether noise sensitivity was associated with neural responses to auditory novelty, we conducted Pearson’s correlation analyses between WNSS scores and individual ERP amplitudes averaged over all conditions in two key time windows: the P3a window (250–350 ms) and the RON window (450–600 ms).

The correlation analysis revealed no significant relationship between WNSS scores and P3a amplitudes (r = −0.006, p = 0.980), indicating that noise sensitivity did not predict the degree of attentional capture by novel sounds. A weak negative trend was observed between WNSS scores and RON amplitudes (r = −0.192, p = 0.393), which would suggest that individuals with higher noise sensitivity may have a less pronounced reorienting response following distraction. However, this trend did not reach statistical significance (Fig. 9). For a visualization of individual ERP waveforms across all participants, see Extended Data Figure 6-1, which depicts ERPs averaged across all conditions for the selected frontocentral channels.

Figure 9.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 9.

Scatter plots depicting Pearson’s correlation between WNSS scores and mean ERP amplitudes for the P3a (left) and RON (right) components. Each dot represents an individual participant, and red lines indicate the best-fitting linear regression.

Discussion

This study investigated how the brain processes auditory novelty in a real-world soundscape while participants engaged in different tasks and listening conditions. We found that the intensity of spectral novelty significantly modulated EEG responses, particularly in the P3a time window. This highlights that stronger acoustic changes in the environment evoke more pronounced neural responses. Based on previous research, we expected these neural responses to vary with both novelty intensity and listening engagement. Specifically, we anticipated that active listening would enhance ERP amplitudes compared to passive listening. The contrast between active and passive listening conditions revealed only subtle differences in ERP amplitudes. That is, it did not significantly matter whether participants were passively listening or actively attending to the soundscape. EEG responses to sound events remained robust across all conditions. Below, we discuss the implications of these findings and potential explanations for the observed patterns.

Effect of spectral novelty on EEG responses

The results demonstrated that spectral novelty robustly influenced EEG amplitudes, particularly in the P3a time window. This finding aligns with previous research showing that novel auditory events elicit enhanced neural responses, reflecting attentional capture (Escera et al., 2000; Wetzel and Schröger, 2014; Getzmann et al., 2024).

The increased P3a amplitude we observed with higher novelty values supports the idea that spectral novelty acts as a key trigger for involuntary attentional shifts, which aligns with the second phase of the distraction model. Interestingly, we found no strong modulation in the RON time window, suggesting that participants either did not consistently reorient attention away from novel sounds or that this process was not as pronounced in a naturalistic setting. One possible explanation is that the ongoing soundscape did not provide discrete auditory events that necessitated reorienting, unlike traditional lab paradigms with isolated deviant stimuli. In our setting, novel sounds may not have required context updating or behavioral adaptation, and thus made it unlikely to observe a reorienting response. Additionally, since the background audio was mostly behaviorally irrelevant (even though the active listening condition encouraged attention to the soundscape, where a reorienting response might have been expected), participants may not have engaged in an active reorienting process. This highlights the need for future research to explore whether reorientation mechanisms are suppressed when auditory distractions occur within continuous, real-world soundscapes rather than discrete, laboratory-controlled paradigms.

Our approach of using spectral novelty provides new insights into how attention dynamically fluctuates in response to environmental sound changes. The observed increase in P3a amplitude with higher novelty (Figs. 5, 7) suggests that the brain remains highly sensitive to salient changes in the acoustic environment, even when attention is directed elsewhere. This responsiveness likely reflects bottom-up attentional capture by acoustically novel events, consistent with the notion that attention can be involuntarily drawn to unexpected changes in the environment. Furthermore, our findings highlight the robustness of the P3a component, which showed similar amplitude and morphology (i.e., shape, latency) across all conditions. This is in line with previous research (Fallgatter et al., 2000; Korte et al., 2025) and renders the P3a particularly useful for real-world EEG applications, where data might be noisier, and less robust components could be overshadowed by noise. Extended Data Figure 6-1 shows the consistency of this pattern across individual participants. The stability of the P3a across different listening conditions suggests that it may serve as a reliable neural marker for attentional capture in complex auditory environments such as workplaces (Wascher et al., 2023), classrooms (Janssen et al., 2021), or public spaces (Gramann, 2024).

Notably, the use of “P3a” versus “Novelty P3” remains a topic of debate in the literature. Some studies use the terms interchangeably, and factor-analytic work by Simons et al. (2001) suggests that they reflect the same neural process. Others, however, argue for a clearer distinction: for instance, Barry et al. (2016) propose that the Novelty P3 is a temporally and functionally distinct component, occurring after the P3a and P3b, and specifically associated with orienting to novel stimuli. Our data—showing a temporally stable, frontocentral response that scales with novelty but does not clearly differentiate into subcomponents—appear more consistent with a unified P3a/Novelty P3 interpretation. Nonetheless, we acknowledge that future studies with higher temporal resolution and precise source localization may help clarify whether these components are indeed separable, particularly in naturalistic contexts.

Limited condition effects and potential explanations

Contrary to our initial hypothesis, listening mode did not significantly alter ERP responses. While the active listening condition was expected to enhance auditory processing compared to passive listening, we found only subtle differences between conditions (Fig. 6 for a visual comparison across bins and conditions). One possible explanation is that attentional resource allocation varied dynamically across tasks, but did not create large enough differences to be reflected in ERPs. The HOP task, being cognitively less demanding than transcription, may have allowed participants to allocate more resources to background sound processing (Sörqvist and Rönnberg, 2016). This could have facilitated stronger neural responses to auditory novelty, but may not have yielded sufficiently large neural differences to reach statistical significance.

Additionally, habituation effects may have contributed to the condition pattern, particularly since the passive A + HOP condition was always presented first. This initial exposure to the soundscape may have triggered stronger neural responses, reflecting heightened sensitivity to novel auditory input. While one might expect habituation to continue progressively across all blocks, it is also possible that the largest adjustment occurred early on, during the first encounter with the soundscape. In this view, the strongest novelty-related responses would be limited to the first block, with a relatively stable lower responsiveness in subsequent blocks. Such an early adjustment would be consistent with the deviance detection and attentional shift stages of the three-phase distraction model, which are known to diminish once novelty is no longer perceived as salient.

Another factor to consider is the role of self-generated sounds in the transcription task. Participants’ typing may have introduced competing auditory input that either acoustically masked or perceptually deprioritized the street soundscape. Prior research suggests that self-generated sounds are processed differently from externally generated ones (Martikainen, 2004; Bäß et al., 2008; Saupe et al., 2013) and often involve predictive mechanisms that suppress their neural representation. As a result, the soundscape may have been pushed into the perceptual background, both because it was masked by keystrokes and because attentional resources were directed toward the motor task and its auditory consequences. This may have reduced the salience of the background noise, thereby contributing to the weaker ERP responses in conditions involving transcription. Future studies could explore whether minimizing self-generated auditory input alters neural responses to environmental sounds.

Behavioral impact of novel sounds on task performance

Beyond the neural effects of spectral novelty, our results also revealed a significant behavioral impact. We observed that IKIs increased in response to novel sounds irrespective of experimental conditions, indicating that distraction effects extend beyond electrophysiological responses (Fig. 8). This result aligns with prior research showing that auditory events can momentarily disrupt ongoing cognitive-motor tasks (Conrad et al., 2012). The slowing of typing speed suggests that involuntary attentional capture, as reflected in the P3a component, translated into measurable performance decrements.

Notably, the behavioral disruption was present across conditions, further supporting the robustness of spectral novelty in capturing attention regardless of task engagement. This aligns with findings from workplace distraction studies, where unpredictable background sounds, such as sudden conversations or environmental noises, reduce productivity in cognitively demanding tasks (Kjellberg et al., 1996; Sexton and Helmreich, 2000; Conrad et al., 2012; Sonnleitner et al., 2014). The observed slowing of typing speed indicates that novel sounds impact task performance in the moments directly following the sound event. Whether this effect extends beyond the immediate keystroke remains to be investigated. Future research should explore whether the distraction persists over time or diminishes with continued exposure. Additionally, while our findings point to a general effect of novelty on behavior, it remains unclear whether varying levels of novelty intensity produce graded effects on task performance. Although this analysis was not feasible in the current dataset due to the limited number of keypress instances per participant, it represents a promising avenue for future investigation.

Individual differences in noise sensitivity and attentional modulation

While previous research has highlighted variability in how individuals respond to auditory distraction (Kjellberg et al., 1996; Shepherd et al., 2016), our analysis found no significant correlation between noise sensitivity (WNSS scores) and ERP amplitudes. This suggests that while noise sensitivity may influence subjective experiences of distraction, it does not necessarily translate to differences in early neural responses to background sounds. However, this does not preclude noise sensitivity from affecting later cognitive or behavioral stages of distraction processing.

One possibility is that noise sensitivity exerts its influence beyond early attentional capture, modulating higher-order cognitive and emotional responses to auditory distractions rather than automatic neural responses measured by ERPs. For instance, individuals with higher noise sensitivity may not show stronger P3a responses but may still perceive background noise as more disruptive, leading to greater cognitive fatigue, annoyance, or task disengagement over time.

It is also worth noting that our sample exhibited relatively average noise sensitivity scores, with no extreme outliers. This restricted variability may have limited the ability to detect significant correlations with ERP amplitudes. Future studies should aim to include individuals across a broader range of sensitivity levels to better assess whether more noise-sensitive individuals show distinct neural or behavioral response patterns.

Given this, future research should also explore whether noise sensitivity influences behavioral performance, subjective distraction ratings, or physiological measures (e.g., autonomic responses such as heart rate variability or skin conductance), which may better reflect individual differences in real-world auditory distraction. Additionally, non-linear effects should be considered, as highly noise-sensitive individuals may show disproportionate responses compared to those with lower sensitivity (Kliuchko et al., 2016).

Implications for real-world auditory attention research

Our study highlights the utility of a spectral novelty detection approach for identifying salient auditory events within a continuous, real-world soundscape. Unlike traditional paradigms that rely on pre-defined, isolated stimuli, spectral novelty is computed in relation to the surrounding acoustic context. This means that the algorithm dynamically evaluates whether a sound deviates from the local sound environment. A sound may be classified as novel in one context but not in another, depending on its spectral contrast with the preceding acoustic input. This context sensitivity allows for a more ecologically valid identification of attention-capturing events, as it mirrors the perceptual mechanisms by which human listeners extract meaningful signals from background noise.

By leveraging this context-aware detection of acoustic change, we move beyond discrete stimulus presentations toward a more naturalistic framework for studying auditory attention. The three-phase model of distraction (Escera et al., 2000; Wetzel and Schröger, 2014; Getzmann et al., 2024), typically investigated in tightly controlled laboratory settings, can thus be extended to complex real-world environments. Here, attentional shifts and reorienting responses may be shaped by factors such as habituation, cognitive load, and environmental complexity (Woods and Elmasian, 1986; Lavie, 2005; Gygi and Shafiro, 2011; Brockhoff et al., 2023). The successful application of spectral novelty in ERP research not only enhances ecological validity but also offers a promising tool for investigating dynamic attention in everyday auditory scenes.

Limitations and future directions

While our study provides valuable insights into auditory attention in real-world settings, certain methodological aspects warrant consideration. First, although spectral novelty served as a robust marker of auditory salience, it does not capture other factors such as semantic relevance or emotional valence, which can also strongly influence attention allocation (Kjellberg et al., 1996; Lavie, 2005; Asutay and Västfjäll, 2012; Roye et al., 2013; Holtze et al., 2021; Debnath and Wetzel, 2022). Second, the fixed order of conditions may have introduced habituation effects, as participants were always exposed to the same sequence of listening modes. Counterbalancing condition order in future studies would help disentangle potential order effects from true condition-related differences.

Furthermore, while we observed clear P3a responses to acoustic novelty, the absence of strong condition effects suggests that our task manipulations may not have been sufficiently distinct to drive measurable differences in ERP amplitude. Refining the contrast between passive and active listening may help clarify how task demands shape auditory distraction. Finally, future studies should consider investigating individual variability in noise sensitivity, as subtle differences in attentional engagement may be masked in group-level analyses of P3a and RON components.

Conclusion

In conclusion, our study provides compelling evidence that spectral novelty serves as a reliable and ecologically valid trigger of attentional processing in naturalistic soundscapes. Across a large dataset and diverse listening contexts, we found that higher novelty consistently elicited strong P3a responses, demonstrating robust neural signatures of attentional capture even when participants were engaged in unrelated tasks. Importantly, this neural response was accompanied by measurable behavioral slowing, confirming that these sound events were not only registered by the brain but also disrupted ongoing performance.

These findings highlight the value of spectral novelty detection as a powerful tool for identifying cognitively relevant sound events in real-world environments, moving beyond traditional stimulus designs. The P3a emerged as a particularly stable marker, showing consistent morphology and amplitude across conditions, positioning it as a key component for studying auditory distraction outside the lab.

While listening mode did not strongly influence ERP amplitudes, this likely reflects the adaptive nature of auditory attention rather than a lack of engagement. Similarly, the absence of a correlation with noise sensitivity underscores the idea that neural responses to distraction are more influenced by moment-to-moment context than by trait-level sensitivity.

Altogether, our results underscore that the brain remains highly responsive to acoustic novelty in real-world settings, both neurally and behaviorally and establish spectral novelty detection as a promising approach for future research on attention, cognition, and distraction in everyday life.

Footnotes

  • The authors declare no competing financial interests.

  • We thank Daniel Küppers for his kind help with the typing speed analysis. Furthermore, we thank all members of the Neurophysiology of Everday Life group for their support and guidance. We would also like to thank Negar Dadkhah and Amrah Gasimli for their annotation of the soundscape and the Friedrich Ebert Foundation, which always provides helpful support to S.K. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under the Emmy-Noether Program—BL 1591/1-1—Project ID 411333557.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.

References

  1. ↵
    1. Ahveninen J,
    2. Hämäläinen M,
    3. Jääskeläinen IP,
    4. Ahlfors SP,
    5. Huang S,
    6. Lin F-H,
    7. Raij T,
    8. Sams M,
    9. Vasios CE,
    10. Belliveau JW
    (2011) Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proc Natl Acad Sci U S A 108:4182–4187. https://doi.org/10.1073/pnas.1016134108
    OpenUrlAbstract/FREE Full Text
  2. ↵
    1. Asdrubali F
    (2014) New frontiers in environmental noise research. Noise Mapp 1:000010247820140001. https://doi.org/10.2478/noise-2014-0001
    OpenUrl
  3. ↵
    1. Asutay E,
    2. Västfjäll D
    (2012) Perception of loudness is influenced by emotion. PLoS One 7:e38660. https://doi.org/10.1371/journal.pone.0038660
    OpenUrlCrossRefPubMed
  4. ↵
    1. Barry RJ,
    2. Steiner GZ,
    3. De Blasio FM
    (2016) Reinstating the novelty P3. Sci Rep 6:31200. https://doi.org/10.1038/srep31200
    OpenUrl
  5. ↵
    1. Bäß P,
    2. Jacobsen T,
    3. Schröger E
    (2008) Suppression of the auditory N1 event-related potential component with unpredictable self-initiated tones: evidence for internal forward models with dynamic stimulation. Int J Psychophysiol 70:137–143. https://doi.org/10.1016/j.ijpsycho.2008.06.005
    OpenUrlCrossRefPubMed
  6. ↵
    1. Basner M,
    2. Babisch W,
    3. Davis A,
    4. Brink M,
    5. Clark C,
    6. Janssen S,
    7. Stansfeld S
    (2014) Auditory and non-auditory effects of noise on health. Lancet 383:1325–1332. https://doi.org/10.1016/S0140-6736(13)61613-X
    OpenUrlCrossRefPubMed
  7. ↵
    1. Bidet-Caulet A,
    2. Fischer C,
    3. Besle J,
    4. Aguera P-E,
    5. Giard M-H,
    6. Bertrand O
    (2007) Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J Neurosci 27:9252–9261. https://doi.org/10.1523/JNEUROSCI.1402-07.2007
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Boutros NN,
    2. Belger A
    (1999) Midlatency evoked potentials attenuation and augmentation reflect different aspects of sensory gating. Biol Psychiatry 45:917–922. https://doi.org/10.1016/S0006-3223(98)00253-4
    OpenUrlCrossRefPubMed
  9. ↵
    1. Brainard DH
    (1997) The psychophysics toolbox. Spat Vis 10:433–436. https://doi.org/10.1163/156856897X00357
    OpenUrlCrossRefPubMed
  10. ↵
    1. Brockhoff L,
    2. Vetter L,
    3. Bruchmann M,
    4. Schindler S,
    5. Moeck R,
    6. Straube T
    (2023) The effects of visual working memory load on detection and neural processing of task-unrelated auditory stimuli. Sci Rep 13:4342. https://doi.org/10.1038/s41598-023-31132-7
    OpenUrl
  11. ↵
    1. Choi I,
    2. Rajaram S,
    3. Varghese LA,
    4. Shinn-Cunningham BG
    (2013) Quantifying attentional modulation of auditory-evoked cortical responses from single-trial electroencephalography. Front Hum Neurosci 7:115. https://doi.org/10.3389/fnhum.2013.00115
    OpenUrlCrossRefPubMed
  12. ↵
    1. Conrad C
    , et al. (2012) A quality improvement study on avoidable stressors and countermeasures affecting surgical motor performance and learning. Ann Surg 255:1190–1194. https://doi.org/10.1097/SLA.0b013e318250b332
    OpenUrlCrossRefPubMed
  13. ↵
    1. Debnath R,
    2. Wetzel N
    (2022) Processing of task-irrelevant sounds during typical everyday activities in children. Dev Psychobiol 64:e22331. https://doi.org/10.1002/dev.22331
    OpenUrlCrossRefPubMed
  14. ↵
    1. Delorme A,
    2. Makeig S
    (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods 134:9–21. https://doi.org/10.1016/j.jneumeth.2003.10.009
    OpenUrlCrossRefPubMed
  15. ↵
    1. Downar J,
    2. Crawley AP,
    3. Mikulis DJ,
    4. Davis KD
    (2002) A cortical network sensitive to stimulus salience in a neutral behavioral context across multiple sensory modalities. J Neurophysiol 87:615–620. https://doi.org/10.1152/jn.00636.2001
    OpenUrlCrossRefPubMed
  16. ↵
    1. Escera C,
    2. Alho K,
    3. Schröger E,
    4. Winkler IW
    (2000) Involuntary attention and distractibility as evaluated with event-related brain potentials. Audiol Neurooto 5:151–166. https://doi.org/10.1159/000013877
    OpenUrl
  17. ↵
    1. Fallgatter AJ,
    2. Eisenack SS,
    3. Neuhauser B,
    4. Aranda D,
    5. Scheuerpflug P,
    6. Herrmann MJ
    (2000) Stability of late event-related potentials: topographical descriptors of motor control compared with the P300 amplitude. Brain Topogr 12:255–261. https://doi.org/10.1023/A:1023403420864
    OpenUrlCrossRefPubMed
  18. ↵
    1. Fritz JB,
    2. Elhilali M,
    3. Shamma SA
    (2005) Differential dynamic plasticity of A1 receptive fields during multiple spectral tasks. J Neurosci 25:7623–7635. https://doi.org/10.1523/JNEUROSCI.1318-05.2005
    OpenUrlAbstract/FREE Full Text
  19. ↵
    1. Getzmann S,
    2. Arnau S,
    3. Gajewski PD,
    4. Wascher E
    (2024) Auditory distraction, time perception, and the role of age: ERP evidence from a large cohort study. Neurobiol Aging 144:114–126. https://doi.org/10.1016/j.neurobiolaging.2024.09.012
    OpenUrlCrossRefPubMed
  20. ↵
    1. Gorgolewski KJ
    , et al. (2016) The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3:160044. https://doi.org/10.1038/sdata.2016.44
    OpenUrlCrossRefPubMed
  21. ↵
    1. Gramann K
    (2024) Mobile EEG for neurourbanism research - What could possibly go wrong? A critical review with guidelines. J Environ Psychol 96:102308. https://doi.org/10.1016/j.jenvp.2024.102308
    OpenUrlCrossRef
  22. ↵
    1. Gygi B,
    2. Shafiro V
    (2011) The incongruency advantage for environmental sounds presented in natural auditory scenes. J Exp Psychol Hum Percept Perform 37:551–565. https://doi.org/10.1037/a0020671
    OpenUrlCrossRefPubMed
  23. ↵
    1. Hicks JM,
    2. McDermott JH
    (2024) Noise schemas aid hearing in noise. Proc Natl Acad Sci U S A 121:e2408995121. https://doi.org/10.1073/pnas.2408995121
    OpenUrlCrossRefPubMed
  24. ↵
    1. Hillyard SA,
    2. Hink RF,
    3. Schwent VL,
    4. Picton TW
    (1973) Electrical signs of selective attention in the human brain. Science 182:177–180. https://doi.org/10.1126/science.182.4108.177
    OpenUrlAbstract/FREE Full Text
  25. ↵
    1. Holtze B,
    2. Jaeger M,
    3. Debener S,
    4. Adiloğlu K,
    5. Mirkovic B
    (2021) Are they calling my name? Attention capture is reflected in the neural tracking of attended and ignored speech. Front Neurosci 15:643705. https://doi.org/10.3389/fnins.2021.643705
    OpenUrl
  26. ↵
    1. Janssen TW
    , et al. (2021) Opportunities and limitations of mobile neuroimaging technologies in educational neuroscience. Mind Brain Educ 15:354–370. https://doi.org/10.1111/mbe.12302
    OpenUrl
  27. ↵
    1. Kjellberg A,
    2. Landström U,
    3. Tesarz M,
    4. Söderberg L,
    5. Akerlund E
    (1996) The effects of nonphysical noise characteristics, ongoing task and noise sensitivity on annoyance and distraction due to noise at work. J Environ Psychol 16:123–136. https://doi.org/10.1006/jevp.1996.0010
    OpenUrlCrossRef
  28. ↵
    1. Kleiner M,
    2. Brainard DH,
    3. Pelli DG
    (2007) What’s new in psychtoolbox-3? Perception 36 ECVP abstract supplement. Available at: http://psychtoolbox.org/credits.
  29. ↵
    1. Kliuchko M,
    2. Heinonen-Guzejev M,
    3. Vuust P,
    4. Tervaniemi M,
    5. Brattico E
    (2016) A window into the brain mechanisms associated with noise sensitivity. Sci Rep 6:39236. https://doi.org/10.1038/srep39236
    OpenUrl
  30. ↵
    1. Korte S,
    2. Jaeger M,
    3. Rosenkranz M,
    4. Bleichner MG
    (2025) From beeps to streets: unveiling sensory input and relevance across auditory contexts. Front Neuroergon 6:1571356. https://doi.org/10.3389/fnrgo.2025.1571356
    OpenUrl
  31. ↵
    1. Lavie N
    (2005) Distracted and confused? Selective attention under load. Trends Cogn Sci 9:75–82. https://doi.org/10.1016/j.tics.2004.12.004
    OpenUrlCrossRefPubMed
  32. ↵
    1. Martikainen MH
    (2004) Suppressed responses to self-triggered sounds in the human auditory cortex. Cereb Cortex 15:299–302. https://doi.org/10.1093/cercor/bhh131
    OpenUrl
  33. ↵
    1. Müller M
    (2021) Fundamentals of music processing: using python and Jupyter notebooks, Ed 2. Cham: Springer.
  34. ↵
    1. Näätänen R,
    2. Paavilainen P,
    3. Titinen H,
    4. Jiang D,
    5. Alho K
    (1993) Attention and mismatch negativity. Psychophysiology 30:436–450. https://doi.org/10.1111/j.1469-8986.1993.tb02067.x
    OpenUrlCrossRefPubMed
  35. ↵
    1. Pelli DG
    (1997) The videoToolbox software for visual psychophysics: transforming numbers into movies. Spat Vis 10:437–442. https://doi.org/10.1163/156856897X00366
    OpenUrlCrossRefPubMed
  36. ↵
    1. Pernet CR,
    2. Appelhoff S,
    3. Gorgolewski KJ,
    4. Flandin G,
    5. Phillips C,
    6. Delorme A,
    7. Oostenveld R
    (2019) EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Sci Data 6:103. https://doi.org/10.1038/s41597-019-0104-8
    OpenUrl
  37. ↵
    1. Pion-Tonachini L,
    2. Kreutz-Delgado K,
    3. Makeig S
    (2019) ICLabel: an automated electroencephalographic independent component classifier, dataset, and website. Neuroimage 198:181–197. https://doi.org/10.1016/j.neuroimage.2019.05.026
    OpenUrlCrossRefPubMed
  38. ↵
    1. Rosenkranz M,
    2. Cetin T,
    3. Uslar VN,
    4. Bleichner MG
    (2023) Investigating the attentional focus to workplace-related soundscapes in a complex audio-visual-motor task using EEG. Front Neuroergon 3:1062227. https://doi.org/10.3389/fnrgo.2022.1062227
    OpenUrl
  39. ↵
    1. Roye A,
    2. Jacobsen T,
    3. Schröger E
    (2013) Discrimination of personally significant from nonsignificant sounds: a training study. Cogn Affect Behav Neurosci 13:930–943. https://doi.org/10.3758/s13415-013-0173-7
    OpenUrl
  40. ↵
    1. Saremi M,
    2. Rohmer O,
    3. Burgmeier A,
    4. Bonnefond A,
    5. Muzet A,
    6. Tassi P
    (2008) Combined effects of noise and shift work on fatigue as a function of age. Int J Occup Saf Ergon 14:387–394. https://doi.org/10.1080/10803548.2008.11076779
    OpenUrlPubMed
  41. ↵
    1. Saupe K,
    2. Widmann A,
    3. Trujillo-Barreto NJ,
    4. Schröger E
    (2013) Sensorial suppression of self-generated sounds and its dependence on attention. Int J Psychophysiol 90:300–310. https://doi.org/10.1016/j.ijpsycho.2013.09.006
    OpenUrlCrossRefPubMed
  42. ↵
    1. Schwartz ZP,
    2. David SV
    (2018) Focal suppression of distractor sounds by selective attention in auditory cortex. Cereb Cortex 28:323–339. https://doi.org/10.1093/cercor/bhx288
    OpenUrlCrossRefPubMed
  43. ↵
    1. Sexton JB,
    2. Helmreich RL
    (2000) Analyzing cockpit communications: the links between language, performance, error, and workload. J Hum Perform Extr Environ 5:6. https://doi.org/10.7771/2327-2937.1007
    OpenUrl
  44. ↵
    1. Shepherd D,
    2. Hautus MJ,
    3. Lee SY,
    4. Mulgrew J
    (2016) Electrophysiological approaches to noise sensitivity. J Clin Exp Neuropsychol 38:900–912. https://doi.org/10.1080/13803395.2016.1176995
    OpenUrlCrossRefPubMed
  45. ↵
    1. Shinn-Cunningham BG,
    2. Best V
    (2008) Selective attention in normal and impaired hearing. Trends Amplif 12:283–299. https://doi.org/10.1177/1084713808325306
    OpenUrlCrossRefPubMed
  46. ↵
    1. Simons RF,
    2. Graham FK,
    3. Miles MA,
    4. Chen X
    (2001) On the relationship of P3a and the Novelty-P3. Biol Psychol 56:207–218. https://doi.org/10.1016/S0301-0511(01)00078-3
    OpenUrlCrossRefPubMed
  47. ↵
    1. Sokolov EN
    (1963) Perception and the conditioned reflex. Oxford: Pergamon Press.
  48. ↵
    1. Sonnleitner A,
    2. Treder MS,
    3. Simon M,
    4. Willmann S,
    5. Ewald A,
    6. Buchner A,
    7. Schrauf M
    (2014) EEG alpha spindles and prolonged brake reaction times during auditory distraction in an on-road driving study. Accid Anal Prev 62:110–118. https://doi.org/10.1016/j.aap.2013.08.026
    OpenUrlCrossRefPubMed
  49. ↵
    1. Sörqvist P,
    2. Dahlström Ö,
    3. Karlsson T,
    4. Rönnberg J
    (2016) Concentration: The Neural Underpinnings of How Cognitive Load Shields Against Distraction. Front Hum Neurosci 10:221. https://doi.org/10.3389/fnhum.2016.00221
    OpenUrl
  50. ↵
    1. Spong P,
    2. Haider M,
    3. Lindsley DB
    (1965) Selective attentiveness and cortical evoked responses to visual and auditory stimuli. Science 148:395–397. https://doi.org/10.1126/science.148.3668.395
    OpenUrlAbstract/FREE Full Text
  51. ↵
    1. Straetmans L,
    2. Holtze B,
    3. Debener S,
    4. Jaeger M,
    5. Mirkovic B
    (2021) Neural tracking to go: auditory attention decoding and saliency detection with mobile EEG. J Neural Eng 18:066054. https://doi.org/10.1088/1741-2552/ac42b5
    OpenUrl
  52. ↵
    1. Wascher E,
    2. Reiser J,
    3. Rinkenauer G,
    4. Larrá M,
    5. Dreger FA,
    6. Schneider D,
    7. Karthaus M,
    8. Getzmann S,
    9. Gutberlet M,
    10. Arnau S
    (2023) Neuroergonomics on the go: an evaluation of the potential of mobile EEG for workplace assessment and design. Hum Factors J Hum Factors Ergon Soc 65:86–106. https://doi.org/10.1177/00187208211007707
    OpenUrl
  53. ↵
    1. Weinstein ND
    (1978) Individual differences in reactions to noise: a longitudinal study in a college dormitory. J Appl Psychol 63:458–466. https://doi.org/10.1037/0021-9010.63.4.458
    OpenUrlCrossRefPubMed
  54. ↵
    1. Wetzel N,
    2. Schröger E
    (2014) On the development of auditory distraction: a review: development of auditory distraction. Psych J 3:72–91. https://doi.org/10.1002/pchj.49
    OpenUrl
  55. ↵
    1. Woods DL,
    2. Elmasian R
    (1986) The habituation of event-related potentials to speech sounds and tones. Electroencephalogr Clin Neurophysiol/Evoked Potentials Sect 65:447–459. https://doi.org/10.1016/0168-5597(86)90024-9
    OpenUrl

Synthesis

Reviewing Editor: Ifat Levy, Yale School of Medicine

Decisions are customarily a result of the Reviewing Editor and the peer reviewers coming together and discussing their recommendations until a consensus is reached. When revisions are invited, a fact-based synthesis statement explaining their decision and outlining what is needed to prepare a revision will be listed below. The following reviewer(s) agreed to reveal their identity: Robert Barry.

Reviewers appreciated the paper - it is interesting and significant, the approach is sound, and the results are clearly presented. Below are the reviewers' comments.

Reviewer 1

I'd like to congratulate the authors on conducting an interesting study, for using the appropriate methods and analyses, and for presenting it in a clear and concise manner. I do not have any queries about the conduct of the research.

Reviewer 2

This is an interesting paper seeking evidence for auditory distraction markers in real-world soundscapes. The move from the lab to real world is great, and the approach here is commendable.

The description of their model of distraction processing on P.2 ("First, ...") is strongly reminiscent of Sokolov's Orienting Reflex, and that link should be used here to provide the basis for the current exposition.

There are major issues with their introduction and use of ERPs.

1. The initial detection of deviance triggering the MMN (P. 3) is not appropriate in this generalised context - it's only evoked when there's a change from some regularity - frequency, timing, etc. This is acknowledged on P.11 lines 288-292. All reference to MMN should be removed from the paper.

2. The main ERP component assessed is described throughout as "P3a". Most ERP researchers would identify this as a Novelty P3 (nP3) rather than P3a, which perpetuates an earlier discredited misidentification. There is no justification for this P3a label here, and a better non-controversial label throughout would be "P300" as used in Author et al. (2024).

Some other issues.

1. The soundscape average volume is specified, but some indication of max/min and duration/timing is required.

2. Both here (line 185) and in Author et al. (2024), the electrode list includes a repeat of CP5.

3. Substantial overlap of elements from these two papers should be reduced.

Author Response

Rebuttal Letter to the Reviewers We would like to sincerely thank both reviewers for their thoughtful and constructive feedback on our manuscript "EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes." We appreciate the time and care taken to engage with our work and are grateful for the opportunity to revise and clarify our manuscript. In response to the comments, we have carefully revised the text, and we believe these changes have improved the clarity and overall quality of the manuscript. All modifications are clearly marked in the revised version. Below we provide a detailed, point-by-point response to each of the comments.

Reviewer 1 "I'd like to congratulate the authors on conducting an interesting study, for using the appropriate methods and analyses, and for presenting it in a clear and concise manner. I do not have any queries about the conduct of the research." Response:

We sincerely thank Reviewer 1 for their positive and encouraging comments. We are pleased that the study's methodology and presentation were well received, and we highly appreciate the acknowledgement of its relevance and clarity.

Reviewer 2 "This is an interesting paper seeking evidence for auditory distraction markers in real-world soundscapes. The move from the lab to real world is great, and the approach here is commendable." Response:

We thank Reviewer 2 for their kind words regarding the study's motivation and design. We appreciate the recognition of the ecological approach. We furthermore appreciate that the reviewer even took the time to read our publication Author et al. (2024) to give a well-informed evaluation of our manuscript. We welcome the opportunity to respond to the detailed and thoughtful comments that follow.

1. Link to Sokolov's Orienting Reflex "The description of their model of distraction processing on P.2 [...] is strongly reminiscent of Sokolov's Orienting Reflex, and that link should be used here to provide the basis for the current exposition." Response:

We thank the reviewer for this insightful observation. We have now incorporated a reference to Sokolov's Orienting Reflex model in the introduction (p. 2, lines 44-46) and clarified how it conceptually aligns with the initial stages of distraction processing as described in the three-phase model by Escera et al. (2000). This addition reinforces the theoretical continuity between classic and contemporary frameworks of involuntary attention.

2. Use of MMN "The initial detection of deviance triggering the MMN [...] is not appropriate in this generalized context [...] All reference to MMN should be removed from the paper." Response:

We appreciate the reviewer's concern and agree that our naturalistic soundscape does not provide the structured regularity that typically gives rise to a canonical MMN. However, we respectfully retain mention of the MMN in the introduction because it is part of the theoretical framework we apply. The three-phase model of auditory distraction proposed by Escera et al. (2000) conceptualizes MMN, P3a, and RON as sequential stages of processing. Our intention is not to suggest that the MMN is present in our data, but provide a complete and accurate account on the theoretical framework that informs our approach.

To make this distinction clearer, we have revised the introduction (pp. 3-4, lines 86-99) to explicitly state that while we refer to the MMN as part of the model, it is not expected in our paradigm and is not included in our analyses. We hope this clarifies the rationale behind its inclusion and addresses the reviewer's concern.

3. ERP Labeling "Most ERP researchers would identify this as a Novelty P3 (nP3) rather than P3a [...] A better label would be P300." Response:

We are grateful to the reviewer for raising this important point regarding ERP terminology, and we fully acknowledge that the distinction between "P3a" and "Novelty P3" has been a subject of ongoing debate in the literature. Some researchers, such as Barry et al. (2016), have argued for a temporally and functionally distinct Novelty P3 component, whereas others, including Simons et al. (2001), have found no empirical evidence supporting a clear separation and suggest that these components may reflect the same underlying neural process.

In our manuscript, we chose to use the label "P3a" primarily for two reasons. First, our work builds on the three-phase distraction model (Escera et al., 2000) which consistently refers to the early fronto-central positivity as "P3a." To remain consistent with this theoretical framework and to support comparability with previous studies applying the same model (e.g. Getzmann et al., 2024; Wetzel &Schröger, 2014), we adopted this label throughout. Second, the ERP component observed in our data exhibited characteristics that closely align with the classic definition of P3a: a fronto-central topography, timing in the 250-300 ms window, and elicitation by behaviorally irrelevant but acoustically salient stimuli, consistent with involuntary attentional capture (Friedman et al., 2001; Polich, 2007).

While we understand the appeal of the more general label "P300," we believe that it would introduce conceptual ambiguity in this case, as it encompasses a broader family of components, including the later, parietally distributed P3b. Since our study does not involve task-relevant targets or decision-related processing, the P3a label provides greater functional specificity and helps clarify the underlying attentional mechanisms.

In response to the reviewer's helpful suggestion, we have revised the manuscript to include an explicit paragraph in the Discussion section (pp. 22-23, lines 528-537) acknowledging the ongoing debate and outlining both perspectives. We also clarify our rationale for using the P3a label in the context of our study design and theoretical grounding.

The new paragraph now reads: "Notably, the use of "P3a" versus "Novelty P3" remains a topic of debate in the literature. Some studies use the terms interchangeably, and factor-analytic work by Simons et al. (2001) suggests that they reflect the same neural process. Others, however, argue for a clearer distinction: for instance, Barry et al. (2016) propose that the Novelty P3 is a temporally and functionally distinct component, occurring after the P3a and P3b, and specifically associated with orienting to novel stimuli. Our data - showing a temporally stable, fronto-central response that scales with novelty but does not clearly differentiate into subcomponents - appear more consistent with a unified P3a/Novelty P3 interpretation. Nonetheless, we acknowledge that future studies with higher temporal resolution and precise source localization may help clarify whether these components are indeed separable, particularly in naturalistic contexts." We hope this additional clarification addresses the reviewer's concern and demonstrates that we have approached this issue thoughtfully and transparently.

4. Missing Details on soundscape characteristics "The soundscape average volume is specified, but some indication of max/min and duration/timing is required." Response:

We thank the reviewer for this helpful suggestion and fully agree that a clearer and more complete description of the soundscape improves the manuscript. In response, we analyzed the sound level envelope of each of the four sound files used in the study and calculated minimum, maximum, and average sound pressure levels based on a playback calibration of 51 dB(A). This analysis revealed highly consistent levels across files, with brief low-level artifacts in File 1. These artifacts were excluded from the summary statistics but are shown in a supplementary figure for transparency.

To improve clarity, we also consolidated all relevant information about the soundscape, including its structure, timing, playback calibration, and acoustic properties under the subsection "Auditory Stimuli". Previously, this information had been distributed across sections. Based on the reviewer's helpful feedback, we felt it was clearer to present it in one place. The revised description can now be found on pp. 5-6, lines 140-152 of the manuscript. The full SPL summary is reported in Table 1, and the corresponding visualizations are shown in Figure 1 and Extended Data Figure 1-1.

The integrated section now reads: "The street scenario had a total length of 2 hours and 21 minutes, from which we took four segments of 45 minutes each. These segments had a short overlap, since the original sound file was not long enough to cover three non-overlapping hours. The sequence of segments was randomized across participants.

The soundscape was presented via two free-field loudspeakers (Sirocco S30, Cambridge Audio, London, United Kingdom) positioned at ear level, at a 45-degree angle to the left and right with a distance of approximately 0.5 m from the participant. Playback volume was calibrated prior to the experiment using a sound level meter placed at head position, with the average sound pressure level set to 51 dB(A). To characterize the acoustic properties of each sound file, we computed short-term root mean square (RMS) energy using 50 ms windows (with 50 % overlap) and converted this to dB SPL relative to the calibrated reference. Two brief signal artifacts were observed in File 1, where sound levels dropped below 30 dB(A) for isolated frames. These values were excluded from the SPL summary statistics to avoid skewing the results. No such artifacts were present in the other files.

The cleaned analysis revealed highly consistent sound level distributions across the four segments, with mean SPLs around 49.65 dB(A). Minimum values ranged from 30.56 to 32.01 dB(A), and maximum levels from 68.54 to 68.85 dB(A) (see Figure 1 and Table 1). A separate figure illustrates the artifacts in File 1 that were excluded from analysis (Extended Data Figure 1-1)." 5. Duplicate Mention of CP5 "The electrode list includes a repeat of CP5." Response:

We thank the reviewer for catching this oversight. We have corrected the electrode list in the manuscript and removed the duplicate mention of CP5 (p. 8 line 206).

6. Overlap with Author et al. (2024) "Substantial overlap of elements from these two papers should be reduced." Response:

We thank the reviewer for pointing out the substantial overlap with our previous publication (Author et al., 2024), and we appreciate that they took the time to engage with our earlier work in such detail. We acknowledge that certain methodological descriptions are repeated between the two papers. This is primarily due to the reuse of the same dataset, which originated from a rich and multifaceted experimental paradigm. However, the two manuscripts address distinct research questions and explore different analytical approaches. For this reason, we believe it is important that each manuscript can be understood as a stand-alone contribution, without requiring the reader to refer to the earlier publication.

While we are aware that some articles refer back to an original publication when reusing data, we felt that this approach would not serve readers well in this case, given the complexity of the paradigm and the fact that the current manuscript focuses on a subset of the original design. We chose to provide a complete and transparent description of the relevant methods and context so that readers can fully understand the scope and rationale of the present analyses.

We hope the reviewer agrees that this approach is consistent with good scientific practice and ensures accessibility and transparency for all readers.

Once again, we thank both reviewers for their constructive feedback. We hope that the revisions and clarifications we have made address the raised concerns and improve the clarity and rigor of the manuscript. We remain grateful for the opportunity to contribute to the discussion on auditory attention in real-world settings.

Sincerely, The Authors

Back to top

In this issue

eneuro: 12 (7)
eNeuro
Vol. 12, Issue 7
July 2025
  • Table of Contents
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this eNeuro article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes
(Your Name) has forwarded a page to you from eNeuro
(Your Name) thought you would be interested in this article in eNeuro.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes
Silvia Korte, Thorge Haupt, Martin G. Bleichner
eNeuro 7 July 2025, 12 (7) ENEURO.0154-25.2025; DOI: 10.1523/ENEURO.0154-25.2025

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Share
EEG Signatures of Auditory Distraction: Neural Responses to Spectral Novelty in Real-World Soundscapes
Silvia Korte, Thorge Haupt, Martin G. Bleichner
eNeuro 7 July 2025, 12 (7) ENEURO.0154-25.2025; DOI: 10.1523/ENEURO.0154-25.2025
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Methods
    • Results
    • Discussion
    • Conclusion
    • Footnotes
    • References
    • Synthesis
    • Author Response
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • auditory attention
  • auditory distraction
  • ecological validity
  • EEG
  • event-related potentials
  • neuroergonomics
  • P3a component
  • real-world soundscape
  • spectral novelty

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Article: New Research

  • A progressive ratio task with costly resets reveals adaptive effort-delay tradeoffs
  • What is the difference between an impulsive and a timed anticipatory movement ?
  • Psychedelics Reverse the Polarity of Long-Term Synaptic Plasticity in Cortical-Projecting Claustrum Neurons
Show more Research Article: New Research

Cognition and Behavior

  • A progressive ratio task with costly resets reveals adaptive effort-delay tradeoffs
  • Luminance matching in cognitive pupillometry is not enough: The curious case of orientation
  • Prefrontal and subcortical c-Fos mapping of reward responses across competitive and social contexts
Show more Cognition and Behavior

Subjects

  • Cognition and Behavior
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Latest Articles
  • Issue Archive
  • Blog
  • Browse by Topic

Information

  • For Authors
  • For the Media

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Feedback
(eNeuro logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
eNeuro eISSN: 2373-2822

The ideas and opinions expressed in eNeuro do not necessarily reflect those of SfN or the eNeuro Editorial Board. Publication of an advertisement or other product mention in eNeuro should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in eNeuro.