Figure 1. Stimulus material and experimental methodology. Acoustic and visual features were extracted from audiovisual speech material and were used to quantify their cerebral tracking during audio-only and visual-only presentations. A, The stimulus material consisted of 180 audiovisual recordings of a trained actor speaking individual English sentences. For visualization, here only the mouth is shown, but participants were presented with the entire face. From the video recordings, we extracted three features describing the dynamics of the lip aperture: the area of lip opening (lip area), its slope (lip slope), and the width of lip opening (lip width); collectively termed LipFeat. From the audio waveform, we extracted three acoustic features: the broadband envelope (aud env), its slope (aud slope), and a measure of dominant pitch (aud pitch), collectively termed AudFeat. B, Trial-averaged percent correctly (PC) reported target words in auditory (A-only) and visual-only (V-only) conditions, with dots representing individual participants. C, Logarithmic power spectra for individual stimulus features. For reference, a 1/f spectrum is shown as a dashed gray line. D, Coherence between pairs of features averaged within two predefined frequency bands (0.5–1 Hz left; 1–3 Hz right; for details, see Materials and Methods).