Abstract
Natural environments contain behaviorally relevant information along many stimulus dimensions, each of which sensory systems must encode in order to guide behaviors. For example, the mammalian visual cortex encodes features of visual scenes such as spatial information related to object identity and temporal information about the motion of those objects in space. In order to reliably encode these behaviorally relevant visual features, neural representations should be robust to changes in environmental conditions. Further, information about changes in environmental conditions, such as the luminance changes that occur over the course of a day, is also important for guiding behaviors. In this study, we asked whether mouse primary visual cortex (V1) jointly represents the spatial properties of visual stimuli along with changes in the mean luminance of the visual scene. We find that while V1 neurons, in mice of either sex, encode spatial aspects of visual information in an invariant manner across luminance conditions, the V1 population response also contains a robust representation of luminance. Importantly, V1 populations encode changes in stimulus orientation and mean luminance along orthogonal axes in the neural response space, such that a change in one stimulus variable is encoded independently from the other.
Significance Statement
We recorded from neural populations in mouse V1 with two-photon imaging to examine how sensory information along multiple feature axes is distributed across the responses of diversely tuned neurons. We find that the V1 population response contains a representation of mean luminance in addition to maintaining a luminance-invariant spatial representation. These independent representations are possible because stimulus information is distributed randomly across the V1 population, such that changes in each stimulus variable are encoded along orthogonal axes in the neural response space. This study offers an example of how multidimensional sensory representations emerge from the diverse response properties of neocortical neurons.
Introduction
Sensory systems must simultaneously encode diverse features of environments in order to guide behaviors. To identify objects in a visual scene, the visual system initially transduces the spatiotemporal properties of visual inputs. But those signals will vary both as a function of the objects present in a scene as well as with the coincident luminance conditions. Identifying an object therefore requires disentangling environmental luminance signals from object properties. And yet it is also the case that luminance conditions provide behaviorally relevant information about time of day, weather conditions, and season. Environmental luminance can vary independently from the spatial properties of objects, and this fact is essential for guiding visual behaviors across environmental states. For example, the visual system should be able to recognize the same object whether illuminated by midday sun or moonlight, although the appropriate behavioral response may depend on the present environmental conditions. This suggests that an important function of the visual system is to independently encode spatial features and environmental luminance. It has been unclear how the visual system jointly encodes the spatial features and mean luminance of visual scenes.
The peripheral visual system adapts to changes in mean luminance in order to retain sensitivity and avoid saturation as light levels vary over 10 orders of magnitude between midnight and midday. Changes in pupil diameter, shifts in the sensitivity to photon absorption within a photoreceptor, and the transition between rod and cone phototransduction each help to maintain visual sensitivity across the range of environmental luminance intensities (Shapley and Enroth-Cugell, 1984; Nikonov et al., 1998; Field and Chichilnisky, 2007). The net effect of these complementary mechanisms is a shift in the operating point of the retinal ganglion cells (RGCs), which pool inputs from photoreceptors and project to central visual regions of the brain. In this framework, it is not apparent how RGC activity could encode mean luminance level, given that an aim of these adaptation mechanisms is to maintain invariant visual response properties within a narrow range of co-occurring luminance levels rather than encoding the larger changes in mean luminance that occur slowly over the course of a day.
There are several mechanisms by which central visual regions could encode the luminance state of the environment. One possibility is that V1 responses carry luminance information implicitly through tuning shifts that accompany luminance adaptation. In species with a foveated retina, tuning to high spatial frequencies in the center of the visual field is lost at scotopic luminance levels when inputs from densely packed cones are absent (Wikler et al., 1990). In addition, the spectral information present in the retinal code, which gives rise to color-opponent encoding under photopic conditions, is absent in the visual signal relayed by monochromatic rods under scotopic conditions (Baylor et al., 1987). Prior work has revealed shifts in chromatic contrast tuning in V1 neurons as a function of mean luminance (Rhim et al., 2017, 2021; Rhim and Nauhaus, 2023). Since these shifts in tuning are tied to mean luminance, these changes in V1 response properties are examples of implicit encoding of luminance adaptation state.
Visual areas may also have access to an explicit representation of mean luminance. Rodents and humans can discriminate the brightness of visual inputs in the absence of changes in rod and cone phototransduction (Brown et al., 2012). This work suggests that mammals have perceptual access to inputs from intrinsically photosensitive RGCs (ipRGCs), which encode the absolute luminance of visual inputs in a temporally low-pass manner (Do and Yau, 2010).
Prior work has shown that the visual responses in mouse V1 to monochromatic stimuli are invariant to scotopic versus photopic luminance, despite the functional shifts in the RGC population response across adaptation states (O’Shea et al., 2025). This study found no systematic shifts in the spatiotemporal tuning or correlation structure of the V1 population response between luminance levels. This study controls for the shifts in chromatic tuning which could implicitly encode environmental luminance by restricting stimuli to the overlapping chromatic sensitivity of mouse rods and M-cones (Rhim et al., 2021).
In the present study, we asked whether, in addition to the invariant spatiotemporal representation of visual features, the mean luminance level was encoded by the V1 population. We find that the V1 population response to spatiotemporally identical stimuli reliably encodes scotopic versus photopic luminance levels. Information about changes in visual orientation and luminance is distributed randomly across the V1 population. An important feature of this encoding scheme is that changes in orientation and luminance are encoded along orthogonal axes in the neural response space. We show that the independent encoding of orientation and luminance differs from the joint encoding of other stimulus features, such as spatial and temporal frequency (TF). These results show how multiple stimulus variables can be simultaneously represented by neural populations in the sensory cortex, in a manner that preserves the independent representations of these variables.
Materials and Methods
Animal subjects
All animal procedures were approved by The University of Texas at Austin Institutional Animal Care and Use Committee.
Imaging experiments were conducted using adult Ai94(TITL-GCaMP6s)-D;CaMK2a-tTA (The Jackson Laboratory #024115) mice of both sexes which express fluorescent calcium indicator GCaMP6s in forebrain excitatory neurons (n = 15). All mice were 6 weeks of age or older.
Surgery
For all surgical procedures, mice were anesthetized with isoflurane (2.5% induction, 1–1.5% surgery) and given two preoperative subcutaneous injections of analgesia (5 mg/kg carprofen, 3.5 mg/kg Ethiqa) and an anti-inflammatory agent (dexamethasone, 1%). Mice were kept warm with a water-regulated pump pad. Each mouse underwent two surgical procedures. First, we placed a metal frame over the visual cortex using dental acrylic to be used for head fixation of the mouse during subsequent experiments. Second, we drilled a 1−4 mm craniotomy over V1 in the right hemisphere and sealed it with a glass window implant. Surgical procedures were always performed on a separate day from experiments. Following head frame implantation, we mapped the V1 retinotopy with intrinsic signal imaging as described previously (Juavinett et al., 2017; Rhim et al., 2017). In all experiments, we targeted the lower visual field of V1 based on the retinotopic map acquired with intrinsic signal imaging, in order to record from neurons which strongly respond to 525 nm stimuli at scotopic and photopic luminance levels (Rhim et al., 2017).
Imaging
Two-photon calcium imaging was performed with a Neurolabware microscope and ScanBox acquisition software. The scan rate varied between 10 and 15 frames/s, scaling with the number of rows in the field of view. A Chameleon Ultra laser was set to 920 nm to excite GCaMP6s. A Nikon 16× (0.8 NA 3 mm WD) or Olympus 10× (0.6 NA 3 mm WD) objective lens was used for all imaging sessions, with 900- and 1,400-µm-wide field of view, respectively. All cells were imaged between 150 and 350 µm depth, which corresponds to layer 2/3 in the mouse. The average power of the exposed laser beam while imaging was approximately 60 and 30 mW for the 10× and 16× objectives, respectively.
Putative cells were extracted from our imaging data using the suite2p pipeline in Python (Pachitariu et al., 2017). We performed additional processing using custom MATLAB code to retain only cells meeting certain criteria. Cell fluorescent traces had to have skewness greater than 2. Cells also had to show significant visually evoked responses above baseline in their trial-averaged time courses as quantified by a one-sided, two-sample t test with p < 0.001. Finally, only cells that were identified and retained for both scotopic and photopic conditions were analyzed. Cells were matched across conditions by finding ROIs matched in both location and morphology, using custom MATLAB code.
Stimulus presentation
Setup
A monochrome LED projector by Texas Instruments (Keynote Photonics) with a spectral peak at 525 nm was used to generate stimuli with a 60 Hz refresh rate onto a Teflon screen which provides a near-Lambertian surface (Rhim et al., 2017). The screen was 12.5 cm high × 32 cm wide, equating to approximately 64° × 116° of visual angle. Stimuli were coded using the Psychophysics Toolbox extension in MATLAB (Rhim et al., 2021). Mice were positioned such that the perpendicular bisector from the mouse's eye to the screen was 10 cm with the screen angled at 30° from the mouse's midline. The upper edge of the screen was placed near the vertical midline of the mouse's field of view to target the lower visual field for stimulation.
All visual stimuli were presented in the lower visual field (anterior V1). The anterior portion of V1 receives retinal input from M-opsin expressing cones under photopic conditions which have a similar chromatic response profile to rods in the mouse retina (Wang et al., 2011; Chang et al., 2013; Rhim et al., 2017, 2021). The similarity in chromatic response profiles allows us to use the same chromatic stimulus (525 nm) for both scotopic and photopic light levels, varying only the luminance of the stimulus.
Natural movies
We presented two, 10-s-long natural movies with each repeat followed by a 6-s-long blank screen at mean luminance. Movie 1 shows honeybees flying in a garden (by courtesy of Ian Nauhaus, The University of Texas at Austin), and Movie 2 shows monkeys playing in snow (by courtesy of David Leopold, NIMH).
Gratings
In experiments comparing V1 responses under scotopic and photopic luminance conditions, static sinewave gratings were presented at 0.05 cycles/° for 500 ms at six orientations (0, 10, 45, 90, 100, 135°) and two contrasts (30, 100%), each followed by a 1 s blank screen at mean luminance. We presented gratings differing by only 10° in order to probe how small changes in orientation are encoded by the V1 population across luminance levels. Specifically, we tested whether the V1 population encoded this small orientation difference with equal fidelity across scotopic and photopic luminance levels. The results of this analysis have been presented previously, showing that the d′ for Δ10° has equal magnitude between scotopic and photopic luminance (O’Shea et al., 2025). For the analysis presented in Figures 2 and 7 in which we probed the V1 representation of mean luminance in response to otherwise identical stimuli, we compared the V1 population response to gratings at each of these six orientations and two contrasts, differing only in mean luminance. For the analyses in Figures 4 and 5, in which we ask how the V1 population encodes changes in orientation along with changes in luminance, only responses to pairs of gratings separated by 45° were considered. For the analysis in Figure 6A, in which we ask how the V1 population encodes changes in orientation along with changes in contrast, only responses to pairs of gratings separated by 45° were considered.
In the experiments presented in Figure 6B, drifting sinewave gratings were presented for 2,000 ms at eight orientations (0, 45, 90,135,180, 225, 270, 315°) and five spatial frequencies (0.02, 0.04, 0.08, 0.16, 0.32 cycles/°), each followed by a 1 s blank screen at mean luminance.
In the experiments presented in Figure 6C, drifting sinewave gratings were presented for 2,000 ms at one orientation (0°), five spatial frequencies (0.02, 0.04, 0.08, 0.16, 0.32 cycles/°), and four temporal frequencies (1, 2, 4, 8 Hz), each followed by a 1 s blank screen at mean luminance.
Light adaptation
At least 10 min prior to each experiment, the pupil was fully dilated with 1% atropine. Rod isomerization rates for each projector configuration were computed as previously described, using spectral radiance measurements taken with a PR-655 spectroradiometer (Rhim et al., 2021). In all experiments, stimuli were presented first under scotopic conditions and then under photopic conditions.
Scotopic adaptation
Scotopic luminance was achieved by lowering projector power and adding two 1% neutral density filters (Thorlabs) to the light path. Care was taken to black out any other sources of visible light during the experiment, and supervision of the experiment was done remotely in an adjacent room separated by a shut door and floor to ceiling black out curtains. Based on our spectral radiance measurements, this configuration generated 25 Rh* rod−1 s−1 [0.016 candelas (cd)/m2]. Previous work has established that this scotopic luminance level corresponds to the nighttime environmental luminance of a rural setting and drives solely rod-mediated responses in the retina (Spitschan et al., 2016; Rhim et al., 2021). Mice viewed a blank screen at mean luminance for 30 min prior to any stimuli being played to achieve scotopic adaptation.
Photopic adaptation
We then raised the projector power and removed the neutral density filters to generate photopic luminance at 1.13 × 105 Rh* rod−1 s−1 (97 cd/m2). The photopic luminance level corresponds to midday sun and fully saturates rod-mediated responses in the mouse retina, with pupil dilated (Spitschan et al., 2016; Rhim et al., 2021; Franke et al., 2022). Mice viewed a blank screen at mean luminance for 10 min prior to any stimuli being played to achieve photopic adaptation.
Quantification and statistical analysis
Imaging
All cellular fluorescence traces were baseline normalized on a single-trial basis with the mean fluorescence value in the 200 ms preceding stimulus onset. All visual responses were analyzed in units of percent change in fluorescence above baseline (ΔF/F):
For the luminance discrimination analysis, we compute each cell's response on a single-trial basis as the mean ΔF/F from 450 to 1,300 ms following stimulus onset, which captures the peak of the evoked fluorescence response to the 500 ms static gratings for cells expressing GCaMP6s.
To explore what sensory information is present in the high-dimensional V1 population code, we project the V1 population response across stimulus conditions into a two-dimensional space for computing the discriminability index (d′). We employ the “targeted dimensionality reduction” method for projecting neural population responses into an orthonormal basis which preserves the principal axis along which the population response shifts for a change in stimulus (dU) and the axis of the largest component of trial-to-trial covariance in the population response (n1; Heller and David, 2022).
For each orientation of the grating stimulus, we create two matrices of size Ncells × Ntrials, containing the single-trial responses for all simultaneously recorded cells at scotopic and photopic luminance. The difference between the trial-averaged means of these matrices gives the first axis in the decoding space—dU.
Next, we subtract the trial-averaged response for each stimulus condition from the response matrices and concatenate this zero-centered data into a “noise matrix” of size Ncells × 2*Ntrials. We then compute the first principal component of this noise matrix (e1), which represents the primary axis of trial-to-trial covariance in the V1 population response across stimulus conditions.
The axes dU and e1 are not necessarily orthogonal to one another. To form an orthonormal basis for dimensionality reduction, we compute the “noise axis” (n1) as the component of e1 which is orthogonal to dU. Neural data are then projected into the two-dimensional decoding space defined by the orthogonal axes dU and n1. The neural responses for each condition are now represented by a matrix of size 2 × Ntrials. In this two-dimensional space, we compute the difference in mean response to the two luminance conditions (ΔU) and the covariance across trials (Σ). The discriminability index (d′) is computed as follows:
To validate the generalizability of this decoding subspace for capturing the structure of the V1 population response, we hold out a random subset of response trials when fitting the space and compute d′ on the projection of this held-out data into the decoding subspace. Each grating stimulus was presented for 50 trials. To fit the decoding subspace, we select neural responses from a random set of 40 trials for each stimulus and use the remaining 10 trials to compute d′. We repeat this procedure for 50 iterations and compute the average d′ across all iterations, which we report as the d′ value in our results.
An analogous procedure was used to decode luminance from V1 population responses to natural scenes. We compute each cell’s response on a single-trial basis as the mean ΔF/F in 166 ms bins. Given the slow dynamics of the calcium indicator, this allows us to compare responses to every five frames of the natural movies. With these single-trial responses across luminance levels, we perform the same luminance decoding procedure as above.
Modeling
We simulate noisy V1 responses to repeated presentations of gratings at scotopic and photopic luminance via random draws from a multivariate normal distribution using the “mvnrnd” function in MATLAB. For the example simulations in Figure 5, the transition from scotopic to photopic visual stimulation is modeled by a doubling of the mean and variance of the multivariate normal distribution. This modeled V1 population response is then projected into the dDR space to compute d′ for luminance discrimination as described for the neural data in Figure 2.
Code accessibility
The MATLAB code to perform the analyses in this manuscript is available on Figshare (10.6084/m9.figshare.30359725). The imaging code is from the company Neurolabware and is freely available online (scanbox.org). The visual presentation code is available on GitHub (https://github.com/SNLC/ISI). We use the Suite2P postprocessing tools to extract cell masks and generate ΔF/F traces (https://github.com/MouseLand/suite2p).
Results
Mouse V1 represents spatiotemporal visual features, such as orientation, in a luminance-invariant manner (O’Shea et al., 2025). This luminance-invariant code could come about because information about luminance has been filtered out of the representation in V1. Alternatively, V1 could retain a representation of mean luminance level, but in a manner orthogonal to the representation of orientation. To determine whether the representation of mean luminance is retained in V1, we used two-photon calcium imaging to track the responses of V1 neurons to natural movies and oriented gratings across scotopic and photopic conditions (Fig. 1A).
A, Experimental design. B, Example fluorescent traces from simultaneously recorded V1 neurons responding to 0 and 45° gratings at scotopic and photopic luminance. For each cell, responses are scaled to the maximum ΔF/F across the four stimulus conditions. The solid line indicates the trial-averaged response, and the shaded region indicates ±1 standard deviation.
We found a diversity of functional characteristics in the evoked responses of V1 neurons to pairs of oriented gratings differing by 45° across luminance conditions (Fig. 1B). Some cells encode this 45° orientation shift in a luminance-invariant manner (“most informative orientation cells”). A separate population of cells encodes a change in luminance by increasing response magnitude from scotopic to photopic states, while others have smaller evoked responses at higher luminance (“most informative luminance cells”). Still other cells exhibit evoked responses that are invariant to both changes in orientation and luminance and are therefore uninformative for representing these two stimulus variables (“uninformative cells”). These response motifs indicate that the V1 population encodes the mean luminance of visual scenes in tandem with an invariant representation of orientation and suggest that information about these two stimulus variables is carried by distinct neural populations, with responses modulated in a heterogeneous manner by changes in luminance.
To quantify the discriminability of the V1 population code for scotopic versus photopic luminance, we asked how well a linear decoding procedure could separate visually evoked population responses to the same orientation of gratings at scotopic (0.1 Rh*/rod/s) versus photopic (104 Rh*/rod/s) luminance levels (n = 10 mice; 18 experimental sessions; 21,792 cells; 1,210.7 ± 740.9 cells/session). We projected neural population data into a two-dimensional decoding space using a dimensionality reduction method which captures the primary axes along which population responses varied across grating presentations (“dDR”; Fig. 2A; Heller and David, 2022). Luminance level was reliably encoded by the V1 population, indicated by large values of the discriminability index (d′; mean d′ = 4.33 ± 2.07; Fig. 2B). We performed the same luminance decoding procedure for V1 responses to natural scenes, and found that luminance level was also reliably encoded by the V1 responses to identical natural movies differing only in mean luminance (mean d′ = 4.18 ± 1.41; Fig. 2C).
Discriminability of the V1 code for mean luminance. A, Example of projecting neural responses into the dDR decoding space for luminance discrimination between photopic (orange) and scotopic (purple) conditions. B, Histogram of d′ values across all V1 imaging experiments for discriminating responses to gratings of the same orientation at scotopic and photopic luminance. The black arrow indicates the mean across all experiments and grating orientations. C, As in B, for discriminating responses to natural movies at scotopic and photopic luminance.
Figure 2-1
Discrimination index for scotopic vs photopic luminance. A. Histogram of d’ values across all V1 imaging experiments for discriminating responses to gratings of the same orientation at scotopic and photopic luminance, plotted separately for 30% and 100% contrast gratings. Arrows indicate mean across all experiments and grating orientations at a single contrast. B. Scatter plot showing mean d'luminance as a function of V1 population size. Black dots indicate mean across experimental stimuli pairs, errorbars indicate +/- .5 standard deviation. Download Figure 2-1, TIF file.
We investigated why the magnitude of d′ for luminance varied across experiments and stimuli (Fig. 2B,C). Some of the variability in d′ values across this dataset is due to diversity in both visual stimuli and experimental population size. d′ for discriminating luminance is significantly higher for gratings at 100% contrast (mean d′ = 5.1 ± 1.9) than at 30% contrast (mean d′ = 3.6 ± 1.8; p = 0.002, two-sample t test; Extended Data Fig. 2-1A). In addition, d′ for luminance discrimination scales with neural population size, in line with prior results showing a scaling in the fidelity of sensory encoding with neural population size (Pearson’s correlation = 0.68; Extended Data Fig. 2-1B; Kafashan et al., 2021; Stringer et al., 2021).
Having determined that the V1 population encodes luminance level, we asked how this representation coexists with the luminance-invariant representation of orientation by the mouse V1 population. Orientation can be decoded using a luminance-invariant projection of the V1 population response (O’Shea et al., 2025). Here we have shown that information on the luminance state can also be decoded from the same V1 population response. We compared our data to candidate models of V1 response distributions to explore how the V1 population jointly represents these two stimulus variables.
The population-level representation of orientation and luminance is built up from the signals carried by individual neurons. The change in mean response of each neuron for a change in stimulus orientation, dμori, or luminance, dμluminance, quantifies how informative a cell’s trial-averaged response is for decoding the value of a particular stimulus variable. Cells with large changes in mean response magnitude for a change in stimulus value are the most informative for decoding (i.e., a “steep” tuning curve), while cells which have an invariant mean response across the range of stimulation are uninformative (i.e., a “flat” tuning curve). The vector of dμ values for all cells, dU, is the first axis of the decoding space and defines the principal dimension along which population activity varies with a change in the value of a stimulus variable of interest (Heller and David, 2022).
We considered two models for how luminance information is encoded by V1 cells in relation to their responses to changes in orientation. One possibility is that orientation and luminance information are distributed randomly across the V1 population, such that there is no correlation between the magnitude of dμori and dμluminance for single cells (“random encoding model”; Fig. 3A, left). The correlation between the magnitude of dμori and dμluminance determines the alignment of dUori and dUluminance, such that this encoding scheme generates orthogonal dU axes (dUori·dUluminance = 0) for orientation and luminance (Fig. 3A, right). Alternatively, dμori and dμluminance could covary for a given cell, such that largely overlapping subpopulations of V1 cells are most informative for decoding both orientation and luminance (“shared encoding model”; Fig. 3B, left). In this encoding scheme, the dU axes are parallel for the two stimulus variables (dUori·dUluminance = ±1; Fig. 3B, right). Dot products between dU axes of both −1 and +1 correspond to the shared encoding model, as the sign of this axis alignment simply reflects the relative directions of the response shifts for given changes in orientation and luminance.
V1 population encoding models for the joint representation of orientation and luminance. A, Illustration of the random encoding model. Left, Distribution of dμori and dμluminance for model V1 cells. The most informative cells for decoding orientation and luminance are shaded in cyan and red, respectively. Overlap in the most informative subpopulations is shaded green. Inset tuning curves show the relative response magnitudes for changes in orientation and luminance corresponding to a large negative, zero, or large positive dμ values. Right, Three-dimensional representation of the modeled V1 population activity, showing the principal encoding axes for changes in orientation (cyan) and luminance (red). B, As in A, for the shared encoding model.
To compare the V1 population data to these candidate encoding schemes, we computed the change in trial-averaged response of all V1 cells to 45° changes in the orientation of 100% contrast, static gratings (dμori) and shifts from scotopic to photopic luminance (dμluminance; n = 8 mice; 12 experimental sessions; 14,320 cells; 1,193.3 ± 734.4 cells/session). dμori and dμluminance were on average uncorrelated when plotted against each other for all cells in each experimental population (mean Pearson's correlation coefficient = 0.07 ± 0.14; Figs. 4A–D, 5A). These scatter plots reveal additional, important characteristics of the joint encoding of orientation and luminance by the V1 population. First, the distribution of dμori is symmetric about zero, indicating that equal numbers of V1 neurons increase as they decrease their response for a particular 45° change in orientation. In addition, an increase in luminace from scotopic to photopic levels causes bidirectional shifts in response magnitude across the V1 population; some neurons increase their responses, while other neurons decrease their responses. On average, however, V1 cells show an increase in response magnitude for photopic versus scotopic luminance, as shown previously (mean log ratio = 0.33 ± 0.79; O’Shea et al., 2025). This differs from the joint encoding of orientation and contrast, where an increase in contrast is encoded by a unidirectional increase in response magnitude across the V1 population (mean log ratio = 0.83 ± 0.75; Extended Data Fig. 6-1A).
Experimental dμori and dμluminance distributions. A–D, Example V1 population dμori and dμluminance distributions. Each scatter point gives the dμori and dμluminance magnitude for a V1 cell. Top 10% largest dμ magnitude cells are highlighted for orientation (cyan) and luminance (red) discrimination. Cells that overlap in both most informative subpopulations are highlighted in green. E, F, Example cell maps from V1 imaging experiments. The cells are highlighted according to the same color scheme as A–D.
Joint representation of orientation and luminance by the V1 population response. A, Histogram of the Pearson's correlation coefficient of dμori and dμluminance for simultaneously recorded V1 cells. B, Histogram of dUori·dUluminance for each V1 population and stimulus set. C, Scatter plot of the magnitude of dUori·dUluminance versus d′ for orientation discrimination. The arrows indicate means of distribution. All histograms show the average over stimulus orientation pairs for each experimental session.
In line with the response motifs observed across the V1 responses from Figure 1, these scatter plots suggest that information about changes in orientation and luminance is carried along independent dimensions of the V1 population response. As predicted by the random encoding model, there was some overlap between those neurons that are most informative about luminance and orientation (Fig. 3). We defined the “most informative” neurons for orientation or luminance discrimination as those with the top 10% dμ magnitude or the largest trial-averaged change in response as a function of a change in stimulus. On average, there was 34 ± 07% overlap in the “most informative” neurons for the two discrimination tasks. These functionally distinct subpopulations are spatially intermingled across the V1 imaging plane (Fig. 4E,F).
Since dμori and dμluminance were uncorrelated for individual neurons, dUori and dUluminance, the primary axes for decoding stimulus information from the population response, tended to be orthogonal (mean dot product = 0.08 ± 0.20), consistent with the prediction of the random encoding model (Fig. 5B). While we observed some variability in the alignment of dUori and dUluminance across stimulus pairs, there was a clear correlation between d′ for orientation discrimination and the orthogonality of dUori and dUluminance across experiments and stimulus pairs, such that V1 populations with the highest fidelity population code for orientation were those which encoded orientation and luminance along orthogonal dimensions in the neural response space (Pearson's correlation coefficient = −0.39; Fig. 5C). This analysis indicates that, in keeping with the predications of the random encoding model, the V1 population encodes orientation and luminance independently, resulting in orthogonal decoding axes for these two stimulus variables.
Next, we asked how the V1 population jointly encodes other visual stimulus properties. In particular, does the V1 population encode additional stimulus variables along orthogonal axes, as for the representation of orientation and luminance, or do these representations covary?
Given that V1 neurons exhibit contrast-independent orientation tuning, we predicted that orientation and contrast would be represented independently by the V1 population response (Fig. 6A; Sclar and Freeman, 1982). Scatter plots for dμori versus dμcontrast indicate that orientation and contrast information are carried by V1 subpopulations matching a random encoding model (Extended Data Fig. 6-1A). In keeping with this result, we find that dUcontrast and dUori tend to be orthogonal, which represents a generalization of contrast-invariant orientation tuning at the single-cell level to the mouse V1 population code (mean dot product = 0.01 ± 0.29; Fig. 6A). Importantly, unlike the representation of luminance in V1, changing contrast has a unidirectional effect on V1 response magnitudes (Extended Data Fig. 6-1A).
Joint representation of visual stimulus features by the V1 population response. A, Left, Histogram of dUori · dUcontrast (right) example fluorescent traces from V1 cells responding to static gratings across orientations and contrasts. The solid line indicates the trial-averaged response, shaded region indicates ±1 standard deviation. B, As in A, for orientation on SF. C, As in A, for SF and TF. All histograms show the average over stimulus orientation pairs for each experimental session.
Figure 6-1
Experimental dμ and dμ distributions. A. Example V1 population dμori and dμcontrast distributions. Each scatter point gives the dμori and dμcontrast magnitude for a V1 cell. Top 10% largest dμ magnitude cells are highlighted for orientation (cyan) and contrast (red) discrimination. Cells that overlap in both most informative subpopulations are highlighted in green. B. As in A, for dμori and dμSF. C. As in A, for dμSF and dμTF. Download Figure 6-1, TIF file.
We found that not all visual stimulus features are encoded along orthogonal dimensions by the V1 population. The orientation tuning of mouse V1 neurons varies with spatial frequency (SF), suggesting that orientation and SF are not encoded along orthogonal dimensions in the neural response space (Fig. 6B; Ayzenshtat et al., 2016; Pattadkal et al., 2018). To test this idea, we computed decoding axes for V1 population responses to 45° changes in the orientation of drifting gratings and to 1 octave changes in the SF of gratings at a fixed orientation. Example scatter plots for dμori versus dμSF are presented in Extended Data Fig. 6-1B. dUori and dUSF were not orthogonal, but rather had dot products shifted away from zero, in contrast with the orthogonality of dUori and dUluminance (Fig. 6B).
Similarly, we computed the alignment of decoding axes for 1 octave changes in the SF and temporal frequency (TF) of gratings, since mouse V1 neurons do not encode these variables independently but rather exhibit tuning for SF–TF combinations corresponding to a particular velocity (Fig. 6C; Priebe et al., 2003; Andermann et al., 2011). Example scatter plots for dμSF versus dμTF are presented in Extended Data Figure 6-1C. dUSF and dUTF were also not orthogonal (Fig. 6C).
The representation of luminance by the V1 population could be due to a change in the gain of visually evoked responses in the retina between scotopic and photopic conditions. RGC firing rates are higher in the photopic than scotopic regime, and this difference could be inherited by V1 in the form of a one-dimensional gain shift of evoked responses across the population (Ruda et al., 2020). This kind of change in response magnitude across the V1 population is one way in which V1 might encode the luminance level. We simulate this scenario by taking a set of V1 responses to two gratings in the scotopic regime and multiplying them by a factor of two to get a pseudo population response for the photopic condition (Extended Data Fig. 7-1A, left panel). This pseudo population response carries information about both the orientation and luminance of the stimulus. In this scenario, equalizing response magnitudes across the two luminance conditions, by normalizing the population response to the maximum response within each condition, eliminates the luminance information in the V1 code, without disrupting the representation of orientation (Extended Data Fig. 7-1A, right panel).
To further illustrate how luminance could be encoded by a change in the gain of V1 responses, consider a model two-neuron population responding to the same stimulus at two different luminance levels, with the only change in the responses across conditions being a scalar shift in the mean and variance, akin to V1 neurons inheriting a gain shift from the retina across light levels (Fig. 7A, left panel). A linear decoder can separate responses from the two conditions with high fidelity (Fig. 7A, right panel). However, if we normalize the modeled responses to the maximum response within each condition, which eliminates the change in response magnitude between luminance conditions, then the two conditions are no longer linearly discriminable (Fig. 7B). This modeling result generalizes to networks of arbitrary size.
Luminance information is not carried by a scalar gain change in the V1 population response. A, Schematic of a two-neuron model for luminance encoded by a scalar gain change in responses, which increases the mean and variance of the neural responses by a factor of 2 in the photopic (orange) relative to scotopic (purple) conditions. Right, Luminance condition can be discriminated from these population responses using the dDR method. B, As in A, following normalization of responses. C, Scatter plot showing d′ for luminance in experimental V1 data before and after normalizing response magnitude in each condition. The gray region shows the prediction of the effect of normalization on luminance discriminability for a one-dimensional gain change in the V1 population response with luminance.
Figure 7-1
Modeling V1 responses to scotopic and photopic luminance. A. (left) Modeled fluorescent traces from V1 neurons responding 0° and 45° gratings at scotopic and photopic luminance. The photopic responses are the scotopic responses scaled by a factor of 2. (right) Modeled fluorescent traces from V1 neurons following normalization to the maximum response across all neurons in each light condition. B. (left) Real fluorescent traces from simultaneously recorded V1 neurons responding 0° and 45° gratings at scotopic and photopic luminance. (right) Real fluorescent traces from V1 neurons following normalization to the maximum response across all neurons in each light condition. Solid line indicates the trial-averaged response, shaded region indicates +/- 1 standard deviation. Download Figure 7-1, TIF file.
In order to determine whether the representation of luminance by the mouse V1 population is due to a scalar shift in response magnitude across the population, we applied the same normalization procedure to our real neural data. Normalizing the V1 population response within each condition does not disrupt that representation of luminance (Extended Data Fig. 7-1B). This is because the transition from scotopic to photopic luminance has heterogeneous effects on responses across cells. These changes are distributed across the V1 population and are not one-dimensional, like a shift in gain. Rather, the effect of changing luminance is high-dimensional, and therefore not eliminated through this normalization procedure. We found that d′ values for luminance discrimination are unaffected by this normalization, indicating that the representation of luminance level in V1 is not solely due to a one-dimensional gain shift fed forward from the retina (mean d′ ratio = 0.95 ± 0.09; Fig. 7C).
Discussion
We have shown that the mouse V1 population encodes mean luminance and orientation in a separable manner. Information about the orientation and luminance of visual inputs is distributed randomly across the V1 population, such that the principal decoding axes for these two stimulus variables are orthogonal to one another. These results indicate that the V1 population response carries information about mean luminance, in a manner that does not interfere with using a shared, linear decoder to recover orientation information across scotopic and photopic conditions. Thus, V1 encodes visual orientation and mean luminance independently, allowing for a luminance-invariant spatial representation of visual features to coexist with a flexible representation of environmental state. The joint representation of orientation and luminance is distinct from that of orientation and contrast, which are also encoded along orthogonal axes by the V1 population, in that a change in luminance has diverse, bidirectional effects on response magnitudes across the V1 population. The encoding of orientation and luminance is also distinct from the encoding of other visual properties, such as spatial and temporal frequency, which are not encoded independently by the V1 population.
The visual system must encode not only spatiotemporal information related to the identity and motion of objects in visual scenes but also information about the current state of the environment. The mean luminance of a visual scene is a valuable measure of environmental state as it systematically shifts on a daily basis. Importantly, most RGCs do not encode absolute luminance level in a monotonic manner. Rather, luminance adaptation shifts the gain of RGC responses to retain sensitivity over a relatively narrow range of light intensities.
How then might downstream visual circuits extract information about the mean luminance of a visual scene? One possibility is that the luminance adaptation state is encoded implicitly via functional shifts in the spatiotemporal or chromatic responses of neurons in central visual areas. Functional shifts which are reliably tied to a change in mean luminance provide information about the mean luminance of the visual input. For example, species with a rod-free fovea, such as primates, lose visual sensitivity at the center of the visual field when adapted to luminance below the threshold for cone activation. This functional shift in the output of the retina alters the visual information available to downstream circuitry. In primates, a simple representation of the luminance adaptation state could be extracted from whether or not V1 neurons with receptive fields within the foveal retinotopy are visually driven. Importantly, outside of the foveal zone, the spatial selectivity of macaque V1 cells is largely invariant across scotopic and photopic conditions, as in the mouse (Duffy and Hubel, 2007; O’Shea et al., 2025). Furthermore, human psychophysical experiments have shown that the perceptual attenuation of spatial and temporal contrast sensitivity for low frequencies, first measured under photopic conditions in human subjects, persists in scotopic conditions. In addition, the perceptual phenomenon of simultaneous contrast, in which squares of equal luminance appear to differ in brightness depending on the luminance of the surrounding area, persists in the scotopic regime (Fiorentini and Maffei, 1973). These results indicate that luminance-invariant spatial encoding is a general property of mammalian V1 and raise the question of how luminance could be represented by V1 neurons in the absence of changes in receptive field structure.
Our results, along with prior psychophysical experiments, indicate that changes in mean luminance are encoded by V1 despite luminance adaptation in the retina and the luminance-invariant tuning properties of downstream neurons. These results suggest that an explicit signal for luminance, perhaps originating in the ipRGCs, provides a representation of luminance to downstream visual areas. Indeed, ipRGCs project to the lateral geniculate nucleus of the thalamus (LGN), implicating this class of neurons in the thalamocortical visual circuit (Dacey et al., 2005; Brown et al., 2010). ipRGCs have overlapping spectral sensitivity with mouse rod and M-cone photoreceptors and will therefore be differentially activated by the 525 nm scotopic and photopic stimuli used in this study. Humans and mice can behaviorally discriminate the brightness of visual metamers that do not alter rod or cone activation, but do differentially activate melanopsin, the photopigment present in ipRGCs (Brown et al., 2012). Finally, mice lacking rod and cone photoreceptors can still perform a visually guided task in which they must discriminate the brightness of two displays in order to exit a water maze (Brown et al., 2012). These findings suggest that signals from ipRGCs projecting to the LGN could modulate the V1 population response across light levels.
Here we have shown that the V1 representation of mean luminance is not due to a one-dimensional shift in the gain of the population response. Rather, changing luminance has diverse effects on individual V1 neurons, with some neurons showing no change in visually evoked response magnitudes between luminance conditions, others having higher response magnitudes for scotopic relative to photopic stimuli, and still others responding more strongly in the photopic than scotopic condition. The finding that the effect of transitioning from scotopic to photopic luminance on V1 activity is high-dimensional emphasizes the importance of studying visual encoding from a neural population perspective, since the representation of luminance is distributed across V1 neurons which respond to changes in luminance in a heterogeneous manner.
Our observation that luminance can modulate V1 neuronal responses bidirectionally and with varying magnitudes may be at odds with the canonical finding that ipRGCs respond monotonically to increases in luminance (Do and Yau, 2010). Any proposed circuit mechanism for the encoding of luminance in V1 must account for this bidirectional modulation of V1 responses across the population, given the monotonic modulation of ipRGC firing rate by luminance. Interestingly, recent studies of ipRGCs in the mouse retina have identified functional clusters of ipRGCs which have distinct dynamic ranges for encoding luminance (Sondereker et al., 2020). Some classes of ipRGCs are more strongly driven at scotopic than photopic luminance, breaking from the canonical view of ipRGCs as monotonic encoders of luminance. A possible explanation for the diverse effects of luminance on V1 responses could be that projections from these functional clusters of ipRGCs remain segregated in downstream visual areas.
The visual system must convert retinal activity into a code for guiding behavior. Constructing such a code requires that the representations of certain visual features be independent of one another. For example, objects must be identified regardless of the environmental conditions in which they appear. Here we have shown that a representation of absolute luminance coexists with an invariant representation of spatial visual information in mouse V1. Further, we demonstrate how visual representations are distributed across the cortical population. By following a random coding scheme, in which there is no correlation between a neuron's change in response to a change in stimulus orientation and the change in response to a change in luminance, these two stimulus variables can be encoded independently by the V1 population.
The joint encoding of visual features by a random encoding scheme, as we have described for orientation, luminance, and contrast, is one way in which the visual system extracts behaviorally relevant information from retinal signals. There are other mechanisms for generating independent representations, such as encoding features by separate neural populations, either within or across brain regions. While V1 generates separable representations for orientation and luminance, the linked representations of SF and TF could indicate either that a separable representation for these features has not yet been unraveled at this stage in the visual hierarchy, or, more likely, that another representation is more important for driving behavior: sensitivity to speed. Understanding which visual feature representations are linked and which are encoded independently by neural populations across visual areas will reveal the process by which the visual system extracts behaviorally relevant information from retinal signals.
Footnotes
The authors declare no competing financial interests.
This work was supported by the NIH (R01-EY028657 and 5-T32-EY-021462-12).
X-X.W. and N.J.P. are senior authors.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.













