Bayesian modeling of dynamic motion integration

https://doi.org/10.1016/j.jphysparis.2007.10.013Get rights and content

Abstract

The quality of the representation of an object’s motion is limited by the noise in the sensory input as well as by an intrinsic ambiguity due to the spatial limitation of the visual motion analyzers (aperture problem). Perceptual and oculomotor data demonstrate that motion processing of extended objects is initially dominated by the local 1D motion cues, related to the object’s edges and orthogonal to them, whereas 2D information, related to terminators (or edge-endings), takes progressively over and leads to the final correct representation of global motion. A Bayesian framework accounting for the sensory noise and general expectancies for object velocities has proven successful in explaining several experimental findings concerning early motion processing [Weiss, Y., Adelson, E., 1998. Slow and smooth: a Bayesian theory for the combination of local motion signals in human vision. MIT Technical report, A.I. Memo 1624]. In particular, these models provide a qualitative account for the initial bias induced by the 1D motion cue. However, a complete functional model, encompassing the dynamical evolution of object motion perception, including the integration of different motion cues, is still lacking. Here we outline several experimental observations concerning human smooth pursuit of moving objects and more particularly the time course of its initiation phase, which reflects the ongoing motion integration process. In addition, we propose a recursive extension of the Bayesian model, motivated and constrained by our oculomotor data, to describe the dynamical integration of 1D and 2D motion information. We compare the model predictions for object motion tracking with human oculomotor recordings.

Introduction

Efficient object motion processing is achieved in humans and non-human primates by integrating multiple noisy local motion signals. It is appropriate to distinguish two types of local motion signals, ambiguous and non-ambiguous ones. Motion signals from elongated uni-dimensional (1D) contours are ambiguous when analyzed through a spatially limited aperture (see Fig. 1), similar to the receptive field of many neurons in the motion-sensitive middle-temporal (MT) cortical area (Albright, 1984). The ambiguity relies on the fact that the motion of the contour in the tangential direction is unknown, so that the observed movement is consistent with a family of possible motion directions and velocities (Fig. 1). In contrast, motion signals from local 2D features (e.g. terminators) are non-ambiguous, and psychophysical (Lorenceau and Shiffrar, 1992) and physiological (Pack and Born, 2001) studies have demonstrated that these signals can be used to reliably solve the aperture problem. However, the integration of 1D and 2D information is time-demanding and very short presentations of moving objects may give rise to characteristic perceptual errors (Lorenceau et al., 1993) that are biased in the direction orthogonal to the contour. For example, Lorenceau and Shiffrar (1992) found that tilted lines (+20° anti-clockwise with respect to the vertical) moving to the right and down are perceived as moving upward if presented very briefly at a low contrast. For longer presentations, the perceptual bias tends to be reduced and eventually eliminated. In parallel, electrophysiological recordings have shown that direction selectivity for motion-sensitive neurons in MT changes across time, over a typical time interval of ∼60 ms (Pack and Born, 2001). MT neurons respond mostly to the direction orthogonal to object’s motion, whereas later they encode the actual object’s motion.

Recent studies on smooth pursuit eye movements (SPEM) do also provide an account of early motion processing which parallels the former findings in psychophysical experiments (see also Section 2). When human subjects or monkeys are required to visually track a moving object carrying different 1D and 2D information, eye-velocity traces are transiently biased, at pursuit initiation, toward the 1D-cued direction, i.e. orthogonally to the object’s contour (Masson and Stone, 2002, Wallace et al., 2005, Born et al., 2006). Later, the edge–orthogonal 1D-bias (or tracking error) is progressively eliminated and eye-velocity converges to the object’s global motion. Typically, the resolution of motion signal ambiguity is achieved within the first 300–400 ms after the presentation of the moving stimulus.

Beside the intrinsic ambiguity resulting from the local edge direction of motion, visual motion is also affected by the noise embedded in the sensory input per se. These two sources of uncertainty can be well integrated within a Bayesian framework (Weiss et al., 2002, Weiss and Fleet, 2002, Stocker and Simoncelli, 2006, Perrinet et al., 2005) where the perceived motion is the solution of a statistical inference problem. In these models, the information from local 1D and 2D motions can be represented by their likelihood functions and these functions can be derived for simple objects with the help of a few reasonable assumptions (Weiss and Fleet, 2002). Bayesian models also allow the inclusion of prior constraints and the most common assumption used in motion models is a preference for slow speeds. The effects of priors are especially salient when signal uncertainty is high. One way to increase the uncertainty of a visual stimulus is to reduce its contrast, and in these cases, perceived velocity is indeed underestimated (Thompson, 1982), thereby providing some experimental support for the slowness prior. Interestingly, Priebe and Lisberger, 2004 have demonstrated that increasing the spatial frequency of the moving stimuli leads to qualitatively similar results than a decrease of contrast (or, more generally, an increase of visual noise), namely to the underestimation of perceived motion speed.

Up to now, Bayesian motion models have been applied to qualitatively predict, for instance, the initial bias toward 1D motion signals observed experimentally and its dependence on sensory noise (Weiss et al., 2002). We propose here to develop this theoretical framework in order to model smooth pursuit eye movements when tracking moving objects that carry multiple local cues. In particular we will focus on the dynamical evolution of the tracking error which reflects, in our opinion, the main characteristics of the underlying dynamical motion integration process. Our dynamic model is composed of a Bayesian kernel and an updating rule. The Bayesian kernel is fairly traditional, combining prior knowledge on speed with the current estimate to produce a robust inference of velocity. The updating rule revises the prior with time, thereby reflecting all past evidence about particular velocities. We propose that prior knowledge represents initially a default assumption independently of any stimulus that is then recursively updated by using the previous posterior probability as the current prior. The recursive injection of posterior distribution boosts the spread of information about the object’s global shape, favoring the disambiguation of 1D by 2D cues. We also propose to both constrain and validate this model by means of experimental recordings of smooth pursuit eye movements.

Section snippets

Dynamic motion integration: an oculomotor account

Humans and monkeys are perfectly able to visually track the center of a moving extended object. The general purpose of these smooth voluntary eye movements is the stabilization of the image of the moving object on the fovea. Tracking accuracy during the steady-state movement is very high regardless of the orientation of the object’s edges with respect to motion direction.

However, before the steady-state movement is achieved, significant biases can be observed. When the orientation of a moving

Smooth pursuit recording and analysis

In the first set of oculomotor experiments, we recorded smooth pursuit eye movements from three human subjects (two authors of the paper and one naïve subject) while they were tracking one of two objects. The first object was a circular Gaussian spot and the second a line whose length could be approximated as infinite, in the sense that terminators were very far in the periphery and therefore their influence was presumably very limited. These stimuli moved with various motion directions and

Results

Fig. 6 presents, for each subject and target motion direction, the estimated variance of the prior and the two independent likelihood distributions as a function of the target speed. It is important to underline that these are estimates of hidden variables which are supposed to characterise the internal inferential processes underlying motion integration. Because these variables are fully constrained by experimental data, they may provide a first general validation of the model. Fig. 6 deserves

Conclusions

Uncertainty in motion processing is reflected in the variability of the initial velocity of smooth pursuit eye movements. This type of eye movements provides also a reliable dynamic measure of the different contributions of 1D and 2D motion cues to motion integration. We have presented a simple model of motion integration dynamics, which is based on the idea of recursively updating the observer’s prior about object motion by means of recent experience. The model is quantitatively constrained by

Acknowledgements

We are deeply thankful to the patient volunteers who participated in the oculomotor experiments and in particular the naïve subject AR. Anna Montagnini was supported by a Marie Curie European Individual Fellowship.

References (32)

  • D. Goldreich et al.

    Effect of changing feedback delay on spontaneous oscillations in smooth pursuit eye movements of monkeys

    Journal of Neurophysiology

    (1992)
  • F. Hürlimann et al.

    Testing the Bayesian model of perceived speed

    Vision Research

    (1992)
  • R. Kalman

    Design of self-optimizing control system

    Transactions of ASME

    (1958)
  • D. Kersten et al.

    Object perception as Bayesian inference

    Annual Review Psychology

    (2004)
  • R. Krauzlis et al.

    A model of visually-guided smooth pursuit eye movements based on behavioral observations

    Journal of Computational Neuroscience

    (1994)
  • S. Lisberger et al.

    Visual motion processing and sensory-motor integration for smooth pursuit eye movements

    Annual Reviews in Neuroscience

    (1987)
  • Cited by (40)

    • The perceptual dynamics of the contrast induced speed bias

      2022, Vision Research
      Citation Excerpt :

      Since it was first introduced (Weiss et al., 2002) the slow motion prior has been employed to explain an impressive variety of perceptual phenomena (Aguado & López-Moliner, 2019; Bogadhi et al., 2011; Kwon et al., 2015; Lakshminarasimhan et al., 2018; Welchman et al., 2008), providing a common framework to connect all of them. Dynamic models in particular have been used to explain the direction bias seen in SPEMs when pursuing ambiguous 1D motion (Bogadhi et al., 2011; Dimova & Denham, 2009; Montagnini et al., 2007) and the perceptual illusion known as the curveball illusion and the motion induced position shift (MIPS)(Kwon et al., 2015). In the present study, we wished to follow a somewhat different approach: we implemented a dynamic Bayesian model with assumptions similar to these models to generate predictions regarding the temporal evolution of the contrast induced speed bias.

    • Construction and evaluation of an integrated dynamical model of visual motion perception

      2015, Neural Networks
      Citation Excerpt :

      Bayesian models have been used to model motion perception at a higher level (Weiss & Adelson, 1998; Weiss, Simoncelli, & Adelson, 2002) or eye movements (Bogadhi, Montagnini, Mamassian, Perrinet, & Masson, 2011; Montagnini, Mamassian, Perrinet, Castet, & Masson, 2007). Typically those models assume probabilistic inputs and outputs, with various levels of abstraction, such as segregation between 1D and 2D components (Montagnini et al., 2007), and define some quantitative value to be maximized. Although general neural implementation strategies for Bayesian mechanisms have been proposed (Rao, 2004), as is the case for other optimization methods such as variational approaches (Viéville, Chemla, & Kornprobst, 2007), the link to the effective neural computations is often not well specified.

    • Motion-based prediction explains the role of tracking in motion extrapolation

      2013, Journal of Physiology Paris
      Citation Excerpt :

      Similarly, this may be expressed in as a Kalman filter, that is in a generic Bayesian framework with a clear hypothesis (Welch and Bishop, 1995). Following the idea of Kalman filter, Montagnini (2007) and Bogadhi et al. (submitted for publication) proposed a hierarchical recurrent Bayesian framework to understand both motion integration as observed in smooth pursuit and also the predictive nature of pursuit. Probabilistic inference has been successful in explaining motion perception to a variety of stimuli (Weiss et al., 2002).

    View all citing articles on Scopus
    View full text