Elsevier

Acta Psychologica

Volume 138, Issue 1, September 2011, Pages 219-230
Acta Psychologica

Similarity and categorization: From vision to touch

https://doi.org/10.1016/j.actpsy.2011.06.007Get rights and content

Abstract

Even though human perceptual development relies on combining multiple modalities, most categorization studies so far have focused on the visual modality. To better understand the mechanisms underlying multisensory categorization, we analyzed visual and haptic perceptual spaces and compared them with human categorization behavior. As stimuli we used a three-dimensional object space of complex, parametrically-defined objects. First, we gathered similarity ratings for all objects and analyzed the perceptual spaces of both modalities using multidimensional scaling analysis. Next, we performed three different categorization tasks which are representative of every-day learning scenarios: in a fully unconstrained task, objects were freely categorized, in a semi-constrained task, exactly three groups had to be created, whereas in a constrained task, participants received three prototype objects and had to assign all other objects accordingly. We found that the haptic modality was on par with the visual modality both in recovering the topology of the physical space and in solving the categorization tasks. We also found that within-category similarity was consistently higher than across-category similarity for all categorization tasks and thus show how perceptual spaces based on similarity can explain visual and haptic object categorization. Our results suggest that both modalities employ similar processes in forming categories of complex objects.

Highlights

► We investigate similarity rating and categorization in visual and haptic processing. ► We use complex, parametrically-defined objects in our experiments. ► The perceptual spaces for both tasks are highly similar for vision and touch. ► Haptic processing can compete with visual processing even in complex tasks.

Introduction

Categorization is one of the most fundamental processes of the human brain. Every day we are confronted with a vast amount of sensory data, which has to be categorized into meaningful entities or concepts: is the milk fresh or not — and therefore, can I drink it, or not? Is the face of my friend smiling or not and is he, therefore, in a good mood or not? Even though categorization had always been one of the core topics in philosophy and, later, in psychology, it was perhaps with the seminal studies by Rosch (1978) that research on perceptual categorization came into focus. Since then, a host of studies have been devoted to analyze visual object categorization. Several factors were determined that influence categorization behavior: typicality (How typical is the object for a certain category?); familiarity (How familiar is the object for the observer?); frequency (How often is the object represented within the category?); knowledge about the origin (for example, a fawn is the baby of a roe deer, although the fur pattern is different); and many more (Hahn & Ramscar, 2001). Although these factors influence categorization, the default factor underlying categorization behavior in many cases seems to be similarity between objects (Goldstone, 1994, Hahn and Ramscar, 2001). In one popular model for explaining categorization strategies, the exemplar model, the similarity is measured between a new object that has to be categorized and the representation of all previously encountered objects (Nosofsky, 1992). In another model, the prototype model, categories are represented by a representative object (either known but very prototypical object of the category, or a virtual average of the whole category) and the similarities between the ideal objects of several categories are compared to the newly encountered object in order to assign it to the correct category (Edelman, 1998, Posner and Keele, 1968, Rosch, 1978) (see (Ashby & Maddox, 2005) for a comparison of the different categorization models and a more in-depth discussion of the alternative influences on categorization, such as rule-based strategies).

One of the early findings of categorization research was that when humans visually categorize objects they often focus on shape as a deterministic feature (Rosch, Mervis, Gray, Johnson, & Boyesbraem, 1976). Shape, however, is not exclusive to the visual modality: the haptic modality is also an expert system in identifying shapes (Klatzky, Lederman, & Metzger, 1985). It therefore seems to be obvious that shape also plays an important role in haptic object categorization as was shown by Lederman and Klatzky (1990). While children from 3 to 9 years of age rely more heavily on other object features like texture, shape is especially important for adults in object categorization (Schwarzer, Kufer, & Wilkening, 1999).

The reason for the importance of tactile input to human shape processing is that humans are able to form a bimodal representation of object shape from the earliest days of infancy. Directly after birth, vision and haptics are both limited. The infant can only passively perceive its surroundings and objects presented to it. However, already at this stage the palmar grasp reflex of the infant allows it to gather information about textures and materials in its surrounding such as the blanket it lies on, or the mother's hair. As soon as active exploration becomes possible, the infant gains access to a vastly enriched sensory impression. Grasping an object enables the infant to turn the object and view it under different angles and thus to form simultaneously a visual and haptic 3D shape representation. In addition, object properties mainly available to the haptic modality (weight, material, temperature) can be linked to the visual appearance allowing easy recognition from visual input later on in life.

Following the idea of a close connection between vision and touch in the development of (perceptual) categorization it seems obvious that a better understanding of haptic categorization behavior is necessary if we want to understand how the human brain forms categories of objects. However, so far only relatively few studies have analyzed how humans categorize objects haptically. Lederman and Klatzky (1990), for example, analyzed how humans categorize common, every-day objects, while Haag (2011) compared visual and haptic categorization of toy objects, which resembled miniaturized animals. While these studies have shown that humans are surprisingly good at haptic recognition of familiar objects, the drawback of these sets of stimuli is that they are hard to characterize in terms of important object parameters such as shape or texture. Furthermore, in both studies participants had to first link a semantic meaning to the objects before they were able to categorize the objects. In contrast, Homa et al., 2009, Schwarzer et al., 1999, Cooke et al., 2007 used fully controlled, novel stimuli and thus participants had to base their categorization behavior solely on object-intrinsic properties. The stimuli used in these three studies, however, were restricted to simpler geometric shapes, such as cylinders or building blocks. In the present study, we try to bridge the gap between highly familiar and novel, geometric stimuli with a unique stimulus set of fully controlled, but still natural objects. For this purpose, we combined a biologically-plausible mathematical model describing shell-growth with 3D printing techniques and generated a set of complex, shell-shaped 3D objects that are well-defined parametrically. These objects span a three-dimensional object space. With this carefully designed physical shape space, we first collected similarity ratings to investigate the topology of the perceptual spaces in vision and haptics. To better understand the link between similarity and categorization we then performed three different categorization tasks with different degrees of freedom and analyzed the structure of the resulting categories. This was done to examine whether a change in categorization procedure would affect the visual and the haptic modality in the same or in different ways, and thus whether the same processes underlie visual and haptic object categorization. Taken together our study seeks to increase our understanding of the mechanisms underlying multisensory categorization.

For the categorization experiments, we selected three different learning scenarios humans might encounter in different contexts. For the first scenario, imagine a mother with her child pointing towards a Golden Retriever and stating, “this is a dog”. The child will learn the label “dog” but will come across many different kinds of dogs later on. Some may look very similar, for example, a German Shepherd. Others will look very different, such as a bulldog. Still, the child will learn that all of them form one category, most likely with no further explicit feedback provided. Our “constrained” categorization experiment resembles this situation. One object per category is provided, and the following objects have to be sorted according to those.

Another scenario that, for example, researchers are confronted with during daily life is the unconstrained categorization task. Researchers have to read many papers and then archive those intelligently. Therefore they have to find a categorization system, such as putting them into folders by year of publication or by the character of the first author's name. In this task, the researchers are free to choose their own system with personally preferred categories and category labels. A similar situation might be when children playfully sort their toys or other objects into categories.

The third categorization experiment is a semi-constrained experiment and is thus situated between the constrained and unconstrained task: in this task, participants have to form an exact number of categories.

All tasks have in common that no explicit feedback is provided, that is, participants are not informed whether their categorization of single objects might be “correct” or “not”. In this sense all three tasks follow the procedure of unsupervised learning (Love, 2002, Shepard et al., 1961). Implicit feedback, however, may be perceived by the participants: in our experiments, each categorization task was repeated until the same categories were formed twice in a row. Hence, a categorization behavior that might be easier to remember would potentially speed up the experiment.

The categorization data as well as the similarity ratings were analyzed using multidimensional scaling (MDS). MDS is an elegant and powerful approach to visualize and study perceptual spaces that are created by similarities (Shepard et al., 1961, Torgerson, 1952). Similarity judgments, or any other measure of pairwise proximity, can be used to represent similarity relations between different entities, which are visualized in an n-dimensional space. The similarity between two objects is inversely related to the distance between the objects in this space, which can be understood as a topological representation of object properties. This perceptual space contains information about how many dimensions are perceived by humans, whether or not these dimensions correspond to the physical measures of the different entities, and how important the different physical measures are to humans. More important for the study presented here, MDS yields the topology of the visual and haptic perceptual spaces that can then be compared to assess similarities and differences of the shape representations formed by the two modalities. In previous studies (Gaissert et al., 2008, Gaissert et al., 2010), we performed a detailed analysis of the topology of visual and haptic perceptual spaces for complex three-dimensional objects and found highly similar perceptual spaces for the two modalities. We thus concluded that similarity is processed very consistently in the two modalities. Moreover we found that both perceptual spaces resemble the underlying physical object space surprisingly well, supporting the notion that both modalities allow for an almost veridical representation of the physical parameters of the shape space.

In such a perceptual space objects that are perceived as most similar are close in distance and are direct neighbors. In his book on representations in object perception, Edelman (1999) posits that for visual exploration of 3D shapes, similar objects form whole neighborhoods within the perceptual space which can, in a veridical space, support categorization (Edelman, 1998, Edelman et al., 1998, Edelman and Duvdevani-Bar, 1997). In the paper presented here we will first try to replicate this result for visual exploration of our complex shell-shaped objects. More importantly, we will analyze whether this hypothesis can also be transferred to the haptic modality and, therefore, whether the link between similarity and categorization might be of a similar kind in both modalities.

Our research extends earlier results from a visuo-haptic object categorization study (Cooke et al., 2007). In this study it was found that a perceptual space reconstructed from similarity ratings can be used to predict categorization parameters. The main limitation of this study, however, was that the parametrically-defined objects consisted of simple, geometric shapes. Furthermore, these objects varied in only two dimensions, shape and texture, both of which are very intuitive to the visual as well as to the haptic modality — hence it is perhaps not surprising that both similarity and categorization results in this study did, indeed, reflect those two dimensions. Moreover participants mostly seemed to base their categorization behavior only on one dimension, either shape or texture, instead of integrating these two dimensions.

Here we present visual and haptic object categorization of complex parametrically-defined 3D objects which vary in three arbitrary shape dimensions. With these objects, we analyze the reconstruction quality of visual and haptic perceptual spaces based on similarity ratings. We compare three different categorization tasks: an unconstrained categorization task, a semi-constrained categorization task, and a constrained categorization task. Finally we analyze whether neighborhoods within the perceptual spaces can support the findings of the categorization behavior.

Section snippets

Material and methods

This section contains a detailed description of the stimuli used in the experiments, the experimental procedures, as well as the multidimensional scaling techniques used to analyze the data.

Similarity ratings

Participants' similarity ratings were converted to dissimilarities and averaged across the three performed blocks. In order to assess the consistency across participants, the average dissimilarity matrix of every participant was correlated with the dissimilarity matrix of every other participant. The consistency within the haptic modality was slightly but significantly lower than within the visual modality, (visual: Rmin = 0.819, Rmax = 0.9, Rmean = 0.861, SEM = 0.003, haptic: Rmin = 0.749, Rmax = 0.893, R

Discussion

A set of complex but natural 3D shapes was generated, spanning a three-dimensional object space. Using these objects we performed similarity ratings visually and haptically and visualized the perceptual spaces. Overall we found a high correlation between visual and haptic similarity perception. Moreover the haptic modality recovered the structure of the underlying physical object space astonishingly well. The good performance of the haptic modality is especially astonishing when compared to

Acknowledgments

The authors would like to thank Isabelle Bülthoff as well as two anonymous reviewers for helpful comments on the manuscript.

This research was supported by a PhD stipend from the Max Planck Society. Part of this research was also supported by the World Class University (WCU) program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (R31-2008-000-10008-0).

References (60)

  • F.G. Ashby et al.

    On the dangers of averaging across subjects when using multidimensional-scaling or the similarity-choice model

    Psychological Science

    (1994)
  • I. Borg et al.

    Modern multidimensional scaling

    (2005)
  • D.H. Brainard

    The psychophysics toolbox

    Spatial Vision

    (1997)
  • E.W. Bushnell et al.

    Children's haptic and cross-modal recognition with familiar and unfamiliar objects

    Journal of Experimental Psychology. Human Perception and Performance

    (1999)
  • N.R. Carlson

    Physiology of behavior

    (2004)
  • T. Cooke et al.

    Characterizing perceptual differences due to haptic exploratory procedures: An MDS approach

  • T. Cooke et al.

    Multidimensional scaling analysis of haptic exploratory procedures

  • T.F. Cox et al.
    (2001)
  • L.J. Cronbach

    Essentials of psychological testing

    (1990)
  • F. Cutzu et al.

    Faithful representation of similarities among three-dimensional shapes in human vision

    Proceedings of the National Academy of Sciences of the United States of America

    (1996)
  • P.T. Do

    Learning, retention, and generalization of haptic categories

    (2009)
  • S. Edelman

    Representation is representation of similarities

    The Behavioral and Brain Sciences

    (1998)
  • S. Edelman

    Representation and recognition in vision

    (1999)
  • S. Edelman et al.

    Effects of parametric manipulation of inter-stimulus similarity on 3D object categorization

    Spatial Vision

    (1999)
  • S. Edelman et al.

    A model of visual recognition and categorization

    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences

    (1997)
  • S. Edelman et al.

    Toward direct visualization of the internal shape representation space by fMRI

    Psychobiology

    (1998)
  • D.R. Fowler et al.

    Modeling seashells

    ACM transactions on computer graphics

    (1992)
  • N. Gaissert et al.

    Analyzing perceptual representations of complex, parametrically-defined shapes using MDS

    Haptics: Perception, Devices and Scenarios, Proceedings of EuroHaptics Conference

    (2008)
  • N. Gaissert et al.

    Visual and haptic perceptual spaces show high similarity in humans

    Journal of Vision

    (2010)
  • E.B. Goldstein

    Sensation and perception

    (2007)
  • Cited by (27)

    • Widgets: A new set of parametrically defined 3D objects for use in haptic and visual categorization tasks

      2020, Revue Europeenne de Psychologie Appliquee
      Citation Excerpt :

      These findings indicate that humans are able to categorize objects by touch, using pre-established semantic knowledge, but what happens when the objects are novel or unfamiliar, meaning that haptic categorization has to consider object-intrinsic properties such as shape, rather than semantic knowledge? We selected and reviewed eight papers that were directly relevant to this issue (Cooke, Jäkel, Wallraven, & Bülthoff, 2007; Gaissert, Bülthoff, & Wallraven, 2011; Homa, Kahol, Tripathi, Bratton, & Panchanathan, 2009; James, Shima, Tarr, & Gauthier, 2005; Lacey, Peters, & Sathian, 2007; Norman, Norman, Clayton, Lianekhammy, & Zielke, 2004; Schwarzer, Küfer, & Wilkening, 1999; Yildirim & Jacobs, 2013), and summarized the main characteristics of the methods they described (see Table 1). All these studies used unfamiliar (i.e., novel) 3D stimuli as material for haptic categorization or object recognition purposes.

    • Can machine learning account for human visual object shape similarity judgments?

      2020, Vision Research
      Citation Excerpt :

      Previous researchers have conducted behavioral experiments in which human participants judged the similarity of visual and visual-haptic objects on the basis of their shapes. For example, in a set of studies by Wallraven, Bülthoff, and colleagues (Cooke, Jäkel, Wallraven, & Bülthoff, 2007; Gaißert, Bülthoff, & Wallraven, 2011; Gaißert & Wallraven, 2012; Gaißert, Wallraven, & Bülthoff, 2010), participants provided similarity judgments for different sets of objects, both artificial and natural, in visual, haptic, and visual-haptic conditions. Using multidimensional scaling (Cox & Cox, 1994) to analyze the experimental data, it was found that participants’ similarity ratings were similar in all three sensory conditions, suggesting that these ratings were based on shared mental representations.

    • Visuo-haptic object perception

      2019, Multisensory Perception: From Laboratory to Clinic
    • Transfer of object category knowledge across visual and haptic modalities: Experimental and computational studies

      2013, Cognition
      Citation Excerpt :

      Previous researchers have considered the transfer of knowledge of object identity across visual and haptic modalities (e.g., Lacey, Peters, & Sathian, 2007; Lawson, 2009; Norman, Norman, Clayton, Lianekhammy, & Zielke, 2004). They have also compared similarity and categorization judgements based solely on visual input with those based solely on haptic input (Gaißert & Wallraven, 2012; Gaißert, Bülthoff, & Wallraven, 2011; Gaißert, Wallraven, & Bülthoff, 2008, 2010). To our knowledge, our experiment is the first focused on the transfer of object category knowledge across visual and haptic modalities.

    View all citing articles on Scopus
    View full text