Research Approach and Philosophy
Our Approach.
Research in the VEMI Lab investigates how the encoding of spatial information from different inputs can serve as a common framework for learning about, representing, and acting in the world and how these spatial behaviors can be supported by the development and implementation of multimodal interface technologies.
Experiments in our lab are conducted using real-world layouts, virtual environments (VEs), and augmented reality (AR). We employ a number of methodological approaches in our research, often combining techniques from Psychophysics, Experimental Psychology, and Cognitive Neuroscience, with principles from Human-Computer Interaction and Human Factors Engineering. Through collaborations with colleagues at UC Santa Barbara and the University of Edinboro, we have also contributed to projects using neuroimaging techniques(fMRI) to address the neural substrates of multimodal spatial information processing in the brain.
For more on our research and specific experiments, check out our Research Interests, Publications, and Current Projects pages. To learn more about virtual and augmented reality and other specialized equipment in the lab, have a read of our Lab Resources pages.
Our Philosophy.
We adopt a research philosophy that good science should strive to be both theoretically motivated and functionally relevant. As such, our studies combine basic questions about how humans learn, represent, and act in space with applied efforts toward the development of navigational technologies to support these endeavors.
Underlying Theory.
Although all of our senses encode spatial information--with hearing, touch, and vision representing the three primary spatial modalities--the vast majority of research on spatial cognition and interface design only considers the role of vision. By considering the world through the lens of one modality, we are missing much about how we actually perceive the world, e.g. through the perceptual mosaic built up from the composition of information derived from multiple modalities. In other words, we do not perceive the world from just one sense, as is often the emphasis of scientific study; we perceive it as the integrated complement of a multitude of sensory information built up from different inputs. Studying the information processing characteristics and representational structure of multiple channels of sensory information is inherently more challenging and “noisy” than focusing all one’s attention on elaborating these matters for just one. To be effective, it is critical to have a multi-level appreciation of the input(s) of interest and a good understanding of the sensory translation rules between inputs. For example, you cannot simply convert the information on a visual map to a tactile map and expect it to make sense. In this instance, as with most cases involving multimodal information processing more generally, the more you know about the physiology of the sensory organs of interest, channel dynamics, cortical projection, sensory Psychophysics, perceptual biases, memory capacity, etc, the better you are able to make valid comparisons and translations between inputs. Thus, given the lab’s core interest in multimodal spatial information research, we adopt a holistic perspective of the enterprise of spatiality—one that focuses on common spatial information content, representation, mental computation, and behavior, rather than emphasizing the specific sensory conduit of this spatial information. Our view of spatial information processing as a common denominator between the senses has the advantage of reducing the tendency to over-emphasize the role of one sensory channel, namely vision, with respect to other spatial senses. This perspective is also congruent with our phenomenology of the world—that is, we perceive our environment as a unified multimodal experience, not as discrete unimodal events.
Extending this philosophy to our research in the lab, an important premise motivating our work is that many of the same tasks and interface technologies traditionally performed with vision can also be supported by other senses. Indeed, much of what most people consider “visual” we argue is “spatial”. Take a look around you: how much of what you see would you consider purely visual? Color is certainly in this camp, but we believe there is far more commonality in the information perceived about the world than is sensory-specific. Spatial information is one example of a common thread which is specified, often redundantly, by multiple inputs in our nervous system. Temporal coding and emotional valence represent other examples.
There are many examples of this commonality from the spatial domain. For instance, geometric properties such as 3-D structure, relations between surfaces, lines, and edges are not related to any one modality; indeed, such information can be equivalently represented, transformed, and acted upon from multiple input channels. Likewise, distance and direction cues can be conveyed quite accurately through multiple input sources. Ditto for the specification of relation (e.g., foreground-background, subject-object, object-object, and their associated transformations). For instance, the relation between a perceiver and their surrounding environment (e.g., egocentric perspective), relations between elements in the environment (e.g., allocentric perspective), or the ability to calculate how these attributes change given a perceiver’s movement (e.g. spatial updating) can all be specified based on information derived from multiple modalities. It is one’s knowledge and computation of spatial structure which makes interpretation and processing of these relations possible, not sensory-specific input, such as from visual information, as is commonly assumed. In sum, all of the information just discussed can be similarly specified through any number of spatial inputs; and thus, we argue is best considered in terms of its underlying spatial properties rather than from its sensory-specific medium of conveyance. For example, feeling or seeing the edge of your desk provides the same information content and spatial computation of “edgeness” for both touch and vision. This is not to say that there are no differences between the sensory-specific conduits of this information, this is obviously erroneous, but confusing spatial information as sensory-specific (e.g., calling it visual) is also fallacious.
Although vision may be our best conduit of spatial information, it does not have a monopoly on space. The guiding tenet of our research is that when information is matched between inputs during encoding, all spatial modalities (vision, touch, audition, and spatial language) can support a similar level of spatial behavior. This assertion is predicated on the view that separate inputs develop into a common representation in working memory (called the spatial image) which is independent of the source modality and functions equivalently in supporting action (called the Functional Equivalence hypothesis). These theories are strongly influenced by the pioneering work in this area by Jack Loomis (UCSB), Roberta Klatzky (CMU), and their colleagues.
Applications.
The practical fall-out of the notion of a common spatial representation is that assuming availability of appropriate information, equivalent performance can be obtained when learning from spatial displays based on different sensor inputs. This opens the door for development of non-visual interfaces (haptic, 3-D spatialized audio, and dynamically-updated verbal information) to support spatial behavior which is traditionally considered to be solely mediated by information delivered via visual displays. Given the importance of redundant spatial processing in our nervous system, and the clear advantages to the end-user of providing redundant modes of information (e.g., solicitation of attention, reducing cognitive load, increasing memory capacity, facilitating multi-tasking, to name a few), our interest in the research and development of non-visual and multimodal displays has broad application to many users and task domains. Our primary design focus combines research from the spatial cognition and human-computer interaction fields, with the goal of specifying information requirements and design principles for multimodal spatial displays for real-time indoor navigation systems.
More about our basic and applied research programs and the bidirectional flow of information exchange that motivates both can be found on our Research Interests and Current Projects page.