Research Interests

Research overview.
Spatial behavior is one of the most fundamental aspects of our existence, yet most people have never considered what information they use to navigate their environment or how they use this information for effective decision-making. Think about the last time you walked around the mall, toured a new city, or needed to find your gate at the airport. Assuming clear signage, these spatial activities were likely performed with little thought or effort, but how did you do it? What information was most useful and how did you actually use these cues to support your behavior? The answers to such questions are surprisingly difficult to articulate. These tasks are not a conscious process for most people; they are simply the product of an effective visually-guided perceptual-motor coupling. However, when reliance on visual information is not available, the automaticity of your behavior likely changes dramatically. For instance, now imagine that you must perform the same tasks wearing a blindfold. Most people would find the prospect of navigating complex environments like malls, cities, or airports without vision incredibly daunting, if not completely incomprehensible. How would you avoid obstacles, know what was around you, figure out where to go, recognize landmarks, read signs, stay oriented, and build up a mental representation of space as you walked around? Research in our lab studies such issues by comparing similarities and differences for spatial information processing between visual and non-visual input modalities for learning, representing, and behaving in the world.
- Multimodal spatial cognition.
Our research interests bridge several domains, theoretical and applied, but the core of our program is linked by a fundamental interest in what we call multimodal spatial cognition (MSC). MSC deals with topics such as spatial learning and navigation from different sensory inputs, the effects of multimodal and cross-modal interactions on the mental representation of space, and a comparison of spatial computations, spatial problem solving, and spatial behavior between different information sources. The traditional notion of spatial cognition, broadly defined, refers to how we think about and represent the structure, entities, and relations of space. Important in this endeavor is how we learn and act in our environment. In most cases, spatial cognition research is synonymous with visual spatial cognition, as the vast majority of work in this area deals with vision as the means of accessing, remembering, and acting in the world. This visuocentric focus is not without merit, as compared to its sister senses, the visual system is exquisitely tuned for conveying spatial information by providing distal access to the environment via a large field of view, using parallel information processing, and making use of a large bandwidth “pipe” to the brain. However, to repeat one of the core conceptual refrains of this site (and our research), vision does not hold all the spatial cards--all of our senses encode spatial information to one degree or another. The aim of multimodal spatial cognition is to make comparisons between different combinations of the primary spatial senses of vision, touch, and 3-D spatialized audio (where objects are heard as coming from a specific location in space), as well as non-perceptual inputs such as spatial language (descriptive terminology such as left-right, front-back, etc) in supporting complex spatial operations.
Although your phenomenology may suggest otherwise, vision is not necessary for successful spatial behavior. Good evidence comes from superior navigational abilities of blind animals and humans alike but is also evident from accurate performance by sighted folks on many common tasks done without visual support (e.g. your ability to walk from your bedroom to the bathroom in the middle of the night). When one considers that much of what is perceived through vision is spatial, and that audition and touch convey many of the same spatial properties as vision (e.g. position, direction, configuration, relation, and the like), the ability to accurately perform spatial behaviors without vision is not surprising. What remains unknown is how far the envelope can be pushed for non-visual inputs to specify information and support tasks generally performed using vision. This high level theoretical question motivates much of our research.
Another thought experiment provides a rough illustration of some of the questions addressed by our research in multimodal spatial cognition. Imagine that you are asked to learn four different maps of equal size and complexity but each must be learned from a different spatial rendering condition. Let’s say that with the first map you must learn the space by hearing what is around you (spatial auditory information), with the second map you feel what is around you (haptic information), with the third map you see what is around you (visual information), and with the fourth map you receive verbal descriptions of your surrounds (spatial language information). Your task is to learn each map and find the best route from a given start position to a pub which is equidistant from all four starting points. Assume that all maps provide access to the requisite landmarks, distances, and turn angles to find the target and acquire metric and topological knowledge. Do you think that you would be able to find the route and learn the map equally well from each of the four presentation modes? Would you use the same exploratory strategies for each? Would your mental representation of the map be based on the modality used at learning (E.G., audition, touch, vision, or language) or would it be independent of sensory-specific information (E.G., based on amodal spatial information)? If you believe there is a fundamental difference between inputs, what is the critical information that is, or is not, available between the conditions? Could inputs that poorly convey this critical information be “augmented” or “supplemented” such that their information content would be as useful as others for supporting spatial behaviors? Finally, thinking beyond this map example, do you think parameters like scale (large or small scale spaces), environment type (indoor or outdoor spaces), or information content (quantity and complexity of the space) effects its learnability, memorability, or navigability as a function of the input modality? Such questions are at the heart of much of our research in the lab. We use various techniques to address these issues, including psychophysical, cognitive, and usability paradigms incorporating physical, virtual, augmented, and mixed reality environments.
In a recent project similar in spirit to the hypothetical map experiment just discussed, we found that almost all participants subjectively reported that learning a map using vision was significantly easier and more natural than hearing verbal descriptions or feeling a tactile rendering of the space. One might reasonably conclude from this intuition that their learning would not be the same for maps rendered from these three inputs and that visual apprehension should yield the best test performance. Interestingly, even when people self-reported that they did badly using non-visual modes of environmental access, their data didn’t “agree”. Indeed, our findings from this research demonstrated that their test performance on wayfinding tasks was almost identical between all three conditions, suggesting the building up and accessing of functionally equivalent cognitive maps. These results are consistent with various other multimodal projects showing similar findings—people may be unfamiliar with spatial tasks performed without vision, and often doubt their abilities and accuracy, but in actuality are quite adept at using these information sources for carrying out all matter of tasks in an equivalent manner to vision (see the next section for a discussion).
- Functional equivalence.
Underlying our interest in Multimodal spatial cognition is the theory of functional equivalence of spatial representations. The hypothesis is that when information is matched between inputs, learning from separate encoding modalities can build up into a common spatial representation in memory (called the spatial image in working memory or the cognitive map in long-term memory) which functions equivalently in supporting spatial behaviors, irrespective of the information source (although learning time may differ between inputs). A growing body of corroborating evidence, including work from our lab, demonstrates that information learned from different modalities leads to functionally equivalent test performance for a host of spatial tasks , including spatial updating, orienting, and wayfinding behavior (see our Publications and Spatial Image project page for specific research). Our theoretical interest is independent of specific combinations of inputs or tasks, focusing instead on what situations will and will not induce equivalence between input modalities and investigating the structure of the ensuing spatial representations (and associated neural substrates) mediating this behavior.
There are at least three theoretical explanations for functionally equivalent behavior:
- Separate but equal hypothesis: modality-specific representations existing in parallel.
- Recoding hypothesis: all inputs are converted into a visual representation.
- Amodal hypothesis: separate inputs lead to a common “spatial” representation not tied to any input source. Our research results favor this third explanation.
Our research on functional equivalence and the development of common spatial representations is supported by a National Institutes of Health (NIH) grant entitled “Spatial Images from Vision, Touch and Hearing in Sighted and Blind”. This work is being done in collaboration with the leading researchers in this area: J.M. Loomis (UCSB) and R.L. Klatzky (CMU). Read here for a brief abstract of our spatial images project or check out our Current Projects page to read more on our work in this area.
- Information Requirements for Environmental Learning and Navigation.
Much of the work in the spatial cognition literature involves tasks where people learn an arrangement of target locations, pre-defined routes, or small (laboratory-sized) layouts. While all matter of interesting spatial operations can be studied with these experimental stimuli and tasks, natural spatial behavior generally occurs in a very different context, e.g. in real-world settings which are spatially extended, not limited to a few experimental objects, and subject to myriad sources of ambient variability. There are a growing number of studies that use large-scale environments, either virtual or physical, but most of this work is still concerned with routes rather than learning the space as a whole (environmental learning) or spatial inference (e.g., determining shortcuts, detours, straight-line distances between off-route landmarks, etc). One line of research in the VEMI lab investigates such issues via spatial knowledge acquisition of large-scale indoor and outdoor environments, as well as addressing the transition between these spaces. In addition to performance measures such as speed and accuracy of executing a route or finding a destination, many of our studies also address performance of more complex spatial tasks such as cognitive map development, spatial inference making, spatial updating (keeping track of your position in space as you move), and the like. In addition to test performance, we are also interested in the far less studied domain of human spatial learning behavior. This research generally uses a free exploration (open search) paradigm instead of a directed search (route-based) design. Areas of interest include: how well people learn environments as a whole (form global representations), what exploratory patterns and decision-making processes they use, what learning strategies they employ, and how they perform with spatial uncertainty. These factors all provide important insight into what and how information is used during the learning process.
We study environmental learning and wayfinding behavior between blind, blindfolded-sighted and sighted participants, with a growing interest in aging and how spatial performance changes at different age brackets. Comparison of these participant groups facilitates understanding of how access to different sources of spatial information and use of different learning strategies effect performance as a function of real or simulated sensory loss and natural lifespan development.
- Multimodal interfaces for real-time navigation systems.
A practical outcome of the building up and accessing of functionally equivalent representations is that, assuming provision of the appropriate information, different sensor technologies, multimodal interfaces, and spatial displays could support the same level of spatial behaviors as vision. Our goal is to identify a core set of sensory-independent spatial primitives for indoor and outdoor environments that support complex spatial behaviors, irrespective of the input channel. These spatial primitives are at the heart of our research and development of all navigation interfaces: visual, auditory, haptic, language-based, and multimodal. Although we are interested in the hardware and software used in such displays, our primary focus is on the content and presentation of spatial information--determining the minimal information requirements and best delivery methods supporting the highest level of environmental learning and navigation performance. These results are critical for understanding how multimodal spatial cognition is affected by the availability of different sources of information and are used to establish specifications for the design of visual and non-visual interfaces alike which support a similar level of performance across a range of common spatial behaviors and end-user groups.
Although we study all matter of spatial layouts, our work concentrates on indoor navigation using both real and virtual environments (most recently, we have also included augmented reality in our toolkit). Compared to outdoor travel, indoor navigation is aided by far less information from the environment, orienting cues, and external aids (such as maps or GPS). As a result, spatial learning and wayfinding of indoor spaces can pose some particularly difficult challenges. Check out the Indoor Wayfinding page for more on this surprisingly vexing issue or the information on our Current Projects page for more about our funded projects and proposed solutions.
Other Research Interests in the Lab.
We are interested in all matter of topics that lie at the intersection of spatial cognition and multimodal input. Some of our other interests include: creative uses of multimodal virtual environment technology (MVET) for spatial knowledge acquisition and application, using augmented reality technology to "enhance" salient features in the environment for low-vision users, cross-modal brain plasticity, development and usability testing of assistive technology providing non-visual access to environmental information, universal design, lifespan spatial abilities, and health informatics and aging as relates to the development of gerontechnology for spatial learning and navigation with age-related vision loss.
More about our research can be found on our Current Projects and Philosophy pages. Selected articles and accompanying comments can be downloaded from our publications page. To get additional information about the VEMI Lab, our team, or our colleagues, check out our Lab Resources, Personnel, and Collaborators pages.