Virtual reality head mounted displays (VR-HMD) can immerse individuals into a variety of virtual environments while accounting for head orientation to update the virtual environment. VR-HMDs also allow users to explore environments while maintaining different body positions (e.g. sitting and laying down). How discrepancies between real world body position and the virtual environment impact the perception of virtual space or, additionally, how a visual upright with incongruent changes in head orientation affects space perception within VR has not been fully defined. In this study we sought to further understand how changes in head-on-body orientation (laying supine, laying prone, laying on left side and, being upright) while a steady visual virtual upright is maintained can affect the perception of distance. We used a new psychophysics perceptual matching based approach with two different probe configurations (“L” & “T” shape) to extract distance perception thresholds in the four previously mentioned positions at egocentric distances of 4, 5, and 6 virtual meters. Our results indicate that changes in observer orientation with respect to gravity impact the perception of distances with a virtual environment when it is maintained at a visual upright. Here we found significant differences between perceived distances in the upright condition compared to the prone and laying on left side positions. Additionally, we found that distance perception results were impacted by differences in probe configuration. Our results add to a body of work on how changes in head-on-body orientation can affect the perception of distance, while stressing that more research is still needed to fully understand how these changes with respect to gravity affect the perception of space within virtual environments.
Many studies have found people can accurately judge distances in the real world while they underestimate distances in virtual reality (VR). This discrepancy negatively impacts some VR applications. Direct blind walking is a popular method of measuring distance judgments where participants view a target and then walk to it while blindfolded. To ensure that participants are comfortable with blindfolded walking, researchers often require participants to practice blind walking beforehand. We call this practice ”pre-experiment blind walking” (PEBW). Few studies report details about their PEBW procedure, and little research has been conducted on how PEBW might affect subsequent distance judgments. This between-participant study varied the amount of the PEBW and had participants perform distance judgments in VR. The results show that a longer PEBW causes less distance underestimation. This work demonstrates the importance of clearly reporting PEBW procedures and suggests that a consistent procedure may be necessary to reliably compare direct blind walking research studies.
Generative adversarial networks (GANs) generate high-dimensional vector spaces (latent spaces) that can interchangeably represent vectors as images. Advancements have extended their ability to computationally generate images indistinguishable from real images such as faces, and more importantly, to manipulate images using their inherit vector values in the latent space. This interchangeability of latent vectors has the potential to calculate not only the distance in the latent space, but also the human perceptual and cognitive distance toward images, that is, how humans perceive and recognize images. However, it is still unclear how the distance in the latent space correlates with human perception and cognition. Our studies investigated the relationship between latent vectors and human perception or cognition through psycho-visual experiments that manipulates the latent vectors of face images. In the perception study, a change perception task was used to examine whether participants could perceive visual changes in face images before and after moving an arbitrary distance in the latent space. In the cognition study, a face recognition task was utilized to examine whether participants could recognize a face as the same, even after moving an arbitrary distance in the latent space. Our experiments show that the distance between face images in the latent space correlates with human perception and cognition for visual changes in face imagery, which can be modeled with a logistic function. By utilizing our methodology, it will be possible to interchangeably convert between the distance in the latent space and the metric of human perception and cognition, potentially leading to image processing that better reflects human perception and cognition.
Research studies suggest that racial and gender stereotypes can influence emotion recognition accuracy both for adults and children. Stereotypical biases have severe consequences in social life but are especially critical in domains such as education and healthcare, where virtual humans have been extending their applications. In this work, we explore potential perceptual differences in the facial emotion recognition accuracy of virtual humans of different genders, races, and ages. We use realistic 3D models of male/female, Black/White, and child/adult characters. Using blendshapes and the Facial Action Coding System, we created videos of the models displaying facial expressions of six universal emotions with varying intensities. We ran an Amazon Mechanical Turk study to collect perceptual data. The results indicate statistically significant main effects of emotion type and intensity on emotion recognition accuracy. Although overall emotion recognition accuracy was similar across model race, gender, and age groups, there were some statistically significant effects across different groups for individual emotion types.
The visualization of spaces, both virtual and built, has long been an important part of the environment design process. Industry tools to visualize occupancy have grown from simple drop-in stock photos post-design to real-time crowds simulations. However, while treatment of visualization and collaborative design processes has long been discussed in the HCI and Architecture communities, these inclusive design methods are infrequently seen in architecture education (e.g. studio) and practice, nor implemented in licensure requirements – leaving designers to think about the future occupants on their own. While there are strong indicators of the impact visualization modality and rendering style have on perception of scale and space, little has been explored regarding how we represent the human form with respect to these design tools and practices. We present findings from a novel online interactive space planning and estimation study that examines the effects of 3 common building visualization modalities in the design process with 3 human form modalities extracted from the architecture literature. Results indicate the type of visualization changes the number of occupants estimated, and that designers prefer integrated manikins within building models when estimating space usage, although their acceptance was equally divided between 2D and 3D. Our findings lay the foundation for new and focused design tools integrating human form and factors at building scale.
Shinrin-yoku, also known as forest bathing, is a nature immersion practice that has been shown to have restorative effects on mental health. Recently, applications of shinrin-yoku in virtual reality (VR) have been investigated as means of providing similar mental health benefits to people that do not have direct access to nature. These applications have shown similar health benefits, although not to the extent of real nature. The factors that make VR nature immersion effective are little researched to date. This paper investigates the Biophilia Hypothesis in the context of a VR-based nature immersion experience. Twenty-six participants were immersed in a computer-generated virtual natural environment that was either high in biomass (forest) or devoid of biomass (canyon), after experiencing an arithmetic stressor task. We compared multiple restorative outcomes between the high and low biomass groups, as well as preference ratings for real and virtual high and low biomass scenes among all participants. Our results call for further investigation into data trends we observed.
Our aim is to develop a better understanding of how the Point of Release (PoR) of a ball affects the perception of animated throwing motions. We present the results of a perceptual study where participants viewed animations of a virtual human throwing a ball, in which the point of release was modified to be early or late. We found that errors in overarm throws with a late PoR are detected more easily than an early PoR, while the opposite is true for underarm throws. The viewpoint and the distance the ball travels also have an effect on perceived realism. The results of this research can help improve the plausibility of throwing animations in interactive applications such as games or VR.
The study of event perception emphasizes the importance of visuospatial attributes in everyday human activities and how they influence event segmentation, prediction and retrieval. Attending to these visuospatial attributes is the first step toward event understanding, and therefore correlating attentional measures to such attributes would help to further our understanding of event comprehension. In this study, we focus on attentional synchrony amongst other attentional measures and analyze select film scenes through the lens of a visuospatial event model. Here we present the first results of an in-depth multimodal (such as head-turn, hand-action etc.) visuospatial analysis of 10 movie scenes correlated with visual attention (eye-tracking 32 participants per scene). With the results, we tease apart event segments of high and low attentional synchrony and describe the distribution of attention in relation to the visuospatial features. This analysis gives us an indirect measure of attentional saliency for a scene with a particular visuospatial complexity, ultimately directing the attentional selection of the observers in a given context.
Translucent materials are ubiquitous in our daily lives, from organic materials such as food, liquids or human skin, to synthetic materials like plastic or rubber. In these materials, light penetrates inside the surface and scatters in the medium before leaving it. While the physical phenomena responsible for translucent appearance are well known, understanding how human observers perceive this type of materials is still an open problem: The appearance of translucent objects is affected by many dimensions beyond the optical properties of the material, including shape and illumination. In this work, we focus on the effect of illumination on the appearance of translucent materials. In particular, we analyze how static and dynamic illumination impact the perception of translucency. Previous studies have shown that changing the illumination conditions results in a constancy failure, specially in media with anisotropic phase functions. We extend this line of work, and analyze whether motion can alleviate such constancy failure. To do that, we run a psychophysical experiment where users need to match the optical density of a reference translucent object under both dynamic and static illumination. Surprisingly, our results suggest that in most cases light motion does not impact the perceived density of the translucent material. Our findings can have implications for material design in predictive rendering and authoring applications.
Through language, people convey not only pure semantics, but also information about themselves, such as age, gender, state of mind or health. The supralingual features that carry this information have been a subject of research for a long time. Various procedures have been proposed to remove unneeded semantics from speech recordings, in order to study supralingual information in natural speech. In this paper, we propose a new method for removing sematics, based on erosion, a morphological operator. We compare its effectiveness to different state-of-the-art methods. As established methods we consider two low pass filters with cut off frequencies of 450Hz and 1150Hz and Brownian noise. As a newer method we investigate a filter for spectro-temporal frequencies. To evaluate each method, appropriately processed recordings were presented to a group of participants in a perceptual experiment. The intelligibility was measured by means of the Levenshtein distance. Our results show that erosion itself performs similarly to the established methods, while a combination of erosion and low-pass filter outperforms all other methods.