In this study, we proposed an exergame to reduce resistance to exercise and maintain motivation in adolescents with orthostatic dysregulation. We created a 2D side-scrolling action game in Unity to be played on smartphones. The game is synchronized with exercises that can be done in a lying posture. We analyzed the effect of the game on the feelings of the participants. Our experiments showed that the use of the exergame has a positive effect on the participants’ emotions during exercise.
When learning a dexterous skill such as playing the piano, people commonly watch videos of a teacher. However, this conventional way has some downsides such as limited information to be retrieved and less intuitive instructions. We propose a virtual training system by visualizing differences between hands to provide intuitive feedback for skill acquisition. After synchronizing the data, two visual cues are proposed including a hand-overlay manner and a two-keyboards visualization. A pilot study confirm the superiority of the proposed methods over conventional video-viewing.
When an object breaks, simulating evolution of fracture as per artist control while maintaining physical realism and plausibility is a challenging problem due to different complex material properties of real world objects. In this work, we present impurity maps as a way to guide fracture paths for both brittle and ductile fracture. We develop a novel probabilistic damage mechanics to model fracture in materials with impurities, using a random graph-based formulation in conjunction with graph-based FEM. An artist created map allows us to selectively distribute the impurities in the material, to weaken the object in those specific regions where the imperfections are added. During simulation, the presence of impurities guide the cracks that develop such that the fracture pattern closely follows the impurity map. We simulate artist-controlled fractures on different materials to demonstrate the potency of our method.
This paper presents an audio-to-animation synthesis method for violin performance. This new approach provides a fine-grained violin performance animation using information on playing procedure consisting of played string, finger number, position, and bow direction. We demonstrate that our method is capable of synthesizing natural violin performance animation with fine fingering and bowing through extensive evaluation.
An increased interest in public motion capture data has allowed for the use of data-driven animation algorithms through neural networks. While motion capture data is increasingly accessible, data sets have become too large to sort through manually. Similarity metrics quantify how different two motions are and can be used to search databases much faster when compared to manual searches as well as to train neural networks. However, the most popular similarity metrics are not informed by human perception, resulting in the potential for data that is not perceptually similar being labeled as such by these metrics. We conducted an experiment with hand motions to identify how large the differences between human perception and common similarity metrics are. In this study, participants watched two animations of hand motions, one altered and the other unaltered, and scored their similarity on a 7-point Likert scale. In our comparisons, we found that none of the tested similarity metrics correlated with human judged scores of similarity.
The goal of our research is to control a physics-based character that learns several dynamic motor skills using conditional Generative adversarial imitation learning(GAIL). We present a network-based learning algorithm that learns various motor skills and changing motions between the motor skills from disparate motion clips. The overall framework for our controller is composed of a control policy which generates a character’s behavior, and a discriminator which induces the policy to produce proper motions from a user’s commands. The discriminator and the policy take outputs from each other as input and improve each performance through an adversarial training process. Using this system, when a user commands a specific motion to the character, the character can design a motion plan to perform the motion from the current pose. We demonstrated the effectiveness of our approach through examples with an interactive character that learns various dynamics motor skills and follows a user command in the physics simulation.
We propose a new light source representation to intuitively model complicated lighting effects with simple user interactions. Our representation uses two layers; an emissive layer which is a traditional diffuse area light source with a constant emission, and a lighting gel layer which introduces variations to the emission. The lighting gel layer is mapped with a texture for colored shadows without modeling a 3D scene. The two layers are transformed independently to cast the texture with different effects. To cast lighting on a planar canvas in 2D design, the proposed light source can be created and edited in the 2D canvas directly, without switching to 3D world space.
Barrier-grid animation is a primitive animation technique that utilizes an interlaced image with a transparent striped overlay. Ombro-Cinéma, a toy developed in 1921, utilizes this technique effectively with a scroll of graphics. By rotating an attached hand-crank, it presents a scrolling animated story, the narrative of which is often tightly associated with its mechanical structure. This paper describes High-Low Tech Ombro-Cinéma, a prototype device that extends Ombro-Cinéma using an embedded system. Conductive adhesive tape strips are affixed to the back side of a scroll and used to trigger various sonic events to enhance its narrativity. Such an investigation into the reproduction of an obsolete toy with digital technology may provide an interesting opportunity to review the narrative of an obsolete device and to reconsider the effects tangible interfaces have on narratives.
Inclusive Character Creator is a speculative design research project that seeks to address some of the long-standing issues of sexism, racism, ableism, and sizeism prevalent in most 3D character creators in interactive media. This project focuses on stylized and expressive features rather than hyperrealism. It seeks to redefine what it means to start a character from a “default body,” a definition that usually results in creating a biased system that relies on media norms. A version 1.0 has been built, encapsulating fundamental principles resulting from research.
We present ProjecString, a touch-sensitive string curtain projection display that encourages novel interactions via touching, grasping, and seeing and walking through the display. We embed capacitive-sensing conductive chains into an everyday string curtain, turning it into both a space divider and an interactive display. This novel take on transforming an everyday object into an interactive projection surface with a unique translucent property creates novel interactions that are both immersive and isolating.
Virtual Production fulfills George Lucas's early dream of having an engulfing “space-opera in the sky” (1). Epic Games’ focus on realistic interactive 3D game environments using Unreal Engine, revolutionized the field of film-making, by replacing rear film projections with large format, curved, high resolution, immersive LED video screens, allowing backdrops to adapt in real time to the narrative needs of each scene by tracking the movement of the camera. Cinematographers and Art Directors are adapting to the challenges of virtual and real lighting and props, recruiting animators and new media developers who create, usually in very little time, virtual and real props, and metahuman actors and characters, enhancing the production value, optimizing and reducing costs in unparalleled ways. This poster presents the results of the first Virtual Production class offered by the Film Animation and New Media Department at the University of Tampa. In a very short time span, students working in interdisciplinary teams have seen the possibilities of these new technologies for science fiction, fantasy and experimental films that otherwise would have been impossible to create with very limited student budgets.
When navigating within an unfamiliar virtual environment in VR, transitions between pre-defined viewpoints are known to facilitate spatial awareness of a user. Previously, different viewpoint transition techniques had been investigated, but mainly for single-scale environments. We present a comparative study of zoom-in transition techniques, where the viewpoint of a user is being smoothly transitioned from a large level of scale (LoS) to a smaller LoS in a multiscale virtual environment (MVE) with a nested structure. We identify that orbiting first before zooming in is preferred over other alternatives when transitioning to a viewpoint at a small LoS.
Previous works have done several kinds of haptic techniques for simulating the touching experience of the virtual object. However, the feedback on the object’s weight has less been explored. This paper presents GravityPack, a wearable gravity display to simulate grabbing, holding, and releasing the virtual object in the virtual world using the liquid-based system consisting of pumps, pipes, valves, a water tank, and water packs. This system can provide a wide weight range from 110g to 1.8 kg in 40 seconds. Additionally, we design and investigate the visual feedback of weight transition for the delay time of liquid transfer to understand the feasibility of visualization by a user study. With the revealing of design consideration and implementation, the paper also shows the potential use of the liquid-based system and its possibility of the visualization technique to simulate the weight sensations.
We present Immersive-Labeler, an environment for the annotation of large-scale 3D point cloud scenes of urban environments. Our concept is based on the full immersion of the user in a VR-based environment that represents the 3D point cloud scene while offering adapted visual aids and intuitive interaction and navigation modalities. Through a user-centric design, we aim to improve the annotation experience and thus reduce its costs. For the preliminary evaluation of our environment, we conduct a user study (N=20) to quantify the effect of higher levels of immersion in combination with the visual aids we implemented on the annotation process. Our findings reveal that higher levels of immersion combined with object-based visual aids lead to a faster and more engaging annotation process.
We introduce MetaPo, a mobile robot with spheric display, 360° media I/O and robotic hands for creating a unified model of interspace communication. MetaPo works as a portal between pairs of physical-physical, cyber-cyber and cyber-physical spaces to provide 1) panoramic communication for multiple remote users, and 2) immersive interspace migration with mobility functionality. The paper overviews our concept and first prototype of MetaPo with its hardware and software implementation.
In this poster, we present a live 360° panoramic-video based empathic Mixed Reality (MR) collaboration system that shares various Near-Gaze non-verbal communication cues including gaze, hand pointing, gesturing, and heart rate visualisations in real-time. The preliminary results indicate that the interface with the partner’s communication cues visualised close to the gaze point allows users to focus without dividing attention to the collaborator’s physical body movements yet still effectively communicate. Shared gaze visualisations coupled with deictic languages are primarily used to affirm joint attention and mutual understanding, while hand pointing and gesturing are used as secondary. Our approach provides a new way to help enable effective remote collaboration through varied empathic communication visualisations and modalities which covers different task properties and spatial setups.
In everyday life, localizing a sound source entails more than the sole extraction of auditory cues to define its three-dimensions: spatial hearing also takes into account the available visual information (e.g. cues to sound position) and resolves perceptual ambiguities through active listening behavior (e.g. exploring the auditory environment with head movements). We introduce a novel approach to sound localization in 3D named SPHERE which exploits a commercially available Virtual Reality Head-mounted display system with real- time kinematic tracking to combine all of these elements (controlled positioning of a real sound source, recording of participants’ responses in 3D, controlled visual stimulations and active listening behavior). SPHERE allows accurate sampling of 3D spatial hearing abilities, and allows detection and quantification of the contribution of active listening.
We present a compressed volumetric data structure and traversal algorithm that interactively visualizes complete terabyte-scale scientific data. Previous methods rely on heavy approximation and do not provide individual sample-level representation when going beyond gigabytes. We develop an extensible pipeline that makes the data streamable on GPU using compact pointers and a compression algorithm based on wavelet transform. The resulting approach renders high-resolution captures under varying sampling characteristics in real-time.
Monte Carlo path tracing generates renderings by estimating the rendering equation using the Monte Carlo method. Studies focus on rendering a noisy image at the original resolution with a low sample per pixel count to decrease the rendering time. Image-space denoising is then applied to produce a visually appealing output. However, denoising process cannot handle the high variance of the noisy image accurately if the sample count is reduced harshly to finish the rendering in a shorter time. We propose a framework that renders the image at a reduced resolution to cast more samples than the harshly lowered sample count in the same time budget. The image is then robustly denoised, and the denoised result is upsampled using original resolution G-buffer of the scene as guidance.
This work introduces DynaPix, a Krita extension that automatically generates pixelated images and surface normals from an input image. DynaPix is a tool that aids pixel artists and game developers more efficiently develop 8-bit style games and bring them to life with dynamic lighting through normal maps that can be used in modern game engines such as Unity. The extension offers artists a degree of flexibility as well as allows for further refinements to generated artwork. Powered by out of the box solutions, DynaPix is a tool that seamlessly integrates in the artistic workflow.
Transient imaging methods often analyze time-resolved light transport for applications such as range imaging, reflectance estimation and, especially, non-line-of-sight (NLOS) imaging, which targets the reconstruction of hidden geometry using measurements of indirect diffuse reflections emitted by a laser. Transient rendering is a key tool for developing such new applications. In this work, we introduce a set of simple, yet effective subpath sampling techniques targeting transient light transport simulation in occluded scenes. We analyze the usual capture setups of NLOS scenes, where the light and camera indirectly aim at hidden geometry through a secondary surface. We leverage that configuration to reduce the integration path space. We implement our techniques in our modified version of Mitsuba 2, adapted for transient light transport, allowing us to support parallelization, polarization, and differentiable rendering.
Translucent materials are ubiquitous in the real world, from organic materials such as food or human skin, to synthetic materials like plastic or rubber. While multiple models for translucent materials exist, understanding how we perceive translucent appearance, and how it is affected by illumination and geometry, remains an open problem. In this work, we analyze how well human observers estimate the density of translucent objects for static and dynamic illumination scenarios. Interestingly, our results suggest that dynamic illumination may not be critical to assess the nature of translucent materials.
Recently, many methods have been proposed to realistically render various materials. The realism of the synthetic images can be improved by rendering small-scale details on the surfaces of 3D objects. We focus on the efficient rendering of the scratches on the transparent objects. Although a fast rendering method using precomputed 2D BRDFs for a scratched material has been proposed, the method is limited to the opaque materials such as metals. We extend this method to the transparent objects. On the surface of the transparent object, rays are split into specular reflections and refractions. We therefore precompute Bidirectional Scattering Distribution Functions (BSDFs). We use a 2D ray tracer as in the previous method to accelerate the precomputation. We show several examples to demonstrate the effectiveness of our method.
In this study, we propose a stereoscopic transparent display that can be viewed with the naked eye. Existing methods for generating realistic high-quality stereoscopic images require wearable devices and the presence of a display, which degrades the sense of presence. Our method increases the sense of presence by making the stereoscopic images blend into the surrounding environment.
Using smartphones while walking is becoming a social problem. Recent works try to support this issue by different warning systems. However, most only focus on detecting obstacles, without considering the risk to the user. In this paper, we propose a deep learning-based context-aware risk prediction system using a built-in camera on smartphones, aiming to notify ”smombies” by a risk-degree based algorithm. The proposed system both estimates the risk degree of a potential obstacle and the user’s status, which can also be used for distracted driving or visually impaired people.
Inside-out optical 2D tracking of tangible objects on a surface oftentimes uses a high-resolution pattern printed on the surface. While De-Bruijn-torus patterns offer maximum information density, their orientation must be known to decode them. Determining the orientation is challenging for patterns with very fine details; traditional algorithms, such as Hough Lines, do not work reliably. We show that a convolutional neural network can reliably determine the orientation of quasi-random bitmaps with 6 × 6 pixels per block within 36 × 36 pixel images taken by a mouse sensor. Mean error rate is below 2°. Furthermore, our model outperformed Hough Lines in a test with arbitrarily rotated low-resolution rectangles. This implies that CNN-based rotation-detection might also be applicable for more general use cases.
Generating the texture map for a 3D human mesh from a single image is challenging. To generate a plausible texture map, the invisible parts of the texture need to be synthesized with relevance to the visible part and the texture should semantically align to the UV space of the template mesh. To overcome such challenges, we propose a novel method that incorporates SamplerNet and RefineNet. SamplerNet predicts a sampling grid that enables sampling from the given visible texture information, and RefineNet refines the sampled texture to maintain spatial alignment.
We present a technique to reduce the dynamic range of an HDRI lighting environment map in an efficient, energy-preserving manner by spreading out the light of concentrated light sources. This allows us to display a reasonable approximation of the illumination of an HDRI map in a lighting reproduction system with limited dynamic range such as virtual production LED Stage. The technique identifies regions of the HDRI map above a given pixel threshold, dilates these regions until the average pixel value within each is below the threshold, and finally replaces each dilated region’s pixels with the region’s average pixel value. The new HDRI map contains the same energy as the original, spreads the light as little as possible, and avoids chromatic fringing.
Recent advances in computer vision have made 3D structure-aware editing of still photographs a reality. Such computational photography applications use a depth map that is automatically generated by monocular depth estimation methods to represent the scene structure. In this work, we present a lightweight, web-based interactive depth editing and visualization tool that adapts low-level conventional image editing operations for geometric manipulation to enable artistic control in the 3D photography workflow. Our tool provides real-time feedback on the geometry through a 3D scene visualization to make the depth map editing process more intuitive for artists. Our web-based tool is open-source1 and platform-independent to support wider adoption of 3D photography techniques in everyday digital photography.
The lens distortion is essential for displaying VR contents on a head-mounted display (HMD) with a distorted display surface. We propose a novel lens distortion algorithm on an embedded GPU system. To minimize the memory access overhead, we propose a compressed form of a lookup table. We also utilize the integrated memory architecture of the edge GPU system (e.g., NVIDIA’s Jetson devices) to reduce data communication overhead between host and device. As a result, our method shows up to 1.72-times higher performance than prior lookup table-based lens distortion approaches while it consumes up to 28.93% less power. Finally, our algorithm achieved real-time performance for high-resolution images on edge GPU systems (e.g., 94 FPS for 8K image on Jetson NX). These results demonstrate the benefits of our approach from the perspectives of both performance and energy.
Triangles are one way we measure the complexity in a synthesized image. There often exists a trade-off between how realistic the synthesized image is based on the number of triangles and the overall rendering time. In real time applications, this trade-off becomes more important as we need to render detailed scenes at a fixed frame rate without sacrificing image quality. We propose a new view-dependent method that creates a more dynamic structure and is easy to parallelize. We also propose an amortized rebalancing operation to reduce long dependency lines in our data structure, prevent worst-case behavior, and in some instances improve the average case.
Our implementation is still in-progress, but our new method is provably consistent and has potential to reduce the amount of storage needed for view-dependent methods and remove more geometry at higher fidelity, along with other performance improvements.
This paper describes the use of Generative Adversarial Network (GAN) applied to the Wave Function Collapse (WFC) algorithm for procedural content generation. The goal of this system is to enable level designers to generate coherent 3D worlds with brand new meshes generated by the GAN.
We present a method for generating arrangements of indoor furniture from human-designed furniture layout data. Our method creates arrangements that target specified diversity, such as the total price of all furniture in the room and the number of pieces placed. To generate realistic furniture arrangement, we train a generative adversarial network (GAN) on human-designed layouts. To target specific diversity in the arrangements, we optimize the latent space of the GAN via a quality diversity algorithm to generate a diverse arrangement collection. Experiments show our approach discovers a set of arrangements that are similar to human-designed layouts but varies in price and number of furniture pieces.
The accuracy of hand pose and shape recovery algorithms depends on how closely the geometric hand model resembles the user’s hand. Most existing methods rely on learned shape space, e.g. MANO; but this shape model fails to generalize to unseen hand shapes with large deviations from the training set. We introduce a new hand shape model, aMANO, that augments MANO by introducing local scale adaptation that enables modeling substantially different hand sizes. We use both MANO and aMANO for calibrating the shape to new users from a stream of depth images and observe the improvement of aMANO over MANO. We believe that our new hand shape model is a significant step in improving the robustness and accuracy of existing hand tracking solutions.
We suggest a new type of subdivision schemes based on matrix dilation for generating smooth surfaces. At each iteration, the number of the nodes in the mesh is doubled and the direction of their weighted averaging changes. The scheme has a low complexity because of a small number of coefficients (four, five or six). Using the recent techniques related to the notion of joint spectral characteristics of matrices, we find the smoothness of generated surfaces which in some cases is surprisingly better than for classical schemes.
Stylized Novel View Synthesis is an emerging technique that combines style transfer and view synthesis. However, none of the existing works explore their applications in Virtual Reality (VR). This work devises a novel application for stylized novel view synthesis. We propose to replace actual 3D scene models or 360 images with stylized stereoscopic images for the areas outside the major play area but are still visible to the user. User study results reveal that users can feel 3D sense and tell them from plane texture. Codes and other materials are available at: kuan-wei-tseng.github.io/ArtNV