In recent years, most productions of 3D video games dealing with cars have represented automobile damage due to collisions. However, the cracks peculiar to FRP materials have not been reproduced in previous works. Therefore, in this research, we devised a method of expressing cracks in FRP materials. In addition, we will conduct experiments to determine the realism of the visual representation of breakage as well as the processing cost and consider whether it is possible to incorporate it into actual video game works.
Soil structure describes the association and arrangement of the soil granules. Structured clays, such as the cracked dry ground, exhibit characteristics of solid-like materials, while destructured clays exhibit characteristics of granular materials. The structure strength weakens with the deviatoric plastic strain accumulation, which leads to clay degradation. However, the existing graphics methods for granular materials cannot simulate this unique mechanical behavior. We present a non-associated Modified Structured Cam Clay (MSCC) model for simulating structured and destructured clays. We track the deviatoric part of the plastic Hencky strain and design two degradation functions (corresponding to yield stress and mean effective stress) to control the size of the yield surface. Through a non-associated flow rule, our method generates visually plausible results while allowing volume preservation.
In recent years, the realism of video expression by 3DCG has increased remarkably. In this study, we propose a CG representation method of tire smoke based on physical simulation. Although tire smoke has appeared in various video works and 3DCG games, the mechanism of how tire smoke is generated has not been fully elucidated. We calculate the generation and conduction of heat from the rotation speed and friction between the tire and the ground, and construct a model of tire smoke generation from the viewpoint of statistical mechanics. The parameters in the mathematical model are set from actual images and observations, and we confirm that the phenomenon of tire smoke can be reproduced.
In this poster, we present a method to generate co-speech text-to-gesture mapping for 3D digital humans. We obtained text and 2D pose data from public monologue videos. Gesture units were obtained from motion capture sequences. The method works by matching 2D poses to 3D gesture units. We trained a model via contrastive learning to improve the matching of noisy pose sequences with gesture units. To ensure diverse gesture sequences at runtime, gesture units were clustered using K-Mean clustering. We incorporated 2035 gestures and 210k rules. Our method is highly adaptable and easy to control and use. Demo Video : https://youtu.be/QBtGdGE1Wgk
We address the problem of 3D human motion estimation from original MoCap optical markers. The original markers are noisy, disordered, and unlabeled, hence recovering 3D human motion from them is non-trivial. Existing works are either time-consuming or assuming the knowledge of the marker labels. We address these problems by presenting an end-to-end method for 3D human motion estimation by leveraging the capability of Transformer to model long-range dependencies. The method takes original markers as inputs and learns joint poses with a Transformer-like architecture. Experimental results show that our method is able to achieve better than centimeter-level errors.
We address the problem of controlling and simulating interactions between multiple physics-based characters, using short unlabeled motion clips. We propose Adversarial Interaction Priors (AIP), a multi-agents generative adversarial imitation learning (MAGAIL) approach, which extends recent deep reinforcement learning (RL) works aiming at imitating single character example motions. The main contribution of this work is to extend the idea of motion imitation of a single character to interaction imitation between multiple characters. Our method uses a control policy for each character to imitate interactive behaviors provided by short example motion clips, and associates a discriminator for each character, which is trained on actor-specific interactive motion clips. The discriminator returns interaction rewards that measure the similarity between generated behaviors and demonstrated ones in the reference motion clips. The policies and discriminators are trained in a multi-agent adversarial reinforcement learning procedure, to improve the quality of the behaviors generated by each agent. The initial results show the effectiveness of our method on the interactive task of shadowboxing between two fighters.
We present a motion in-betweening framework to generate high quality, physically plausible character animation when we are given temporally sparse keyframes as soft animation constraints. More specifically, we learn imitation policies for physically simulated characters by using deep reinforcement learning where the policies can access limited information only. Once learned, the physically simulated characters are capable of adapting to external perturbations while following given sparse input keyframes. We demonstrate the performance of our framework on two different motion datasets and also compare our results with the the results generated by a baseline imitation policy.
In 3D character animation, the frame rate is often reduced to mimic hand-drawn animations, like anime (Japanese animation). However, anime with a low frame rate differs from real motions because it omits excessive movements and emphasizes speed by expressing motions with a small number of impressive poses. It is difficult to reproduce such motions only by downsampling mocap data. Thus, in this poster, we propose a method for converting mocap data into anime-like 2D motions by respecting production site techniques. The proposed method evaluates the characteristics of motion data using the time distributions of speeds and pose areas to select an appropriate sequence of viewpoints and extract effective poses for each viewpoint.
We propose a novel image search system with color palettes. By querying color palettes, users can search for inspiring images, which is helpful for design exploration. Although few such systems can accept palettes as queries, they have several constraints on query palettes (e.g., limited palette size) or search results. Our system accepts palettes with color weights and any sizes as the inputs and returns results with enough diversity that can stimulate users’ design inspiration. To achieve these, we extract color themes from our database with a perceptually-based model and calculate the palette similarity based on a weighted palette distance. Visual comparisons demonstrate that our system has more potential for helping users’ design inspiration than the existing systems.
The recent success of Machine Learning encouraged research using artificial neural networks (NNs) in computer graphics. A good example is the bidirectional texture function (BTF), a data-driven representation of surface materials that can encapsulate complex behaviors that would otherwise be too expensive to calculate for real-time applications, such as self-shadowing and interreflections.
We propose two changes to the state-of-the-art using neural networks for BTFs, specifically NeuMIP. These changes, suggested by recent work in neural scene representation and rendering, aim to improve baseline quality, memory footprint, and performance. We conduct an ablation study to evaluate the impact of each change. We test both synthetic and real data, and provide a working implementation within the Mitsuba 2 rendering framework.
Our results show that our method outperforms the baseline in all these metrics and that neural BTF is part of the broader field of neural scene representation.
Project website: https://traverse-research.github.io/NeuBTF/.
We propose an autonomous drone exploration system (ADES) with a lightweight and low-latency saliency prediction model to explore unknown environments. Recent studies have applied saliency prediction to drone exploration. However, these studies are not sufficiently mature. The ADES system proposes a smaller and faster saliency prediction model and adopts a novel drone exploration approach based on visual-inertial odometry (VIO) to solve the practical problems encountered during exploration, i.e., exploring salient objects without colliding with them and not repeatedly exploring salient objects. The system not only has a performance comparable to that of the state-of-the-art multiple-discontinuous-image saliency prediction network (TA-MSNet) but also enables drones to explore unknown environments more efficiently.
In this work, we present a high frame rate affordable nystagmus detection aid for outpatient clinic use. Using a Frenzel goggles and the high-speed video recording feature of a smartphone, high-frame rate eye videos of patients are obtained. Then, OpenCV morphological operation and blob detection library were modified to manually adjust the parameters according to the video to capture and track the eye center movements. Next, we correct the error caused by hand shaking during photography and predict the eyeball movement during blinking. Finally, with rule-based detecting method, nystagmus was detected and classified based on the modified tracking data. In our experimental, we collaborated with a professional ENT doctor to obtain multiple high-frame rate eye videos of patients. Experimental results show that this procedure can provide preliminary diagnosis results for physicians to refer to, but the final diagnosis needs to be made by a professional ophthalmologist. Plus, this method has been used by the ENT doctor to record the result simultaneously.
Image-based pose estimation generally needs 3D map generation, which is costly in practice. In this paper, we propose a method to accurately estimate the pose by recursively rendering 2D images from a generic 3D city mesh map and updating the pose.
Recognizing transparent objects such as window and glassware is a challenging task in 3D Reconstruction from color images. In recent years, transparent object recognition methods have focused on feature extraction from boundary region of transparent objects. Our observation is that, unlike non-transparent objects, transparent objects can be characterized and located better by looking at their external and internal boundaries separately. We propose a new internal-external boundaries attention module in which internal and external boundary features are separately recognized. We add an edge-body fully attention module that supervises the segmentation of the generated transparent objects body using semantic information in the external boundaries. We employ contour loss to perform distance-weighted supervision on the inner and outer boundaries separately. Extensive experiments show that proposed method outperforms existing methods on Trans10k dataset.
We introduce an image super-resolution technique for high-quality texture mapping in this poster. We first get upscaled textures from an existing image super-resolution (SR) method. We then perform a post-color correction algorithm to restore color tones and details lost in the SR algorithm. Finally, we compress the textures with variable compression ratios to reduce storage and memory overheads caused by the increased resolution. As a result, TexSR can improve the image quality of a state of the art, Real-ESRGAN.
Recently, many image stylization techniques have been proposed to convert photographs into illustrations, paintings, cartoons, etc. Artists sometimes use these techniques to efficiently generate background images for their own illustrations of foreground objects, such as cartoon characters. However, this approach often produces unnatural results because the level of abstraction between the foreground and the background images are different, resulting in inconsistent atmosphere between them. In this paper, we address this problem and propose a method for adjusting the abstraction level of the stylized background images. The effectiveness of our method is demonstrated by several examples with different levels of abstraction.
Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.
This work introduces a novel method for automating the object animation from a 3D mesh used in various applications such as AR/VR, gaming, etc. The 3D mesh could be a generic inflated object generated from an image or sketch. The method describes a medial axis based control point estimation to generate an animation path and perform deformation-based animations for a mesh. Our method eliminates user involvement or need for expertise in terms of rigging the mesh or region selection etc. Performance of our method shows meaningful, real-time animation of the 3D mesh. Experiments indicate that our method is more consistent and less error prone compared to existing works.
We propose procedural modeling of crystal clusters. Based on crystallography, we model single crystals and then distribute single crystals on a base rock using a hierarchical sampling approach. Users can interactively generate various crystal clusters, including amethyst and citrine clusters. Furthermore, our approach offers a high-level specification, i.e., target shape so that the generated clusters approximate the shape; such controllability is crucial for realizing artists’ intentions. Additionally, we focused on fabrication-oriented procedural modeling. We identify the possible parameter spaces based on physical experiments, i.e., test the removability of resin casts from silicone molds produced by the 3D printed crystal clusters for making replicas.
This paper presents a procedural modeling method for coral groups considering the territorial conflict between species. We developed a graphical interface to control the territorial battle by arranging locator points that represent available space to grow the coral branches. The coral skeletals are generated according to species-specific structural rules while competing for the locators between individuals.
In this work, we propose a robust pipeline to create vectorized models from LiDAR point clouds without the assumption of watertight polygonal surfaces. The core idea behind our method is combining the hierarchical 2D projection with the pairwise intersections of clipped planes to precisely hypothesize faces of an object inside and outside while enhancing the robustness of different scenes. Moreover, to generate concise polygon meshes efficiently, Adaptive K-means++ (AKM++) is used to smooth the 2D rasterized height map followed by the Levenberg-Marquardt (LM) algorithm to optimize the projection primitive with planar intersections jointly. The final model could be easily obtained by merging the derived neighbouring facets above. Our experimental results show that the proposed method obtains compact models with high fidelity and efficiency on either precise or defect-laden data compared with other state-of-the-art approaches.
Real-time visualization of large-scale surface models is still a challenging problem. When using multiple levels of details (LOD), the main issues are popping between levels and/or cracks between level parts. We present a scheme (both mesh preprocessing and real-time rendering), which avoids both of these issues. Vertex interpolation between child and parent is used to achieve crack and popping free rendering. We implemented and tested our method on a modest laptop PC, and could render scanned models of multiples tens of million triangles at optimal visual quality and interactive frame rate.
We use existing machine learning neural style transfer model, differential rasterizer, for colored font design. The input of the proposed system is an existing TrueType font and the output is an neural style transferred OpenType-SVG color font. Each character glyph contains a series of vector graphics path elements. A neural style transferred glyph consists of dozens of random Bézier curves initially distributed among the glyph space, gradually converging to glyph stokes of a given character. In contract, the process of designing a font manually requires specialized skills for designers. They use mouse cursor to adjust strokes and glyph kerning for each character. It require hours to achieve a specific style. A Chinese font contains thousands of characters. Neural style transfer techniques can be used to speed up the process for designing decorative fonts. In addition, we add different colors to the curves in a an neural style transfer character for visual appealing. In order to support colors in a font, we choose OpenType-SVG format, allowing displaying colors in software like Illustrator or Inkscape. We open-source our code at https://github.com/su8691/ribbon.
We explore the application of a time-dependent machine learning framework to art direction of volumetric simulations. We show the benefit of the time dependency inherent to the ODE-net model when used in conjunction with simulation sequences. Unlike other machine learning methods which maintain a uniform timestep constraint during evaluation, the ODE-net framework is able to generate results for arbitrary time samples. We demonstrate how this non-uniform time step evaluation can be leveraged for use in artistic direction tasks. We specifically apply the model to the retiming of volumetric simulations to showcase the ability of the machine learning method to properly predict arbitrary time steps. We show that with minimal training data, the model is able to generalize over several simulation sequences with similar parameters.
In this work, we propose a novel Metric-K Nearest Neighbor (-KNN) to facilitate topology aware learning in point clouds. Topology aware learning is achieved by accumulation of local features in deep-learning model. Recent work rely on Ball queries or K-Nearest-Neighbor (KNN) for local feature extraction of point clouds and finds challenges in retaining topological information. -KNN employes a generalised Minkowski distance in the KNN search algorithm for topological representation of point clouds. -KNN enables state-of-the-art point cloud methods to perform topology aware downstream tasks. We demonstrate the performance of -KNN as plugin towards point cloud classification, part-segmentation, and denoising using benchmark dataset.
People with visual impairment (PVI) are eager to push their limits in extreme sports such as alpine skiing. However, training skiing is very difficult as it always requires assistance from an experienced guide. This paper explores sonification-based methods that enable PVI to train skiing using a simulator, which will allow them to train without a guide. Two types of sonification feedback for PVI are proposed and studied in our experiment. The results suggest that users without any visual information can also pass through over 80% of the poles compared to with visual information on average.
In this poster, we present Flow Human, a no-code system that generates conversational behavior of digital humans from the text. Our users only need to build a conversation flow they want to talk to customers using the flow-based authoring tool we developed. Our system then automatically generates the verbal and non-verbal behavior of digital humans along the conversation flow, interacts with customers, and collects feedback. We believe that this work can serve the potential to be distributed to various services that have not been introduced because of the challenging task of controlling multiple factors in digital humans (e.g., conversation flow, co-speech gestures, and facial animation).
Rhythm is important to improve skills in various sports. In this study, we create a rhythm game for ski training. It synchronizes music and turn timing, and has various types of feedback for training. We conducted a pilot study to verify the effectiveness through a comparison of three conditions.
Skill improvement in sports is an essential factor in keeping the motivation to continue. Previous studies showed that kinesthetic illusion enhances observational learning. However, such studies have only dealt with the learning of movements using a single part of the body, and whether kinesthetic illusion can be induced in observational learning of whole-body movements has not been clarified. In this study, we conducted an experiment involving human subjects and confirmed that synchronized visuo-tactile stimuli can induce kinesthetic illusion even in whole-body movements. Moreover, we also demonstrated a complete mediation model in which the synchronization of visuo-tactile stimuli influences kinesthetic illusion mediated by body ownership.
We introduce MMGrip, a handheld multimodal haptic device that simultaneously presents vibration, impact, and shear for realistic and immersive haptic feedback to virtual collision events. The three types of haptic stimulus mimic a damped vibration, a collision impulse, and a skin slip deformation, respectively, occurring at physical contact. Controlling the three stimuli, as well as their onset time differences, provides a vast design space for offering diverse and realistic haptic experiences. We present the design of MMGrip and demonstrates its application to virtual sports.
Biofeedback is a well-known form of therapy in which the patient receives sensory and auditory input on their physiology to help them reflect on, recognize, and be aware of their own state in order to improve their cognitive and emotional functioning. In our approach, a user’s self-avatar is generated based on their physiological state, specifically their heart rate and electrodermal activity. Moreover, based on self-reported anxiety symptoms, we investigate if the biofeedback self-immersive avatar can help reduce stress levels.
Color vision tests are conducted based on subjective responses; however, since subjective responses can be biased and are not uniformly applicable to all individuals, an objective method is required. In this study, we developed a quantitative and user-friendly color vision test, which utilized a psychophysical method based on pupillary responses to color flicker stimuli sampled from the confusion line. The results revealed that power spectral density of pupillary oscillations increased corresponded to the color differences. Our system could eventually lead to a feasible color testing method.
Head-mounted displays (HMDs) increase immersion in virtual world, however which limits VR audiences’ awareness of the external audiences that co-located in the same physical environment. In this paper, we superimposing the virtual world onto physical environment and propose a shareable VR experience between VR audiences and external audiences via a wearable interface, thereby enabling external audiences to observe and interact with the virtual objects by using hand gestures. Our system allows external audiences to explore the virtual scenario spatially and to generate asymmetric communication with VR audiences in three different modes. Research suggests that these design strategies could be adapted for future shareable VR experience.
This study focused on font type and weight and we examined their effects on the readability of long texts in VR. In the experiment, we changed the font weight of three font types, Gothic, Mincho and Antigothic, to evaluate the readability of text with a length of 600 characters. Our results showed a tendency for Antigothic to be more readable and less fatiguing than Mincho and Gothic.
Mixed Reality (MR) techniques begin to penetrate our daily life. Due to the features of MR techniques to seamlessly intermix a user’s real environment with useful virtual contents in a natural manner, their potential applications range from information visualization and sharing and remote work collaboration to education and training in various fields. However, due to the steep learning curve in content creation and development, it is not trivial for novice users to create MR contents by themselves. In this work, we propose “Codeless Content Creator System”, an MR system that helps novice users rapidly and conveniently create MR contents using MR devices rather than relying on complicated software development tools such as Unity. A simple user study found that our system is useful and convenient for the MR contents creation tasks.
We propose an application that combines the benefits of augmented reality (AR) and virtual reality (VR) experiences for realistic and effective fire response training. Our analysis of recent literature reveals that there are different advantages to AR versus VR applications, and combining these experiences can generate new compound benefits in fire drills. In this study, a high-level training environment is automatically generated using three-dimensional object recognition technology, and the physical training tools are linked to the VR environment to improve immersion and training effectiveness. The combined advanced AR and VR experience demonstrates the potential use of this method in various industries.
Journaling is a well-known evidence-based strategy for practicing better self-regulation and self-awareness in daily life. It is also an effective way for reducing the effects of negative emotions like anxiety and depression. We explore the combination of virtual reality (VR) and a body-tracking mirror as a cognitive activity in the form of VR journaling. VR-based journaling aims to build cognitive reappraisal, self-reflection, and autonomic emotion regulation in a sustainable way. The user performs in front of the mirror, using body-tracking to help them keep a daily journal. Considering that looking into the mirror is already a daily habit for most people, it does not require the users to form a new habit when trying to do journaling. Furthermore, it activates creativity by encouraging the user to use their body language as a story-telling tool.
In recent years, video content that enables users to experience a simulated fall using an head mounted display (HMD) has become popular. However, the visual factors that cause the sensation of falling have not been clarified. In this study, we focused on visual stimuli, which are known to affect the intensity of vection, and examined their relationship to the sensation of falling.The results confirmed that stimulus presented from the peripheral visual field area and increased spatial frequency amplified the sensation of falling. In addition, we were able to produce video content that produced a greater sense of falling by using multiple visual stimuli that had the property of increasing vection.
In light of the COVID-19 pandemic, wearing a mask is crucial to avoid contracting infectious diseases. However, wearing a mask is known to impair communication functions. This study aims to address the communication difficulties caused by wearing a mask and provide a strategy for aiding in understanding the speaker’s speech through facial animation. Facial animation is generated in real-time, and upper facial information is processed to detect the speaker’s emotions, generating a lower facial expression. In addition, the system detects the mask’s shape and enables accurate registration in the proper position. This technology can improve communication and alleviate challenges associated with communication between persons wearing face masks.
In many sports, rhythmic skills are considered important. In this paper, we take juggling as an example and propose a VR system that simplifies the acquisition of a sense of rhythm. The proposed system uses temporal and spatial distortion and other functions to assist the training. A pilot study is conducted to validate the effectiveness of each function.
We propose a lens aberration correction method for holographic displays via a light wave propagation simulation and optimization algorithm. Aberration correction is an important technology to obtain noise-less hologram images in holographic displays. We optimized phase holograms with an automatic differentiation technique and Adam optimizer, and aberration corrected images were achieved. Given an aberrated lens with a focal length of 20mm, the optimized holographic image has a PSNR value of 32.
We realized color animation of a computer-generated hologram (CGH) while retaining the features of the conventional full-parallax high-definition CGH (FPHD-CGH) method such as large screen size and a wide viewing area. Since the FPHD-CGH is fabricated using a laser direct writing system, the only way to animate it is to switch structured illumination (SI), and even that method has the problem of overlapping images from different frames caused by light leakage. We propose a method of suppressing this light leakage while maintaining image quality by applying erosion to SI. We also developed the world’s first two-frame color animation prototype, which is 18 cm square and has a 30-degree viewing angle horizontally and vertically. We did this by combining our method with the existing FPHD-CGH colorization method.
We present Hanging Print, a framework to design and fabricate shapes by weaving catenaries(i.e. hanging curves) by extruding plastic filaments directly in mid-air. Hanging Print has the potential to assist designing expressions unique to hanging structures, unique to plastic, and beyond imagination. In this paper we introduce the workflow of Hanging Print, and demonstrate our works in various scales.
Preoperative models manufactured by 3D printing technology are widely demanded in surgical procedures. Current printing methods such as Polyjet, color jet printing (CJP) and fused deposition molding (FDM) are either too expensive or of poor performance. In this paper, a novel solution is proposed. Preoperative models are first divided into internal part, such as blood vessel, and the main body. The former is printed in color and the latter is transparent. In particular, the internal part is printed using FDM. A housing template for the main body is generated by digital light processing (DLP) based on the preoperative model. After integrating the template and inner part, transparent elastomer precursor is injected and cured inside the template, leading to the formation of the transparent main body. For demonstration, a liver model is manufactured using the proposed scheme. Compared with the existing methods, our strategy could manufacture preoperative models at both high quality and affordable cost.
In this study, we developed a flow to simulate anisotropy in multi-color printing of a FFF 3D printer. There are previous methods for simulating layered marks from gcode, which is a 3D printer control statement, and multi-color printing by switching filament colors and their simulators. However, there was no simulator for the case where color anisotropy exists in a single path, which is the target of this study. Since the coloring of this method is based on a completely different principle from existing methods, it is difficult to predict the results. Our simulator tool was very useful in rendering multi-color 3D printed models. Furthermore, a more sophisticated simulation of the placement of objects in real space was performed based on the highly realistic images created by physical-based rendering. The simulation also worked in VR space and showed an optimal preview at the architectural scale.