There is increasing interest in using robots in simulation to understand and improve human-robot interaction (HRI). At the same time, the use of simulated settings to gather training data promises to help address a major data bottleneck in allowing robots to take advantage of powerful machine learning approaches. In this paper, we describe a prototype system that combines the robot operating system (ROS), the simulator Gazebo, and the Unity game engine to create human-robot interaction scenarios. A person can engage with the scenario using a monitor wall, allowing simultaneous collection of realistic sensor data and traces of human actions.
We propose MagniFinger, a fingertip-worn microscopy device that augments the limited abilities of human visual and tactile sensory systems in micrometer-scale environments. MagniFinger makes use of the finger's dexterous motor skills to achieve precise and intuitive control while allowing the user to observe the desired position simply by placing a fingertip. To implement the fingertip-sized device and its tactile display, we have built a system comprising a ball lens, an image sensor, and a thin piezoelectric actuator. Vibration-based tactile feedback is displayed based on the luminance of a magnified image, providing the user with the feeling of touching the magnified world.
Conceiving an artwork requires designers to create assets and organize (or layout) them in a harmonious, self-orating story. While creativity is fundamental to both aspects, the latter can be bolstered with automated techniques. We present a first true SIMD formulation for the layout generation and leverage CUDA-enabled GPU to scan through millions of possible permutations and rank them on aesthetic appeal using weighted parameters such as symmetry, alignment, density, size balance, etc. The entire process happens in real-time using a GPU-accelerated implementation of replica exchange Monte Carlo Markov Chain method. The exploration of design space is rapidly narrowed by performing distant jumps from poorly ranked layouts, and fine tuning the highly ranked ones. Several iterations are carried out until desired rank or system convergence is achieved. In contrast to existing approaches, our technique generates aesthetically better layouts and runs more than two orders of magnitude faster.
When presbyopic people use digital devices, they often zoom in the display, because it is not in focus when they move it close to face. We have proposed the automatic display zoom system for presbyopic people [Fang and Funahashi 2018]. However, some of the information on the small display has gone out of it after zooming-in (Fig. 2(a) to (b)). It is necessary to scroll it frequently, and a bother. On the other hand, a conventional partial zoom means like a magnifying glass is also usually provided (Fig. 2(a) to (c)). The part around a zoomed area is cut off, and it is necessary to move the glass frequently too. People sometimes want to skim through sentences and understand an overview. By the way, although it is difficult to read blurry words (Fig. 3(a)), you can guess and read a sentence includes the blurry words when some other words are clear (Fig. 3(b)). Therefore, we reconsider the zoom-in method for presbyopic people. For example, the area paid attention is zoomed in to read clearly, and the magnification rate of the area around it is gradually reduced to zoom-out rate so that all information is displayed in the small display even though some words are zoomed out. It is expected that you can guess and read also the unzoomed-in words like blurred words around the clear zoomed-in words. We propose a suitable partial zoom-in function that allows you to skim a document.
This paper proposes an intuitive tool for users to create 3D architectural models through 2D sketch input. A user only needs to draw the outline of a frontal or oblique view of a building. Our system recognizes the parts drawn in a sketch and estimates their types. The estimated information is then used to compose the corresponding 3D model. Besides, our system provides additional assistant tools for rapid editing. The modeling process can be iterative and incremental. To accomplish a building complex, a user can gradually create their models from one view to another. Our experiment shows that the proposed interface with sketch analysis tools eases the process of 3D building modeling.
This paper describes "Scented Graphics," our artistic experiments with a scented-printing technique that utilizes an inkjet printer. By mixing water-soluble aromatic oils with inks, an inkjet printer can be utilized to control scent mixing with high precision at almost no extra cost. Such features are hardly provided by the existing scented-printing services at the same level of flexibility and cost-effectiveness. We mimicked the Swiss graphic design in our experiment as the style to facilitate the preliminary investigation of this technique. As scent mixing is a fundamental technique in olfactory art, our experiments can be beneficial as design exemplars as a preliminary exploration of the technique.
Strokes have a wide range of occurrence among people of all races and genders. The consequences of a stroke often include significant muscle weakness on one side of the body that must be physically exercised to attempt to restore its previous strength and mobility.
In traditional mirror therapy, the partially disabled hand or leg is hidden by a mirror. The patient sees a reflection of the healthy side of their body where the disabled limb should be, in order to stimulate brain to operate the partially disabled hand/leg in the proper way. The patient will interact with mirror using unaffected hand and observe its reflection.
While there are many professional examples of successful character designs, there seems to be little academic formalization in standardizing a process to achieve consistent visual results. In this work, we present such a formal process to construct visual designs for character archetypes that are given by "verbal descriptions". This process is based on visual semiotics that are used for creating clear meaning behind design choices while still retaining a sense of aesthetic through principles of artistic design. Using this process, we have developed a set of encyclopedic references for a wide variety of psychology and literary archetypes to demonstrate the power of this approach. We also used this process successfully in a visual storytelling class.
We propose a system that automatically generates layouts for magazines that require graphical design. In this system, when images or texts are input as the content to be placed in layouts, an appropriate layout is automatically generated in consideration of content and design. The layout generation process is performed by randomized processing in accordance with a rule set of minimum conditions that must be satisfied for layouts (minimum condition rule set), where a large number of candidates are generated. The appearance, style, design, and composition of the work are evaluated by a learning-to-rank estimator, top scores are returned to the user. Users can greatly improve the efficiency of layout creation/editing by selecting from among automatically generated candidate layouts.
We propose a novel and intuitive method for exploring recoloring variations of vector graphics. Compared with existing methods, ours is specifically tailored for vector graphics, where color distributions are sparser and are explicitly stored using constructs like solid colors or gradients, independent from other semantical and spatial relationships. Our method tries to infer some of them before formulating color transfer as a transport problem between the weighted color distributions of the reference and the target vector graphics. We enable creative exploration by providing fine-grain control over the resulting transfer, allowing users to modify relative color distributions in real-time.
In this paper, we propose a method to colorize line drawings of anime characters' faces with colors from a reference image. Previous studies using reference images often fail to realize fully-automatic colorization, especially for small areas, e.g., eye colors in the resulting image may differ from the reference image. The proposed method accurately colorizes eyes in the input line drawing using automatically computed hints. The hints are round patches used to specify the positions and corresponding colors extracted from the eye areas of a reference image.
Chinese glove puppetry is a traditional art with a long history and widely spread in the folk ever. This paper presents a collaborative work in multi-user virtual reality system for puppetry's opera. According to our developed system, each user has a unique perspective on any shared virtual world and interaction through our virtual reality network. This system achieves human-computer interaction and realizes the interaction between people. In addition, it brings an entertaining experience to users and easy to operate for all ages. In order to cultural preservation, we can record a grandmaster's puppet show performance in our system. Our research not only delivers a balance of art and technology in culture creativity, but also preserve folk art.
We propose a graph matching-based anime-colorization method from line drawings using multiple reference images. A graph structure of each frame in an input line drawing sequence helps to find correspondences of regions to be colorized between frames. However, it is difficult to find precise correspondences of whole frames only from a single reference image, because the graph structure tends to change drastically during the sequence. Therefore, our method first finds an optimal image from multiple reference images according to a cost function that represents shape similarity between nodes and compatibility of node pairs. While it is necessary to prepare several manually colored reference images, our method is still effective in reducing the effort required for colorization in anime production. We demonstrate the effectiveness of our method using actual images from our production.
In this research, the authors designed an interactive spatial augmented reality system for stage performance based on the technologies of UWB positioning and Bluetooth triggering. The position of the actor is obtained through the antenna tag carried by the actor and the signal base station placed on the stage. Special effects can be triggered through the Bluetooth module according to the actor. The system has a higher degree of freedom in practical applications, which can present an interactive spatial augmented reality effect, and therefore provide new possibilities for the application of spatial augmented reality in the stage performance. The system could bring better immersive experience to the audiences, and it also brings new possibilities for the aesthetic creation of opera.
The authors developed a VR orchestral application for interactive music experience, allowing virtual musical instruments in an orchestral piece to be repositioned spatially, dynamically and interactively in VR space. This can be done for changing environments where 3D audio technology is used to restructure traditional orchestral pieces into a new music art form. User experience surveys were undertaken on two kinds of users, with the results showing that the VR orchestral system developed in this paper could bring some special advantages in the musical experience.
Illimitable Space System v2 is a configurable toolbox which provides multimodal interaction and serves as a platform for artists to enhance their performance through the use of depth and colour data from a 3D capture device. Its newest iteration was presented as part of ChineseCHI in 2018.
This latest iteration of ISSv2 is powered by an open source core named OpenISS. The core allows the ISSv2 platform to be run as a distributed system. Video and depth capture are done from a computer acting as a server with a client component for displaying the applied effects and video from a web browser. This has the added benefit of allowing the artist to broadcast their performance live and opens the way for audience interaction. There are two primary motivations behind creating an open source core for the ISS: first, open source tech allows more people to participate in the development process as well as understand how the technology works while spreading maintenance responsibilities to their respective parties. Secondly, having a core allows parts of they system to be switched out at will without having to modify it all at once, this is particularly relevant with respect to capture devices.
Given a point set S ⊆ R2, reconstruction refers to the process of identifying a vector shape that best approximates the input. Although this field was pioneered since 1983 by Edelsbrunner [Edelsbrunner et al. 2006] and has been heavily studied since then, the general problem still remains open, ill-posed and challenging. Solving it is essential for a wide range of applications, from image processing, pattern recognition and sketching to wireless networks.
Meet in Rain is a serious game on Chinese poetry. While invoking various events, players must complete given tasks, which help them better appreciate the poems, by exploring imaginary sceneries that depict Chinese poems. Its visual design also mimics Chinese paintings in the era when the poems were created. As only a few serious games exist for Chinese poetry and they mostly focus on knowledge acquisition, our work provides a rare design exemplar of a serious game that is designed with the intention to foster aesthetic appreciation for a cultural subject.
We propose an effective method to solve multiple characters audio-driven facial animation (ADFA) problem in an end-to-end fashion via deep neural network. In this paper each character's ADFA considered as a single task, and our goal is to solve ADFA problem in multi-task setting. To this end, we present MulTaNet for multi-task audio-driven facial animation (MTADFA), which learns a cross-task unified feature mapping from audio-to-vertex that capture shared information across multiple related tasks, while try to find within-task prediction network encoding character-dependent topological information. Extensive experiments indicate that MulTaNet generates more natural-looking and stable facial animation, meanwhile shows better generalization capacity to unseen languages compare to previous approaches.
We present an educational virtual reality (VR) puzzle game set in an archaeological context. We digitally documented the site architecture and a selection of excavated artefact using structure from motion (SfM) mapping, reconstructed the site during Classic Period (AD 250-900) based on the current state and archaeological findings, and created the natural environment using procedural modeling. With this collection of resources, we created a holistic landscape of the Mayan site of Cahal Pech. The player can link the Mayan ruin between its current state and the past through collecting artefact and evidence, and discover the architectural beauty and historical richness of this site.
In this work, we solve the problem of real-time transfer of geometric style from a single glyph to the entire glyph set of a vector font. In our solution, a single glyph is defined as one or more closed Bézier paths which is further broken down in primitives to define a set of segments. The modification to these segments is percolated to the entire glyph set by comparing the set of segments across glyphs using techniques like the order and direction of segments and the spatial placement of segments. Once the target segments in other glyphs is identified the transformation from style glyph is applied to the target glyph.
Furthermore, we establish user-controlled policies for percolation of style like mapping line segment modification to curve segments. This extension to the algorithm enables the user to create multiple variations of a glyph.
This work presents a combination of 3D printing with mixed reality to use the results in the context of museum exhibitions or for cultural heritage. While now priceless artefacts are encased in glass, kept safe and out of reach of the visitors, we present a new pipeline which would allow visitors hands-on interaction with realistic 3D printed replicas of the artefacts which are then digitally augmented to have the genuine artefacts' appearances.
We propose a method for transforming inclined panoramic images into upright posture as if they were captured up straight. The method rectifies images automatically using neither additional information nor external instructions, while it enables panoramic images of both indoor and outdoor scenes to stand upright robustly. It helps unskilled people to make high quality panoramic images used for widespread applications including city navigation and real estate easier.
In recent years, there has been a trend in teaching programming in elementary schools around the world. When teaching programming to lower grader students, robots and puzzles are often used to learn programming for easy understanding. However, those tools have limitations in execution results. Higher graders often use visual programming as a learning material. However, there is a problem that visual programming requires one computer for each individual student, and many students have to learn how to use a computer first. Therefore we propose a programming tool using tangible blocks and AR. This makes it possible to learn programming intuitively with fewer restrictions. Our tool is operated using a smartphone and tangible blocks without using a computer. By using AR, it is possible to create an intuitive programming that can interact with reality. We asked teachers who have experience teaching programming to children to assess the usefulness of our tool within programming education in school. As a result, there was an opinion that it might be suitable for multi-person programming.
"Biodigital" is a Sci-Fi interactive story set in the year 2117 that combines VR film, immersive 3D environments, and VR data visualization. It turns data into a cinematic experience where a user is enmeshed as a character in the story. This VR storytelling tells the tale of humanity a hundred years from now. It also encourages us to think "How should we live in the future?"
We propose a projection-based augmented reality (AR) robot system that provides pervasive support for the education and safety of preschoolers via a deep learning framework. This system can utilize real-world objects as metaphors for educational tools by performing object detection based on deep learning in real-time, and it can help recognize the dangers of real-world objects that may pose risks to children. We designed the system in a simple and intuitive way to provide user-friendly interfaces and interactions for children. Children's experiences through the proposed system can improve their physical, cognitive, emotional, and thinking abilities.
Natural disasters constitute unexpected and severe threats with devastating effects on communities worldwide. Recent studies emphasize the importance of public awareness and training of first responders in disaster preparedness and response activities. This paper presents a virtual reality framework that creates a realistic 3D gaming environment with real-time and historical weather and disaster conditions. Main goals of the project are to increase public awareness about disasters by using gamification techniques, and train and evaluate emergency responders by simulating real-life scenarios. The system is supported by voice recognition to interact with the virtual world, and analyze user's actions and voice to detect emotional and psychological state.
Appropriately chosen user interfaces are essential parts of immersive augmented reality experiences. Regular user interfaces cannot be efficiently used for interactive, real-time augmented reality applications. In this study, a gesture controlled educational gaming experience is described where gesture recognition relies on deep learning methods. Our implementation is able to replace a depth-camera based gesture recognition system using conventional camera while ensuring the same level of recognition accuracy.
We present a collaborative immersive technology effort, InNervate AR and InNervate VR. These applications meet the need to expand on existing anatomy education platforms by implementing a more dynamic and interactive user interface. This user interface allows for exploration of the complex relationship between motor nerve deficits and their effects upon the canine anatomy's ability to produce movement. Preliminary AR user studies provided us with positive feedback in the quality of learning. The studies show that the dynamic touch interactions in AR definitely benefit students' critical reasoning and spatial visualization in learning motor nerve and muscle relationships. However, users seek a more immersive VR-based learning environment, without the distractions that an AR experience may offer. Based on this feedback, a VR version of this learning experience was created. Preliminary responses show that users are satisfied with this VR environment which allows them to manipulate and control the anatomical content with full-body interactions.
We present Vox-Cells, a volume visualization framework designed for real-time volume investigation and exploration. We seek to treat data as a first-class citizen with a 1:1 relationship between the data and its corresponding representation. CPU-GPU transfer is minimized, and novel approaches to volume construction and lighting are explored in order to maximize performance for deployment on consumer grade Virtual Reality (VR) Head-mounted displays (HMD).
Virtual and Augmented Reality (VR and AR) are two fast growing mediums, not only in the entertainment industry but also in health, education and engineering. A good VR or AR application seamlessly merges the real and virtual world, making the user feels fully immersed. Traditionally, a computer-generated object can be interacted with using controllers or hand gestures [HTC 2019; Microsoft 2019; Oculus 2019]. However, these motions can feel unnatural and do not accurately represent the motion of interacting with a real object. On the other hand, a physical object can be used to control the motion of a virtual object. At present, this can be done by tracking purely rigid motion using an external sensor [HTC 2019]. Alternatively, a sparse number of markers can be tracked, for example using a motion capture system, and the positions of these used to drive the motion of an underlying non-rigid model. However, this approach is sensitive to changes in marker position and occlusions and often involves costly non-standard hardware [Vicon 2019]. In addition, these approaches often require a virtual model to be manually sculpted and rigged which can be a time consuming process. Neural networks have been shown to be successful tools in computer vision, with several key methods using networks for tracking rigid and non-rigid motion in RGB images [Andrychowicz et al. 2018; Kanazawa et al. 2018; Pumarola et al. 2018]. While these methods show potential, they are limited to using multiple RGB cameras or large, costly amounts of labelled training data.
We propose a new aerial imaging display in which autostereoscopic objects with horizontal and vertical parallax appear as if they are floating in the air. This system operates by displaying an integral photography image in which depth is reversed beforehand and by observing the image from the other side of a micro mirror array.
Monte Carlo methods for transient rendering have become a powerful instrument to generate reliable data in transient imaging applications, either for benchmarking, analysis, or as a source for data-driven approaches. However, due to the increased dimensionality of time-resolved renders, storage and data bandwidth are significant limiting constraints, where a single time-resolved render of a scene can take several hundreds of megabytes. In this work we propose a learning-based approach that makes use of deep encoder-decoder architectures to learn lower-dimensional feature vectors of time-resolved pixels. We demonstrate how our method is capable of compressing transient renders up to a factor of 32, and recover the full transient profile making use of a decoder. Additionally, we show how our learned features significantly mitigate variance on the recovered signal, addressing one of the pathological problems in transient rendering.
In stereoscopic displays, such as those used in VR/AR headsets, our two eyes are presented with different views. The disparity between the views is typically used to convey depth cues, but it could be used for other purposes. We devise a novel technique that takes advantage of binocular fusion to boost perceived local contrast and visual quality of images. Since the technique is based on fixed tone-curves, it has negligible computational cost and it is well suited for real-time applications, such as VR rendering. To control the trade-off between the level of enhancement and binocular rivalry, we conduct a series of experiments that lead to a new finding, explaining the factors that dominate the rivalry perception in a dichoptic presentation where two images of different contrasts are displayed. With this new inding, we demonstrate that the enhancement can be quantitatively measured and binocular rivalry is well controlled.
Future auto-stereoscopic displays offer us an amazing possibility of virtual reality without the need for head mounted displays. Since fundamentally though we only need to generate viewpoints for known observers, the classical approach to render all views at once is wasteful in terms of GPU resources and limits the scale of an auto-stereoscopic display. We present a technique that reduces GPU consumption when using an auto-stereoscopic displays by giving the display a context awareness of its observers. The technique was first applied to the Looking Glass device on the Unity3D platform. Rather than rendering 45 different views at the same time, for each observer, the framework only requires six views that are visible to both eyes based on the tracked eye positions. Given the current specifications of this device, the framework helps save 73% GPU consumption for Looking Glass if it was to render a 8K X 8K resolution scene, and the saved GPU consumption increases as the resolution increases. This technique can be applied to reduce future GPU requirements for auto-stereoscopic displays in the future.
Due to the increased availability and accuracy of GPS sensors, the field of movement ecology has been able to benefit from larger datasets of movement data. As miniaturisation and the efficiency of electronic components have improved, additional sensors have been coupled with GPS tracking to enable features related to the animal's state at a given position to be recorded. This capability is especially relevant to understand how environmental conditions may affect movement.
Recently, material type recognition using color or light field camera has been studied. However, visual pattern based approaches for material type recognition without direct acquisition of surface reflectance show limited performance. In this work, we propose IR surface reflectance estimation using off-the-shelf ToF (Time-of-Flight) active sensor such as Kinect and perform surface material type recognition based on both color and reflectance clues. Two stream deep neural network consists of convolutional neural network encoding visual clue and recurrent neural network encoding reflectance characteristic is proposed for material classification. Estimated IR surface reflectance and material type recognition evaluation on our Color-IR Material Data set show promising performance compared to prior approaches.
To develop a graphics project with ease and confidence, the reliability and extensibility of the underlying framework are essential. While there are existing options, e.g., pbrt-v3 [Pharr et al. 2016] and Mitsuba [Jakob 2010], they either focus on education or not being updated for a long time. We would like to present an alternative solution named Photon.
Accurate modeling and rendering of human skin appearance has been a long standing goal in computer graphics. Of particular importance has been the realistic modeling and rendering of layered subsurface scattering in skin for which various bio-physical models have been proposed based on the spectral distribution of chromophores in the epidermal and dermal layers of skin [Donner and Jensen 2006; Donner et al. 2008; Jimenez et al. 2010]. However, measurement of the spectral parameters of absorption and scattering of light for such bio-phyisical models has been a challenge in computer graphics. Previous works have either borrowed parameters for skin-type from tissue-optics literature [Donner and Jensen 2006], or employed extensive multispectral imaging for inverse rendering detailed spatially varying parameters for a patch of skin [Donner et al. 2008]. Closest to our approach, Jimenez et al. [2010] employed observations under uniform broadband illumination to estimate two dominant parameters (melanin and hemoglobin concentrations) for driving a qualitative appearance model for facial animation.
Light projection onto falling water produces distinct and impressive experience which is suitable for entertainment and advertising installations in public spaces [Barnum et al. 2010; Eitoku et al. 2006]. One of popular and classical techniques used in illuminating water for such purposes is strobe lighting, which presents optical illusion of levitating --- or slowly falling or rising --- water drops depending on the relation between water dropping and strobe lighting frequencies (e.g. [Pevnick 1981; Rosenthal 1984]).
The ocean surface is highly dynamic. It moves rapidly and thus its shading changes rapidly as well. Usually, this doesn't pose any problems if the shading is smooth. However, for a surface that has a strong highlight or bright reflection moving rapidly, it causes an inaccurate and unnatural flickering. In the traditional rendering algorithms, each frame is rendered independently at a discrete time, resulting in serious temporal aliasing artifacts. Particularly, for a wavy water surface, reflection vectors may not hit the light source even though they actually hit for part of the frame time. Removing such aliasing in real-time is an active research area and many methods have been proposed [Jimenez et al. 2011]. They can improve the fidelity and efficiency of the rendering method. However, their focus is on spatial anti-aliasing and most of them do not address the temporal aliasing problem, particularly the one observed in rendering a reflected image of a light source on the water surface.
Plasmonic color generation describes structural color arising from resonant interaction between visible light and metallic nanostructures, causing selective frequencies of light to be scattered and/or absorbed [Kristensen et al. 2017; Sun and Xia 2003]. The perceived color from such metallic nanostructures is highly dependent on viewing angle and the color appearance can change with color of the viewing background. Plasmonic color generation is a rapidly emerging research area with potential advantages over conventional pigment printing technology including higher printing resolution and robustness, greater compatibility for integration and functionalization, and reduced resource requirements [Mudachathi and Tanaka 2017; Zhu et al. 2017]. Structural color from plasmonic nanostructures has already been used to improve security measures in currency notes and credit cards [Lee et al. 2018].
For most mammals and vertebrate animals, tail plays an important role for their body providing variant functions to expand their mobility, or as a limb that allows manipulation and gripping. In this work, Arque, we propose an artificial biomimicry-inspired anthropomorphic tail to allow us alter our body momentum for assistive, and haptic feedback applications. The proposed tail consists of adjacent joints with a spring-based structure to handle shearing and tangential forces, and allow managing the length and weight of the target tail. The internal structure of the tail is driven by four pneumatic artificial muscles providing the actuation mechanism for the tail tip. Here we highlight potential applications for using such prosthetic tail as an extension of human body to provide active momentum alteration in balancing situations, or as a device to alter body momentum for full-body haptic feedback scenarios.
A contemporary challenge involves scientific education and the connection between new technologies and the heritage of the past. CubeHarmonic (CH) joins novelty and tradition, creativity and education, science and art. It takes shape as a novel musical instrument where magnetic 3D motion tracking technology meets musical performance and composition. CH is a Rubik's cube with a note on each facet, and a chord or chord sequence on each face. The position of each facet is detected through magnetic 3D motion tracking. While scrambling the cube, the performer gets new chords and new chord sequences. CH can be used to compose, improvise,1 and teach music and mathematics (group theory, permutations) with colors and physical manipulation supporting abstract thinking. Furthermore, CH allows visually impaired people to enjoy Rubik's cube manipulation by using sounds instead of colors.
The concept of spatial computing has a high potential to augment our lives. We can imagine the expansion of the concept of spatial computing in the future as the expanded visual space will blend into our daily lives. In such a future, how do we feel and interact with another reality that overlaps with real space? We have researched this question from an approach using a haptic interface. We previously developed the Synesthesia Suit[Konishi et al. 2016]. This is a full body haptic suit that extends the VR experience of the game "Rez Infinite". It delivers the visual and musical experience of "Rez" to the whole body, and provides haptic feedback. However, the Synesthesia Suit is difficult to walk around in a physical space, due to the thick cables required for its operation. Basically it is especially suited for VR gaming and it was difficult to apply to spatial computing experience.
In this paper, we propose three display methods for projection-based augmented reality. In spatial augmented reality (SAR), determining where information, objects, or contents are to be displayed is a difficult and important issue. We use deep learning models to estimate user pose and suggest ways to solve the issue based on this data. Finally, each method can be appropriately applied according to various the applications and scenarios.
We propose a potential fluid-measurement technology aimed at supporting biomechanics research of water sports using fluid simulation and motion analysis. Cellulose nanofibers introduced into the water as tracer particles to visualize the movement of water. An optical property of nanofibers, called flow birefringence, makes water flows brighter than their surroundings when placed between right and left circularly polarized plates. We tested the capability of the technology in a water tank and succeeded in using an existing particle-tracking method-particle image velocimetry (PIV)-to measure the flows from a pump in the tank.
Training anesthetic application is a challenge for teaching dentistry, given the complexity of the procedure and the risks involved. Through an immersive virtual reality environment, the system presented here offers a playful way of learning, through game elements. This serious game allows the user to practice anesthesia technique with or without aids (tactile or visual) and to receive scores at different levels of difficulty. An important differential is the possibility of the syringe being driven automatically (tactile aid) in order to reproduce trajectories inserted by experienced instructors. Thus, the student can feel as if the instructor was conducting the learner's hand while guiding her or him.
This study proposes a tabletop projection device that can be implemented by combining physical objects with interactive projections. Users can interact on kraft papers using daily tools, such as marker pens, toothbrushes, colored blocks, and square wooden blocks. The input of the proposed device is a multifunction sensor, and the output is a tabletop projector. Using MagicPAPER, four types of interactions are implemented, namely drawing, gesture recognition, brushing, and building blocks. The abstract and poster discuss the design motivations and system descriptions of MagicPAPER.
High resolution facial capture has received significant attention in computer graphics due to its application in the creation of photorealistic digital humans for various applications ranging from film and VFX to games and VR. Here, the state of the art method for high quality acquisition of facial geometry and reflectance employs polarized spherical gradient illumination [Ghosh et al. 2011; Ma et al. 2007]. The technique has had a significant impact in facial capture for film VFX, recently receiving a Technical Achievement award from the Academy of Motion Picture Arts and Sciences [Aca 2019]. However, the method imposes a few constraints due to the employment of polarized illumination, and requires the camera viewpoints to be located close to the equator of the LED sphere for appropriate diffuse-specular separation for multiview capture [Ghosh et al. 2011]. The employment of polarization for reflectance separation also reduces the amount of light available for exposures and requires double the number of photographs (in cross and parallel polarization states), increasing the capture time and the number of photographs required for each face scan.
In this poster, we propose a new haptic rendering algorithm that dynamically modulates wave parameters to convey distance, direction, and object type by utilizing neck perception and the Hapbeat-Duo, a haptic device composed of two actuators linked by a neck strap. This method is useful for various VR use cases because it provides feedback without disturbing users' movement. In our experiment, we presented haptic feedback of sine waves which were dynamically modulated according to direction and distance between a player and a target. These waves were presented to both sides of the users' necks independently. As a result, players could reach invisible targets and immediately know they had reached the targets. The proposed algorithm allows the neck to become as important a receptive part of body as eyes, ears, and hands.
The Ouija board game is associated with a type of involuntary motion known as an ideomotor action. We sought to clarify the conditions under which this motion occurs by evaluating the effect that visual and haptic movement cues have on its occurrence. Using our lateral skin deformation device, we found that the simultaneous presentation of visual and tactile illusory motion and force produced larger ideomotor actions than when either modality presented alone, an effect that was further potentiated by the presence of another player (an avatar).
DisplayBowl is a bowl-shaped hemispherical display for showing omnidirectional images with direction data. It provides users with a novel way of observing 360-degree video streams, which improves the awareness of the surroundings when operating a remote-controlled vehicle compared to conventional flat displays and HMDs. In this paper, we present a user study, in which we asked participants to control a remote drone using an omnidirectional video streaming, to compare the uniqueness and advantages of three displays: a flat panel display, a head-mounted display and DisplayBowl.
We present ShareHaptics, a novel modular system to provide tactile and pressure feedback in mixed reality applications using a novel actuator: shape memory alloy (SMA). We apply it to fingers, wrist and foot ankle. Although it can be used for haptic feedback in a diverse set of use cases, we specifically focus on collaborative applications: ShareHaptics allows to haptically jack-in to a remote environment via a custom glove and ankle braces. We demonstrate a wide range of applications: watching sports, gaming, and collaborative discussions and skill transfer.
While humans are quite good at copying motions from others, it is difficult to do so in a dynamic sport such as skiing. Hence, we propose a virtual reality ski training system, which visualizes prerecorded expert motion in different ways and enables users to learn by copying. The system is based on a commercial indoor ski simulator, a VR headset, and two VR trackers to capture the ski's motion. Users can control their skis on the virtual ski slope and improve their skills by following a digital avatar of the expert skier replayed in front of them. We investigate 3 types of visualizations for training: Graphs to visualize the angle of feet compared to the expert, periodic copies of the expert's pose to show the spatial and temporal motion of the key movements, and a more minimal ribbon-trace of the leading skier to point out the optimized trajectory.
Common mechanical actuators for haptic feedback are generally dedicated to creating single kind of feedback, e.g., vibrotactile only, the pressure only, or shear force only, [Choi and Kuchenbecker 2013; Girard et al. 2016; Pacchierotti et al. 2017]. This is against the fact that highly realistic fully immersive VR/AR sometimes requires rather complete and rich multi-mode haptic feedback. For instance, when rubbing your finger on a wooden desk, the fingertip simultaneously senses both the high-frequency vibration due to the roughness of the surface texture and the quasi-static pressure due to pushing force, and your brain combines them to feel it as a wooden desk. The lack of any of the involved physical signal may seriously deteriorate the realism. This may be one of the reasons why current haptic interface technology for VR/AR environments is not at the same level as visual interfaces.
In this work, we present a procedural approach to capture a variety of appearances of American Second Empire houses. To develop this procedural approach, we have identified the set of rules and similarities of Second Empire houses. Our procedural approach, therefore, captures the style differences of Second Empire houses with a relatively few numbers of parameters. Using our interface, we are able to generate virtual houses in a wide variety of styles of American Second Empire architecture. We have also developed a method to break up these virtual models into slices in order to efficiently and economically 3D print them. Using this approach we have created miniatures of two landmark buildings: the Hamilton-Turner Inn in Savannah and the Enoch Pratt House in Baltimore. Note that the virtual models still provide more details because of the limited resolution of 3D printing processes.
We present a variety of new compositing techniques using Multi-plane Images (MPI's) [Zhou et al. 2018] derived from footage shot with an inexpensive and portable light field video camera array. The effects include camera stabilization, foreground object removal, synthetic depth of field, and deep compositing. Traditional compositing is based around layering RGBA images to visually integrate elements into the same scene, and often requires manual 2D and/or 3D artist intervention to achieve realism in the presence of volumetric effects such as smoke or splashing water. We leverage the newly introduced DeepView solver [Flynn et al. 2019] and a light field camera array to generate MPIs stored in the DeepEXR format for compositing with realistic spatial integration and a simple workflow which offers new creative capabilities. We demonstrate using this technique by combining footage that would otherwise be very challenging and time intensive to achieve when using traditional techniques, with minimal artist intervention.
Masks are heavily used in image and video processing, particularly in the context of green screen keying. Designing good masks is a difficult task that involves painting over small details of images. Usually, only a rough mask is created. We propose an algorithm to expand such a mask, using color similarity. Our approach is fast, even on 4K images, and compares favorably with standard tools used in keying.
Cinematic scientific visualizations turn complex scientific phenomena and concepts into stunning graphics and make them easier for the general public to comprehend. Adding interactivity to cinematic scientific visualizations is highly beneficial especially for educational purposes, as it keeps the viewers engaged and promotes active learning [Cano et al. 2017]. Although there are existing software tools such as VisIt that are capable of handling large data sets and allow for interactive exploration, they are usually designed for scientists and not meant for producing cinematic visualizations for the general public. Creating aesthetically pleasing visualizations of scientific data helps to better communicate the scientific concepts, increase impact, and reach a broader audience [Borkiewicz et al. 2018]. As existing examples of visualizations that are both interactive and cinematic have mainly been produced with custom software, there is a lack of easily accessible tools for developing this type of scientific visualization.
We propose a novel method for editing vector graphics which enables users to intuitively modify complex Bézier geometry. Our method uses a Generative Adversarial Network (GAN) to automatically predict salient points for any arbitrary geometry defined by cubic Bézier curves, which are used as handle locations for a Linear Blend Skinning transformation. Further, we bind input geometry to a triangle mesh, to decouple the complexity of input geometry from mesh topology. Finally, to reconstruct Bézier curves from the transformed mesh, we formulate a linear optimization problem and solve it in performant manner to ensure real time feedback, without increasing the number of Bézier segments.
Visual effects (called "VFX" in this poster) have expanded the range of expression and been necessary to enhance the mood for the story in films and videos. A variety of software tools which support producing VFX has been developed.
A procedural art-directable workflow is developed for voxel 3D printing using existing digital effects technologies. Customised for the Stratasys J750's unique materials, the system produces large-scale prosthetic eyes as a case study for film and display work.
Time-resolved imaging has made it possible to look around corners by exploiting information from diffuse light bounces. While there have been successive improvements in the field since its conception, so far it has only been proven to work in very simple and controlled scenarios. We present a public dataset of synthetic time-resolved Non-Line-of-Sight (NLOS) scenes with varied complexity aimed at benchmarking reconstructions. It includes scenes that are common in the real world but remain a challenge for NLOS reconstruction methods due to the ambiguous nature of higher-order diffuse bounces naturally occurring in them. With over 300 reconstructible scenes, the dataset contains an order of magnitude more scenes than what is available currently. The final objective of the dataset it to boost NLOS research to take it closer to its real-world applications.
We proposed a detail refinement method to enhance the visual effect of turbulence in irrotational vortex. We restore the missing angular velocity from the particles and convert them into linear velocity to recover turbulent detail due to numerical disspation.
We have developed an approach to construct and design a new class of space-filling shapes, which we call Delaunay Lofts. Our approach is based on interpolation of a stack of planar tiles whose dual tilings are Delaunay diagrams. We construct control curves that interpolate Delaunay vertices. Voronoi decomposition of the volume using these control curves as Voronoi sites gives us lofted interpolation of original polygons in planar tiles. This, combined with the use of wallpaper symmetries allows for the design of space-filling shapes in 3-space. In the poster exhibition, we will also demonstrate 3D printed examples of the new class of shapes (See Figures 1 and 3).
The advancement in technology brought about the introduction of eLearning to educational institutes. By supplementing traditional courses with eLearning materials, instructors are able to introduce new learning methods without completely deviating from standard education programs [Basogain et al. 2017]. Some of the most popular forms of E-Learning include online courses [Aparicio and Bacao 2013], [Goyal 2012], video clips of lectures, and gamification of courses and materials [Plessis 2017]. This paper introduces and evaluates the performance of eLearning videos featuring anime styled avatars (a.k.a VTuber) speaking in vocoder transformed audios and how they compare with the traditional lecturer videos.
Modern medical science strongly depends on imaging technologies for accurate diagnose and treatment planning. Raw medical images generally require post-processing - like edge and contrast enhancement, and noise removal - for visualization. In this paper, a clustering-based contrast enhancement technique is presented for computed tomography (CT) images.
Computing injective mappings with low distortions on meshes is an important problem for its wide range of practical applications in computer graphics, geometric modeling and physical simulations. Such tasks as surface parametrization or shape deformation are often reduced to minimizing non-convex and non-linear geometric energies defined over triangulated domains. These energies are commonly expressed in a finite element manner as a weighted sum of distortion densities D over simplixes S:[MATHS HERE]where (2) enforces f to preserve orientation of each simplex, and (A, b) is a linear system of the given positional constraints. The orientation constraints are particularly important in parametrization problems, since they avoid undesirable foldover artifacts in the texture, while positional constraints are widely used in shape deformation applications, such as point-to-point deformations, deformations with fixed anchors, and more.
For cytology, pathology or histology image analysis, whether performed by computer-aided algorithms or human experts, a general issue is to exclude the disturbance caused by noisy objects, especially when appeared with high similarities in shape, color and texture with target cell or tissues. In this paper, we introduce a novel model to reduce such type of noisy objects with large quantity and distribution in the microscope images based on deep learning and hand-craft features. The model experimentally reduces the false positives without effect on objects of interest for cancer detection. Moreover, it also provides much distinct images for human experts for the final diagnosis.
Physics-based models for ocean dynamics and optical raytracing are used extensively for rendering maritime scenes in computer graphics [Darles et al. 2011]. Raytracing models can provide high-fidelity representations of an ocean image with full control of the underlying environmental conditions, sensor specifications, and viewing geometry. However, the computational expense of rendering ocean scenes can be high. This work demonstrates an alternative approach to ocean raytracing via machine learning, specifically Generative Adversarial Networks (GANs) [Goodfellow et al. 2014]. In this paper, we demonstrate that a GAN trained on several thousand small scenes produced by a raytracing model can be used to generate megapixel scenes roughly an order of magnitude faster with a consistent wave spectrum and minimal processing artifacts.
In computer graphics, stippling is a widely used non-photorealistic rendering technique. As the art of representing images with dots, one of the key problems is the placement of dots. In general, they should be distributed evenly, and with some randomness at the same time. Blue noise methods provide these characteristics and are used by state-of-the-art gray-scale algorithms to distribute dots. Color stippling, however, is more challenging as each channel should have even distribution at the same time. Existing approaches cast color stippling as a multi-class blue noise sampling problem and provide high quality results at the cost of a very long processing time. In this paper, we propose a real-time structure aware method for color stippling, based on samples generated from an incremental voronoi set. Our method can handle an arbitrary input color vector for stippling and produce significantly better results than previous methods, at real-time frame rate. We evaluate the perceptual quality of our stippling with a user study and its numerical performance by measuring the MSE between the reconstructed image from the stippling and the input image. As a result, the real time performance of our method makes interactive stippling editing possible, providing the artist with an effective tool to explore quickly a wide space of color image stippling.
We present an unsupervised incremental learning method for refining hand shape and pose estimation. We propose a refiner network (RefNet) that can augment a state-of-the-art hand tracking system (BaseNet) by refining its estimations on unlabeled data. At each input depth frame, the estimations from the BaseNet are iteratively refined by RefNet using a model-fitting strategy. During this process, the RefNet adapts to the input data characteristics by incremental learning. We show that our method provides more accurate hand shape and pose estimates on both a standard dataset and real data.
This paper proposes an innovative industry practice regarding fractal geometry generating and rendering processes. VFX Fractal Toolkit (VFT) aims to provide powerful, yet intuitive and artist-friendly workflows for exploration and generating of vast amounts of fractals. VFT allows for node-based description of fractals implemented in SideFX Houdini. VFT is built specifically for Visual Effects (VFX) pipelines and employs standard practices. It aims to provide artists with a toolset which would help them explore fractal forms of generative art directly in VFX applications.