SIGGRAPH '24: ACM SIGGRAPH 2024 Posters

Full Citation in the ACM Digital Library

SESSION: Animation & Simulation

A Multi-modal Framework for 3D Facial Animation Control

Chong Cao
Qiuyang Xiao
Chengwei Shi

3D facial reconstruction and animation has advanced significantly in the past decades. However, existing methods based on single-modal input struggle with specific facial part control and require post-processing for natural rendering. Recent approaches integrate audio and 2D estimation results to enhance facial animations, which improves the naturalness of head pose, eye and mouth animations. In this paper, We present a multi-modal 3D facial animation framework that uses video and audio input simultaneously. We experiment on different settings of motion control to produce natural and precise facial animation through user studies.

Animated Ink Bleeding with Computational Fluid Dynamics

Grzegorz Gruszczynski
Mateusz Tokarz
Przemyslaw Musialski

This work presents an advanced computational approach for simulating the traditional ink-on-wet paper art technique, focusing on the dynamic interplay between ink, water, and paper using the Lattice Boltzmann Method (LBM). We model ink motion using the advection-diffusion equation coupled with Navier-Stokes equations in a porous medium. We explore the impact of the background material’s permeability on the realism of digital ink simulations. Our results highlight this method’s potential in digital art, offering artists and animators a novel tool for creating realistic ink effects.

FluidsFormer: A Transformer-Based Approach for Continuous Fluid Interpolation

Bruno Roy

Generative AI for 2D Character Animation

Jaime Guajardo
Ozgun Bursalioglu
Dan B Goldman

In this pilot project, we teamed up with artists to develop new workflows for 2D animation while producing a short educational cartoon. We created several workflows streamlining the animation process, bringing the artists’ vision to the screen more effectively.

Non-Hermitian Absorbing Layers for Schrödinger's Smoke

Naoyuki Hirasawa
Takashi Kanai
Ryoichi Ando

This paper proposes novel open boundary conditions for incompressible Schrödinger flow (ISF) by introducing a new non-Hermitian term to the original Schrödinger’s equation, effective only within narrow-banded layers. Unlike previous work that explicitly requires auxiliary variables, our method achieves the same objective without such complexity, facilitating implementation. We demonstrate that our method retains the benefits of the original ISF while robustly dissipating velocity without noticeable reflections.

Projection-Based Handling of Contact Constraints

Stephanie Ferreira
Johannes Sebastian Mueller-Roemer
Daniel Weber

Revitalizing Traditional Animation: Pre-Composite Frame Interpolation as a Production Catalyst

Sean Brown
Anu G. Bourgeois

Simulating High-Resolution 3D Smoke with 2D Turbulence Transfer

Yoshinori Dobashi
Syuhei Sato

SESSION: Art & Design

Choreofil: Dancing Filament Light Bulb

Akira Nakayasu

The ChoreoFil is a light bulb system with dancing filaments. The shape-memory alloy actuator as filaments bends like tentacles of a sea anemone, and the top of the actuator glows softly. By downsizing the previous system, the driving circuit could be fitted inside a light bulb. This project is still only an actuator-driven system, when connected to networks, information, and music, it will become the ambient media with an organism-like presence.

Cybermove: a Critical Practice on Virtualizing Our Daily Life

Jiahe Zhao
Liyu Chen

Cybermove prompts exploration of "housing". The artist used 3D scanning to virtually recreate her childhood room, packed all her belongings into two suitcases, and began her journey abroad. This digital space was then transported with the artist to a foreign country, unveiled and shared with visitors through a VR headset. The moving process was displayed on social media and connected with viewers' experiences of housing issues, amassing over 3 million views.

EDAVS: Emotion-Driven Audiovisual Synthesis Experience

Sihang Chen
Junliang Chen
Xiaojuan Gu

In this paper, we proposed a novel approach to the creation of immersive multimedia content. At its core, EDAVS harnesses the subtle inflexions of human speech, translating emotive cues into dynamic visual narratives and corresponding soundscapes. This amalgamation of auditory sentiment and visual splendour is predicated upon cutting-edge machine-learning algorithms capable of deep semantic and emotional analysis.

Enhancing the Digital Inheritance and Embodied Experience of Zen based on Multimodal Mixed Reality System

Wenchen Guo
Wenbo Zhao
Guoyu Sun
Yanni Li
Yangyi Ye
Su Wang

Zen is an important part of the world's intangible cultural heritage (ICH) but is facing difficulties nowadays. We present an immersive multimodal system for audiences to meditate and experience Zen by integrating AI, natural interaction, and mixed reality (MR). The evaluation (N = 66) shows that the system provides the audience with a high-quality sensory experience, enhancing their engagement and empathy. This approach brings new insights into the computer graphic (CG) and ICH community.

Illusory Eyescape - Exploring the Variations of Consciousness through Generative Art and Eye-Tracking Techniques

Sin-Fei Lee
Ming-Te Chi

Historically, art has been considered a form of expression uniquely reserved for humans. However, technological advances, particularly in computer graphics, have expanded the boundaries of artistic creation beyond human exclusivity. The emergence of Generative Art in recent years has not only transformed the artistic process but has also initiated extensive dialogues on the interactions between humans and machines. To delve deeper into this theme, we developed an interactive art installation titled "Illusory Eyescape." This installation integrates Generative Art and Eye-tracking Technologies to probe the fundamental nature of consciousness in humans and machines, questioning the role of art in this new epoch of human-machine synergy. This exploration broadens our comprehension of the potential amalgamations of art and technology and challenges our conventional perceptions of art.

Interactive RGB+NIR Photo Editing

Samuel Antunes Miranda
Shahrzad Mirzaei
Mariam Bebawy
Sebastian Dille
Yağız Aksoy

Kandinsky As You Preferred

Aven-Le Zhou
Yu-Ao Wang
Wei Wu
Kang Zhang

Due to the significant generative capabilities of GenAI (generative artificial intelligence), the art community has actively embraced it to create painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error as users struggle to formulate effective prompts to achieve their desired results. This paper describes a generative approach that empowers users to easily work with a large text-to-image (TTI) model to create their preferred painterly content. The authors propose a large model personalization method, namely Semantic Injection, to personalize a large TTI model in a given specific artistic style, i.e., Kandinsky’s paintings in Bauhaus era, as the Artist Model. Through working with a Kandinsky expert, the authors first establish a semantic descriptive guideline and a TTI dataset of Kandinsky style and then apply the Semantic Injection method to obtain an Artist Model of Kandinsky, empowering users to create preferred Kandinsky content in a deterministically controllable manner.

Picture (Im-) Perfect - Exploring Imperfections in Computer Generated Rendering

Jolie Che
Oliver Gingrich

Computer graphics advances rapidly advance new innovations in animation practices aiming for ever more realism - calculating exact symmetry, perfect lines and flawless shadows that follow the physicality of reality. However, these technical advancements present a departure from traditional drawing techniques and physical artistry and handcraft, which often embrace originality, experimentation, and individual authorship in animation practices. This project explores hand-drawing and hand-painting techniques in the context of 3D shading and texturing, reintroducing a much-needed return to traditional artistic skills through a mixture of hand-painting and procedural texturing to3D shading. Reaching beyond hyper-realism, this illustrates a new synthesis between 2D drawing and 3D shading techniques.

Starry Starry Night: An Interactive Narrative Visualization of Astrology Imagery in Tang Poetry

Yechun Peng
Jingxin Ye
Dianheng Jiang
Qing Chen
Yang Shi
Nan Cao

Astrology emerges as a distinctive and mystical facet of traditional Chinese culture, imparting a unique charm when interwoven into ancient literature such as Tang poetry. We designed an interactive narrative visualization, Starry Starry Night, to present the application of astrology imagery in Tang poems. The visualization combines author-driven storytelling and reader-driven exploration, hoping to effectively enhance readers’ understanding of astrology-related poetry and to promote the dissemination of traditional Chinese culture.

Zeitgeist - A deep learning visualization of Social Flow

Oliver Gingrich
Shama Rahman
Daniel Hignell-Tully

Zeitgeist is a participatory artwork that forms part of the AHRC-funded research project. The interface employs artificial intelligence (ai)-based algorithms to represent mental states of creative ‘Flow’: Flow, a concept of peak performance, low stress and heightened creative stimulation [Nakamura and Csikszentmihalyi, 2002] is measured and assessed using physiological data analysed in real-time using a deep-learning algorithms: Mapped onto 3D representations and displayed on a Pepper's ghost type hologram, participants jointly observe their individual and collective propensity of being in Flow. As a research project, the artefact is embedded into research on the effect of participatory art engagements on Flow, wellbeing and social connectedness. A pilot with two cohorts creatives (N=12) and scientists (N=16) demonstrates a significant effect of the participatory art intervention on Flow, mood and social connectedness, with facilitation and environmental context as confounding factors.

SESSION: Augmented & Virtual Reality

A Real-time Visualization System of Pseudo-muscle Activity Using a VR Device

Asako Soga
Taichi Yano
Kunihiko Oda

The goal of this research is to provide a real-time display of muscle activity in response to human motion input. We developed a system for real-time visualization of pseudo-muscle activity using a VR device. The user controls a CG avatar in VR space by inputting a body motion with the VR device. We constructed simplified models to output muscle activity corresponding to the posture of a CG avatar based on muscle-analysis data estimated from motion data. This system allows users to visualize the position of active muscles using a CG avatar in a VR space. This system can present different muscle activities depending on the input of elbow flexion/extension movement and the state of the upper limb at that time. The position of active muscles and the magnitude of their activity are represented by the color and expansion of the muscles on a CG avatar. By assigning controller input to other parts of the body rather than just the hands, users can move those parts in virtual space and visualize muscle activity.

Creating a Multi-Sensory Horror Experience with VR

Danielle Cole
Mackenzie Darling
Ryan Clement
Dexter Hunter Laroche
Samuelle Lamoureux
Brendan O'Malley
Lesley Istead

Driving Through the Data: Extended Reality Perceptual Support System (XRPSS)

Jay H. Matsushiba
Mia Fitzpatrick
Nicholas Hedley

This work introduces the Extended Reality Perceptual Support System (XRPSS) that provides real-time 3D LiDAR sensor data to off-road vehicle drivers with interactive XR for enhancing in-situ spatial awareness in natural environments. The system extends to ex-situ data exploration via motion driving simulators and VR. We created a proof-of-concept system that can both visualize real-time LiDAR point clouds in the field and generate virtual experiences for the motion simulator post-drive.

Effective Visual Feedback for Virtual Grasping

Shota Imaizumi
Mie Sato

The purpose of this study is to examine effectiveness of visual feedback in virtual grasping with a bare hand. We propose two types of visual feedback that can generate a pseudo-haptics and compare them with five types of existing visual feedback. In an experiment, a subject performs a pick-and-drop task, in which a virtual object, which is a sphere with a radius of 5 cm, is grasped, moved and released toward a target. Quantitative and qualitative evaluation by twenty subjects revealed that the one type of the proposed visual feedback helped the subjects perform the task most accurately and the other type of the proposed visual feedback was preferred most in the other visual feedback.

Enhancing Radiography Education through Co-located Mixed Reality with Tracker-based X-ray Imaging Simulation

Xuanhui Xu
Antonella Puggioni
David Kilroy
Abraham Campbell

This poster proposes a co-located collaborative Mixed Reality (MR) teaching system to enhance Veterinary Radiography training. By integrating real X-ray equipment with virtual representations of horses, the system simulates radiographic procedures effectively. The cost-effective setup combines real equipment with virtual annotations, such as collimation beams, to achieve synchronized virtual objects and generate real-time radiography. This innovative approach enables instructors to demonstrate equine techniques and assess students’ proficiency more effectively. An expert review study was conducted to gather feedback and evaluate user experience related to this novel teaching approach.

Enhancing VR Customer Service Training: A System for Generating Customer Queries and Evaluating Trainee Responses

Takenori Hara

Virtual customer service training using avatars eliminates the need for physical facilities, reducing costs and enabling remote participation. However, the challenge lies in the authoring cost of preparing service specific question and the necessity of experienced instructors. To address this, we employ LLM technology to generate service specific question and evaluate trainee’s answer mitigating authoring costs. Despite LLMs having general knowledge limitations, we integrate RAG to imbue them with specific service knowledge. This innovation facilitates the generation of service specific questions. Furthermore, we are developing a self-training system where RAG assesses the correctness of trainee answers, enabling independent learning (without an instructor). This paper discusses the issues faced and insights gained during the development of our system.

Exploring the Application of Multi-Sensory Device Interaction in Traditional Chinese Medicine Popular Science Education

Chin-Chun Chen
Tung-Hua Yang
Cheng-Ta Yang
Chi-Yu Lin
Ping-Hsuan Han
Si-Han Chen
Yi-Ru Yang
Ching Hui Chung

The origins of traditional Chinese medicine (TCM) span thousands of years, rooted in extensive medical and life experiences. Central to TCM is herbal medicine, integrating natural science principles. However, accessibility to fresh herbs is limited due to growth conditions, regulations, and transportation costs. To address this, we propose the "TCM Herbal Education" system, merging 3D scanning, interactive VR, and olfactory devices to educate on Chinese herbal science multidisciplinarily. It offers insights into cultivation, pest control, prescriptions, and herbal aromas. By virtualizing traditional education, users interact with herbs via VR, enhancing understanding through sensory experiences. Exploratory research evaluates and refines the system for improved cross-reality experiences.

Fitting Room: Lung Transplantation Donor-Recipient Size-matching in Virtual Reality

David Sibrina
Rene Novysedlak
Jan Kolarik
Robert Lischke
George Alex Koulieris

Multi-Role VR Training System for Film Production: Enhancing Collaboration with MetaCrew

Zheng Wei
Shan Jin
Wai Tong
David Kei Man Yip
Pan Hui
Xian Xu

Film production within studios has become an industry standard in mature professional filmmaking. However, most training institutions cannot afford the expensive film production equipment essential for providing beginners with collaborative training alongside directors, cinematographers, cameramen, and other film professionals. To address these challenges, we first propose the Film Production Virtual Reality Collaborative Learning (VRCL) system — MetaCrew. We then developed an interface featuring visual symbols for film operations to facilitate smooth communication and learning among beginners using professional terminology in film production. In an experiment with 24 participants, we further explored the impact of the film operation graphical user interface (GUI) design on social interaction factors within our system. Results show that our system and GUI enhance social interaction and learning outcomes among learners. Moreover, our system presents a smaller financial burden and requires less physical space.

Virtual Reality Application That Leverages the Proteus Effect to Make Blood Donation a Positive Experience

Hitoshi Usui
Kei Kobayashi

In Japan, the reluctance of young people to donate blood has become a social issue. In this study, for the purpose of deterring the trend among young people not to donate blood, we propose a virtual reality (VR) application that changes blood donation into a positive experience by gradually changing the appearance of an avatar from a normal state to a state full of energy according to the elapsed time of blood donation. As a result of a comparison experiment between the proposed method to change blood donation into a positive experience and a method to distract attention by stimulation, it was clarified that the pain of blood donation was significantly alleviated through the proposed method.

SESSION: Geometry & Modeling

Edible Caustics: Designing Caustics of Jelly via Differentiable Rendering

Daiki Inokoshi
Yuki Yabumoto
Junpei Fujikawa
Yoshinori Dobashi
Takashi Ijiri

Neuro-Symbolic Transformation of Architectural Facades into Their Procedural Representations

Aleksander Płocharski
Jan Swidzinski
Joanna Porter-Sobieraj
Przemyslaw Musialski

We introduce a neuro-symbolic transformer model that converts flat, segmented facade structures into procedural definitions using a custom-designed split grammar. To facilitate this, we first develop a split grammar tailored for architectural facades and generate a dataset of facades alongside their procedural representations. This dataset is used to train our transformer model to convert segmented, flat facades into the procedural language of our grammar.

O, What an Iridescent Web We Weave: Rendering Physically Inspired Spider Webs for Visual Effects

Vaya Simeonova
Eike Falk Anderson

The presence of realistic spider webs can be used to establish plausible, visually appealing environments, providing a sense of immersion and authenticity to computer generated worlds. When lit, cobwebs transform into displays of iridescent colors and patterns, resulting from the interaction between the light and the structure of the silk fibers.

We propose a straightforward, physically inspired method for rendering cobwebs using common VFX production software, replicating the iridescent visual appearance of spider webs for added realism in virtual environments.

Temporal Hierarchical Gaussian Mixture Models for Real-Time Point Cloud Streaming

Roland Fischer
Tobias Gels
Rene Weller
Gabriel Zachmann

SESSION: Images, Video & Computer Vision

3DCrewCap: Applying 3D Volumetric video capture for XR helicopter rescue crew training and simulation.

John McGhee
Conan Bourke
Robert Lawther
Hao Zhou
Rolf Petersen

Helicopter rescue crew training scenarios are complex and hard to simulate in game engines with animated 3D digital characters. This challenge is compounded when simulating Realtime photorealistic animated character sequences on XR-based Head-Mounted Displays (HMD). In this research, we present a practical early-stage development of 3D volumetric video capture and playback workflow for use in helicopter rescue crew training on XR HMDs. We break down the workflow of using Gaussian Splat approaches to construct keyframed 3D animated models of the rescue crew training actions. This novel application approach provides a practical example of the photorealistic capture and XR display of helicopter rescue crew performing training scenarios.

Adaptive Sampling for Monte-Carlo Event Imagery Rendering

Yuichiro Manabe
Tatsuya Yatagawa
Shigeo Morishima
Hiroyuki Kubo

This paper presents a novel event-based camera simulation system based on physically accurate Monte Carlo path tracing with adaptive path sampling. The adaptive sampling performed in the proposed method is based on the probability of event occurrence. First, our rendering system collects logarithmic luminances rather than raw luminance based on the circuit characteristics of event-based cameras. We calculate the probability of how much different the logarithmic luminance gap is from the preset event threshold. This means how likely an event will occur at the pixel. Then, we sample paths adaptively based on the sample rate combined with a previous adaptive sampling method. We demonstrate that our method achieves higher rendering quality than the baseline approach which allocates path samples uniformly for each pixel.

Classifying Texture Anomalies at First Sight

Andrei-Timotei Ardelean
Tim Weyrich

Consistent Image Registration for Multi-view Focus Bracketing in Micro-Scale Photogrammetry

Kodai Yasuda
Yuki Yabumoto
Takuhiro Nishida
Takashi Ijiri

Controllable Neural Reconstruction for Autonomous Driving

Máté Tóth
Péter Kovács
Zoltán Bendefy
Zoltán Hortsin
Tamás Matuszka

Neural scene reconstruction is gaining importance in autonomous driving, especially for closed-loop simulation of real-world recordings. This paper introduces an automated pipeline for training neural reconstruction models, utilizing sensor streams captured by a data collection vehicle. Subsequently, these models are deployed to replicate a virtual counterpart of the actual world. Additionally, the scene can be replayed or manipulated in a controlled manner. To achieve this, our in-house simulator is employed to augment the recreated static environment with dynamic agents, managing occlusion and lighting. The simulator’s versatility allows for various parameter adjustments, including dynamic agent behavior and weather conditions.

Crowdsourced Streetview: Integrating Real-Time Imagery Updates into Google Streetview

Ryan Hardesty Lewis

This paper presents Crowdsourced Streetview, a system that integrates real-time imagery updates into Google Streetview by leveraging crowdsourced images from social media platforms. Our approach utilizes advanced image alignment and feature detection algorithms to overlay user-contributed images onto existing Streetview data, achieving real-time performance with an image alignment runtime of 19ms per image pair. Crowdsourced Streetview has the potential to provide up-to-date and real-time visual representations of locations, particularly in rapidly changing environments and during events such as natural disasters.

Evaluating And Improving Disparity Maps Without Ground Truth

Andreea Pocol
Lesley Istead
Craig Steven Kaplan

Image Segmentation from Shadow-Hints using Minimum Spanning Trees

Moritz Heep
Eduard Zell

Non-Line-of-Sight Imaging based on Dual Photography using Leaked EM Waves

Masaya Oishi
Taiki Kitazawa
Yuichi Hayashi
Hiroyuki Kubo

In this study, we propose a novel method for Non-Line-of-Sight (NLoS) imaging utilizing leaked electromagnetic (EM) waves. Our approach leverages the ability to estimate images from projected by a hidden light source (projector) from leaked EM waves. By estimating the projected pattern from a hidden light source and capturing the diffuse surface of a wall illuminated by the reflected light of the hidden object, we demonstrate the reconstruction images of objects occluded from direct view. We show that leaked EM waves can be exploited to enhance NLoS imaging techniques, opening up new possibilities for reconstructing scenes beyond the direct field of view.

Reconstructionless Airborne Radiance Fields

Christoph Praschl
Leopold Böss
David C. Schedl

Recreating the Sodium Vapor Matting Process

Paul Debevec
Nikolas Pueringer

We present a Sodium Vapor Matting (SVM) system built from off-the-shelf components allowing simultaneous capture of a full-color foreground subject and an accurate holdout matte. We use a beamsplitter cube and two filters to image the sodium vapor wavelength independently of the rest of the visible spectrum. We build custom sodium vapor light sources for illuminating the background, and aim synchronized digital cinema cameras into the filtered prism to record the foreground and the matte. We process several composite shots using the system, using the matte and a clean plate to subtract visible light in the subject’s background and to hold out the alpha region from the new background.

Scribble: Auto-Generated 2D Avatars with Diverse and Inclusive Art-Direction

Lohit Petikam
Charlie Hewitt
Shideh Rezaeifar

Sign Motion Generation by Motion Diffusion Model

Kohei Hakozaki
Tomoya Murakami
Tsubasa Uchida
Taro Miyazaki
Hiroyuki Kaneko

Spectral Periodic Networks for Neural Rendering

Hallison Paz
Tiago Novello
Luiz Velho

T2DyVec: Leveraging Sparse Images and Controllable Text for Dynamic SVG

Jiakai Wu
Jun Xiao
Haiyong Jiang

In this work, we introduce a controlled dynamic vector graphic generation method. While existing work mostly focuses on text-based generation of single-frame images, dynamic images, or single-frame vectors, there is a lack of research on generating dynamic vectors with complex elements and diverse styles. This is due to the unique challenges posed by dynamic vectors, which require coherent and seamless transitions of vector parameters between frames. To address these challenges, we propose T2DyVec, which leverages text prompts and sparse images as a control for vector generation. It incorporates Vector Consistency, Semantic Tracking, and VPSD to optimize the diffusion model for vector parameters, enabling the generation of multi-frame dynamic coherent vectors. This approach can significantly optimize creative workers’ workflow, facilitating generation and further editing.

Visualization of Flow Direction using Polarization Angle Changes of Cellulose Nanofiber Suspension

Ryusei Okamoto
Shogo Yamashita
Takuya Kato
Hiroyuki Kubo

In this study, we propose a novel method for visualizing flow direction by exploiting the polarization angle changes of cellulose nanofiber suspensions. Utilizing a polarization camera, we calculate the polarization angle, enabling pixel-wise visualization of flow direction. We demonstrate that the boat’s shape significantly influences the flow patterns generated in its wake. Furthermore, we verify the correspondence between polarization angle and flow direction, and investigate the effects of illumination multiplexing and de-multiplexing on the proposed method. This study presents a promising technique for high-resolution flow direction visualization, with potential applications in fluid dynamics research and industrial settings.

SESSION: Interactive Techniques

3D Printing Shape-Changing Devices with Inductive Sensing

Hsuanling Lee
Liang He

We present a novel technique that converts arbitrary 3D models into 3D printable shape-changing devices with sensing capability through the integrated helix-and-lattice structure. The structure comprises a central hollow helical channel, surrounded by lattice padding and a surface mesh, which allows the device to become elastic and deformable. By inserting a conductive steel wire into the hollow helical channel, inductive signals are generated when a current runs through the wire. When the device undergoes deformations, these inductive signals vary distinctively, reflecting the specific changes in the device’s shape. These signals, specific to each type of deformation, are interpreted as interactive user input using a trained machine-learning classifier. We also showcase three example applications, including a deformable Christmas tree, a Snake game joystick controller, and a snowman-like light control.

A Study on Tactile Illusions of Temperature and Touch Through Virtual Hand Illusion

Shosuke Iida
Tomokazu Ishikawa

Recently, a wide variety of VR content has been provided, and the development of devices that approach the sense of touch is attracting attention. We investigated the sensory experience of "temperature," "smoothness," and "hardness" in virtual space. Unlike previous studies that perform haptic feedback using a glove-type device, we conducted a tactile experiment without using a device by inducing a tactile illusion from visual information using a virtual hand illusion. The results suggests that visual information can evoke a tactile illusion of "temperature" and "smoothness."

BNNAction-Net: Binary Neural Network on Hands Gesture Recognitions

Federico Fontana
Alessandro Di Matteo
Luigi Cinque
Giuseppe Placidi
Marco Raoul Marini

With the rise of wearable technology and real-time gesture recognition, lightweight and efficient models are essential. Traditional approaches struggle with computational demands and power consumption. We present BNNAction-Net, a hand gesture recognition system using Binary Neural Networks (BNNs) to reduce computational complexity. Evaluated on the EgoGesture dataset, our system simulates a real use case with a headset and frontal RGB-D cameras. Optimized with binary layers, pooling, and normalization, it achieves accuracy comparable to floating-point networks with lower resource consumption. Our findings highlight the efficiency of BNNs for wearable devices without significant accuracy loss.

EMSJUMP: Electrical Muscle Stimulation as a Wearable Training Tool for Take-off Phase of Ski Jumping

Shon Nakamura
Tatsuki Fushimi
Yoichi Ochiai

We developed a wearable Electrical Muscle Stimulation (EMS) system aimed at enhancing kinesthesia and promoting skill acquisition in ski jumping. Through experiments, it was demonstrated that the application of EMS during jumping significantly increased the angular velocity of knee extension and altered perception.

Fabricating Haptic Paper Interface

Ruhan Yang
Ellen Yi-Luen Do

Our research focuses on developing haptic feedback mechanisms for paper interfaces. We present a comprehensive overview of our efforts and introduce approaches to incorporating haptic feedback into paper interfaces. Traditional paper interfaces primarily emphasize interactive features and often disregard haptic feedback, which is critical for enhancing user confidence in operation. Our research investigates how cardboard structures and static magnets can be utilized to incorporate haptic feedback features into paper interfaces. These techniques help to enhance the user experience associated with paper interfaces.

FactoryDecoder: Expertise-Free Digital Twin Generation and Modification Tool

Jiachun Du
Hanjin Zhong
Liang Zhou
Jianye Li

Digital twins are essential visualization applications in the manufacturing field. However, their development requires specialized 3D engineers, which often complicates modifications during the operation stage. To address this complexity, we introduce FactoryDecoder, a development tool for non-expert users in 3D engineering to generate and modify digital twins using natural language inputs. FactoryDecoder converts users' descriptions of production line into hierarchical asset codes, facilitating the automated layout and simplified modification of digital twins. Furthermore, if the system finds that the 3D asset library lacks appropriate device representations, it will automatically use a 3D mesh generator to create new ones. We evaluate the performance of large language models (LLMs) to optimize FactoryDecoder's capabilities. Preliminary user studies highlight FactoryDecoder's effectiveness.

ImmerseSketch: Transforming Creative Prompts into Vivid 3D Environments in VR

Alfred Lan
Tai-Chen Tsai
Chih-Chuan Huang
Pu Ching
Tse-Yu Pan
Min-Chun Hu

We propose ImmerseSketch, a framework designed to transform creative prompts into vivid and detailed 3D content within a Virtual Reality (VR) environment. Our aim is to inspire creative ideation and to provide creators with 3D reference content for painting. However, the computational demands of diffusion models for 3D model generation and the scarcity of high-quality 3D datasets limit the diversity and intricacy of the generated environments. To overcome these obstacles, we focus on generating initial panoramic images through diffusion models and then converting these images into rich 3D environments with the aid of depth estimation. The pilot study shows that the proposed ImmerseSketch can provide an immersive environment and assist the process of creation.

Real FPV Drone Simulator with 3D Scan Data and Visual Effects

Tsukasa Ebisawa
Yusaku Aoki
Xu Han
Mina Shibasaki
Tetsuaki Baba
Kumiko Kushiyama
Akira Nakayasu

TexEdge: Towards an Approach for Optimization of Sensor Layouts in Textile Interfaces

Samuel Zühlke
Sara Mlakar
Kathrin Probst
David C. Schedl

The Future of Interaction with Mobile Game Characters

Roberto Lopez Mendez
Sicong Li
Hong Ji Liu

This poster showcases the future of player-NPC interaction in gaming: more natural, based on speech. The poster explains the implementation of verbal interaction with an NPC in a mobile game. Large Language Models (LLMs) open new ways of interacting in gaming, but the large size and memory footprint make deploying them on mobile a challenge. We have identified a 33M-parameter small Language Model (LM), TinyStories [Eldan and Li 2023], repurposed it to be conversational, and combined it with the standard Unity Multi-Layer Perceptron (MLP) ML-Agents [Juliani et al. 2020] model. The LM drives the interaction with the user, while the MLP performs the actions. A Sentence Similarity model [Reimers and Gurevych 2019] is then responsible for filtering user input to decide whether the NPC needs to communicate with the player or to perform an action. Running all these models locally on mobile inside a game engine like Unity can be very challenging in terms of optimization, but this poster demonstrates how it is possible.

SESSION: Rendering & Displays

A Surface-based Appearance Model for Pennaceous Feathers

Juan Raul Padron Griffe
Dario Lanza
Adrian Jarabo
Adolfo Muñoz

The appearance of a real-world feather is the result of a complex light interaction with its multi-scale biological structure including the central shaft, branching barbs and interlocking barbules on those barbs. In this work, we propose a practical appearance model for feathers encoded as 2D textures where the overall appearance is a weighted BSDF of the implicit representations of the key biological structures. This BSDF can be applied to any surface and does not require the explicit geometrical modeling of the internal microstructures (barbs and barbules) as in previous works. Our model accounts for the particular characteristics of feather fibers such as the non-cylindrical cross-sections of barbules and the hierarchical cylindrical cross-sections of barbs. To model the relative visibility between barbs and barbules, we derive a masking term for the differential projected areas of the different components of the feather’s microgeometry, which allows us to analytically compute the masking between barbs and barbules without costly Monte Carlo integration.

Amplify AR HUD User-Experience with Real-World Sunlight Simulation in Virtual Scene

Eun Hye Kim
Hyocheol Ro
Hyunjin Park

In this paper, we introduce the concept of real-world sunlight simulation in virtual scenes of Head-Up Display (HUD) application. We have adopted solar position calculation logic [Zhang, 2020] and demonstrated its capability with experiments ran on real vehicle (Figure 1). In addition, we suggest UX approaches featuring the interaction between virtual 3D objects and the simulated virtual sunlight for expanded AR experiences in HUD application.

Denoising Monte Carlo Renders with Diffusion Models

Vaibhav Vavilala
Rahul Vasanth
David Forsyth

Physically-based renderings contain Monte-Carlo noise, with variance that increases as the number of rays per pixel decreases. This noise, while zero-mean for good modern renderers, can have heavy tails (most notably, for scenes containing specular or refractive objects). Learned methods for restoring low fidelity renders are highly developed, because suppressing render noise means one can save compute and use fast renders with few rays per pixel. We demonstrate that a diffusion model can denoise low fidelity renders successfully. Furthermore, our method can be conditioned on a variety of natural render information, and this conditioning helps performance. Quantitative experiments show that our method is competitive with SOTA across a range of sampling rates; qualitative evidence suggests that the image prior applied by a diffusion method strongly favors reconstructions that are like real images, with straight shadow boundaries, curved specularities, and no fireflies. In contrast, existing methods that do not rely on image foundation models struggle to generalize when pushed outside the training distribution.

Development of a stereoscopic projection mapping system using a small mobile robot

Azumi Katayama
Shinji Mizuno
Kenji Funahashi

Distance-adaptive unsupervised CNN model for computer-generated holography

Yuto Asano
Kenta Yamamoto
Tatsuki Fushimi
Yoichi Ochiai

Convolutional neural networks (CNN) are useful for overcoming the trade-off between generation speed and accuracy in the process of synthesizing computer-generated holograms (CGH). However, methods using a CNN cannot specify the propagation distance when reproducing a hologram, thereby limiting their practical usage across various contexts. Therefore, in this study, we developed a CNN model that can generate CGH by specifying the target image and propagation distance. The proposed method demonstrates performance comparable to traditional methods with a fixed distance and achieves the generation accuracy and speed necessary for practical use even when the propagation distance is changed.

Expansive Field-of-View Head-Mounted Display based on Dynamic Projection Mapping

Naoki Hashimoto
Kazuto Saito

In this study, we propose a method for extending the peripheral field of view of conventional head-mounted displays (HMDs) based on dynamic projection mapping. External projectors are used to dynamically project images onto peripheral screens attached to the periphery of the HMD. A wide field of view is achieved without significantly increasing the HMD weight.

Gabor Splatting for High-Quality Gigapixel Image Representations

Skylar Wurster
Ran Zhang
Changxi Zheng

Geometry enhanced 3D Gaussian Splatting for high quality deferred rendering

Shuo Wang
Cong Xie
Shengdong Wang
Shaohui Jiao

Reconstructing and rendering the real world with high visual fidelity is crucial for VR applications. 3D Gaussian Splatting (GS) method offers high-quality novel views but struggles with realistic shadows under complex lighting, including both directional lighting and point lighting. We proposed a novel geometry enhanced 3D GS method, in which 3D Gaussians are learned efficiently with additional attributes of both normal and depth. The proposed representation enhances high-fidelity novel view rendering and integrates seamlessly into commercial real-time engines. We developed a deferred rendering pipeline for rasterization, enabling real-time complex illumination effects with high-precision depth and normal attributes learned by our 3D GS method. Our pipeline surpasses previous 3D GS renderers in accurate illumination, shadows, and directional lighting. Applied to VR applications like live broadcasting, it generates immersive virtual environments in real-time on consumer devices. Experimental results show our method delivers superior realistic rendering at real-time speeds, benefiting numerous VR applications.

Measurement of the Imperceptible Threshold for Color Vibration Pairs Selected by using MacAdam Ellipse

Shingo Hattori
Yuichi Hiroi
Takefumi Hiraki

We propose an efficient method for searching for color vibration pairs that are imperceptible to the human eye based on the MacAdam ellipse, an experimentally determined color-difference range that is indistinguishable to the human eye. We created color pairs by selecting eight colors within the sRGB color space specified by the ellipse, and conducted experiments to confirm the threshold of the amplitude of color vibration amplitude at which flicker becomes imperceptible to the human eye. The experimental results indicate a general guideline for acceptable amplitudes for pair selection.

PictorialAttributes: Depicting Multiple Attributes with Realistic Imaging

Omer Dahary
Min Lu
Or Patashnik
Daniel Cohen-Or

Traditional visualizations often use abstract graphics, limiting understanding and memorability. Existing methods for pictorial visualization are more engaging, but often create disjointed compositions. To address this, we propose PictorialAttributes, a technique utilizing LLMs and diffusion models to depict data attributes. Examples show its promise for compelling and informative pictorial visualizations.

Projecting Radiance Fields to Mesh Surfaces

Adrian Xuan Wei Lim
Lynnette Hui Xian Ng
Nicholas Kyger
Tomo Michigami
Faraz Baghernezhad

Radiance fields produce high fidelity images with high rendering speed, but are difficult to manipulate. We effectively perform avatar texture transfer across different appearances by combining benefits from radiance fields and mesh surfaces. We represent the source as a radiance field using 3D Gaussian Splatter, then project the Gaussians on the target mesh. Our pipeline consists of Source Preconditioning, Target Vectorization and Texture Projection. The projection completes in 1.12s in a pure CPU compute, compared to baselines techniques of Per Face Texture Projection and Ray Casting (31s, 4.1min). This method lowers the computational requirements, which makes it applicable to a broader range of devices from low-end mobiles to high end computers.