Web3D '20: Proceedings of the 25th International Conference on 3D Web Technology

Full Citation in the ACM Digital Library

SESSION: Session 1

Creating IoT-ready XR-WebApps with Unity3D

The rise of IoT-ready devices is supported through well-established web concepts for communication and analytics, but interaction yet remains in the world of web browsers and screen-based 2D interaction during times of tablet and smartphone popularity. Transforming IoT interaction concepts into 3D for future exploitation with head-worn XR devices is a difficult task due to the lack of support and continued disengagement of game engines used in XR development. In this work, we present an approach to overcome this limitation, tightly including web technology into a 3D game engine. Our work leverages the versatility of web concepts to create immersive and scalable web applications in XR, without the need for deep-tech know-how about XR concepts or tiring customization work. We describe the methodology and tools in detail and provide some exemplary XR applications.

Deictic Gesture Retargeting for Telepresence Avatars in Dissimilar Object and User Arrangements

The avatar-mediated 3D telepresence system aims to enable a user in local space to interact with a remote user through a virtual avatar representing the remote user. In this system, the avatar’s movement should convey the meaning of the remote user’s movement to the local space. If the local and remote spaces have different spatial configurations, the remote user’s placement and motion should be adapted to the local space so as to preserve the motion semantics of the remote user. To this end, this paper presents a method to generate the placement and deictic gestures of the avatar. First, we develop a method to predict the probability of the avatar’s placement in the local space according to the placement of the remote user. From this, we find the optimal placement of the avatar given the configuration of the local space and the remote user’s placement. Second, we develop a simple yet effective method to retarget the remote user’s deictic gesture to his/her avatar. User study shows that our method improves user engagement and social presence in the tested telepresence scenarios.

Visualization of Differences between Spatial Measurements and 3D Planning Data

In this paper, a system for detection and visualization of geometric differences between 3D planning data and spatial measurement is presented. Usually construction processes are forward processes without feedback loops, e.g. construction of buildings or 3D printing. In our approach an automatic feedback step adds differences to the original planning data after the construction phase to get well-documented products. Even if the planning data as well as the 3D reconstruction of the final product are 3D data, this is a challenging task due to different structures of the 3D data (e.g., precisely defined corners and exact dimension vs. rough geometric approximation) and different levels of abstraction (hierarchy of construction elements like doors, walls or gears vs. unstructured point clouds). As a first step towards this goal, we visualize these differences between the original planning data and the 3D reconstruction based on the obtained point cloud data.

Balancing Body based and Discrete Interaction for Usability and Presence in Virtual Reality

As the major goal of virtual reality (VR) is to provide enriched experience, motion based interface involving many parts of the body, if not the whole, is often employed. On the other hand, in general, the more body parts involved and active the interaction becomes, the less usable and more tiring the user can feel, and the more space the system can require. This in turn can even reversely affect the total experience in the negative way. In this paper, we explore whether it might be possible to convey the rich active VR experience while maintaining some minimum level of usability, by considering the extent of the of body involvement in the VR interaction. We compare six different styles of interaction in terms of their usability and experience in a simple virtual tennis application: one that uses the whole body (both upper and lower body), one that uses only the discrete input interaction device, and four others that use varied degrees of the body and device. Our experiment has shown that compared to the full body interaction, part body interaction showed comparable level of immersion, presence and active experience, while incurring a similarly low level of fatigue of the full interaction device based interaction. Constraining the active VR interface to the upper body without sacrificing the level of experience this way has an added advantage of less operating space. We believe that this investigation can serve as a guideline for VR interaction design that employs body based action interfaces.

Streamlining XR Technology Into Industrial Training and Maintenance Processes

The nature of industrial manufacturing processes and the need to learn and adapt production systems to new demands require new tools to cope with the vast amount of information being generated. This research aims to design, implement, and evaluate a process to streamline industrial assets to immersive environments and harness the interconnectivity of machines, processes, and products for more informed decision making. To evaluate the effectiveness of our solution, we compare it against existing solutions. A statistical test is used to corroborate our hypotheses, and the results of the usability test indicate well-perceived learnability.

SESSION: Session 2

Representation of VR-Based Health Information for Smart City

A rich source of information relevant to smart cities and health care comes from sensors embedded in the environment, or ambient sensors. Strategically responding to sensed data helps healthcare would be smarter. By gaining real time access to this information, city services can respond promptly to urgent health needs and make decisions to avoid unhealthy situations. The emergence of virtual reality and ICT assisted urban planning effectively overcomes these complexities through visualization of big data. Web-based visualization technologies such as virtual reality (VR) present a highly detailed dynamic virtual environment. Also, there is continuously increasing demand for the effective organization of this diversified data, linkage with other data, and the visualization of data. In recent years, there has been a rapid increase in the development of health information systems motivated by legislation intended to protect patients’ information and privacy and government interest in reducing the cost and improving the quality of healthcare.

In this paper, we investigate types of physiological content and the related sensors that can be included in a smart city healthcare system. We present a technical healthcare information model with XML in a standardized format for the implementation of a 3D virtual smart city as follows: Firstly, 3D virtual environment and healthcare information systems were confirmed to be dictated by sensor information parameters via XML. Secondly, real-time location, direction, and health data representation for 3D health management sensor data monitoring and real-time monitoring were conducted in XML. The identified key use case physiological sensor-based XML is helpful as it provides a standardized format for the implementation of a 3D virtual smart city. For implementation, X3D was used for visualizing an example health information system.

Semi-Structured Visual Design of Complex Industrial 3D Training Scenes

Designing high-quality interactive 3D spaces, such as those required for professional training, is a complex, expensive, and time-consuming task. Although several approaches have been proposed to simplify the process, the result is always a trade-off between advanced capabilities and the ease of use of the content creation tool. To guarantee high visual quality of the modeling results, no simplification of the geometry detail level is possible. In this paper, we propose a method of visual content design, which simplifies the process by eliminating some of the structural complexity through the introduction of content controllers, which based on the known semantics of content objects introduce specific editing rules and impose a specific structure of their sub-components, thus simplifying and speeding up the content creation process as well as eliminating common mistakes.

User Interests Driven Collaborative Cloud-Edge-Browser Architecture for WebBIM Visualization

Emerging mobile technology stimulates the increasing demands of online visualization of Web Building Information Modeling (WebBIM). However, it is a challenging task to schedule networking transmission of massive WebBIM scenes due to the enormous data volume, the uncertainty of users’ network conditions, and the weak computing capacity on the browser end. In this paper, we introduce edge computing into the existing cloud-browser mode for an online Web3D framework. A Cloud-Edge-Browser architecture supports online visualization of gigantic WebBIM scenes. The experimental results show that our proposed method can reduce the networking transmission delay and improve the user experience of accessing large-scale WebBIM scenes. This study provides an innovative solution with a combination of Web3D and edge computing.

OpenCollBench - Benchmarking of Collision Detection & Proximity Queries as a Web-Service

We present a server-based benchmark that enables a fair analysis of different collision detection & proximity query algorithms. A simple yet interactive web interface allows both expert and non-expert users to easily evaluate different collision detection algorithms’ performance in standardized or optionally user-definable scenarios and identify possible bottlenecks. In contrast to typically used simple charts or histograms to show the results, we additionally propose a heatmap visualization directly on the benchmarked objects that allows the identification of critical regions on a sub-object level. An anonymous login system, in combination with a server-side scheduling algorithm, guarantees security as well as the reproducibility and comparability of the results. This makes our benchmark useful for end-users who want to choose the optimal collision detection method or optimize their objects with respect to collision detection but also for researchers who want to compare their new algorithms with existing solutions.

3D Object Recognition Using X3D and Deep Learning

In this paper, a method of recognizing a 3D object using a machine learning algorithm is described. 3D object data sets consisting of geometric polygons are analyzed by Keras, a deep learning API that learns composition rules of data sets. A 3D object can be recognized by applying a composition rule to the object. Data sets for various types of objects have been experimented with. 100 objects per shape were used to learn the rules and different objects were used for evaluating the rules. Seven types of 3D objects were experimented with. In addition, evaluation results for different numbers of 3D objects were compared.

SESSION: Session 3

STEP and IFC export to X3D

The STEP and IFC standardized data models are widely used in the industry. Existing software machinery does not enable to easily take advantage of all the information contained in those files, for common industrial processes. The X3D standard comes with a rich data model and the related technology to handle and visualize the data. In this paper, we present a methodology to perform a controlled conversion from STEP/IFC to X3D. A practical implementation is introduced, exercised, and made available to the community as a free software package.

The Scalability of X3D4 PointProperties: Benchmarks on WWW Performance

With the development of remote sensing devices, it becomes more convenient for individual researchers to collect high-resolution point cloud data with their own scanners. There have been plenty of online tools for researchers to exhibit their work. However, the drawback of existing tools is that they are not flexible enough for the users to create 3D scenes of a mixture of point-based and triangle-based models. X3DOM is a WebGL-based library built on Extensible 3D (X3D) standard, enabling users to create 3D scenes with only a little computer graphics knowledge. Before the X3D 4.0 Specification, little attention has been paid to point cloud rendering in X3DOM. PointProperties, an appearance node newly added in X3D 4.0, provides point size attenuation and texture-color mixing effects to point geometries. In this work, we propose an X3DOM implementation of PointProperties. This implementation fulfills not only the features specified in X3D 4.0 documentation but also other shading effects that are compatible with the effects of triangle-based geometries in X3DOM and other state-of-the-art point cloud visualization tools. We also evaluate the performances of some of these effects. The result shows that a general laptop is able to handle most of the examined conditions in real-time.

Photorealism and Kinematics for Web-based CAD data

As cloud technology gains traction as a platform in the architecture, engineering, and construction (AEC) sector, so does the adoption of Web3D technologies for the visualisation of massive 3D models. However, the interaction with highly complex CAD models typical of these sectors is still critical. Various efforts are found in the literature to create suitable transmission formats that do not require users to wait long periods for massive scenes to load and to define standards for enhancing such data with interaction. However, most of the existing frameworks are either domain-specific or too general, which results in increased data preparation times and additional needs for processing at the application level. This paper describes a novel system for CAD (Computer-Aided Design) data interaction built on Web3D technologies. First, we discuss the approach to prepare CAD models for visualisation: importation of data, the definition of mechanical behaviours, and physically-based rendering (PBR) properties. Next, we describe how to export CAD models as an X3D scene with federated glTF nodes to increase performance and overall client interactivity. We followup with a description of how the Denavit-Hartenberg (DH) parameters can enable the visualisation of mechanical motion characteristics directly from the design and enhance user interaction. Finally, we summarise lessons learned from this industry-based software engineering experience. We also identify several future research directions in this area.

X3D Ontology for Querying 3D Models on the Semantic Web

The Semantic Web offers significant capabilities that transform the current Web into a global knowledge base including various cross-linked multimedia content with formal descriptions of its semantics understandable to humans and processable by computers. Content on the Semantic Web can be subject to reasoning and queries with standardized languages, methods and tools, which opens new opportunities for collaborative creation, use and exploration of web repositories. However, these opportunities have not been exploited so far by the available 3D formats and modeling tools, which limits the possibilities of search and reuse of 3D content as part of the Semantic Web. This work contributes a semantic development pipeline of the X3D Ontology, with corresponding conversion of X3D models into triple forms suitable for formal query. The ontology design reflects experience accompanying the development of the Extensible 3D (X3D) Graphics International Standard, in particular, the X3D Unified Object Model (X3DUOM). This approach combines semantic and syntactic elements of X3D models and metadata to support integration with the Semantic Web. The pipeline enables automatic generation of the X3D Ontology, thereby providing an up-to-date 3D representation with semantics during X3D specification development. By extending commonplace model conversions from other formats to X3D, the ontology presents the potential to enable integration of most forms of 3D content with the Semantic Web.

Physically-based Environment and Area Lighting using Progressive Rendering in WebGL

This paper presents a progressive rendering approach that enables rendering of static 3D scenes, lit by physically-based environment and area lights. Multi-frame sampling strategies are used to approximate elaborate lighting that is refined while showing intermediate results to the user. The presented approach enables interactive yet high-quality rendering in the web and runs on a wide range of devices including low-performance hardware such as mobile devices. An open-source implementation of the described techniques using TypeScript and WebGL 2.0 is presented and provided. For evaluation, we compare our rendering results to both a path tracer and a physically-based rasterizer. Our findings show that the approach approximates the lighting and shadowing of the path-traced reference well while being faster than the compared rasterizer.

SESSION: Session 4

Web-based Product Data Visualization and Feedback between PLM and MES

Consumers are demanding high-quality personalized products. To support personalization of high quality products, companies try to establish Information Technology (IT) systems. Typical IT systems are Product Lifecycle Management (PLM) used in the design department and Manufacturing Execution System (MES) used in the production department. Problems arise during the production of such highly personalized products using only information from 2D drawings. It is also inefficient that information about malfunctions and defects during production is sent back to the design department through manual operations. To resolve these problems, a method of exchanging the 3D shape with malfunction and defect information between PLM and MES is proposed. 3D shapes have been exchanged using open source and web-based PLM, MES, and CAD systems. The result shows that interoperability between PLM and MES is possible with visualization support.

Nuclear Femtography on the Web with X3D

Deep inside the atomic nucleus there are strange and colorful things afoot. This paper provides the first integrated views into the new Reference Model of the proton physics and the instrumentation needed for its measurement. Our unique interdisciplinary team has developed a processing pipeline for producing a set of scientific visualizations that provide interactive 3D views into high-dimensional particle physics information. In this paper, we describe the data and the pipeline, reflecting on the key design issues and choices we made along the way. We consider lessons learned in creating visualization for scientists and citizens across installed applications, Web3D, and Virtual Reality.

Effects of Gestural Exaggeration to User Experience in Virtual Reality

In real life, people use facial expressions, bodily gestures, and tone variations to convey information more clearly and dramatically. Such multimodal communication is effective, but also requires significant mental and physical efforts on the part of the communicator. In virtual reality (VR), such efforts can be much relieved not only automatically but also in a much more amplified way, which would not possible in real life. In this paper, we investigate the effects of exaggeration and highlighting of gestures when communicating in VR. The exaggeration and highlighting method we consider is enlarging and exaggerating of the gestural body parts. We conduct a comparative experiment in which a VR user tries to understand an avatar conveying a short passage or describing a concept word only in gestures with the avatar’s gesturing body parts appropriately enlarged (or not). Our experiment has shown that there was not only the expected communication efficacy depending on the type of content to be conveyed, there was also a significant influence to the level of concentration, immersion and even presence.

DeepCinema: Adding Depth with X3D Image-Based Rendering

In scientific research, data visualization is crucial for human understanding and insight. Modern experiments and simulations are increasingly growing in scale and complexity; as we approach the Exascale of big data, current methods of interactive 3D scientific visualization become untenable. Cinema has proposed as an Image-Based solution to this problem, where instead of explicit 3D geometry, the model is represented by an image database, through which users navigate among pre-rendered (in situ) screenshots. However, flat images alone cannot fully express the 3D characteristics of a data. Thus, we propose the DeepCinema technique to improve the depth portrayal of Image-Based Rendering using Displacement Maps and shading. We examine the perceptual efficacy of our technique with a 2-Alternative, Forced Choice user study of 60 participants. The within-subjects multi-factorial design compared user depth judgements over: Displacement Map, shading, and the interval of angular perspective for each image. Results show that this method would be useful for the interactive 3D visualization of Big Scientific Data using Web3D Standards.

SYSTEM ARCHITECTURE FOR SUPPORTING MULTIPLE LIVE ACTORS IN WEB3D VIRTUAL CONFERENCE: Extended Abstract

For live actors to perform as 3D models with seamless graphics computing, we need to overcome the huge challenges of developing a methodology to construct a cleaned and smooth model in real time and a system architecture design that supports the streaming of multi-model live actors. By applying the technique of semantic image segmentation to generate the entire human body, a 3D model needs to be built up with a segmented image as a texture. Moreover, to enable a live actor to virtually interact with others, multi-actor models must be produced for telecommuting systems. Therefore, by combining the concepts above, a system with multiple live actors can be created to conduct live group conferences in virtual reality that provide an immersive experience.

SESSION: Session 5

Unified Representation for XR Content and its Rendering Method

Virtual Reality (VR) and Augmented Reality (AR) have become familiar technologies with related markets growing rapidly every year. Moreover, the idea of considering VR and AR as one eXtended reality (XR) has broken the border between virtual space and real space. However, there is no formal way to create such XR content except through existing VR or AR content development platforms. These platforms require the content author to perform additional tasks such as duplicating content for a specific user interaction environment (VR or AR) and associating them as one. Also, describing the content in an existing markup language (e.g., X3D, X3DOM, A-frame) has limitations of that the content author should predefine the user interaction environment (i.e., either of VR and AR). In this study, a unified XR representation is defined for describing XR content, and the method to render it has been proposed. The unified XR representation extends the HTML so that content authored with this representation can be harmoniously incorporated into existing web documents and can exploit resources on the World Wide Web. The XR renderer, which draws XR content on the screen, follows different procedures for both VR and AR situations. Consequently, the XR content works in both user interaction environment (VR and AR). Hence, this study provides a straightforward XR content authoring method that users access anywhere through a web browser regardless of their situational contexts, such as VR or AR. It facilitates XR collaboration with real objects by providing both VR and AR users with accessing an identical content.

Towards Web3D-based Lightweight Crowd Evacuation Simulation

The heterogeneity of the appearance and behavior of the crowd is an important component in a realistic simulation. Using Web3D technology to recreate a large-scale crowd of virtual avatars via the internet is an increasingly important area in crowd studies, especially for emergency crowd evacuation. One major issue in the early development of real-time simulation is the challenge of introducing diversity and authenticity on avatars’ appearance and behavior. Another major theoretical issue is high transmission delay on the network, which diminishes the simulation performed on the web browser when rendering a large number of visualization components. In this research, we propose a novel method to provide a lightweight solution that can simultaneously guarantee the realness and diversity of the simulated crowd. We use a parameterization technique based on shape space to distinguish the avatars’ appearance. Asynchronously transmitting the rendering elements helps to reduce bandwidth pressure. The multi-level clone instancing method can generate a massive amount of heterogeneous avatars in a short time. In the end, we validated our methods with an online experiment to demonstrate its ability to solve large-scale crowd simulation problems over the internet.

Research on suppressing shading from 3D scanned data using a scanned reference model

When a physical object is scanned with an optical 3D scanner in a normal environment, shading is caused by external lighting that effect on the 3D scanned color data. This shading is not easy to separate from original surface color of the model and can make abnormal rendering result when rendered with additional artificial lighting. Therefore, various kind of automatic or manual processing is required to suppressing shading in general. However, the processing requires information of original scanned physical object and lighting environment which is not simple to sense and record. In our research, using white spherical plaster as a reference model, scan in lighting environment as similar as possible then make a model of external lighting environment using the normal vector and brightness information of the reference model to calibrate the color change due to shading. We have scanned experimental data and be applied processing to show feasibility of our method. We found that sphere or hemisphere reference model is effective reference model for process of suppressing shading. It is recommendable to end user application practice and can be included as component of standard or recommendation.

Extending X3D Realism with Audio Graphs, Acoustic  Properties and 3D Spatial Sound

A fundamental requirement for both realistic modeling and immersive presence is spatial audio, correctly rendering the presentation of each object with aural characteristics. Auditory attributes involve perceived directions, distances, and the propagation paths from complex sound sources to each listener in a potentially complex 3D scene. Many long-running efforts in both hardware and software have improved the ability to aurally render high-fidelity spatialized sound in real time. Motivated by current progress, this work proposes integrating acoustic properties associated with geometric shapes together with 3D spatial sound in the X3D Graphics standard. This combination is possible by exploiting the structure and functionality of Web Audio API, an effective framework for processing and synthesizing audio in Web applications. The paper describes design patterns and initial implement work for this spatial composition of audio graphs and scene graphs. Both specifications are device neutral, without dependencies on specific platforms or audio hardware. Examples for evaluation lead to useful conclusions and areas for future model development.

SESSION: Posters 1

Web-based Interactive System for Live Actors in MAR: Extended Abstract

Live actor and entity (LAE) is a system that displays live actors in a virtual scene, and the application enables the scene to be experienced through an immersive display. The LAE system has high potential because it allows a live actor to interact with virtual objects by using gamepad controllers, which have full-set sensors such as 3D orientation sensors and position sensors. With the controller, the live actor can seamlessly drag and drop objects as well as handle the action on a button in the 3D virtual world to perform any triggered events specially defined in the LAE system.

Electrophysiological Changes in the Virtual Reality Sickness: EEG in the VR sickness

This paper proposes a specific brain wave in which changes in virtual reality (VR) sickness are noticeably observed by comparing stages before and during the watching VR video. Although VR sickness generally induces physical symptoms such as headache, dizzy, and nausea and causes difficulties in using VR devices, there were still insufficient methods to quantify the degree of VR sickness. We divided the experiment into three conditions: baseline, VR video, and recovery. Each condition was measured by electroencephalography (EEG) to examine which frequency band responded prominently. The absolute power of alpha wave decreased during VR video condition compared to the baseline, and after the VR sickness was relieved, the absolute power of alpha wave increase during recovery condition compared to the VR video condition.

Web3D Distance live workshop for children in Mozilla Hubs

This paper proposes the design and procedures of a live workshop for children in Mozilla Hubs. We developed a step-by-step method called “Sugoroku,” spatial design, and audio effective calculations. The method proved to be efficient during the kids’ workshop. The “Sugoroku” method can be implemented in other systems.

Key factors for reducing motion sickness in 360° virtual reality scene: Extended Abstract

A 3D real-time dynamic exhibition space for I-Media-Cities audiovisual platform

As the first virtual reconstructions began to also spread the desire to create three-dimensional virtual galleries has arisen. Within the framework of the I-Media-Cities Horizon 2020 project a virtual gallery tool has been created; it can generate itself dynamically and in real-time starting from the selections performed on the audiovisual contents of the platform.

SESSION: Posters 2

Investigating the Influence of 2—D presentation versus 3—D rotation presentation formats on user perception

When presenting a product through digital media, both designers and retailers aim to communicate with their audience in the most effective way using visual perception. Previous studies have found that a users’ perception and behavior are affected by the visual presentation. The purpose of this study is to discover how a users' perception and evaluation of a product change depending on how the product is presented. In this study, we specifically look at the presentation formats of 2—D orthographic views and 3—D simulated format (360 rotation). This study employs a between-subject design with surveys and an eye—tracking test to determine if 2—D presentation or 3—D presentation resulted in better user evaluation and higher approval to the product design. We found that users have a better understanding of product aesthetics, usefulness, and product ease of use when they were viewing 2—D orthographic images of products versus when they were viewing the 3—D 360 rotation presentation format. Future marketing, design, and theoretical implications as well as future research directions are discussed.

A Framework for Interactive Exploration of Clusters in Massive Data Using 3D Scatter Plots and WebGL

This paper presents a rendering framework for the visualization of massive point datasets in the web. It includes highly interactive point rendering, cluster visualization, basic interaction methods, and importance-based labeling, while being available for both mobile and desktop browsers. The rendering style is customizable, as shown in figure 1. Our evaluation indicates that the framework facilitates interactive visualization of tens of millions of raw data points even without dynamic filtering or aggregation.

Investigating Web3D topics on StackOverflow: a preliminary study of WebGL and Three.js

Web3D developers often have to decide on which technologies are best for their projects. We explore that question through the perspective of community attention and support on Stack Overflow (SO). We focused on i) WebGL, a key low-level JavaScript (JS) API used to render 3D graphics in browsers without plugins, and ii) Three.js, a higher level JS library that reuses WebGL and is reputed easier and more intuitive. We considered questions from SO tagged with WebGL or Three.js and extracted all tags used on these questions. Using these, we were able to compare the relative attention (considering the number of questions and views) and support (considering satisfactory answers and how long they take) received by concerns and technologies associated to WebGL and Three.js. Our results suggest that Three.js gets significantly more community attention but less community support than WebGL on SO.

Tactile Internet to Share VR users’ Experiences

We propose the concept of “Tactile Internet” (Tac-I) that will be the next evolution of the Internet of Things (IoT). It enables sharing experiences of others with the sense of touch. In our case, Tac-I allows to multicast vibrotactile sensations from one node to multiple nodes via Internet. We developed vibrotactile shared devices which are physical bracelets connected between them through Cloud platform. We use the WebRTC as communication protocol to exchange data between tactile bracelets. Users can access Tac-I Web using a smartphone or Tablet and experience the vibrotactile sensation by manipulating virtual objects in a virtual reality environment from a Web browser. Multiple vibrotactile bracelets would be connected to the Cloud and transmit the tactile information from one node to another one or to multiple ones.

Towards Rapid Development of Conversational Virtual Humans using Web3D Technologies

In this paper, we propose a framework for the rapid development of conversational virtual humans (VHs) via the web. Web3D formats, such as WebGL, have enabled the use of interactive VHs on the web. Web-based conversational VHs are found to have great potential in improving population health. However, the development of web-based VHs needs significant time and effort which can lead health experts to avoid VH technology. In this paper, we present a case study to highlight the challenges that lead to significant time and effort in the development of web-based conversational VHs. Based on our findings, we propose a framework that uses Web3D formats and has the potential to address identified challenges in the rapid development of web-based conversational VH systems.