Nobel laureate, Herb Simon described design as changing existing situations into preferred situations. But Simon did not specify a medium by which we must design. As a technologist tackling challenges, I often over-rely on technology as my medium and limit my potential for impact. We can design technology, but we can also design services, processes, and organizations and much more. By more broadly understanding the media by which we can design, we can more systematically build the future in which we want to live.
For the past 40 years, research communities have embraced a culture that relies heavily on physical meetings of people from around the world: we present our most important work in conferences, we meet our peers in conferences, and we even make life-long friends in conferences. Also at the same time, a broad scientific consensus has emerged that warns that human emissions of greenhouse gases are warming the earth. For many of us, travel to conferences may be a substantial or even dominant part of our individual contribution to climate change. A single round-trip flight from Paris to New Orleans emits the equivalent of about 2.5 tons of carbon dioxide (CO2e) per passenger, which is a significant fraction of the total yearly emissions for an average resident of the US or Europe. Moreover, these emissions have no near-term technological fix, since jet fuel is difficult to replace with renewable energy sources. In this talk, I want to first raise awareness of the conundrum we are in by relying so heavily in air travel for our work. I will present some of the possible solutions that go from adopting small, incremental changes to radical ones. The talk focuses one of the radical alternatives: virtual conferences. The technology for them is almost here and, for some time, I have been part of one community that organizes an annual conference in a virtual environment. Virtual conferences present many interesting challenges, some of them technological in nature, others that go beyond technology. Creating truly immersive conference experiences that make us feel "there" requires attention to personal and social experiences at physical conferences. Those experiences need to be recreated from the ground up in virtual spaces. But in that process, they can also be rethought to become experiences not possible in real life.
Knitting creates complex, soft fabrics with unique texture properties that can be used to create interactive objects.However, little work addresses the challenges of designing and using knitted textures computationally. We present KnitPick: a pipeline for interpreting hand-knitting texture patterns into KnitGraphs which can be output to machine and hand-knitting instructions. Using KnitPick, we contribute a measured and photographed data set of 472 knitted textures. Based on findings from this data set, we contribute two algorithms for manipulating KnitGraphs. KnitCarving shapes a graph while respecting a texture, and KnitPatching combines graphs with disparate textures while maintaining a consistent shape. KnitPick is the first system to bridge the gap between hand- and machine-knitting when creating complex knitted textures.
Adding electronics to textiles can be time-consuming and requires technical expertise. We introduce SensorSnaps, low-power wireless sensor nodes that seamlessly integrate into caps of fabric snap fasteners. SensorSnaps provide a new technique to quickly and intuitively augment any location on the clothing with sensing capabilities. SensorSnaps securely attach and detach from ubiquitous commercial snap fasteners. Using inertial measurement units, the SensorSnaps detect tap and rotation gestures, as well as track body motion. We optimized the power consumption for SensorSnaps to work continuously for 45 minutes and up to 4 hours in capacitive touch standby mode. We present applications in which the SensorSnaps are used as gestural interfaces for a music player controller, cursor control, and motion tracking suit. The user study showed that SensorSnap could be attached in around 71 seconds, similar to attaching off-the-shelf snaps, and participants found the gestures easy to learn and perform. SensorSnaps could allow anyone to effortlessly add sophisticated sensing capacities to ubiquitous snap fasteners.
We present Tessutivo, a contact-based inductive sensing technique for contextual interactions on interactive fabrics. Our technique recognizes conductive objects (mainly metallic) that are commonly found in households and workplaces, such as keys, coins, and electronic devices. We built a prototype containing six by six spiral-shaped coils made of conductive thread, sewn onto a four-layer fabric structure. We carefully designed the coil shape parameters to maximize the sensitivity based on a new inductance approximation formula. Through a ten-participant study, we evaluated the performance of our proposed sensing technique across 27 common objects. We yielded 93.9% real-time accuracy for object recognition. We conclude by presenting several applications to demonstrate the unique interactions enabled by our technique.
We present a technique for fabricating soft and flexible textiles using a consumer grade fused deposition modeling (FDM) 3D printer. By controlling the movement of the print header, the FDM alternately weaves the stringing fibers across a row of pillars. Owing to the structure of the fibers, which supports and strengthens the pillars from each side, a 3D printer can print a thin sheet of fabric in an upright position while the fibers are being woven. In addition, this technique enables users to employ materials with various colors and/or properties when designing a pattern, and to prototype an interactive object using a variety of off-the-shelf materials such as a conductive filament. We also describe a technique for weaving textiles and introduce a list of parameters that enable users to design their own textile variations. Finally, we demonstrate examples showing the feasibility of our approach as well as numerous applications integrating printed textiles in a custom object design.
This work presents a novel interactive system for simple garment composition and surface patterning. Our approach makes it easier for casual users to customize machine-knitted garments, while enabling more advanced users to design their own composable templates. Our tool combines ideas from CAD software and image editing: it allows the composition of (1) parametric knitted primitives, and (2) stitch pattern layers with different resampling behaviours. By leveraging the regularity of our primitives, our tool enables interactive customization with automated layout and real-time patterning feedback. We show a variety of garments and patterns created with our tool, and highlight our ability to transfer shape and pattern customizations between users.
Developers spend a significant portion of their time searching for solutions and methods online. While numerous tools have been developed to support this exploratory process, in many cases the answers to developers' questions involve trade-offs among multiple valid options and not just a single solution. Through interviews, we discovered that developers express a desire for help with decision-making and understanding trade-offs. Through an analysis of Stack Overflow posts, we observed that many answers describe such trade-offs. These findings suggest that tools designed to help a developer capture information and make decisions about trade-offs can provide crucial benefits for both the developers and others who want to understand their design rationale. In this work, we probe this hypothesis with a prototype system named Unakite that collects, organizes, and keeps track of information about trade-offs and builds a comparison table, which can be saved as a design rationale for later use. Our evaluation results show that Unakite reduces the cost of capturing tradeoff-related information by 45%, and that the resulting comparison table speeds up a subsequent developer's ability to understand the trade-offs by about a factor of three.
There has been considerable research on how software can enhance programmers' productivity within their workspace. In this paper, we instead explore how software might help programmers make productive use of their time while away from their workspace. We interviewed 10 software engineers and surveyed 78 others and found that while programmers often do work while mobile, their existing mobile work practices are primarily exploratory (e.g., capturing thoughts or performing online research). In contrast, they want to be doing work that is more grounded in their existing code (e.g., code review or bug triage). Based on these findings, we introduce Mercury, a system that guides programmers in making progress on-the-go with auto-generated microtasks derived from their source code's current state. A study of Mercury with 20 programmers revealed that they could make meaningful progress with Mercury while mobile with little effort or attention. Our findings suggest an opportunity exists to support the continuation of programming tasks across devices and help programmers resume coding upon returning to their workspace.
We present X-Droid, a framework that provides Android app developers an ability to quickly and easily produce functional prototypes. Our work is motivated by the need for such ability and the lack of tools that provide it. Developers want to produce a functional prototype rapidly to test out potential features in real-life situations. However, current prototyping tools for mobile apps are limited to creating non-functional UI mockups that do not demonstrate actual features. With X-Droid, developers can create a new app that imports various kinds of functionality provided by other existing Android apps. In doing so, developers do not need to understand how other Android apps are implemented or need access to their source code. X-Droid provides a developer tool that enables developers to use the UIs of other Android apps and import desired functions into their prototypes. X-Droid also provides a run-time system that executes other apps' functionality in the background on off-the-shelf Android devices for seamless integration. Our evaluation shows that with the help of X-Droid, a developer imported a function from an existing Android app into a new prototype with only 51 lines of Java code, while the function itself requires 10,334 lines of Java code to implement (i.e., 200× improvement).
Instructors of hardware computing face many challenges including maintaining awareness of student progress, allocating their time adequately between lecturing and helping individual students, and keeping students engaged even while debugging problems. Based on formative interviews with 5 electronics instructors, we found that many circuit style behaviors could help novice users prevent or efficiently debug common problems. Drawing inspiration from the software engineering practice of coding style, these circuit style behaviors consist of best-practices and guidelines for implementing circuit prototypes that do not interfere with the functionality of the circuit, but help a circuit be more readable, less error-prone, and easier to debug. To examine if these circuit style behaviors could be peripherally enforced, aid an in-person instructor's ability to facilitate a workshop, and not monopolize instructor's attention, we developed CircuitStyle, a teaching aid for in-person hardware computing workshops. To evaluate the effectiveness of our tool, we deployed our system in an in-person maker-space workshop. The instructor appreciated CircuitStyle's ability to provide a broad understanding of the workshop's progress and the potential for our system to help instructors of various backgrounds better engage and understand the needs of their classroom.
We propose blending the virtual and physical worlds for prototyping circuits using physical proxies. With physical proxies, real-world components (e.g. a motor, or light sensor) can be used with a virtual counterpart for a circuit designed in software. We demonstrate this concept in Proxino, and elucidate the new scenarios it enables for makers, such as remote collaboration with physically distributed electronics components. We compared our hybrid system and its output with designs of real circuits to determine the difference through a system evaluation and observed minimal differences. We then present the results of an informal study with 9 users, where we gathered feedback on the effectiveness of our system in different working conditions (with a desktop, using a mobile, and with a remote collaborator). We conclude by sharing our lessons learned from our system and discuss directions for future research that blend physical and virtual prototyping for electronic circuits.
Smartphone augmented reality (AR) lets users interact with physical and virtual spaces simultaneously. With 3D hand tracking, smartphones become apparatus to grab and move virtual objects directly. Based on design considerations for interaction, mobility, and object appearance and physics, we implemented a prototype for portable 3D hand tracking using a smartphone, a Leap Motion controller, and a computation unit. Following an experience prototyping procedure, 12 researchers used the prototype to help explore usability issues and define the design space. We identified issues in perception (moving to the object, reaching for the object), manipulation (successfully grabbing and orienting the object), and behavioral understanding (knowing how to use the smartphone as a viewport). To overcome these issues, we designed object-based feedback and accommodation mechanisms and studied their perceptual and behavioral effects via two tasks: picking up distant objects, and assembling a virtual house from blocks. Our mechanisms enabled significantly faster and more successful user interaction than the initial prototype in picking up and manipulating stationary and moving objects, with a lower cognitive load and greater user preference. The resulting system---Portal-ble---improves user intuition and aids free-hand interactions in mobile situations.
We present an optimization-based approach for Mixed Reality (MR) systems to automatically control when and where applications are shown, and how much information they display. Currently, content creators design applications, and users then manually adjust which applications are visible and how much information they show. This choice has to be adjusted every time users switch context, i.e., whenever they switch their task or environment. Since context switches happen many times a day, we believe that MR interfaces require automation to alleviate this problem. We propose a real-time approach to automate this process based on users' current cognitive load and knowledge about their task and environment. Our system adapts which applications are displayed, how much information they show, and where they are placed. We formulate this problem as a mix of rule-based decision making and combinatorial optimization which can be solved efficiently in real-time. We present a set of proof-of-concept applications showing that our approach is applicable in a wide range of scenarios. Finally, we show in a dual-task evaluation that our approach decreased secondary tasks interactions by 36%.
Remotely instructing and guiding users in physical tasks has offered promise across a wide variety of domains. While it has been the subject of many research projects, current approaches are often limited in the communication bandwidth (lacking context, spatial information) or interactivity (unidirectional, asynchronous) between the expert and the learner. Systems that use Mixed-Reality systems for this purpose have rigid configurations for the expert and the learner. We explore the design space of bi-directional mixed-reality telepresence systems for teaching physical tasks, and present Loki, a novel system which explores the various dimensions of this space. Loki leverages video, audio and spatial capture along with mixed-reality presentation methods to allow users to explore and annotate the local and remote environments, and record and review their own performance as well as their peer's. The system design of Loki also enables easy transitions between different configurations within the explored design space. We validate its utility through a varied set of scenarios and a qualitative user study.
Augmented Reality (AR) has the potential to expand our capability for interacting with and comprehending our surrounding environment. However, current AR devices treat electronic appliances no different than common non-interactive objects, which substantially limits the functionality of AR. We present InfoLED, a positioning and communication system based on indicator lights that enables appliances to transmit their location, device IDs, and status information to the AR client without changing their visual design. By leveraging human insensitivity to high-frequency brightness flickering, InfoLED transmits all of that information without disturbing the original function as an indicator light. We envision InfoLED being used in three categories of application: malfunctioning device diagnosis, appliances control, and multi-appliance configuration. We conducted three user studies, measuring the performance of the InfoLED system, the human readability of the patterns and colors displayed on the InfoLED, and users' overall preference for InfoLED. The study results showed that InfoLED can work properly from a distance of up to 7 meters in indoor conditions and it did not interfere with our participants' ability to comprehend the high-level patterns and colors of the indicator light. Overall, study subjects prefer InfoLED to an ArUco 2D barcode-based baseline system and reported less cognitive load when using our system.
Augmented reality requires precise and instant overlay of digital information onto everyday objects. We present our work on LightAnchors, a new method for displaying spatially-anchored data. We take advantage of pervasive point lights - such as LEDs and light bulbs - for both in-view anchoring and data transmission. These lights are blinked at high speed to encode data. We built a proof-of-concept ap-plication that runs on iOS without any hardware or software modifications. We also ran a study to characterize the performance of LightAnchors and built eleven example demos to highlight the potential of our approach.
An ideal Mixed Reality (MR) system would only present virtual information (e.g., a label) when it is useful to the person. However, deciding when a label is useful is challenging: it depends on a variety of factors, including the current task, previous knowledge, context, etc. In this paper, we propose a Reinforcement Learning (RL) method to learn when to show or hide an object's label given eye movement data. We demonstrate the capabilities of this approach by showing that an intelligent agent can learn cooperative policies that better support users in a visual search task than manually designed heuristics. Furthermore, we show the applicability of our approach to more realistic environments and use cases (e.g., grocery shopping). By posing MR object labeling as a model-free RL problem, we can learn policies implicitly by observing users' behavior without requiring a visual search model or data annotation.
Sketching is an effective communication medium that augments and enhances what can be communicated in text. We introduce Sketchforme, the first neural-network-based system that can generate complex sketches based on text descriptions specified by users. Sketchforme's key contribution is to factor complex sketch rendering into layout and rendering subtasks using neural networks. The sketches composed by Sketchforme are expressive and realistic: we show in our user study that these sketches convey descriptions better than human-generated sketches in several cases, and 36.5% of those sketches were identified as human-generated. We develop some interactive applications using these generated sketches, and show that Sketchforme can significantly improve language learning applications and support intelligent language-based sketching assistants.
We present a capture-time tool designed to help casual photographers orient their subject to achieve a user-specified target facial appearance. The inputs to our tool are an HDR environment map of the scene captured using a 360 camera, and a target facial appearance, selected from a gallery of common studio lighting styles. Our tool computes the optimal orientation for the subject to achieve the target lighting using a computationally efficient precomputed radiance transfer-based approach. It then tells the photographer how far to rotate about the subject. Optionally, our tool can suggest how to orient a secondary external light source (e.g. a phone screen) about the subject's face to further improve the match to the target lighting. We demonstrate the effectiveness of our approach in a variety of indoor and outdoor scenes using many different subjects to achieve a variety of looks. A user evaluation suggests that our tool reduces the mental effort required by photographers to produce well-lit portraits.
We present Videostrates, a concept and a toolkit for creating real-time collaborative video editing tools. Videostrates supports both live and recorded video composition with a declarative HTML-based notation, combining both simple and sophisticated editing tools that can be used collaboratively. Videostrates is programmable and unleashes the power of the modern web platform for video manipulation. We demonstrate its potential through three use scenarios: collaborative video editing with multiple tools and devices; orchestration of multiple live streams that are recorded and broadcast to a popular streaming platform; and programmatic creation of video using WebGL and shaders for blue screen effects. These scenarios only scratch the surface of Videostrates' potential, which opens up a design space for novel collaborative video editors with fully programmable interfaces.
A major concern for filmmakers creating 360° video is ensuring that the viewer does not miss important narrative elements because they are looking in the wrong direction. This paper introduces gated clips which do not play the video past a gate time until a filmmaker-defined viewer gaze condition is met, such as looking at a specific region of interest (ROI). Until the condition is met, we seamlessly loop video playback using view-dependent video textures, a new variant of standard video textures that adapt the looping behavior to the portion of the scene that is within the viewer's field of view. We use our desktop GUI to edit live action and computer animated 360° videos. In a user study with casual viewers, participants prefer our looping videos over the standard versions and are able to successfully see all of the looping videos' ROIs without fear of missing important narrative content.
While hair is an essential component of virtual humans, it is also one of the most challenging digital assets to create. Existing automatic techniques lack the generality and flexibility to create rich hair variations, while manual authoring interfaces often require considerable artistic skills and efforts, especially for intricate 3D hair structures that can be difficult to navigate. We propose an interactive hair modeling system that can help create complex hairstyles in minutes or hours that would otherwise take much longer with existing tools. Modelers, including novice users, can focus on the overall hairstyles and local hair deformations, as our system intelligently suggests the desired hair parts. Our method combines the flexibility of manual authoring and the convenience of data-driven automation. Since hair contains intricate 3D structures such as buns, knots, and strands, they are inherently challenging to create using traditional 2D interfaces. Our system provides a new 3D hair authoring interface for immersive interaction in virtual reality (VR). Users can draw high-level guide strips, from which our system predicts the most plausible hairstyles via a deep neural network trained from a professionally curated dataset. Each hairstyle in our dataset is composed of multiple variations, serving as blend-shapes to fit the user drawings via global blending and local deformation. The fitted hair models are visualized as interactive suggestions that the user can select, modify, or ignore. We conducted a user study to confirm that our system can significantly reduce manual labor while improve the output quality for modeling a variety of head and facial hairstyles that are challenging to create via existing techniques.
For creative tasks, programmers face a choice: Use a GUI and sacrifice flexibility, or write code and sacrifice ergonomics?
To obtain both flexibility and ease of use, a number of systems have explored a workflow that we call output-directed programming. In this paradigm, direct manipulation of the program's graphical output corresponds to writing code in a general-purpose programming language, and edits not possible with the mouse can still be enacted through ordinary text edits to the program. Such capabilities provide hope for integrating graphical user interfaces into what are currently text-centric programming environments.
To further advance this vision, we present a variety of new output-directed techniques that extend the expressive power of Sketch-n-Sketch, an output-directed programming system for creating programs that generate vector graphics. To enable output-directed interaction at more stages of program construction, we expose intermediate execution products for manipulation and we present a mechanism for contextual drawing. Looking forward to output-directed programming beyond vector graphics, we also offer generic refactorings through the GUI, and our techniques employ a domain-agnostic provenance tracing scheme.
To demonstrate the improved expressiveness, we implement a dozen new parametric designs in Sketch-n-Sketch without text-based edits. Among these is the first demonstration of building a recursive function in an output-directed programming setting.
Living things in nature have long been utilizing the ability to "heal" their wounds on the soft bodies to survive in the outer environment. In order to impart this self-healing property to our daily life interface, we propose Self-healing UI, a soft-bodied interface that can intrinsically self-heal damages without external stimuli or glue. The key material to achieving Self-healing UI is MWCNTs-PBS, a composite material of a self-healing polymer polyborosiloxane (PBS) and a filler material multi-walled carbon nanotubes (MWCNTs), which retains mechanical and electrical self-healability. We developed a hybrid model that combines PBS, MWCNTs-PBS, and other common soft materials including fabric and silicone to build interface devices with self-healing, sensing, and actuation capability. These devices were implemented by layer-by-layer stacking fabrication without glue or any post-processing, by leveraging the materials' inherent self-healing property between two layers. We then demonstrated sensing primitives and interactive applications that extend the design space of shape-changing interfaces with their ability to transform, conform, reconfigure, heal, and fuse, which we believe can enrich the toolbox of human-computer interaction (HCI).
We propose a paradigm called Skin-On interfaces, in which interactive devices have their own (artificial) skin, thus enabling new forms of input gestures for end-users (e.g. twist, scratch). Our work explores the design space of Skin-On interfaces by following a bio-driven approach: (1) From a sensory point of view, we study how to reproduce the look and feel of the human skin through three user studies;(2) From a gestural point of view, we explore how gestures naturally performed on skin can be transposed to Skin-On interfaces; (3) From a technical point of view, we explore and discuss different ways of fabricating interfaces that mimic human skin sensitivity and can recognize the gestures observed in the previous study; (4) We assemble the insights of our three exploratory facets to implement a series of Skin-On interfaces and we also contribute by providing a toolkit that enables easy reproduction and fabrication.
In this paper, we present M-Hair, a novel method for providing tactile feedback by stimulating only the body hair without touching the skin. It works by applying passive magnetic materials to the body hair, which is actuated by external magnetic fields. Our user study suggested that the value of the M-hair mechanism is in inducing affective sensations such as pleasantness, rather than effectively discriminating features such as shape, size, and direction. This work invites future research to use this method in applications that induce emotional responses or affective states, and as a research tool for investigations of this novel sensation.
In this paper, we explore personalized morphing food that enhances traditional food with new HCI capabilities, rather than replacing the chef and authentic ingredients (e.g. flour) with an autonomous machine and heterogeneous mixtures (e.g. gel). Thus, we contribute a unique transformation mechanism of kneaded and sheeted flour-based dough, with an integrated design strategy for morphing food during two general cooking methods: dehydration (e.g. baking) or hydration (e.g. water boiling). We also enrich the design space of morphing food by demonstrating several applications. We end by discussing hybrid cooking between human and a design tool that we developed to ensure accuracy while preserving customizability for morphing food.
Despite the increasing popularity of soft interactive devices, their fabrication remains complex and time consuming. We contribute a process for rapid do-it-yourself fabrication of soft circuits using a conventional desktop inkjet printer. It supports inkjet printing of circuits that are stretchable, ultrathin, high resolution, and integrated with a wide variety of materials used for prototyping. We introduce multi-ink functional printing on a desktop printer for realizing multi-material devices, including conductive and isolating inks. We further present DIY techniques to enhance compatibility between inks and substrates and the circuits' elasticity. This enables circuits on a wide set of materials including temporary tattoo paper, textiles, and thermoplastic. Four application cases demonstrate versatile uses for realizing stretchable devices, e-textiles, body-based and re-shapeable interfaces.
Isolation is one of the largest contributors to a lack of wellbeing, increased anxiety and loneliness in older adults. In collaboration with elders in living facilities, we designed the Memory Music Box; a low-threshold platform to increase connectedness. The HCI community has contributed notable research in support of elders through monitoring, tracking and memory augmentation. Despite the Information and Communication Technologies field (ICT) advances in providing new opportunities for connection, challenges in accessibility increase the gap between elders and their loved ones. We approach this challenge by embedding a familiar form factor with innovative applications, performing design evaluations with our key target group to incorporate multi-iteration learnings. These findings culminate in a novel design that facilitates elders in crossing technology and communication barriers. Based on these findings, we discuss how future inclusive technologies for the older adults' can balance ease of use, subtlety and elements of Cognitively Sustainable Design.
Blind people frequently encounter inaccessible dynamic touchscreens in their everyday lives that are difficult, frustrating, and often impossible to use independently. Touchscreens are often the only way to control everything from coffee machines and payment terminals, to subway ticket machines and in-flight entertainment systems. Interacting with dynamic touchscreens is difficult non-visually because the visual user interfaces change, interactions often occur over multiple different screens, and it is easy to accidentally trigger interface actions while exploring the screen. To solve these problems, we introduce StateLens - a three-part reverse engineering solution that makes existing dynamic touchscreens accessible. First, StateLens reverse engineers the underlying state diagrams of existing interfaces using point-of-view videos found online or taken by users using a hybrid crowd-computer vision pipeline. Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface. Finally, a set of 3D-printed accessories enable blind people to explore capacitive touchscreens without the risk of triggering accidental touches on the interface. Our technical evaluation shows that StateLens can accurately reconstruct interfaces from stationary, hand-held, and web videos; and, a user study of the complete system demonstrates that StateLens successfully enables blind users to access otherwise inaccessible dynamic touchscreens.
Navigating stairs is a dangerous mobility challenge for people with low vision, who have a visual impairment that falls short of blindness. Prior research contributed systems for stair navigation that provide audio or tactile feedback, but people with low vision have usable vision and don't typically use nonvisual aids. We conducted the first exploration of augmented reality (AR) visualizations to facilitate stair navigation for people with low vision. We designed visualizations for a projection-based AR platform and smartglasses, considering the different characteristics of these platforms. For projection-based AR, we designed visual highlights that are projected directly on the stairs. In contrast, for smartglasses that have a limited vertical field of view, we designed visualizations that indicate the user's position on the stairs, without directly augmenting the stairs themselves. We evaluated our visualizations on each platform with 12 people with low vision, finding that the visualizations for projection-based AR increased participants' walking speed. Our designs on both platforms largely increased participants' self-reported psychological security.
People using white canes for navigation find it challenging to concurrently access devices such as smartphones. Building on prior research on abandonment of specialized devices, we explore a new touch free mode of interaction wherein a person with visual impairment can perform gestures on their existing white cane to trigger tasks on their smartphone. We present GesturePod, an easy-to-integrate device that clips on to any white cane, and detects gestures performed with the cane. With GesturePod, a user can perform common tasks on their smartphone without touch or even removing the phone from their pocket or bag. We discuss the challenges in building the device and our design choices. We propose a novel, efficient machine learning pipeline to train and deploy the gesture recognition model. Our in-lab study shows that GesturePod achieves 92% gesture recognition accuracy and can help perform common smartphone tasks faster. Our in-wild study suggests that GesturePod is a promising tool to improve smartphone access for people with VI, especially in constrained outdoor scenarios.
We present FaceWidgets, a device integrated with the backside of a head-mounted display (HMD) that enables tangible interactions using physical controls. To allow for near range-to-eye interactions, our first study suggested displaying the virtual widgets at 20 cm from the eye positions, which is 9 cm from the HMD backside. We propose two novel interactions, widget canvas and palm-facing gesture, that can help users avoid double vision and allow them to access the interface as needed. Our second study showed that displaying a hand reference improved performance of face widgets interactions. We developed two applications of FaceWidgets, a fixed-layout 360 video player and a contextual input for smart home control. Finally, we compared four hand visualizations against the two applications in an exploratory study. Participants considered the transparent hand as the most suitable and responded positively to our system.
Impact is a common effect in both daily life and virtual reality (VR) experiences, e.g., being punched, hit or bumped. Impact force is instantly produced, which is distinct from other force feedback, e.g., push and pull. We propose ElastImpact to provide 2.5D instant impact on a head-mounted display (HMD) for realistic and versatile VR experiences. ElastImpact consists of three impact devices, also called impactors. Each impactor blocks an elastic band with a mechanical brake using a servo motor and extending it using a DC motor to store the impact power. When releasing the brake, it provides impact instantly. Two impactors are affixed on both sides of the head and connected with the HMD to provide the normal direction impact toward the face (i.e., 0.5D in z-axis). The other impactor is connected with a proxy collider in a barrel in front of the HMD and rotated by a DC motor in the tangential plane of the face to provide 2D impact (i.e., xy-plane). By performing a just-noticeable difference (JND) study, we realize users' impact force perception distinguishability on the heads in the normal direction and tangential plane, separately. Based on the results, we combine normal and tangential impact as 2.5D impact, and performed a VR experience study to verify that the proposed 2.5D impact significantly enhances realism.
We propose integrating an array of skin stretch modules with an head-mounted display (HMD) to provide two-dimensional skin stretch feedback on the user's face. Skin stretch has been found effective to induce the perception of force (e.g. weight or inertia) and to enable directional haptic cues. However, its potential as an HMD output for virtual reality (VR) remains to be exploited. Our explorative study firstly investigated the design of shear tactors. Based on our results, Masque has been implemented as an HMD prototype actuating six shear tactors positioned on the HMD's face interface. A comfort study was conducted to ensure that skin stretches generated by Masque are acceptable to all participants. The following two perception-based studies examined the minimum changes in skin stretch distance and stretch angles that are detectable by participants. The results help us to design haptic profiles as well as our prototype applications. Finally, the user evaluation indicates that participants welcomed Masque and regarded skin stretch feedback as a worthwhile addition to HMD output.
Low-cost, smartphone-powered VR/AR headsets are becoming more popular. These basic devices - little more than plastic or cardboard shells - lack advanced features, such as controllers for the hands, limiting their interactive capability. Moreover, even high-end consumer headsets lack the ability to track the body and face. For this reason, interactive experiences like social VR are underdeveloped. We introduce MeCap, which enables commodity VR headsets to be augmented with powerful motion capture ("MoCap") and user-sensing capabilities at very low cost (under $5). Using only a pair of hemi-spherical mirrors and the existing rear-facing camera of a smartphone, MeCap provides real-time estimates of a wearer's 3D body pose, hand pose, facial expression, physical appearance and surrounding environment - capabilities which are either absent in contemporary VR/AR systems or which require specialized hardware and controllers. We evaluate the accuracy of each of our tracking features, the results of which show imminent feasibility.
We explore the use of hand gestures for authoring animations in virtual reality (VR). We first perform a gesture elicitation study to understand user preferences for a spatiotemporal, bare-handed interaction system in VR. Specifically, we focus on creating and editing dynamic, physical phenomena (e.g., particle systems, deformations, coupling), where the mapping from gestures to animation is ambiguous and indirect. We present commonly observed mid-air gestures from the study that cover a wide range of interaction techniques, from direct manipulation to abstract demonstrations. To this end, we extend existing gesture taxonomies to the rich spatiotemporal interaction space of the target domain and distill our findings into a set of guidelines that inform the design of natural user interfaces for VR animation. Finally, based on our guidelines, we develop a proof-of-concept gesture-based VR animation system, MagicalHands. Our results, as well as feedback from user evaluation, suggest that the expressive qualities of hand gestures help users animate more effectively in VR.
Designing and implementing human-robot interactions requires numerous skills, from having a rich understanding of social interactions and the capacity to articulate their subtle requirements, to the ability to then program a social robot with the many facets of such a complex interaction. Although designers are best suited to develop and implement these interactions due to their inherent understanding of the context and its requirements, these skills are a barrier to enabling designers to rapidly explore and prototype ideas: it is impractical for designers to also be experts on social interaction behaviors, and the technical challenges associated with programming a social robot are prohibitive. In this work, we introduce Synthé, which allows designers to act out, or bodystorm, multiple demonstrations of an interaction. These demonstrations are automatically captured and translated into prototypes for the design team using program synthesis. We evaluate Synthé in multiple design sessions involving pairs of designers bodystorming interactions and observing the resulting models on a robot. We build on the findings from these sessions to improve the capabilities of Synthé and demonstrate the use of these capabilities in a second design session.
We introduce shape-changing swarm robots. A swarm of self-transformable robots can both individually and collectively change their configuration to display information, actuate objects, act as tangible controllers, visualize data, and provide physical affordances. ShapeBots is a concept prototype of shape-changing swarm robots. Each robot can change its shape by leveraging small linear actuators that are thin (2.5 cm) and highly extendable (up to 20cm) in both horizontal and vertical directions. The modular design of each actuator enables various shapes and geometries of self-transformation. We illustrate potential application scenarios and discuss how this type of interface opens up possibilities for the future of ubiquitous and distributed shape-changing interfaces.
We propose Third-Person Piloting, a novel drone manipulation interface that increases situational awareness using an interactive third-person perspective from a second, spatially coupled drone. The pilot uses a controller with a manipulatable miniature drone. Our algorithm understands the relationship between the pilot's eye position and the miniature drone and ensures that the same spatial relationship is maintained between the two real drones in the sky. This allows the pilot to obtain various third-person perspectives by changing the orientation of the miniature drone while maintaining standard primary drone control using the conventional controller. We design and implement a working prototype with programmable drones and propose several representative operation scenarios. We gather user feedback to obtain the initial insights of our interface design from novices, advanced beginners, and experts. Our result suggests that the interactive third-person perspective provided by the second drone offers sufficient potential for increasing situational awareness and supporting their primary drone operations.
We present GhostAR, a time-space editor for authoring and acting Human-Robot-Collaborative (HRC) tasks in-situ. Our system adopts an embodied authoring approach in Augmented Reality (AR), for spatially editing the actions and programming the robots through demonstrative role-playing. We propose a novel HRC workflow that externalizes user's authoring as demonstrative and editable AR ghost, allowing for spatially situated visual referencing, realistic animated simulation, and collaborative action guidance. We develop a dynamic time warping (DTW) based collaboration model which takes the real-time captured motion as inputs, maps it to the previously authored human actions, and outputs the corresponding robot actions to achieve adaptive collaboration. We emphasize an in-situ authoring and rapid iterations of joint plans without an offline training process. Further, we demonstrate and evaluate the effectiveness of our workflow through HRC use cases and a three-session user study.
We introduce new ways of using a 3D printer head as a 3-axis robotic manipulator to enable advanced fabrication and usage such as assembling separately printed parts, breaking support materials and actuating printed objects on a build-plate. To achieve these manipulations, we customize a low-cost fused deposition modeling (FDM) 3D printer that can attach/detach printed end-effectors which change the function of the 3D printer head (e.g. grab, break, and rotate printed objects). These techniques afford the 3D printer to fabricate and assemble complete kinetic objects such as automatons without manual processing (i.e. removing support materials and assembling objects). We conclude that a small modification to a standard 3D printer, allows us to fabricate and assemble objects without human intervention.
Synner is a tool that helps users generate real-looking synthetic data by visually and declaratively specifying the properties of the dataset such as each field's statistical distribution, its domain, and its relationship to other fields. It provides instant feedback on every user interaction by updating multiple visualizations of the generated dataset and even suggests data generation specifications from a few user examples and interactions. Synner visually communicates the inherent randomness of statistical data generation. Our evaluation of Synner demonstrates its effectiveness at generating realistic data when compared with Mockaroo, a popular data generation tool, and with hired developers who coded data generation scripts for a fee.
Programmers, researchers, system administrators, and data scientists often build complex workflows based on command-line applications. To give these power users the well-known benefits of GUIs, we created Bespoke, a system that synthesizes custom GUIs by observing user demonstrations of command-line apps. Bespoke unifies the two main forms of desktop human-computer interaction (command-line and GUI) via a hybrid approach that combines the flexibility and composability of the command line with the usability and discoverability of GUIs. To assess the versatility of Bespoke, we ran an open-ended study where participants used it to create their own GUIs in domains that personally motivated them. They made a diverse set of GUIs for use cases such as cloud computing management, machine learning prototyping, lecture video transcription, integrated circuit design, remote code deployment, and gaming server management. Participants reported that the benefit of these bespoke GUIs was that they exposed only the most relevant subset of options required for their specific needs. In contrast, vendor-made GUIs usually include far more panes, menus, and settings since they must accommodate a wider range of use cases.
Natural language programming is a promising approach to enable end users to instruct new tasks for intelligent agents. However, our formative study found that end users would often use unclear, ambiguous or vague concepts when naturally instructing tasks in natural language, especially when specifying conditionals. Existing systems have limited support for letting the user teach agents new concepts or explaining unclear concepts. In this paper, we describe a new multi-modal domain-independent approach that combines natural language programming and programming-by-demonstration to allow users to first naturally describe tasks and associated conditions at a high level, and then collaborate with the agent to recursively resolve any ambiguities or vagueness through conversations and demonstrations. Users can also define new procedures and concepts by demonstrating and referring to contents within GUIs of existing mobile apps. We demonstrate this approach in PUMICE, an end-user programmable agent that implements this approach. A lab study with 10 users showed its usability.
Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.
Machine learning (ML) can be hard to master, but what first trips up novices is something much more mundane: the incidental complexities of installing and configuring software development environments. Everyone has a web browser, so can we let people experiment with ML within the context of any webpage they visit? This paper's contribution is the idea that the web can serve as a contextualized prototyping environment for ML by enabling analyses to occur within the context of data on actual webpages rather than in isolated silos. We realized this idea by building Mallard, a browser extension that scaffolds acquiring and parsing web data, prototyping with pretrained ML models, and augmenting webpages with ML-driven results and interactions. To demonstrate the versatility of Mallard, we performed a case study where we used it to prototype nine ML-based browser apps, including augmenting Amazon and Twitter websites with sentiment analysis, augmenting restaurant menu websites with OCR-based search, using real-time face tracking to control a Pac-Man game, and style transfer on Google image search results. These case studies show that Mallard is capable of supporting a diverse range of hobbyist-level ML prototyping projects.
Research was conducted regarding a display that presents digital information using bubbles. Conventional bubble displays require moving parts, because it is common to use air taken from outside of the water to represent pixels. However, it is difficult to increase the number of pixels at a low cost. We propose a liquid-surface display using pixels of bubble clusters generated from electrolysis, and present the cup-type device BubBowl, which generates a 10×10 pixel dot matrix pattern on the surface of a beverage. Our technique requires neither a gas supply from the outside nor moving parts. Using the proposed electrolysis method, a higher-resolution display can easily be realized using a PCB with a higher density of matrix electrodes.Moreover, the method is simple and practical, and can be utilized in daily life, such as for presenting information using bubbles on the surface of coffee in a cup.
We present an interactive 360-degree tabletop display system for collaborative work around a round table. Users are able to see 3D objects on the tabletop display anywhere around the table without 3D glasses. The system uses a visual perceptual mechanism for smooth motion parallax in the horizontal direction with fewer projectors than previous works. A 360-degree camera mounted above the table and image recognition software detects users' positions around the table and the heights of their faces (eyes) as they move around the table in real-time. Those mechanics help display correct vertical and horizontal direction motion parallax for different users simultaneously. Our system also has a user interaction function with a tablet device that manipulates 3D objects displayed on the table. These functions support collaborative work and communication between users. We implemented a prototype system and demonstrated the collaborative features of the 360-degree tabletop display system.
We present TilePoP, a new type of pneumatically-actuated interface deployed as floor tiles which dynamically pop up by inflating into large shapes constructing proxy objects for whole-body interactions in Virtual Reality. TilePoP consists of a 2D array of stacked cube-shaped airbags designed with specific folding structures, enabling each airbag to be inflated into a physical proxy and then deflated down back to its original tile shape when not in use. TilePoP is capable of providing haptic feedback for the whole body and can even support human body weight. Thus, it allows new interaction possibilities in VR. Herein, the design and implementation of TilePoP are described in detail along with demonstrations of its applications and the results of a preliminary user evaluation conducted to understand the users' experience with TilePoP.
LeviProps are tangible structures used to create interactive mid-air experiences. They are composed of an acoustically- transparent lightweight piece of fabric and attached beads that act as levitated anchors. This combination enables real- time 6 Degrees-of-Freedom control of levitated structures which are larger and more diverse than those possible with previous acoustic manipulation techniques. LeviProps can be used as free-form interactive elements and as projection surfaces. We developed an authoring tool to support the creation of LeviProps. Our tool considers the outline of the prop and the user constraints to compute the optimum locations for the anchors (i.e. maximizing trapping forces), increasing prop stability and maximum size. The tool produces a final LeviProp design which can be fabricated following a simple procedure. This paper explains and evaluates our approach and showcases example applications, such as interactive storytelling, games and mid-air displays.
This paper presents a design space, a fabrication system and applications of creating fluidic chambers and channels at millimeter scale for tangible actuated interfaces. The ability to design and fabricate millifluidic chambers allows one to create high frequency actuation, sequential control of flows and high resolution design on thin film materials. We propose a four dimensional design space of creating these fluidic chambers, a novel heat sealing system that enables easy and precise millifluidics fabrication, and application demonstrations of the fabricated materials for haptics, ambient devices and robotics. As shape-change materials are increasingly integrated in designing novel interfaces, milliMorph enriches the library of fluid-driven shape-change materials, and demonstrates new design opportunities that is unique at millimeter scale for product and interaction design.
Users can now easily communicate digital information with an Internet of Things; in contrast, there remains a lack of support to automate physical tasks that involve legacy static objects, e.g. adjusting a desk lamp's angle for optimal brightness, turning on/off a manual faucet when washing dishes, sliding a window to maintain a preferred indoor temperature. Automating these simple physical tasks has the potential to improve people's quality of life, which is particularly important for people with a disability or in situational impairment.
We present Robiot -- a design tool for generating mechanisms that can be attached to, motorized, and actuating legacy static objects to perform simple physical tasks. Users only need to take a short video manipulating an object to demonstrate an intended physical behavior. Robiot then extracts requisite parameters and automatically generates 3D models of the enabling actuation mechanisms by performing a scene and motion analysis of the 2D video in alignment with the object's 3D model. In an hour-long design session, six participants used Robiot to actuate seven everyday objects, imbuing them with the robotic capability to automate various physical tasks.
We present StackMold, a DIY molding technique to prototype multi-material and multi-colored objects with embedded electronics. The key concept of our approach is a novel multi-stage mold buildup in which casting operations are interleaved with the assembly of the mold to form independent compartments for casting different materials. To build multi-stage molds, we contribute novel algorithms that computationally design and optimize the mold and casting procedure. By default, the multi-stage mold is fabricated in slices using a laser cutter. For regions that require more surface detail, a high-fidelity 3D-printed mold subsection can be incorporated. StackMold is an integrated end-to-end system, supporting all stages of the process: it provides a UI to specify material and detail regions of a 3D~object; it generates fabrication files for the molds; and it produces a step-by-step casting instruction manual.
In this paper, we present a method to create re-programmable multi-color textures that are made from a single material only. The key idea builds on the use of photochromic inks that can switch their appearance from transparent to colored when exposed to light of a certain wavelength. By mixing cyan, magenta, and yellow (CMY) photochromic dyes into a single solution and leveraging the different absorption spectra of each dye, we can control each color channel in the solution separately. Our approach can transform single-material fabrication techniques, such as coating, into high-resolution multi-color processes. We discuss the material mixing procedure, modifications to the light source, and the algorithm to control each color channel. We then show the results from an experiment in which we evaluated the available color space and the resolution of our textures. Finally, we demonstrate our user interface that allows users to transfer virtual textures onto physical objects and show a range of application examples.
Advances in digital fabrication have simultaneously created new capabilities while reinforcing outdated workflows that constrain how, and by whom, these fabrication tools are used. In this paper, we investigate how a new class of hybrid-controlled machines can collaborate with novice and expert users alike to yield a more lucid making experience. We demonstrate these ideas through our system, Turn-by-Wire. By combining the capabilities of a traditional lathe with haptic input controllers that modulate both position and force, we detail a series of novel interaction metaphors that invite a more fluid making process spanning digital, model-centric, computer control, and embodied, adaptive, human control. We evaluate our system through a user study and discuss how these concepts generalize to other fabrication tools.
Joints are crucial to laser cutting as they allow making three-dimensional objects; mounts are crucial because they allow embedding technical components, such as motors. Unfortunately, mounts and joints tend to fail when trying to fabricate a model on a different laser cutter or from a different material. The reason for this lies in the way mounts and joints hold objects in place, which is by forc-ing them into slightly smaller openings. Such "press fit" mechanisms unfortunately are susceptible to the small changes in diameter that occur when switching to a ma-chine that removes more or less material ("kerf"), as well as to changes in stiffness, as they occur when switching to a different material. We present a software tool called springFit that resolves this problem by replacing the problematic press fit-based mounts and joints with what we call canti¬lever-based mounts and joints. A cantilever spring is simply a long thin piece of material that pushes against the object to be held. Unlike press fits, cantilever springs are robust against variations in kerf and material; they can even handle very high variations, simply by using longer springs. SpringFit converts models in the form of 2D cutting plans by replacing all contained mounts, notch joints, finger joints, and t-joints. In our technical evaluation, we used springFit to convert 14 models downloaded from the web.
We present Ondulé-an interactive design tool that allows novices to create parameterizable deformation behaviors in 3D-printable models using helical springs and embedded joints. Informed by spring theory and our empirical mechanical experiments, we introduce spring and joint-based design techniques that support a range of parameterizable deformation behaviors, including compress, extend, twist, bend, and various combinations. To enable users to design and add these deformations to their models, we introduce a custom design tool for Rhino. Here, users can convert selected geometries into springs, customize spring stiffness, and parameterize their design with mechanical constraints for desired behaviors. To demonstrate the feasibility of our approach and the breadth of new designs that it enables, we showcase a set of example 3D-printed applications from launching rocket toys to tangible storytelling props. We conclude with a discussion of key challenges and open research questions.
Current VR/AR systems are unable to reproduce the physical sensation of fluid vessels, due to the shifting nature of fluid motion. To this end, we introduce SWISH, an ungrounded mixed-reality interface, capable of affording the users a realistic haptic sensation of fluid behaviors in vessels. The chief mechanism behind SWISH is in the use of virtual reality tracking and motor actuation to actively relocate the center of gravity of a handheld vessel, emulating the moving center of gravity of a handheld vessel that contains fluid. In addition to solving challenges related to reliable and efficient motor actuation, our SWISH designs place an emphasis on reproducibility, scalability, and availability to the maker culture. Our virtual-to-physical coupling uses Nvidia Flex's Unity integration for virtual fluid dynamics with a 3D printed augmented vessel containing a motorized mechanical actuation system. To evaluate the effectiveness and perceptual efficacy of SWISH, we conduct a user study with 24 participants, 7 vessel actions, and 2 virtual fluid viscosities in a virtual reality environment. In all cases, the users on average reported that the SWISH bucket generates accurate tactile sensations for the fluid behavior. This opens the potential for multi-modal interactions with programmable fluids in virtual environments for chemistry education, worker training, and immersive entertainment.
Force feedback is said to be the next frontier in virtual reality (VR). Recently, with consumers pushing forward with untethered VR, researchers turned away from solutions based on bulky hardware (e.g., exoskeletons and robotic arms) and started exploring smaller portable or wearable devices. However, when it comes to rendering inertial forces, such as when moving a heavy object around or when interacting with objects with unique mass properties, current ungrounded force feedback devices are unable to provide quick weight shifting sensations that can realistically simulate weight changes over 2D surfaces. In this paper we introduce Aero-plane, a force-feedback handheld controller based on two miniature jet propellers that can render shifting weights of up to 14 N within 0.3 seconds. Through two user studies we: (1) characterize the users' ability to perceive and correctly recognize different motion paths on a virtual plane while using our device; and, (2) tested the level of realism and immersion of the controller when used in two VR applications (a rolling ball on a plane, and using kitchen tools of different shapes and sizes). Lastly, we present a set of applications that further explore different usage cases and alternative form-factors for our device.
Creating or arranging objects at runtime is needed in many virtual reality applications, but such changes are noticed when they occur inside the user's field of view. We present Mise-Unseen, a software system that applies such scene changes covertly inside the user's field of view. Mise-Unseen leverages gaze tracking to create models of user attention, intention, and spatial memory to determine if and when to inject a change. We present seven applications of Mise-Unseen to unnoticeably modify the scene within view (i) to hide that task difficulty is adapted to the user, (ii) to adapt the experience to the user's preferences, (iii) to time the use of low fidelity effects, (iv) to detect user choice for passive haptics even when lacking physical props, (v) to sustain physical locomotion despite a lack of physical space, (vi) to reduce motion sickness during virtual locomotion, and (vii) to verify user understanding during story progression. We evaluated Mise-Unseen and our applications in a user study with 15 participants and find that while gaze data indeed supports obfuscating changes inside the field of view, a change is rendered unnoticeably by using gaze in combination with common masking techniques.
We present Pull-Ups, a suspension kit that can suggest a range of body postures and thus enables various exercise styles of users perceiving the kinesthetic force feedback by suspending their weight with arm exertion during the interaction. Pull-Ups actuates the user's body to move up to 15 cm by pulling his or her hands using a pair of pneumatic artificial muscle groups. Our studies informed the discernible kinesthetic force feedbacks that were then exploited for the design of kinesthetic force feedback in three physical activities: kitesurfing, paragliding, and space invader. Our final study on user experiences suggested that a passive suspension kit alone added substantially to users' perceptions of realism and enjoyment (all above neutral) with passive physical support, while sufficient active feedback can further level them up. In addition, we found that both passive and active feedback of the suspension kit significantly reduced motion sickness in simulated kitesurfing and paragliding compared to when no suspension kit (thus no feedback) was provided. This work suggests that a passive suspension kit is cost-effective as a home exercise kit, while active feedback can further level up user experience, though at the cost of the installation (e.g., an air compressor in our prototype).
We present PseudoBend, a haptic feedback technique that creates the illusion that a rigid device is being stretched, bent, or twisted. The method uses a single 6-DOF force sensor and a vibrotactile actuator to render grain vibrations to simulate the vibrations produced during object deformation based on the changes in force or torque exerted on a device. Because this method does not require any moving parts aside from the vibrotactile actuator, devices designed using this method can be small and lightweight. Psychophysical studies conducted using a prototype that implements this method confirmed that the method could be used to successfully create the illusion of deformation and could also change users' perception of stiffness by changing the virtual stiffness parameters.
We introduce CapstanCrunch, a force resisting, palm-grounded haptic controller that renders haptic feedback for touching and grasping both rigid and compliant objects in a VR environment. In contrast to previous controllers, Cap-stan¬Crunch renders human-scale forces without the use of large, high force, electrically power consumptive and ex-pensive actuators. Instead, CapstanCrunch¬ integrates a friction-based capstan-plus-cord variable-resistance brake mechanism that is dynamically controlled by a small inter-nal motor. The capstan mechanism magnifies the motor's force by a factor of around 40 as an output resistive force. Compared to active force control devices, it is low cost, low electrical power, robust, safe, fast and quiet, while providing high force control to user interaction. We describe the de-sign and implementation of CapstanCrunch and demon-strate its use in a series of VR scenarios. Finally, we evalu-ate the performance of CapstanCrunch in two user studies and compare our controller with an active haptic controller with the ability to simulate different levels of convincing object rigidity and/or compliance.
Method-independent text entry evaluation tools are often used to conduct text entry experiments and compute performance metrics, like words per minute and error rates. The input stream paradigm of Soukoreff & MacKenzie (2001, 2003) still remains prevalent, which presents a string for transcription and uses a strictly serial character representation for encoding the text entry process. Although an advance over prior paradigms, the input stream paradigm is unable to support many modern text entry features. To address these limitations, we present transcription sequences: for each new input, a snapshot of the entire transcribed string unto that point is captured. By comparing adjacent strings within a transcription sequence, we can compute all prior metrics, reduce artificial constraints on text entry evaluations, and introduce new metrics. We conducted a study with 18 participants who typed 1620 phrases using a laptop keyboard, on-screen keyboard, and smartphone keyboard using features such as auto-correction, word prediction, and copy/paste. We also evaluated non-keyboard methods Dasher, gesture typing, and T9. Our results show that modern text entry methods and features can be accommodated, prior metrics can be correctly computed, and new metrics can reveal insights. We validated our algorithms using ground truth based on cursor positioning, confirming 100% accuracy. We also provide a new tool, TextTest++, to facilitate web-based evaluations.
Current text correction processes on mobile touch devices are laborious: users either extensively use backspace, or navigate the cursor to the error position, make a correction, and navigate back, usually by employing multiple taps or drags over small targets. In this paper, we present three novel text correction techniques to improve the correction process: Drag-n-Drop, Drag-n-Throw, and Magic Key. All of the techniques skip error-deletion and cursor-positioning procedures, and instead allow the user to type the correction first, and then apply that correction to a previously committed error. Specifically, Drag-n-Drop allows a user to drag a correction and drop it on the error position. Drag-n-Throw lets a user drag a correction from the keyboard suggestion list and "throw" it to the approximate area of the error text, with a neural network determining the most likely error in that area. Magic Key allows a user to type a correction and tap a designated key to highlight possible error candidates, which are also determined by a neural network. The user can navigate among these candidates by directionally dragging from atop the key, and can apply the correction by simply tapping the key. We evaluated these techniques in both text correction and text composition tasks. Our results show that correction with the new techniques was faster than de facto cursor and backspace-based correction. Our techniques apply to any touch-based text entry method.
Text entry is expected to be a common task for smart glass users, which is generally performed using a touchpad on the temple or by a promising approach using eye tracking. However, each approach has its own limitations. For more efficient text entry, we present the concept of gaze-assisted typing (GAT), which uses both a touchpad and eye tracking. We initially examined GAT with a minimal eye input load, and demonstrated that the GAT technology was 51% faster than a two-step touch input typing method (i.e.,M-SwipeBoard: 5.85 words per minute (wpm) and GAT: 8.87 wpm). We also compared GAT methods with varying numbers of touch gestures. The results showed that a GAT requiring five different touch gestures was the most preferred, although all GAT techniques were equally efficient. Finally, we compared GAT with touch-only typing (SwipeZone) and eye-only typing (adjustable dwell) using an eye-trackable head-worn display. The results demonstrate that the most preferred technique, GAT, was 25.4% faster than the eye-only typing and 29.4% faster than the touch-only typing (GAT: 11.04 wpm, eye-only typing: 8.81 wpm, and touch-only typing: 8.53 wpm).
When editing or reviewing a document, people directly overlay ink marks on content. For instance, they underline words, or circle elements in a fgure. These overlay marks often accompany in-context annotations in the form of handwritten footnotes and marginalia. People tend to put annotations close to the content that elicited them, but have to compose with the often-limited whitespace. We introduce SpaceInk, a design space of pen+touch techniques that make room for in-context annotations by dynamically refowing documents. We identify representative techniques in this design space, spanning both new ones and existing ones. We evaluate them in a user study, with results that inform the design of a prototype system. Our system lets users concentrate on capturing feeting thoughts, streamlining the overall annotation process by enabling the fuid inverleaving of space-making gestures with freeform ink.
In this paper, we propose and investigate a new text entry technique using micro thumb-tip gestures. Our technique features a miniature QWERTY keyboard residing invisibly on the first segment of the user's index finger. Text entry can be carried out using the thumb-tip to tap the tip of the index finger. The keyboard layout was optimized for eyes-free input by utilizing a spatial model reflecting the users' natural spatial awareness of key locations on the index finger. We present our approach of designing and optimizing the keyboard layout through a series of user studies and computer simulated text entry tests over 1,146,484 possibilities in the design space. The outcome is a 2×3 grid with the letters highly confining to the alphabetic and spatial arrangement of QWERTY. Our user evaluation showed that participants achieved an average text entry speed of 11.9 WPM and were able to type as fast as 13.3 WPM towards the end of the experiment.
SCALE provides a framework for load data from distributed load-sensitive modules for exploring force-based interaction. Force conveys not only the force vector itself but also rich information about activities, including way of touching, object location and body motion. Our system captures these interactions on a single pipeline of load data processing. Furthermore, we have expanded the interaction area from a flat 2D surface to 3D volume by building a mathematical framework, which enables us to capture the vertical height of a touch point. These technical invention opens broad applications, including general shape capturing and motion recognition. We have packaged the framework into a physical prototyping kit, and conducted a workshop with product designers to evaluate our system in practical scenarios.
Natural haptic feedback in virtual reality (VR) is complex andchallenging, due to the intricacy of necessary stimuli and re-spective hardware. Pseudo-haptic feedback aims at providinghaptic feedback without providing actual haptic stimuli butby using other sensory channels (e.g. visual cues) for feed-back. We combine such an approach with the additional inputmodality of muscle activity that is mapped to a virtual force toinfluence the interaction flow.
In comparison to existing approaches as well as to no kines-thetic feedback at all the presented solution significantly in-creased immersion, enjoyment as well as the perceived qualityof kinesthetic feedback.
Rapid prototyping of haptic output on 3D objects promises to enable a more widespread use of the tactile channel for ubiquitous, tangible, and wearable computing. Existing prototyping approaches, however, have limited tactile output capabilities, require advanced skills for design and fabrication, or are incompatible with curved object geometries. In this paper, we present a novel digital fabrication approach for printing custom, high-resolution controls for electro-tactile output with integrated touch sensing on interactive objects. It supports curved geometries of everyday objects. We contribute a design tool for modeling, testing, and refining tactile input and output at a high level of abstraction, based on parameterized electro-tactile controls. We further contribute an inventory of 10 parametric Tactlet controls that integrate sensing of user input with real-time electro-tactile feedback. We present two approaches for printing Tactlets on 3D objects, using conductive inkjet printing or FDM 3D printing. Empirical results from a psychophysical study and findings from two practical application cases confirm the functionality and practical feasibility of the Tactlets approach.
Mantis is a highly scalable system architecture that democratizes haptic devices by enabling designers to create accurate, multiform and accessible force feedback systems. Mantis uses brushless DC motors, custom electronic controllers, and an admittance control scheme to achieve stable high-quality haptic rendering. It enables common desktop form factors but also: large workspaces (multiple arm lengths), multiple arm workspaces, and mobile workspaces. It also uses accessible components and costs significantly less than typical high-fidelity force feedback solutions which are often confined to research labs. We present our design and show that Mantis can reproduce the haptic fidelity of common robotic arms. We demonstrate its multiform ability by implementing five systems: a single desktop-sized device, a single large workspace device, a large workspace system with four points of feedback, a mobile system and a wearable one.
We present HapSense, a single-volume soft haptic I/O device with uninterrupted dual functionalities of force sensing and vibrotactile actuation. To achieve both input and output functionalities, we employ a ferroelectric electroactive polymer as core functional material with a multilayer structure design. We introduce a haptic I/O hardware that supports tunable high driving voltage waveform for vibrotactile actuation while insitu sensing a change in capacitance from contact force. With mechanically soft nature of fabricated structure, HapSense can be embedded onto various object surfaces including but not limited to furniture, garments, and the human body. Through a series of experiments and evaluations, we characterized physical properties of HapSense and validated the feasibility of using soft haptic I/O with real users. We demonstrated a variety of interaction scenarios using HapSense.
We introduce a vision-based technique to recognize static hand poses and dynamic finger tapping gestures. Our approach employs a camera on the wrist, with a view of the opisthenar (back of the hand) area. We envisage such cameras being included in a wrist-worn device such as a smartwatch, fitness tracker or wristband. Indeed, selected off-the-shelf smartwatches now incorporate a built-in camera on the side for photography purposes. However, in this configuration, the fingers are occluded from the view of the camera. The oblique angle and placement of the camera make typical vision-based techniques difficult to adopt. Our alternative approach observes small movements and changes in the shape, tendons, skin and bones on the opisthenar area. We train deep neural networks to recognize both hand poses and dynamic finger tapping gestures. While this is a challenging configuration for sensing, we tested the recognition with a real-time user test and achieved a high recognition rate of 89.4% (static poses) and 67.5% (dynamic gestures). Our results further demonstrate that our approach can generalize across sessions and to new users. Namely, users can remove and replace the wrist-worn device while new users can employ a previously trained system, to a certain degree. We conclude by demonstrating three applications and suggest future avenues of work based on sensing the back of the hand.
Robust, wide-area sensing of human environments has been a long-standing research goal. We present Sozu, a new low-cost sensing system that can detect a wide range of events wirelessly, through walls and without line of sight, at whole-building scale. To achieve this in a battery-free manner, Sozu tags convert energy from activities that they sense into RF broadcasts, acting like miniature self-powered radio stations. We describe the results from a series of iterative studies, culminating in a deployment study with 30 instrumented objects. Results show that Sozu is very accurate, with true positive event detection exceeding 99%, with almost no false positives. Beyond event detection, we show that Sozu can be extended to detect richer signals, such as the state, intensity, count, and rate of events.
Computer vision is applied in an ever expanding range of applications, many of which require custom training data to perform well. We present a novel interface for rapid collection of labeled training images to improve CV-based object detectors. LabelAR leverages the spatial tracking capabilities of an AR-enabled camera, allowing users to place persistent bounding volumes that stay centered on real-world objects. The interface then guides the user to move the camera to cover a wide variety of viewpoints. We eliminate the need for post hoc labeling of images by automatically projecting 2D bounding boxes around objects in the images as they are captured from AR-marked viewpoints. In a user study with 12 participants, LabelAR significantly outperforms existing approaches in terms of the trade-off between detection performance and collection time.
This paper presents RFTouchPads, a system of batteryless and wireless modular hardware designs of two-dimensional (2D) touch sensor pads based on the ultra-high frequency (UHF) radio-frequency identification (RFID) technology. In this system, multiple RFID IC chips are connected to an antenna in parallel. Each chip connects only one of its endpoints to the antenna; hence, the module normally turns off when it gets insufficient energy to operate. When a finger touches the circuit trace attached to another endpoint of the chip, the finger functions as part of the antenna that turns the connected chip on, while the finger touch location is determined according to the chip's ID. Based on this principle, we propose two hardware designs, namely, StickerPad and TilePad. StickerPad is a flexible 3×3 touch-sensing pad suitable for applications on curved surfaces such as the human body. TilePad is a modular 3×3 touch-sensing pad that supports the modular area expansion by tiling and provides a more flexible deployment because its antenna is folded. Our implementation allows 2D touch inputs to be reliability detected 2 m away from a remote antenna of an RFID reader. The proposed batteryless, wireless, and modular hardware design enables fine-grained and less-constrained 2D touch inputs in various ubiquitous computing applications.
We introduce PrivateTalk, an on-body interaction technique that allows users to activate voice input by performing the Hand-On-Mouth gesture during speaking. The gesture is performed as a hand partially covering the mouth from one side. PrivateTalk provides two benefits simultaneously. First, it enhances privacy by reducing the spread of voice while also concealing the lip movements from the view of other people in the environment. Second, the simple gesture removes the need for speaking wake-up words and is more accessible than a physical/software button especially when the device is not in the user's hands. To recognize the Hand-On-Mouth gesture, we propose a novel sensing technique that leverages the difference of signals received by two Bluetooth earphones worn on the left and right ear. Our evaluation shows that the gesture can be accurately detected and users consistently like PrivateTalk and consider it intuitive and effective.
We propose "Ohmic-Sticker'', a novel force-to-motion type input device to extend capacitive touch surfaces. It realizes various types of force-sensitive inputs by simply attaching on to commercial touchpads or touchscreens. A simple force-sensitive-resistor (FSR)-based structure enables thin (less than 2 mm) form factors and battery-less operation. The applied force vector is detected as the leakage current from the corresponding touch surface electrodes by using Ohmic-Touch technology. Ohmic-Sticker can be used for adding force-sensitive interactions to touch surfaces, such as analog push buttons, TrackPoint-like devices, and full 6 DoF controllers for navigating virtual spaces. In this paper, we report a series of investigations on the design requirements of Ohmic-Sticker and some prototypes.We also evaluate the performance of Ohmic-Sticker as a pointing device.
Understanding the selection uncertainty of moving targets is a fundamental research problem in HCI. However, the only few works in this domain mainly focus on selecting 1D moving targets with certain input devices, where the model generalizability has not been extensively investigated. In this paper, we propose a 2D Ternary-Gaussian model to describe the selection uncertainty manifested in endpoint distribution for moving target selection. We explore and compare two candidate methods to generalize the problem space from 1D to 2D tasks, and evaluate their performances with three input modalities including mouse, stylus, and finger touch. By applying the proposed model in assisting target selection, we achieved up to 4% improvement in pointing speed and 41% in pointing accuracy compared with two state-of-the-art selection technologies. In addition, when we tested our model to predict pointing errors in a realistic user interface, we observed high fit of 0.94 R2.
We describe Tip-Tap, a wearable input technique that can be implemented without batteries using a custom RFID tag. It recognizes 2-dimensional discrete touch events by sensing the intersection between two arrays of contact points: one array along the index fingertip and the other along the thumb tip. A formative study identifies locations on the index finger that are reachable by different parts of the thumb tip, and the results determine the pattern of contacts points used for the technique. Using a reconfigurable 3x3 evaluation device, a second study shows eyes-free accuracy is 86% after a very short period, and adding bumpy or magnetic passive haptic feedback to contacts is not necessary. Finally, two battery-free prototypes using a new RFID tag design demonstrates how Tip-Tap can be implemented in a glove or tattoo form factor.
Head-mounted Mixed Reality (MR) systems enable touch interaction on any physical surface. However, optical methods (i.e., with cameras on the headset) have difficulty in determining the touch contact accurately. We show that a finger ring with Inertial Measurement Unit (IMU) can substantially improve the accuracy of contact sensing from 84.74% to 98.61% (f1 score), with a low latency of 10 ms. We tested different ring wearing positions and tapping postures (e.g., with different fingers and parts). Results show that an IMU-based ring worn on the proximal phalanx of the index finger can accurately sense touch contact of most usable tapping postures. Participants preferred wearing a ring for better user experience. Our approach can be used in combination with the optical touch sensing to provide robust and low-latency contact detection.
Mutual capacitance-based multi-touch sensing is now a ubiquitous and high-fidelity input technology. However, due to the complexity of electrical and signal processing requirements, it remains very challenging to create interface prototypes with custom-designed multi-touch input surfaces. In this paper, we introduce Multi-Touch Kit, a technique enabling electronics novices to rapidly prototype customized capacitive multi-touch sensors. In contrast to existing techniques, it works with a commodity microcontroller and open-source software and does not require any specialized hardware. Evaluation results show that our approach enables multi-touch sensors with a high spatial and temporal resolution and can accurately detect multiple simultaneous touches. A set of application examples demonstrates the versatile uses of our approach for sensors of different scales, curvature, and materials.
Redirected walking (RDW) techniques provide a way to explore a virtual space that is larger than the available physical space by imperceptibly manipulating the virtual world view or motions. These manipulations may introduce conflicts between real and virtual cues (e.g., visual-vestibular conflicts), which can be disturbing when detectable by users. The empirically established detection thresholds of rotation manipulation for RDW still require a large physical tracking space and are therefore impractical for general-purpose Virtual Reality (VR) applications. We investigate Redirected Jumping (RDJ) as a new locomotion metaphor for redirection to partially address this limitation, and because jumping is a common interaction for environments like games. We investigated the detection rates for different curvature gains during RDJ. The probability of users detecting RDJ appears substantially lower than that of RDW, meaning designers can get away with greater manipulations with RDJ than with RDW. We postulate that the substantial vertical (up/down) movement present when jumping introduces increased vestibular noise compared to normal walking, thereby supporting greater rotational manipulations. Our study suggests that the potential combination of metaphors (e.g., walking and jumping) could further reduce the required physical space for locomotion in VR. We also summarize some differences in user jumping approaches and provide motion sickness measures in our study.
We explore a future in which people spend considerably more time in virtual reality, even during moments when they transition between locations in the real world. In this paper, we present DreamWalker, a VR system that enables such real-world walking while users explore and stay fully immersed inside large virtual environments in a headset. Provided with a real-world destination, DreamWalker finds a similar path in a pre-authored VR environment and guides the user while real-walking the virtual world. To keep the user from colliding with objects and people in the real-world, DreamWalker's tracking system fuses GPS locations, inside-out tracking, and RGBD frames to 1) continuously and accurately position the user in the real world, 2) sense walkable paths and obstacles in real time, and 3) represent paths through a dynamically changing scene in VR to redirect the user towards the chosen destination. We demonstrate DreamWalker's versatility by enabling users to walk three paths across the large Microsoft campus while enjoying pre-authored VR worlds, supplemented with a variety of obstacle avoidance and redirection techniques. In our evaluation, 8 participants walked across campus along a 15-minute route, experiencing a lively virtual Manhattan that was full of animated cars, people, and other objects.
Many accounts and devices require only infrequent authentication by an individual, and thus authentication secrets should be both secure and memorable without much reinforcement. Inspired by people's strong visual-spatial memory, we introduce a novel system to help address this problem: the Memory Palace. The Memory Palace encodes authentication secrets as paths through a 3D virtual labyrinth navigated in the first-person perspective. We ran two experiments to iteratively design and evaluate the Memory Palace. In the first, we found that visual-spatial secrets are most memorable if navigated in a 3D first-person perspective. In the second, we comparatively evaluated the Memory Palace against Android's 9-dot pattern lock along three dimensions: memorability after one week, resilience to shoulder surfing, and speed. We found that relative to 9-dot, complexity-controlled secrets in the Memory Palace were significantly more memorable after one week, were much harder to break through shoulder surfing, and were not significantly slower to enter.
Scale adaptive techniques for VR navigation enable users to navigate spaces larger than the real space available, while allowing precise interaction when required. However, due to these techniques gradually scaling displacements as the user moves (changing user's speed), they introduce a Drift effect. That is, a user returning to the same point in VR will not return to the same point in the real space. This mismatch between the real/virtual spaces can grow over time, and turn the techniques unusable (i.e., users cannot reach their target locations). In this paper, we characterise and analyse the effects of Drift, highlighting its potential detrimental effects. We then propose two techniques to correct Drift effects and use a data driven approach (using navigation data from real users with a specific scale adaptive technique) to tune them, compare their performance and chose an optimum correction technique and configuration. Our user study, applying our technique in a different environment and with two different scale adaptive navigation techniques, shows that our correction technique can significantly reduce Drift effects and extend the life-span of the navigation techniques (i.e., time that they can be used before Drift draws targets unreachable), while not hindering users' experience.
The virtual reality ecosystem has gained momentum in the gaming, entertainment, and enterprise markets, but is hampered by limitations in concurrent user count, throughput, and accessibility to mass audiences. Based on our analysis of the current state of the virtual reality ecosystem and relevant aspects of traditional media, we propose a set of design hypotheses for practical and effective seated virtual reality experiences of scale. Said hypotheses manifest in the Collective Audience Virtual Reality Nexus (CAVRN), a framework and management system for large-scale (30+ user) virtual reality deployment in a theater-like physical setting. A mixed methodology study of CAVE, an experience implemented using CAVRN, generated rich insights into the proposed hypotheses. We discuss the implications of our findings on content design, audience representation, and audience interaction.
Contemporary AR/VR systems use in-air gestures or handheld controllers for interactivity. This overlooks the skin as a convenient surface for tactile, touch-driven interactions, which are generally more accurate and comfortable than free space interactions. In response, we developed ActiTouch, a new electrical method that enables precise on-skin touch segmentation by using the body as an RF waveguide. We combine this method with computer vision, enabling a system with both high tracking precision and robust touch detection. Our system requires no cumbersome instrumentation of the fingers or hands, requiring only a single wristband (e.g., smartwatch) and sensors integrated into an AR/VR headset. We quantify the accuracy of our approach through a user study and demonstrate how it can enable touchscreen-like interactions on the skin.
Eye gaze involves the coordination of eye and head movement to acquire gaze targets, but existing approaches to gaze pointing are based on eye-tracking in abstraction from head motion. We propose to leverage the synergetic movement of eye and head, and identify design principles for Eye&Head gaze interaction. We introduce three novel techniques that build on the distinction of head-supported versus eyes-only gaze, to enable dynamic coupling of gaze and pointer, hover interaction, visual exploration around pre-selections, and iterative and fast confirmation of targets. We demonstrate Eye&Head interaction on applications in virtual reality, and evaluate our techniques against baselines in pointing and confirmation studies. Our results show that Eye&Head techniques enable novel gaze behaviours that provide users with more control and flexibility in fast gaze pointing and selection.
Previous work in VR has demonstrated how individual physical objects can represent multiple virtual objects in different locations by redirecting the user's hand. We show how individual objects can represent multiple virtual objects of different sizes by resizing the user's grasp. We redirect the positions of the user's fingers by visual translation gains, inducing an illusion that can make physical objects seem larger or smaller. We present a discrimination experiment to estimate the thresholds of resizing virtual objects from physical objects, without the user reliably noticing a difference. The results show that the size difference is easily detected when a physical object is used to represent an object less than 90% of its size. When physical objects represent larger virtual objects, however, then scaling is tightly coupled to the physical object's size: smaller physical objects allow more virtual resizing (up to a 50% larger virtual size). Resized Grasping considerably broadens the scope of using illusions to provide rich haptic experiences in virtual reality.
We present Plane, Ray, and Point, a set of interaction techniques that utilizes shape constraints to enable quick and precise object alignment and manipulation in virtual reality. Users create the three types of shape constraints, Plane, Ray, and Point, by using symbolic gestures. The shape constraints are used like scaffoldings and limit and guide the movement of virtual objects that collide or intersect with them. The same set of gestures can be performed with the other hand, which allow users to further control the degrees of freedom for precise and constrained manipulation. The combination of shape constraints and bimanual gestures yield a rich set of interaction techniques to support object transformation. An exploratory study conducted with 3D design experts and novice users found the techniques to be useful in 3D scene design workflows and easy to learn and use.
Vascular network reconstruction is an essential aspect of the daily practice of medical doctors working with vascular systems. Accurately representing vascular networks, not only graphically but also in a way that encompasses their structure, can be used to run simulations, plan medical procedures or identify real-life diseases, for example. A vascular network is thus reconstructed from a 3D medical image sequence via segmentation and skeletonization. Many automatic algorithms exist to do so but tend to fail for specific corner cases. On the other hand, manual methods exist as well but are tedious to use and require a lot of time. In this paper, we introduce an interactive vascular network reconstruction system called INVANER that relies on a graph-like representation of the network's structure. A general skeleton is obtained with an automatic method and medical practitioners are allowed to manually repair the local defects where this method fails. Our system uses graph-related tools with local effects and introduces two novel tools, dedicated to solving two common problems arising when automatically extracting the centerlines of vascular structures: so-called "Kissing Vessels" and a type of phenomenon we call "Dotted Vessels."