Musical Performances in Virtual Reality with Spatial and View-Dependent Audio Descriptions for Blind and Low-Vision Users
Virtual reality (VR), inherently reliant on spatial interaction, poses significant accessibility barriers for individuals who are blind or have low vision (BLV). Traditional audio descriptions (AD) typically provide a verbal explanation of visual elements in 2D or flat video media, facilitating access for BLV audiences but failing to convey the complex spatial information essential in VR. This shortfall is especially pronounced in musical performances, where understanding the spatial arrangement of the stage setup and movements of performers is crucial. To overcome these limitations, we have developed two AD approaches—Spatial AD for a dance performance and View-dependent AD for an instrumental performance—within VR-based 360° environments. Spatial AD employs spatial audio technology to align descriptions with corresponding visuals, dynamically adjusting to follow the visuals, such as the movements of performers in the dance performance. Meanwhile, View-dependent AD adapts descriptions based on the orientation of the VR headset, activating when particular visuals enter the central view of the camera, ensuring that the description aligns with the user’s attention directed to a particular location within the VR environment. These methods are designed as enhancements to traditional AD, aiming to improve spatial orientation and immersive experiences for BLV audiences. This demonstration showcases the potential of these AD approaches to improve interaction and engagement, furthering the development of inclusive virtual environments.
- Research Article
18
- 10.1089/cyber.2019.0409
- Feb 24, 2020
- Cyberpsychology, Behavior, and Social Networking
Virtual reality (VR) is demonstrating increasing potential for therapeutic benefit in elderly care, but it is still generally considered to be the domain of the visually unimpaired. Even where VR and augmented reality (AR) are being explored for use with low vision, it is generally with a focus on creating bespoke software and hardware. However, the properties of commercial off-the-shelf (COTS) headsets, such as high luminance, may render them accessible even to very low vision users. Using a case-study approach, we explored the differences in visual perception from baseline to pass-through AR and commercial VR applications for an elderly female (Mrs. M) with advanced age-related macular degeneration. We found notable improvements in object, face, and color recognition, particularly with higher display brightness. Furthermore, Mrs. M was able to engage fully and enthusiastically with a number of (unmodified) VR applications, providing detailed descriptions of both static and moving elements. We suggest that the high luminance available in COTS VR may support more stable fixation closer to the fovea, improving visual resolution. Furthermore, the improvements we noted in color perception support previous suggestions that increasing luminance may improve photosensitivity by reducing the uptake of limited oxygen by the rod cells. We conclude that low vision should not automatically preclude users from engaging in VR research or entertainment, and that they may be able to use well-illuminated VR applications without any special modifications.
- Research Article
- 10.1016/j.measen.2024.101097
- Mar 14, 2024
- Measurement: Sensors
Adaptive matching optimization and VR system for indoor music performance based on wireless sensors
- Research Article
- 10.33452/amri.2021.50.95
- Dec 31, 2021
- Tongyang Ŭmak
Ethnomusicologists have long recognized the integrative function of music performance. Studies of music performance examine social histories, politics, spiritual beliefs, among others, that underscore meaning in music performance as well as examine the acoustics and physical experience which shape the value and impact of a performance. To ethnomusicologists, the contexts of performance and identities of performers are as essential as the aesthetics and production of sound. Transformations in sociocultural contexts lead to transformations in perceptions and interpretations of musical performance and understanding these is essential to ethnomusicology’s ongoing development. In 2020, the COVID-19 pandemic forced into survival mode industries and events dependent on the movement and gathering of people. The Association for Asian Studies, for example, had to reconceptualize its annual conference, which, historically, brings together 3~4,000 individuals for in-person networking and academic presentations. Music performance at these conferences largely has been peripheral, engaging audiences in passing and enhancing, rather than centralizing, the significance of the overall event. Without downplaying its devasting impact worldwide, it is important to note that the pandemic has created opportunities to re-imagine ways by which we interact and the role of music performance therein. In this paper, I contend that this new virtual reality into which we were all forced has altered perceptions regarding the use of music, the significance of hybrid music performance space, and music performance’s impact and function. Here, I examine pre-recorded performances of Chindon’ya and contemporary Japanese fusion music at the 2021 AAS Virtual Annual Conference. Within this specific virtual space, music performances merged with organizers’ intellectual framing and the virtual ‘chuimse’ of participants. Through an examination of these performances, I will uncover the ways by which the physical location of performance worked in tandem with the virtual space to both dissolve space and time and reshape the meaning of music performance within the conference context. My goal is to demonstrate how such online music performances can eradicate perceptions of physical distance while simultaneously augmenting connectivity of participants. Such performances engage all participants—performers, producers, spectators—in an active, multisensory immersion in a space both real and imaginary.
- Research Article
2
- 10.1109/tvcg.2025.3617147
- Feb 1, 2026
- IEEE transactions on visualization and computer graphics
Effective visual accessibility in Virtual Reality (VR) is crucial for Blind and Low Vision (BLV) users. However, designing visual accessibility systems is challenging due to the complexity of 3D VR environments and the need for techniques that can be easily retrofitted into existing applications. While prior work has studied how to enhance or translate visual information, the advancement of Vision Language Models (VLMs) provides an exciting opportunity to advance the scene interpretation capability of current systems. This paper presents EnVisionVR, an accessibility tool for VR scene interpretation. Through a formative study of usability barriers, we confirmed the lack of visual accessibility features as a key barrier for BLV users of VR content and applications. In response, we used our findings from the formative study to inform the design and development of EnVisionVR, a novel visual accessibility system leveraging a VLM, voice input and multimodal feedback for scene interpretation and virtual object interaction in VR. An evaluation with 12 BLV users demonstrated that EnVisionVR significantly improved their ability to locate virtual objects, effectively supporting scene understanding and object interaction.
- Conference Article
109
- 10.1145/3313831.3376353
- Apr 21, 2020
Current Virtual Reality (VR) technologies focus on rendering visuospatial effects, and thus are inaccessible for blind or low vision users. We examine the use of a novel white cane controller that enables navigation without vision of large virtual environments with complex architecture, such as winding paths and occluding walls and doors. The cane controller employs a lightweight three-axis brake mechanism to provide large-scale shape of virtual objects. The multiple degrees-of-freedom enables users to adapt the controller to their preferred techniques and grip. In addition, surface textures are rendered with a voice coil actuator based on contact vibrations; and spatialized audio is determined based on the progression of sound through the geometry around the user. We design a scavenger hunt game that demonstrates how our device enables blind users to navigate a complex virtual environment. Seven out of eight users were able to successfully navigate the virtual room (6x6m) to locate targets while avoiding collisions. We conclude with design consideration on creating immersive non-visual VR experiences based on user preferences for cane techniques, and cane material properties.
- Conference Article
- 10.54941/ahfe1006068
- Jan 1, 2025
- AHFE international
Entertainment such as live music performances (live performances) and events are often held in urban centers for reasons such as transportation accessibility and venue size, resulting in regional disparities in entertainment. On the other hand, recent years have seen the emergence of virtual reality (VR)-based live performances, which could solve regional disparities in entertainment. VR live performances can provide many opportunities for people who are unable to attend live performances due to location and time constraints.However, the quality of the experience is expected to deteriorate due to the sense of realism and other sensations that are difficult to reproduce in Reality live performances. However, few studies have compared real-life live performances with VR live performances and examined the effects of factors such as the sense of presence on the audience. In this study, an experiment was conducted with the aim of constructing a system to support the improvement of the sense of presence in VR live music performances. In the experiment, we experienced an on-demand VR live performance and investigated the degree of realism and the differences from a real live performance. From the results, we analyzed the factors that are important for improving the realism of VR live performances. The system is then constructed and evaluated.Ten male participants (23.6±1.8 years old) who had participated in a live concert in the past and remembered the sensation were asked to experience a VR music concert in a standing position for 5 minutes, and a post-experience questionnaire was administered to compare the realism of the actual concert and the VR concert.The post-experience questionnaire consisted of a subjective questionnaire and open-ended questions regarding the sense of presence and the five senses. The subjective questionnaire used a 5-point Likert scale. The items related to the sense of presence consisted of 13 items, which were added by the researcher based on previous studies that have analyzed the sense of presence. 3 was used as the standard for the 5-point scale, based on the comparison with previous real-live experiences. The higher the value, the higher the evaluation of each element in the VR experience.According to the results of the post-experience questionnaire, the vibration and communication elements were rated low. The VR live experience and its components in this experiment did not reproduce the vibrations from the large speakers and communication with other people, such as the performers and the surrounding audience, that are felt in a real live performance. Therefore, the experience tended to feel like a one-way live performance, and the quality of the experience was considered to be compromised. In addition, factor analysis of the questionnaire results showed that among the body perception factors with the highest contribution rate, the values of dynamism, vibration, and sense of self-presence were the most significant, and thus had a large impact on the sense of presence in the live performance.
- Research Article
167
- 10.1162/014892600559515
- Dec 1, 2000
- Computer Music Journal
This dissertation presents research in the field ofautomatic music performance with a special focus on piano. A system is proposed for automatic music performance, basedon artificial neural networks (ANNs). A complex,ecological-predictive ANN was designed thatlistensto the last played note,predictsthe performance of the next note,looksthree notes ahead in the score, and plays thecurrent tone. This system was able to learn a professionalpianist's performance style at the structural micro-level. In alistening test, performances by the ANN were judged clearlybetter than deadpan performances and slightly better thanperformances obtained with generative rules. The behavior of an ANN was compared with that of a symbolicrule system with respect to musical punctuation at themicro-level. The rule system mostly gave better results, butsome segmentation principles of an expert musician were onlygeneralized by the ANN. Measurements of professional pianists' performances revealedinteresting properties in the articulation of notes markedstaccatoandlegatoin the score. Performances were recorded on agrand piano connected to a computer.Staccatowas realized by a micropause of about 60% ofthe inter-onset-interval (IOI) whilelegatowas realized by keeping two keys depressedsimultaneously; the relative key overlap time was dependent ofIOI: the larger the IOI, the shorter the relative overlap. Themagnitudes of these effects changed with the pianists' coloringof their performances and with the pitch contour. Theseregularities were modeled in a set of rules for articulation inautomatic piano music performance. Emotional coloring of performances was realized by means ofmacro-rules implemented in the Director Musices performancesystem. These macro-rules are groups of rules that werecombined such that they reflected previous observations onmusical expression of specific emotions. Six emotions weresimulated. A listening test revealed that listeners were ableto recognize the intended emotional colorings. In addition, some possible future applications are discussedin the fields of automatic music performance, music education,automatic music analysis, virtual reality and soundsynthesis.
- Research Article
4
- 10.5204/mcj.1763
- Jun 1, 1999
- M/C Journal
Seeing Sound, Hearing Image
- Book Chapter
2
- 10.4324/9780429491214-11
- Jun 22, 2021
Acoustic space is fundamental to the way that we experience live music in performance, but it is a dimension of performance that is exceptionally difficult to control and to replicate between venues. While this makes some sites incredibly attractive as performance and recording venues, it also limits the opportunity for composers and performers to actively and systematically use space as a musical effect, and it presents challenges for performers of historic music, who must reimagine historic repertoire for modern concert venues that are very different spaces to those that would have been known to composers, performers and audiences in the past. This chapter explores 3D sound modelling and music performance as key – though often neglected – points of interface in Virtual Reality (VR), and how the technology of VR gaming might be used as a platform for investigating historical performance spaces and the music that was performed within them. It explores how virtual acoustics might provide new insights into historic and musicological research, and how VR-derived post-production tools might lead to new approaches to classical music production; new commercial music products and revenue, and the opportunities and challenges that present to audiences, performers and composers.
- Research Article
1
- 10.4312/mz.43.1.43-51
- Dec 1, 2007
- Musicological Annual
An interface may be considered being a (social) situation where information is transferred effecting and affecting both, the communicator and the recipient. There is evidence, that musical performance could be argued to be a paradigm of this situation. As musical performance is part of a communication-process formalized in music an interface is part of a communication-process formalized in a non-mechanistic virtual reality. Musical perfomance as well as interfaces are based on expressive bahavior – by giving access to they “construct” music and virtual realities. These hypotheses are argued on the basis of experimental data of communication-processes as well as theories of music and media-art.
- Book Chapter
1
- 10.1007/978-3-030-23541-3_36
- Jan 1, 2019
New media technologies bring to light new techniques in representation and storytelling. Particularly, in theatre performance that is delivered live to the audience, new techniques have the capacity to create more immersive experiences to audiences. In recent years, Chinese audience are witnessing an increasing number of new representation and storytelling techniques being applied on stage productions. These technological innovations on stage are rapidly developing and inherit tremendous creative potentials. This paper focuses on the application of game engine-powered technologies such as real-time rendering used in 3D mapping, interactive, Virtual Reality, Mixed Reality etc. On the frontstage that is visible to the audience, the way of storytelling and the visual narration of the creative concept appears to be the result of these technological innovations on stage. This paper focuses on the application of technologies such as real-time rendering, virtual reality, and mixed reality on stage performances powered by game engine to realize intermixing of storytelling. Throughout the paper, we provide examples of various stage productions, such as theatre, music performances, and dance productions. These examples demonstrate the wide range of applications that are possible in enriching ways of storytelling and engaging audience. Yet, despite all of the above benefits, we also call for is a reflection upon the relationship between technology and arts. We argue that in spite of the availabilities of these technologies, it is imperative to also slow down and reassess the content we are going to create and begin a series of creative conversations between artistic expressions and technological excursions.
- Conference Article
16
- 10.1109/vr51125.2022.00099
- Mar 1, 2022
Music videos are short films that integrate songs and imagery and are produced for artistic and promotional purposes. Modern music videos apply various media capture techniques and creative postproduction technologies to provide a myriad of stimulating and artistic approaches to audience entertainment and engagement for viewing across multiple devices. Within this domain, volumetric technologies are becoming a popular means of recording and reproducing musical performances for new audiences to access via traditional 2D screens and emergent virtual reality platforms. However, the precise impact of volumetric video in virtual reality music video entertainment has yet to be fully explored from a user’s perspective. Here we show how users responded to volumetric representations of music performance in virtual reality. Our results preliminarily demonstrate how audiences are likely to respond to music videos and offer insight into how future music videos may be developed for different user types. We anticipate our essay as a formative starting point for more sophisticated, interactive music videos that can be accessed and presented via extended-reality technologies.
- Research Article
3
- 10.1515/opar-2022-0340
- Dec 22, 2023
- Open Archaeology
Music and sound cannot be experienced through writing and numbers. Writing freezes time onto paper; as a time-based medium, sound cannot be heard without temporal motion, and acoustic metrics are silent data. For a complete experience of sound, it needs to engage our bodies. Digital multimedia technologies offer powerful approaches to understanding the acoustics of the past, and this work will explore a number of those affordances. In particular, this work explores the use of apps that illustrate archaeoacoustic effects, set digitally within visual and acoustic archaeological cultures. The ways of immersing audiences through projection, acoustic simulation, field and studio recordings, and musical performance will be discussed. The use of virtual reality (VR) headsets is explored to create a sense of deep-flow and presence amongst audiences, total immersion in an experiential phenomenological understanding of interacting audio and visual fields, as well as setting such results within an appropriate context. This study will examine how acoustics results at caves in Northern Spain, in various phases of Stonehenge, and at Paphos Theatre (all World Heritage Sites) can be explored using VR and multimedia technologies, evaluating the comparative advantages of the use of different technologies. It proposes that such integration of visual and sonic modelling using interactive digital technologies is effective as a non-representational theory approach to compliment empirical studies, allowing understanding that goes beyond numerical analysis and binary dialectics to engage directly with the material of archaeological sites in an embodied manner, and address the real-world complexities of acoustic ecologies and their contexts.
- Research Article
20
- 10.1186/s12888-023-05040-z
- Aug 1, 2023
- BMC Psychiatry
BackgroundPerformance anxiety is the most frequently reported anxiety disorder among professional musicians. Typical symptoms are - on a physical level - the consequences of an increase in sympathetic tone with cardiac stress, such as acceleration of heartbeat, increase in blood pressure, increased respiratory rate and tremor up to nausea or flush reactions. These symptoms can cause emotional distress, a reduced musical and artistical performance up to an impaired functioning. While anxiety disorders are preferably treated using cognitive-behavioral therapy with exposure, this approach is rather difficult for treating music performance anxiety since the presence of a public or professional jury is required and not easily available. The use of virtual reality (VR) could therefore display an alternative. So far, no therapy studies on music performance anxiety applying virtual reality exposure therapy have investigated the therapy outcome including cardiovascular changes as outcome parameters.MethodsThis mono-center, prospective, randomized and controlled clinical trial has a pre-post design with a follow-up period of 6 months. 46 professional and semi-professional musicians will be recruited and allocated randomly to an VR exposure group or a control group receiving progressive muscle relaxation training. Both groups will be treated over 4 single sessions. Music performance anxiety will be diagnosed based on a clinical interview using ICD-10 and DSM-5 criteria for specific phobia or social anxiety. A behavioral assessment test is conducted three times (pre, post, follow-up) in VR through an audition in a concert hall. Primary outcomes are the changes in music performance anxiety measured by the German Bühnenangstfragebogen and the cardiovascular reactivity reflected by heart rate variability (HRV). Secondary outcomes are changes in blood pressure, stress parameters such as cortisol in the blood and saliva, neuropeptides, and DNA-methylation.DiscussionThe trial investigates the effect of VR exposure in musicians with performance anxiety compared to a relaxation technique on anxiety symptoms and corresponding cardiovascular parameters. We expect a reduction of anxiety but also a consecutive improvement of HRV with cardiovascular protective effects.Trial registration: This study was registered on clinicaltrials.gov. (ClinicalTrials.gov Number: NCT05735860)
- Research Article
5
- 10.1007/s10055-024-01014-y
- May 25, 2024
- Virtual Reality
This study analyzes the effect of using a virtual reality (VR) game as a complementary tool to improve users’ rhythmic performance and perception in a remote and self-learning environment. In recent years, remote learning has gained importance due to various everyday situations; however, the effects of using VR in such situations for individual and self-learning have yet to be evaluated. In music education, learning processes are usually heavily dependent on face-to-face communication with a teacher and are based on a formal or informal curriculum. The aim of this study is to investigate the potential of gamified VR learning and its influence on users’ rhythmic sensory and perceptual abilities. We developed a drum-playing game based on a tower defense scenario designed to improve four aspects of rhythmic perceptual skills in elementary school children with various levels of music learning experience. In this study, 14 elementary school children received Meta Quest 2 headsets for individual use in a 14-day individual training session. The results showed a significant increase in their rhythmical skills through an analysis of their rhythmic performance before and after the training sessions. In addition, the experience of playing the VR game and using the HMD setup was also assessed, highlighting some of the challenges of currently available affordable headsets for gamified learning scenarios.