Back to Search Start Over

Real-time immersive audio featuring facial recognition and tracking

Authors :
Atbas, Erdem
Gaydecki, Patrick
Peyton, Anthony
Publication Year :
2022
Publisher :
University of Manchester, 2022.

Abstract

Consumption of audio-visual media is increasing everyday. As a result, the usage of mobile phones, tablets, and other portable devices is growing rapidly. However, in many situations it is not welcomed if the audio is not consumed via headphones within a public space. Since headphones have their own disadvantages, a proposed solution is to deliver the desired audio to the listener using loudspeaker arrays in such a way that it causes minimal interference outside of the zone of focus. Such a solution is not always practical thus, a second solution is also proposed: to enrich the audio via headphones to provide a more immersive listening experience. This thesis describes research which explored both the concept of focused delivery using a loudspeaker array, and secondly, 3D sound generation using headphones. It proposes two systems that provide immersive sound localisation either with loudspeakers or headphones, minimizing disruption to other non-users. The first system is a real-time sound projector, referred to in this thesis as a Personal Audio Delivery system, that in real time, changes the focal point of an acoustic beam with respect to the triangulated coordinates of a recognised and tracked user moving within the enclosed volume of an anechoic chamber. The Personal Audio Delivery system updates the focal point with a refresh rate of 4 Hz whilst providing an average acoustic contrast of between 23.47 dB to 34.52 dB between the focal point and rest of the anechoic chamber, combined with a triangulation accuracy of ±5 cm in a 33m^3 room. The second described system is a real-time audio virtualiser, referred to in this thesis as a Virtual Audio Localisation system. This can adaptively virtually localise the desired audio with respect to the azimuth, elevation, and distance of the triangulated coordinates of the designated speakers and listeners with an update rate of 1 Hz, to enhance the headphone listening experience by providing a realistic listening experience. Since the Virtual Audio Localisation system is integrated with the previously defined recognition and tracking system, it has a triangulation accuracy of ±5 cm in a 33m3 room.

Details

Language :
English
Database :
British Library EThOS
Publication Type :
Dissertation/ Thesis
Accession number :
edsble.850605
Document Type :
Electronic Thesis or Dissertation