1,098 results on '"Stereophonic sound"'
Search Results
2. Making Ambient Music Interactive Based on Ubiquitous Computing Technologies
- Author
-
Kinoshita, Yukiko, Nakajima, Tatsuo, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Novais, Paulo, editor, Jung, Jason J., editor, Villarrubia González, Gabriel, editor, Fernández-Caballero, Antonio, editor, Navarro, Elena, editor, González, Pascual, editor, Carneiro, Davide, editor, Pinto, António, editor, Campbell, Andrew T., editor, and Durães, Dalila, editor
- Published
- 2019
- Full Text
- View/download PDF
3. Principles of microphone sound recording in the context of the creative direction of sound recording
- Subjects
Sound recording and reproduction ,Stereophonic sound ,Popular music ,Aesthetics ,law ,Rock music ,Context (language use) ,General Medicine ,Sociology ,Timbre ,Realism ,law.invention ,Purism - Abstract
The purpose of the article is to characterize the principles of sound recording with microphones in the context of the acoustic spatial features of concert halls, which are an important component in positioning the activities and creative directions of "purism", "individualism" and "realism" in sound engineering. The methodology consists of the use of analytical, historical, and cultural methods, which made it possible to identify and characterize the technological foundations of sound recording using the example of sound engineers. The scientific novelty of the work lies in the fact that for the first time in Ukrainian science the principles of microphone sound recording in the context of acoustic spatial features and creative directions of sound engineering "purism", "individualism" and "realism" were defined and characterized. Conclusions. In the work, the data on the spectral response of the frequency range, the stereophonic effect, musical and timbre balance, and the spatial impression of the acoustics of concert halls were determined. The principles of application of multi-microphone technique in instrumental, orchestral, and rock music are revealed; outlined the creative potential of the directions "purism", "individualism" and "realism". in sound engineering. In terms of current cinematic trends and contemporary popular music culture, we hear and become accustomed to exaggeratedly colorful and rich, often "electronic" sound. Since the listener is the ultimate link in the entire recording industry, it is necessary to recognize landmarks in sound engineering aimed at the tastes of the majority.
- Published
- 2021
- Full Text
- View/download PDF
4. Stereo Feature Enhancement and Temporal Information Extraction Network for Automatic Music Transcription
- Author
-
Jie Shao, Yanjun She, Wen Zhang, and Yonghui Zhang
- Subjects
Computer science ,Applied Mathematics ,Speech recognition ,Feature extraction ,computer.software_genre ,Temporal database ,law.invention ,Raw audio format ,Stereophonic sound ,Feature (computer vision) ,law ,Signal Processing ,Electrical and Electronic Engineering ,Transcription (software) ,Audio signal processing ,Hidden Markov model ,computer - Abstract
As a challenging task of audio processing, automatic music transcription (AMT) attracts increasing attention recently, which aims to convert a raw audio to a symbolic representation. Nowadays, music recordings are usually stereo audio files. Many previous studies simply average the stereo signal to a mono signal when processing data, which sacrifices some useful information. In this paper, we design a stereo feature enhancement (SFE) module based on self-attention mechanism to make full use of stereo information. Moreover, in recent years temporal convolutional network (TCN) has demonstrated great effect on processing temporal data, which overcomes some drawbacks of existing temporal information extraction methods such as HMM, RNN and LSTM. Inspired by this, we propose a temporal convolutional module (TCM) which is suitable to extract temporal context of music. Our proposed network is validated on the MAPS dataset for music transcription, and achieves ideal performance.
- Published
- 2021
- Full Text
- View/download PDF
5. Rapt/Wrapped Listening: The Aesthetics of 'Surround Sound'
- Author
-
James Wierzbicki
- Subjects
Point (typography) ,business.industry ,media_common.quotation_subject ,Terrence Malick ,Surround Sound ,Art ,Stereo ,Surround sound ,law.invention ,Dolby 5.1 ,Stereophonic sound ,Movie theater ,law ,Aesthetics ,Perception ,In real life ,Encoding (semiotics) ,Active listening ,Fantasound ,business ,media_common - Abstract
This essay is prompted by “surround sound,” the sonic results of which have been evident in cinemas since the late 1970s and the encoding for which, in the form of Dolby 5.1 on the soundtracks of DVDs, since the turn of the century has been fairly ubiquitous. By way of background, the essay deals in turn with the physical nature of three-dimensional listening and with the history of stereophonic sound as manifest both in the cinema and on LP recordings. More to the point, the essay deals with the aesthetic differences (not just perceptual but also affective) between listening to three-dimensional sounds in real life situations and listening to re-creations of those sounds, via a Dolby system or otherwise, in the privacy and comfort of one’s home. Playing on the homophonic adjectives in its title, the essay reflects on why sometimes we give more rapt attention to artificial versions of “surround sound” than to the genuine stereophonic sound in which we are literally wrapped almost on a daily basis.
- Published
- 2021
6. Active Filter Circuits and Phase-Locked Loop (PLL)
- Author
-
S. P. Yawale and S. S. Yawale
- Subjects
Phase-locked loop ,Stereophonic sound ,Quality (physics) ,Band-pass filter ,law ,Computer science ,Low-pass filter ,Electronic engineering ,Public address system ,High-pass filter ,Active filter ,law.invention - Abstract
Active filters and phase-locked loop (PLL) and its applications are discussed in this chapter. To get acquainted with the design of active filters and the applicability in instrumentation, low pass, high pass and band pass filters are explained. The frequency selection in audio or music systems is utmost important. The selection of lower and higher cut-off frequencies decides the quality of the filters. And in public address or audio or stereophonic system bass and treble decides the quality of the sound and this comes from the proper selection of low, mid and high frequencies.
- Published
- 2021
- Full Text
- View/download PDF
7. Deep learning-based stereophonic acoustic echo suppression without decorrelation
- Author
-
Andong Li, Xiaodong Li, Renhua Peng, Linjuan Cheng, and Chengshi Zheng
- Subjects
Reverberation ,Acoustics and Ultrasonics ,Computer science ,business.industry ,Microphone ,Deep learning ,Acoustics ,law.invention ,Stereophonic sound ,Recurrent neural network ,Deep Learning ,Arts and Humanities (miscellaneous) ,law ,Computer vision ,Loudspeaker ,Artificial intelligence ,Neural Networks, Computer ,Sound quality ,Least-Squares Analysis ,business ,Decorrelation ,Algorithms - Abstract
Traditional stereophonic acoustic echo cancellation algorithms need to estimate acoustic echo paths from stereo loudspeakers to a microphone, which often suffers from the nonuniqueness problem caused by a high correlation between the two far-end signals of these stereo loudspeakers. Many decorrelation methods have already been proposed to mitigate this problem. However, these methods may reduce the audio quality and/or stereophonic spatial perception. This paper proposes to use a convolutional recurrent network (CRN) to suppress the stereophonic echo components by estimating a nonlinear gain, which is then multiplied by the complex spectrum of the microphone signal to obtain the estimated near-end speech without a decorrelation procedure. The CRN includes an encoder-decoder module and two-layer gated recurrent network module, which can take advantage of the feature extraction capability of the convolutional neural networks and temporal modeling capability of recurrent neural networks simultaneously. The magnitude spectra of the two far-end signals are used as input features directly without any decorrelation preprocessing and, thus, both the audio quality and stereophonic spatial perception can be maintained. The experimental results in both the simulated and real acoustic environments show that the proposed algorithm outperforms traditional algorithms such as the normalized least-mean square and Wiener algorithms, especially in situations of low signal-to-echo ratio and high reverberation time RT60.
- Published
- 2021
8. A High-Capacity Reversible Data Hiding Scheme Using Dual-Channel Audio
- Author
-
Heng Yu, Rangding Wang, Li Dong, Diqun Yan, Yongkang Gong, and Yuzhen Lin
- Subjects
General Computer Science ,Channel (digital image) ,Cover (telecommunications) ,Steganography ,Computer science ,Speech recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,General Engineering ,dual channel ,Reversible data hiding ,Data_CODINGANDINFORMATIONTHEORY ,TK1-9971 ,Image (mathematics) ,law.invention ,Stereophonic sound ,Transmission (telecommunications) ,law ,Information hiding ,audio ,magic matrix ,Embedding ,General Materials Science ,Electrical engineering. Electronics. Nuclear engineering - Abstract
In recent years, the reversible data hiding (RDH) based on dual stego cover is developing rapidly because of its high capacity and low distortion. For image case, however, two consecutive images of the same image will draw the attention of adversaries during transmission. In this article, we propose a high-capacity RDH scheme using dual-channel audio, by exploiting the natural dual-channel property of the stereo audio. Specifically, we first convert secret message into novenary digits, which could increase the embedding capacity. Then, the magic matrix is used to embed the secret digit into a single-channel audio to generate two single-channel stego-audio. Finally, the two single-channel stego-audio is combined as a convention dual-channel audio. Extensive experiments have demonstrated that our proposed method could significantly boost the stego quality (the SNR is improved by 16% on average), when comparing with the state-of-the-art methods.
- Published
- 2020
- Full Text
- View/download PDF
9. Sergei Eisenstein’s Ideas in the Context of Contemporary Cinema. Stereo Film and Stereo Sound
- Author
-
Elena A. Rusinova
- Subjects
business.industry ,Filmmaking ,media_common.quotation_subject ,Screen space ,Art ,law.invention ,Movie theater ,Stereophonic sound ,Aesthetics ,law ,Film studies ,Dream ,business ,media_common - Abstract
The text (continuation of the article Vestnik VGIK, No. 3 (41), 2019). deals with the ideas of S.M. Eisenstein, presented by him in the theoretical work On Stereoscopic Films, in the context of the subsequent development of cinema phonography and modern scientific and theoretical discussions on the problems of correlation of technological and aesthetic aspects of cinema art. Eisensteins article focuses on and analyzes new and controversial technical achievements of cinema, but the authors thoughts reach a high level of understanding the history and prospects of the development of cinema and art in general as an expression of the organic human need for creative and artistic activity. Turning to the history of the theater as the forerunner of cinema, Eisenstein emphasizes the moment of separation of the actor and stage action from the audience. The director sees the technical possibilities of stereo cinema as a way of returning the viewer to the space of direct co-action, complete immersion in the artistic space, integration with the artistic image. Modern multichannel sound technologies have come close to fulfilling Eisensteins dream of drawing the viewer into the screen space and merging it with the artistic creation. But when and to what extent is the use of stereo effects justified, how do the technological and artistic aspects of film production correlate These are issues that are currently the subject of theoretical discourse in the framework of not only film studies, but also interdisciplinary knowledge, affect the development of audiovisual concepts in the practice of filmmaking.
- Published
- 2019
- Full Text
- View/download PDF
10. Highly secured image hiding technique in stereo audio signal based on complete complementary codes
- Author
-
Marwa H. El-Sherif, Noha O. Korany, and Said E. El-Khamy
- Subjects
Audio signal ,Steganography ,Computer Networks and Communications ,Computer science ,business.industry ,020207 software engineering ,02 engineering and technology ,law.invention ,Spread spectrum ,Stereophonic sound ,Hardware and Architecture ,law ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Embedding ,Computer vision ,Artificial intelligence ,business ,Software - Abstract
The recent revolution of the Internet as a collaborative medium has opened the door for people who want to share their work. Nevertheless, this may cause serious problems for privacy and copyright protection. Steganography is a powerful tool for protecting important data during transmission. It’s used to hide any secret information like text, image or audio behind a cover file. In this study, a new robust audio steganography technique based on optimum two dimensional Complete Complementary Codes (CCC) has been adopted to encode colour images data and obtain two differently encoded versions of it. These two versions are hidden in DWT coefficients of the two channels of stereo audio signal and embedding locations are determined via 2-D chaotic map random sequence. Complete Complementary Codes (CCC) are sets of spread spectrum sequence family that have ideal auto and cross-correlation properties so, they found many applications in several science areas with the broadest application possibilities in telecommunications. Various attacks are applied to the host audio signals and simulation results show high robustness and capacity with good quality of the extracted image.
- Published
- 2019
- Full Text
- View/download PDF
11. Music memoirs from Africa: tracking music in Binyavanga Wainaina’s memoir: One Day I Will Write About This Place
- Author
-
Savannah Lucas
- Subjects
History ,05 social sciences ,0507 social and economic geography ,Human sexuality ,06 humanities and the arts ,060202 literary studies ,050701 cultural studies ,Solidarity ,Diaspora ,law.invention ,Visual arts ,Stereophonic sound ,law ,Counterculture ,Memoir ,0602 languages and literature ,Tracking (education) ,Social Sciences (miscellaneous) - Abstract
“Music Memoirs from Africa” seeks to highlight how music can be used as a tool by contemporary authors to dynamically communicate modern and refreshed experiences of diaspora, sexuality, pan-Africa...
- Published
- 2019
- Full Text
- View/download PDF
12. A new adaptive filtering algorithm for stereophonic acoustic echo cancellation
- Author
-
Mohamed Djendi and Merouane Messini
- Subjects
010302 applied physics ,Normalization (statistics) ,Acoustics and Ultrasonics ,Channel (digital image) ,Computer science ,Echo (computing) ,Stability (learning theory) ,01 natural sciences ,law.invention ,Term (time) ,Least mean squares filter ,Stereophonic sound ,law ,0103 physical sciences ,Convergence (routing) ,010301 acoustics ,Algorithm - Abstract
This paper addresses the stereophonic acoustic echo cancelation (SAEC) problem in teleconferencing system by adaptive filtering algorithm. We propose a new stereophonic version of the fast normalized least mean square (FNLMS) algorithm. The proposed algorithm is an extended version of the FNLMS algorithm to the stereophonic case with important improvement of the prediction part that becomes stable in the stereophonic case. The basic idea behind the proposed stereophonic fast normalized least mean square (SFNLMS) algorithm is the normalization of the predictor parameter by the inputs variance for each channel. Simulation results of a comparison between the proposed SFNLMS algorithm and the classical SNLMS version in term of convergence speed and stability are presented.
- Published
- 2019
- Full Text
- View/download PDF
13. A Partitioned-Block Frequency-Domain Adaptive Kalman Filter for Stereophonic Acoustic Echo Cancellation
- Author
-
Feiran Yang, Rui Zhu, Li Yuepeng, and Shang Shidong
- Subjects
Stereophonic sound ,Computer science ,law ,Frequency domain ,Block (telecommunications) ,Echo (computing) ,Kalman filter ,Algorithm ,law.invention - Published
- 2021
- Full Text
- View/download PDF
14. Low-Complexity Acoustic Scene Classification Using Data Generation Based On Primary Ambient Extraction
- Author
-
Haocong Yang, Jiangnan Liang, Yingzi Liu, and Chuang Shi
- Subjects
Test data generation ,Computer science ,computer.software_genre ,Data modeling ,law.invention ,Stereophonic sound ,Statistical classification ,law ,Data mining ,Focus (optics) ,Quantization (image processing) ,Baseline (configuration management) ,Mobile device ,computer - Abstract
Acoustic scene classification (ASC) is an important branch of machine hearing. Since ASC systems are intended to be deployed on mobile devices, how to ensure the performance under low-complexity implementation has become an attracting research problem. The state-of-the-art methods include compressing parameter precisions, reducing quantization bits, introducing sparsity constraints and so on. These methods mainly focus on the model level optimization, while explorations are rarely originated from the data level. This paper introduces a train of thoughts from data level, inspired by a stereo audio processing algorithm, namely the primary ambient extraction (PAE), which generates additional samples through audio up-mixing. The experiment results demonstrate that the proposed method exhibits better performance than a group of ASC baseline systems without data level optimization, not to mention that the proposed method is compatible with the existing model level optimization.
- Published
- 2021
- Full Text
- View/download PDF
15. Proportionate Adaptive Sub-Filters for Nonlinear Acoustic Echo Cancellation
- Author
-
Srikanth Burra and Asutosh Kar
- Subjects
Stereophonic sound ,Nonlinear system ,Rate of convergence ,law ,Colors of noise ,Nonlinear distortion ,Computer science ,Convergence (routing) ,Echo (computing) ,Loudspeaker ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Algorithm ,law.invention - Abstract
The electronic components such as loudspeaker and amplifier in hand-free devices will add the nonlinear distortion (ND) into the echo path of the stereophonic acoustic systems. The stereophonic acoustic echo cancellation (SAEC) algorithms developed to minimize the effect of acoustic echo in these systems assuming linear echo path will suffer from degraded performance in presence of ND. However, no solutions were proposed to address issue of ND in SAEC. This paper introduces a sub-filter based NAEC framework for faster convergence and enhanced steady-state performance. The proposed framework employs the functional link approach in modeling the ND and focuses on improving the rate of convergence of NAEC for better echo cancellation performance. Simulations indicate that the proposed NAEC outperforms the variants of functional link based NAEC in terms of echo return loss enhancement for speech and colored noise inputs.
- Published
- 2021
- Full Text
- View/download PDF
16. Visually Informed Binaural Audio Generation without Binaural Audios
- Author
-
Xudong Xu, Dahua Lin, Xiaogang Wang, Hang Zhou, Ziwei Liu, and Bo Dai
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,business.industry ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Speech recognition ,Stability (learning theory) ,Computer Science - Computer Vision and Pattern Recognition ,Pipeline (software) ,Computer Science - Sound ,Multimedia (cs.MM) ,Visualization ,law.invention ,Sound recording and reproduction ,Stereophonic sound ,Audio and Speech Processing (eess.AS) ,law ,FOS: Electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,business ,Sensory cue ,Binaural recording ,Impulse response ,Computer Science - Multimedia ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Stereophonic audio, especially binaural audio, plays an essential role in immersive viewing environments. Recent research has explored generating visually guided stereophonic audios supervised by multi-channel audio collections. However, due to the requirement of professional recording devices, existing datasets are limited in scale and variety, which impedes the generalization of supervised methods in real-world scenarios. In this work, we propose PseudoBinaural, an effective pipeline that is free of binaural recordings. The key insight is to carefully build pseudo visual-stereo pairs with mono data for training. Specifically, we leverage spherical harmonic decomposition and head-related impulse response (HRIR) to identify the relationship between spatial locations and received binaural audios. Then in the visual modality, corresponding visual cues of the mono data are manually placed at sound source positions to form the pairs. Compared to fully-supervised paradigms, our binaural-recording-free pipeline shows great stability in cross-dataset evaluation and achieves comparable performance under subjective preference. Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings., Accepted by CVPR 2021. Code, models, and demo video are available on our webpage: \
- Published
- 2021
17. Three-dimensional Nonvisual Directional Guidance for People with Visual Impairments
- Author
-
Sohyeon Park, Kyungyeon Lee, Uran Oh, and Seung A Chung
- Subjects
Modalities ,Point (typography) ,Computer science ,Visual impairment ,Visualization ,law.invention ,Stereophonic sound ,Human–computer interaction ,law ,medicine ,Laser pointer ,Audio feedback ,medicine.symptom ,Haptic technology - Abstract
Conveying directional feedback is important for individuals who are blind or have limited visual acuity. However, most studies have focused on supporting two-dimensional guidance. In this work, we investigated the effects of different nonvisual feedback conditions for providing directional guidance in a three-dimensional space. We conducted a user study with six people who are blind or have low vision to investigate the effects of stereo sound (on vs. off) and feedback modalities (beeping vs. vibration vs. beeping+vibration). Participants were asked to point a series of virtual targets randomly appeared around them in 3D with a laser pointer as quickly as possible. Findings suggest that the presence of beeping sound have better performance in terms of task completion time and travel distance compared to when vibration feedback was provided without beeping sound, which was the least preferred condition. In addition, we found that the presence of stereo sound has no significant effect on the performance although it is preferred by most participants. This work can contribute to 3D navigation for people who are blind or have limited visual acuity.
- Published
- 2021
- Full Text
- View/download PDF
18. Enhanced Audio Source Separation and Musical Component Analysis
- Author
-
Tanmay Bhagwat, Leena Ragha, Shubham Deolalkar, and Jayesh Lokhande
- Subjects
Artificial neural network ,Computer science ,business.industry ,Deep learning ,Speech recognition ,Cohesion (computer science) ,law.invention ,Stereophonic sound ,Recurrent neural network ,law ,Source separation ,Task analysis ,Artificial intelligence ,business ,Digital signal processing - Abstract
Audio source separation is a cornerstone problem for researchers engaged in Digital Signal Processing and Artificial Intelligence. Music unmixing is the task of decomposing music into its constitutive components, like yielding separated stems for the vocals, bass, drums, accompaniment, jazz, and others from a mastered song track. Due to recent progress in the field of Deep Learning, researchers have been able to devise Neural Networks that can perform this task with considerable precision. However, these models lack performance when dealing with generic musical audio despite having decent utility over specific music genres. The proposed system aims to develop a universal platform-independent software for accurate domain-specific implementation of music source separation for acute subsets of stereo audio using the Bidirectional Long Short Term Memory (BLSTM) architecture of Recurrent Neural Networks. The Deep Neural Network helps demix audio mixtures into the jazz solo and its accompaniment. In cohesion, these two models extract five independent audio stems from the original audio with reasonable accuracy. Further, the extracted accompaniment stem is processed using ConvNet Model to estimate the instrumental components. In synchronization, these three models can break down audio to its fundamental elements.
- Published
- 2020
- Full Text
- View/download PDF
19. How the War Changed Audio
- Author
-
Julian Ashbourn
- Subjects
geography ,business.product_category ,History ,geography.geographical_feature_category ,Transition (fiction) ,World War II ,law.invention ,Visual arts ,Stereophonic sound ,Tape recorder ,Spanish Civil War ,law ,business ,Sound (geography) ,Period (music) - Abstract
This chapter covers the fairly dramatic transition period from before to after the war. It is interesting to note that stereophonic sound was invented 8 years before the war but was not used until the 1950s. Early forms of recording sound are discussed as well as the introduction of the tape recorder.
- Published
- 2020
- Full Text
- View/download PDF
20. How to Do Things Properly
- Author
-
Julian Ashbourn
- Subjects
business.product_category ,Computer science ,media_common.quotation_subject ,Genius ,GeneralLiterature_MISCELLANEOUS ,Field (computer science) ,law.invention ,Classical music ,Tape recorder ,Stereophonic sound ,law ,Computer graphics (images) ,Blumlein Pair ,business ,media_common - Abstract
This chapter simply covers how to do things properly by recording true stereo sound, which was invented by Alan Blumlein in 1931, although not properly used before the 1950s. This is mostly because, until the advent of the tape recorder, there was no easy way to record stereophonic sound. Blumlein had managed it in the laboratory, but then, he was a genius. He did define clearly the methodologies for recording stereo, even though the technology was not there at the time. In the 1950s and into the very early 1960s, audio engineers in the classical music field followed these principles and made some wonderful, true stereo recordings, which we may still hear today. But then most of them were overtaken by multi-track technology, and they forgot the art. This chapter shows you how to do it again and produce true stereo recordings.
- Published
- 2020
- Full Text
- View/download PDF
21. Stereo Sound, Film Sound and the Legacy of Alan Dower Blumlein
- Author
-
Julian Ashbourn
- Subjects
geography ,Engineering ,geography.geographical_feature_category ,Microphone ,business.industry ,Electrical engineering ,Dower ,law.invention ,Stereophonic sound ,High fidelity ,law ,Blumlein Pair ,Radar ,business ,Sound (geography) - Abstract
This chapter introduces the name Alan Dower Blumlein and the tremendous flow of ideas that came from this one man within a decade. It discusses the invention of stereophonic sound, the moving coil microphone and the sound for film. The sudden flourish of what became known simply as ‘High Fidelity’ is introduced and the resultant global market that was developed. Blumlein’s work on RADAR is also briefly discussed as many may not know of this, or his subsequent death while testing new variations of RADAR in an aircraft crash.
- Published
- 2020
- Full Text
- View/download PDF
22. The Big Time, with 24 Tracks Everywhere
- Author
-
Julian Ashbourn
- Subjects
geography ,geography.geographical_feature_category ,Computer science ,business.industry ,GeneralLiterature_MISCELLANEOUS ,Field (computer science) ,law.invention ,Classical music ,Stereophonic sound ,High fidelity ,Popular music ,law ,Telecommunications ,business ,Studio ,Sound (geography) - Abstract
The pressure upon recording studios to offer 24-track recording was very real, and many had to invest significantly in not only the 24-track machines but compatible audio mixers and outboard electronics. The installation of such equipment into an existing studio required an enormous effort and associated cost. This was driven mostly by the popular music fraternity, but would come to affect the classical music field as well, with consequent changes in the way classical music was recorded and the sound that record buyers heard, which did not necessarily correspond with the sound of the orchestra as heard in the concert hall. The importance of true stereo sound is also discussed.
- Published
- 2020
- Full Text
- View/download PDF
23. Why Recordings Sound Worse Now Than They Did in the 1950s and 1960s
- Author
-
Julian Ashbourn
- Subjects
Classical music ,geography ,Stereophonic sound ,geography.geographical_feature_category ,Popular music ,History ,law ,media_common.quotation_subject ,Quality (business) ,Sound (geography) ,law.invention ,Visual arts ,media_common - Abstract
It is a fact that many classical music recordings that were made in the late 1950s and early 1960s actually sound much more realistic and are of better quality than many of those made today. This is principally because they were recorded in true stereo with good quality microphones and relatively little in the signal chain. Some audio engineers continue to work this way, but not many. The art and science have largely been lost. In the popular music field, they never really adopted stereo anyway, except from some recordings made in the early 1950s, and so things have not changed much, except of course from the impact of digital technology. These issues are discussed.
- Published
- 2020
- Full Text
- View/download PDF
24. Classical Music Is Effectively Broken by Technology
- Author
-
Julian Ashbourn
- Subjects
geography ,geography.geographical_feature_category ,Event (computing) ,business.industry ,Computer science ,Visual arts ,law.invention ,Classical music ,Stereophonic sound ,High fidelity ,Popular music ,law ,Music industry ,business ,Sound (geography) ,Digital audio - Abstract
The effect of multi-track recording and digital sound caused engineers within the classical music field to break with tradition and start to put microphones in front of almost every instrument, in the knowledge that they could mix the audio in any way they wished after the event. They would argue that this produced the right sound of each instrument, but it did not reproduce the sound of the orchestra as the conductor and audience heard it, or how the composer intended it to be heard. This has introduced a dichotomy within the recording industry with some freelance engineers and specialist labels recording in true stereo, thus preserving the sound of the orchestra in place, and most of the larger labels recording the multi-track way, emulating those within the popular music field. This is discussed in some depth.
- Published
- 2020
- Full Text
- View/download PDF
25. Investigating Three-dimensional Directional Guidance with Nonvisual Feedback for Target Pointing Task
- Author
-
Uran Oh, Kyungyeon Lee, and SeungA Chung
- Subjects
Computer science ,business.industry ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Task completion ,Spatial memory ,Task (project management) ,law.invention ,Stereophonic sound ,law ,0202 electrical engineering, electronic engineering, information engineering ,Laser pointer ,0501 psychology and cognitive sciences ,Audio feedback ,Computer vision ,Augmented reality ,Artificial intelligence ,business ,050107 human factors ,Haptic technology - Abstract
While directional guidance is essential for spatial navigation, little has been studied about providing nonvisual cues in 3D space for individuals who are blind or have limited visual acuity. To understand the effects of different nonvisual feedback for 3D directional guidance, we conducted a user study with 12 blind-folded participants. They were asked to search for a virtual target in a 3D space with a laser pointer as quickly as possible under 6 different feedback designs varying the feedback mode (beeping vs. haptic vs. beeping+haptic) and the presence of a stereo sound. Our findings show that beeping sound feedback with and without haptic feedback outperforms the mode where only haptic feedback is provided. We also found that stereo sound feedback generated from a target significantly improves both the task completion time and travel distance. Our work can help people who are blind or have limited visual acuity to understand the directional guidance in a 3D space.
- Published
- 2020
- Full Text
- View/download PDF
26. Independent Echo Path Modeling for Stereophonic Acoustic Echo Cancellation
- Author
-
Ian Liu, Cheng Luo, Gao Yi, Bin Li, and J. Zheng
- Subjects
Stereophonic sound ,law ,Computer science ,Acoustics ,Echo (computing) ,Path (graph theory) ,law.invention - Published
- 2020
- Full Text
- View/download PDF
27. The Method of Random Directions Optimization for Stereo Audio Source Separation
- Author
-
Oleg Golokolenko and Gerald Schuller
- Subjects
Stereophonic sound ,Computer science ,law ,business.industry ,Source separation ,Computer vision ,Artificial intelligence ,business ,law.invention - Published
- 2020
- Full Text
- View/download PDF
28. Stereophonic Frequency Modulation using MATLAB: An Undergraduate Research Project
- Author
-
Jordan M. L. Gilbert, Waseem Sheikh, and William M. Regula
- Subjects
Discriminator ,Computer science ,business.industry ,FM transmitter ,Acoustics ,Signal ,law.invention ,Frequency-division multiplexing ,Stereophonic sound ,symbols.namesake ,Additive white Gaussian noise ,law ,symbols ,business ,Frequency modulation ,Radio broadcasting - Abstract
Frequency modulation (FM) is used worldwide for high-fidelity broadcast radio communication. This paper presents a MATLAB implementation of a stereophonic FM transmitter and receiver. Simulations are performed to measure the performance of the stereophonic FM receiver in the presence of additive white Gaussian noise (AWGN). A stereophonic FM signal with pre-detection SNR values from 0 to 60 dB, in increments of 10 dB, is demodulated using a discriminator. Post-detection SNR values are determined, and performance characteristics are examined for above and below threshold modes of FM operation. The simulations show that the receiver reliably demodulates the message signal for pre-detection SNR values greater than 30 dB. Degradation of the signal occurs at and below 20 dB with a complete loss of signal at 0 dB (below threshold mode). A 15 dB post-detection SNR gain occurs at 60 dB pre-detection SNR.
- Published
- 2020
- Full Text
- View/download PDF
29. Smartphone-Controlled Multi-Channel Surround Sound System
- Author
-
Shu-Nung Yao, Chun-Ting Ke, Jie-Hong Chen, and Yu-Hsin Chang
- Subjects
geography ,geography.geographical_feature_category ,Computer science ,Speech recognition ,Ambient noise level ,Equalization (audio) ,Surround sound ,law.invention ,Stereophonic sound ,law ,Active listening ,Sound quality ,Sound (geography) ,Communication channel - Abstract
Multi-channel audio systems are widely used in modern sound devices for movie theaters and home theaters. However, most audio files are encoded into stereo tracks. When using a multichannel system to play a stereophonic audio, the system may not produce the best possible listening perception. This study aims to upmix the stereophonic sound to 5.1-channel sound, thereby enhancing surround sound experience. The proposed system extracts the primary sound for the center audio channel and the ambient sound for the side audio channels. The frontal audio channels are reinforced by a special equalization technique. Users can use a smartphone to remotely control the proposed playback. The subjective listening results show that the spatial audio quality outperforms the commercial built-in upmixing especially in spatial hearing. We also hypothesize that there is a trade-off between directional enhancement and spectral balance in the pilot experiment.
- Published
- 2020
- Full Text
- View/download PDF
30. HMM-based music retrieval using stereophonic feature information and framelength adaptation
- Author
-
Gerhard Rigoll, Björn Schuller, and M. Lang
- Subjects
Dynamic time warping ,Matching (statistics) ,Computer science ,Speech recognition ,Feature extraction ,Musical ,Query by humming ,law.invention ,Stereophonic sound ,law ,Music information retrieval ,Polyphony ,Singing ,ddc:004 ,Hidden Markov model - Abstract
Music retrieval methods are in the focus of recent interest due to the increasing size of music databases as e.g. in the Internet. Among different query methods content-based media retrieval analyzing intrinsic characteristics of the source seems to form the most intuitive access. The key-melody in a song can be regarded as the major characteristic in music and leads to a query by humming or singing. In this paper we turn our attention to both, the features and the algorithm of matching in audio music retrieval. Nowadays approaches propagate the use of dynamic time warping for the matching process. As reference mostly midi-data or humming itself is used. However, first attempts matching humming to polyphonic audio exist. In this contribution we introduce hidden Markov models as an alternative for humming queries matching humming itself, mobile phone ring tones and polyphonic audio. The second object of our research is the introduction of a new way of melody enhancement prior to a latter feature extraction by use of stereophonic information. Further an adaptation throughout the extraction process of the frame length to the tempo of a musical piece helps improving similarity matching performance. The paper addresses the design of a working recognition engine and results achieved with respect to the alluded methods. A test database consisting of polyphonic audio clips, ring tones, and sung user data is described in detail.
- Published
- 2020
31. The Application of Mid-Side Theory to Produce Analog Stereo Audio Records Using a Single Laser Beam
- Author
-
Olivier Allegre, David Whitehead, Robert Heinemann, S Orchid, and Daniel Wilson
- Subjects
Stereophonic sound ,Optics ,business.industry ,Computer science ,law ,General Engineering ,business ,Music ,Laser beams ,law.invention - Abstract
The recent resurgence of vinyl music records sales led by a consumer demand is increasing faster than production capability. This has resulted in supply delays across the sector. Thus far, manufacturing investments have been focused on traditional proven methods rather than alternative technologies. This paper demonstrates for the first time the production of a stereo recording via analog methodology using a single pulsed laser beam. Using mid-side theory,to combine a sum (mono) signal with a difference signal, a 532nm Nd:Yag laser beam was used to process high-impact polystyrene discs (HIPS). Stereo recordings were manufactured by varying the laser power to produce a difference signal and deflecting the beam with a mirror mounted galvanometer to produce the sum signal. Upon playback on a conventional turntable, the recordings were analyzed with an oscilloscope and stereo separation was observed. To our knowledge this is the first time a stereo signal has been successfully recorded using a single laser beam. Previous literature has used a single laser beam to achieve mono signals and required significant digital pre-processing of the audio source. This new methodology requires lowerinvestment costs than traditional pressing plants and would make volume-tailored production more affordable.
- Published
- 2020
- Full Text
- View/download PDF
32. Blaster: An Off-Grid Method for Blind and Regularized Acoustic Echoes Retrieval
- Author
-
Nancy Bertin, Diego Di Carlo, Antoine Deleforge, Clement Elvira, and Rémi Gribonval
- Subjects
Reverberation ,Computer science ,Speech recognition ,010102 general mathematics ,Grid method multiplication ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Blaster ,01 natural sciences ,law.invention ,Speech enhancement ,Stereophonic sound ,law ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,0101 mathematics ,Audio signal processing ,computer - Abstract
Acoustic echoes retrieval is a research topic that is gaining importance in many speech and audio signal processing applications such as speech enhancement, source separation, dereverberation and room geometry estimation. This work proposes a novel approach to blindly retrieve the off-grid timing of early acoustic echoes from a stereophonic recording of an unknown sound source such as speech. It builds on the recent framework of continuous dictionaries. In contrast with existing methods, the proposed approach does not rely on parameter tuning nor peak picking techniques by working directly in the parameter space of interest. The accuracy and robustness of the method are assessed on challenging simulated setups with varying noise and reverberation levels and are compared to two state-of-the-art methods.
- Published
- 2020
- Full Text
- View/download PDF
33. Influence of interaural cross-correlation coefficient and loudness level on auditory source width at different frequency
- Author
-
Peng Wang, Zhibin Lin, and Xiaojun Qiu
- Subjects
010302 applied physics ,business.product_category ,Acoustics and Ultrasonics ,Auditory event ,Acoustics ,01 natural sciences ,law.invention ,Loudness ,Stereophonic sound ,Cross correlation coefficient ,law ,0103 physical sciences ,Curve fitting ,Hearing impaired ,business ,010301 acoustics ,Headphones ,Mathematics - Abstract
© 2019 Elsevier Ltd Auditory source width (ASW) is the perceived width of an auditory event of a stimulus, which has complicated relationships with the interaural cross-correlation coefficient, loudness level and frequency. In this paper, the virtual acoustics pointer method is used as the reference signals to investigate the relationship for headphone users. It is found that increasing the loudness level increases the ASW, but with different degrees at different frequencies. The minimum ASW increment appears around 1600 Hz and the maximum occurs around 200 Hz. The average of the ASW increment is approximately 5.4 for 10 phons loudness increment. Decreasing IACC can broaden the ASW, the maximum of the increment of ASW occurs around 400–800 Hz and the average increment is approximately 4.8 for the decrement in IACC of 0.2. A formula is developed by curve fitting to describe the relationship among the three factors and the ASW, which can be used to predict the auditory source width for stereo sound reproduction using headphones and help hearing impaired people perceive more accurate ASW of sound.
- Published
- 2020
34. Reversible Watermarking on Stereo Audio Signals by Exploring Inter-Channel Correlation
- Author
-
Wen Diao, Dongdong Hou, Yuanxin Wu, and Weiming Zhang
- Subjects
Channel correlation ,Stereophonic sound ,law ,Computer science ,business.industry ,Computer vision ,Artificial intelligence ,business ,Digital watermarking ,Software ,law.invention - Abstract
A new reversible watermarking algorithm on stereo audio signals is proposed in this article. By utilizing correlations between two channels of audio signal, the authors segment one channel based on another one according to the smoothness. For each segmented sub-host sequence, they estimate its capacity and the corresponding embedding distortion firstly, and then select the optimal combinations of sub-host sequences for embedding. Experimental results indicate that the proposed algorithm can improve SNR (signal to noise ratio) for various kinds of capacity.
- Published
- 2019
- Full Text
- View/download PDF
35. Auditory Stimulation on Touching a Virtual Object Outside a user’s Field of View
- Author
-
Zentaro Kimura and Mie Sato
- Subjects
Stereophonic sound ,Auditory feedback ,Natural interaction ,law ,Human–computer interaction ,Virtual image ,Auditory stimulation ,Computer science ,Field of view ,Human-centered computing ,law.invention ,Impression - Abstract
We investigate the effects of auditory stimuli that improve the ease of touching a virtual object outside a user’s field of view. Our impression evaluation experiments show that a 45-degree expanded stereophonic sound contributes to the natural interaction between a user and a virtual object without using visual information.
- Published
- 2020
- Full Text
- View/download PDF
36. A Stereo Audio Delta-Sigma DAC with 40-kHz Bandwidth and 103-dB SNR
- Author
-
Han Yang, Jun Soo Cho, Hyunjong Kim, Yujin Park, and Suhwan Kim
- Subjects
Physics ,Stereophonic sound ,law ,Bandwidth (signal processing) ,Electronic engineering ,Digital-to-analog converter ,Wide band ,Electrical and Electronic Engineering ,Delta-sigma modulation ,Switched capacitor ,Electronic, Optical and Magnetic Materials ,law.invention ,Low noise - Published
- 2018
- Full Text
- View/download PDF
37. Binaural Capability of Locating Sound Sources of Information Signals
- Author
-
Mariia Volodymyrivna Vdovenko and Svetlana Andriivna Luniova
- Subjects
Computer science ,Acoustics ,media_common.quotation_subject ,General Medicine ,Pulse (music) ,Legibility ,Signal ,law.invention ,Stereophonic sound ,Interval (music) ,law ,Perception ,Loudspeaker ,Binaural recording ,media_common - Abstract
The purpose of the work done was to estimate the human binaural capability of locating stereo sound sources of information signals, especially speech, choir singing, and symphonic music as compared to pulse signals. The measurements were conducted in an average-sized hall where sound was emitted by a stereo system consisting of two loudspeakers with the base width of 4.5 m. Based on the results of analysis of existing methods of binaural sound perception modelling, the estimation was performed based on correlational processing of signals recorded using a head dummy placed in various points of the room. For the recording points selected, the maximum time delays between signals arriving at a listener’s left and right ears were calculated, and the interaural cross-correlation functions were obtained. The general functions, especially the peak shift, correlation interval, and peak sharpness (including the conditions when it splits into two individual ones), were analysed. The correlation factor values and levels were calculated. As a result, the conclusions regarding the capability of locating an imaginary sound source were made based on the cross-correlation function factors values, which simplified the application of this method in practice. Based on the results of experiments conducted and on the subjective sensations of perception, the authors have come to the conclusion that a stereophony area has sizes much wider than those assumed earlier from the signal time delay of 1 ms at a receiving stereo pair (representing a shift of an imaginary source towards the ear perceiving the signal earlier). At the same time, it was found that signals with a speech component are much harder to locate than pulse signals. While music signals, especially symphonic music, are close to pulse signals in terms of human locating capabilities. The found patterns have allowed us to introduce adjustments to the stereophony area calculations. Based on the research results, we suggest defining the stereophony zone border by the correlation factor value of 0.5. Given that, the interaural cross-correlation function properties and subjective perception provide for acceptable speech legibility and music transparency. The key conclusion is that this area is quite narrow for speech signals and is actually limited to the time delay of signals arriving at left and right ears – around 1 ms. Music signals have a wider stereophony area defined by the time delay between perception binaural pair components of around 10 ms. Therefore, sizes of a stereophonic sound area ought to be defined with regard to an information signal’s type. Ref. 14, fig. 3, tabl. 3.
- Published
- 2018
- Full Text
- View/download PDF
38. Detecting Door Events Using a Smartphone via Active Sound Sensing
- Author
-
Thilina Dissanayake, Daichi Amagata, Takuya Maekawa, and Takahiro Hara
- Subjects
Focus (computing) ,Ubiquitous computing ,Computer Networks and Communications ,Computer science ,Event (computing) ,Real-time computing ,020206 networking & telecommunications ,020207 software engineering ,02 engineering and technology ,law.invention ,Human-Computer Interaction ,Stereophonic sound ,Sine wave ,Hardware and Architecture ,law ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Doors ,Impulse response - Abstract
Event detection of indoor objects, including doors, has a wide variety of applications, including intruder detection, HVAC control, and surveillance of independently living elderly people. Hence, this has been the focus of multiple research projects in the UbiComp research community. Herein, we propose a method to accurately detect door events in an indoor environment, without the installation and maintenance costs of using distributed ubiquitous sensors. In particular, we recognize the events of multiple doors existing in the environment via active sound probing using a disused smartphone installed in the environment. We perform event recognition by fusing the analysis of the Doppler shift caused by the moving doors with the acoustic characteristics describing the open/close states of the doors acquired via impulse response. To accurately distinguish between the events of different doors via sound probing, our method employs the time-series analysis of the Doppler shift as well as the active sound probing using directional high-frequency sine waves and stereo sound recording. In addition, by incorporating prior knowledge about the state transitions of a door object into a recognition model, we attempt to improve the accuracy of event recognition. Moreover, our method is capable of recognizing walking activities of a person related to door events in the environment, which are necessary information for applications such as HVAC control that require information about both door events and human presence.
- Published
- 2018
- Full Text
- View/download PDF
39. Binaural ambiguity amplifies visual bias in sound source localization
- Author
-
Emily Jo Venskytis, Leslie Balderas, and Yi Zhou
- Subjects
Acoustics and Ultrasonics ,Auditory response ,Computer science ,Speech recognition ,media_common.quotation_subject ,030229 sport sciences ,Ambiguity ,Acoustic source localization ,01 natural sciences ,law.invention ,03 medical and health sciences ,Noise ,Stereophonic sound ,0302 clinical medicine ,Arts and Humanities (miscellaneous) ,law ,0103 physical sciences ,Auditory localization ,010301 acoustics ,Binaural recording ,media_common - Abstract
Auditory spatial perception relies on more than one spatial cue. This study investigated the effects of cue congruence on auditory localization and the extent of visual bias between two binaural cues-interaural time differences (ITDs) and interaural level differences (ILDs). Interactions between these binaural cues were manipulated by stereophonic techniques. The results show that incoherent binaural information increased auditory response noise and amplified visual bias. The analysis further suggests that although ILD is not the dominant cue for low-frequency localization, it may strengthen the position estimate by combining with the dominant ITD information to minimize estimation noise.
- Published
- 2018
- Full Text
- View/download PDF
40. Stereophonic Music Separation Based on Non-Negative Tensor Factorization with Cepstral Distance Regularization
- Author
-
Shogo Seki, Tomoki Toda, and Kazuya Takeda
- Subjects
Cepstral distance ,Stereophonic sound ,Tensor factorization ,law ,Applied Mathematics ,Signal Processing ,Separation (statistics) ,Electrical and Electronic Engineering ,Computer Graphics and Computer-Aided Design ,Algorithm ,Regularization (mathematics) ,law.invention ,Mathematics - Published
- 2018
- Full Text
- View/download PDF
41. Stereophonic Acoustic Echo Suppression for Speech Interfaces for Intelligent TV Applications
- Author
-
Jungpyo Hong
- Subjects
Noise measurement ,Computer science ,Speech recognition ,020208 electrical & electronic engineering ,Echo (computing) ,020206 networking & telecommunications ,02 engineering and technology ,Filter (signal processing) ,law.invention ,Least mean squares filter ,Stereophonic sound ,Signal-to-noise ratio ,law ,Distortion ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Electrical and Electronic Engineering - Abstract
In this paper, a unified framework for using speech interfaces with intelligent televisions (TVs) is proposed. The proposed framework is primarily based on suppressing the stereophonic acoustic echo of the TV audio sounds and adaptive tracking of the time varying acoustic transfer function (ATF) of the echo sounds. In order to effectively suppress the acoustic echoes from the TV speakers, an optimal filter with a multichannel subspace approach is imposed on noisy inputs and the second order statistics of the stereophonic echoes are adaptively estimated by tracking the time varying ATF using the normalized least mean square method. Through assessing the proposed method with a database collected in real TV watching environments, noticeable improvements were achieved in the signal to noise ratio gain, Itakura–Saito distance, and speech recognition rates.
- Published
- 2018
- Full Text
- View/download PDF
42. Sensorimotor learning with stereo auditory feedback for a brain-computer interface.
- Author
-
McCreadie, Karl, Coyle, Damien, and Prasad, Girijesh
- Subjects
- *
SENSORIMOTOR cortex , *COMPUTER interfaces , *BRAIN imaging , *ELECTROENCEPHALOGRAPHY , *AUDITORY perception , *FEASIBILITY studies , *PILOT projects - Abstract
Motor imagery can be used to modulate sensorimotor rhythms (SMR) enabling detection of voltage fluctuations on the surface of the scalp using electroencephalographic electrodes. Feedback is essential in learning to modulate SMR for non-muscular communication using a brain-computer interface (BCI). A BCI not reliant upon the visual modality not only releases the visual channel for other uses but also offers an attractive means of communication for the physically impaired who are also blind or vision impaired. This study demonstrates the feasibility of replacing the traditional visual feedback modality with stereo auditory feedback. Results from a pilot study were used to select the most appropriate sounds for auditory feedback based on three options: broadband noise and two anechoic instrument samples. Subsequently, an SMR BCI was used to examine the effect on sensorimotor learning with broadband noise utilising a modified stereophonic presentation method. Twenty participants split into equal groups took part in ten sessions. The visual group performed best initially but did not improve over time whilst the auditory group improved as the study progressed. The results demonstrate the feasibility of using stereophonic auditory feedback with broadband noise as opposed to other auditory feedback presentation methods and sounds which are less intuitive. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
43. SVD-Based Adaptive QIM Watermarking on Stereo Audio Signals
- Author
-
Hong-Goo Kang, Min-Jae Hwang, JeeSok Lee, and Mi-Suk Lee
- Subjects
Signal processing ,Audio signal ,business.industry ,Computer science ,020207 software engineering ,Watermark ,02 engineering and technology ,Computer Science Applications ,Matrix decomposition ,law.invention ,Stereophonic sound ,law ,Frequency domain ,Signal Processing ,Singular value decomposition ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Digital watermarking ,Digital audio - Abstract
This paper proposes a blind digital audio water- marking algorithm that utilizes the quantization index modulation (QIM) and the singular value decomposition (SVD) of stereo audio signals. Conventional SVD-based blind audio watermarking algorithms lack physical interpretation since the matrix construction method for the input matrix for SVD is heuristically defined. However, in the proposed approach, because the SVD is directly applied to the stereo input signals, the resulting decomposed elements convey a conceptually meaningful inter- pretation of the original audio signal. As the proposed approach effectively utilizes the ratio of singular values, the embedded watermark is highly imperceptible and robust against volumetric scaling attacks; most QIM-based watermarking schemes are weak to these types of attacks. Experimental results under well-known practical attacks, such as compressions, resampling, and various types of signal processing, confirm that the proposed algorithm performs well compared to conventional audio watermarking algorithms.
- Published
- 2018
- Full Text
- View/download PDF
44. The art of stereo reproduction—A test engineer's perspective
- Author
-
James Weir
- Subjects
Acoustics and Ultrasonics ,Computer science ,Reproduction (economics) ,Perspective (graphical) ,people.profession ,Test engineer ,Odds ,law.invention ,Musical acoustics ,Stereophonic sound ,Arts and Humanities (miscellaneous) ,Human–computer interaction ,law ,Psychoacoustics ,Dialog box ,people - Abstract
Stereophonic reproduction is an art form often at odds with analytical testing. The author presents a breakdown of the aspects of stereo reproduction, including musical acoustics and the electroacoustics, to room acoustic assessment and the psychoacoustics used in the assessment and enjoyment of this common art form. This goal is to provide a holistic overview and encourage discussion and improvements in the processes used in the production of the art, as well as the analysis and dialog of the consumers of the art.
- Published
- 2021
- Full Text
- View/download PDF
45. Exploring Data Sonification to Enable, Enhance, and Accelerate the Analysis of Big, Noisy, and Multi-Dimensional Data
- Author
-
B. Garcia, G. Foran, J. Cooke, W. Díaz-Merced, and J. Hannam
- Subjects
Exploit ,Computer science ,business.industry ,Big data ,Astronomy and Astrophysics ,Virtual reality ,01 natural sciences ,law.invention ,Identification (information) ,Stereophonic sound ,Software ,Space and Planetary Science ,Sonification ,Human–computer interaction ,law ,0103 physical sciences ,Data analysis ,business ,010303 astronomy & astrophysics ,010301 acoustics - Abstract
We explore the properties of sound and human sound recognition as a means to enhance and accelerate visual-only data analysis methods. The aim of this work is to enable and improve the analysis of large data sets, data requiring rapid analysis, multi-dimensional data, and signal detection in data with low signal-to-noise ratio. We present a prototype tool, StarSound, to sonify data such as astronomical transient light curves, spectra, and power spectra. Stereophonic sound is used to ‘visualise’ and localise the data under examination, and 3-D sound is discussed in conjunction with virtual reality technology, as a means to enhance analysis efficiency and efficacy, including rapid data assessment and training machine learning software. In addition, we explore the use of higher-order harmonics as a means to examine simultaneously multi-dimensional data sets. Such an approach can allow the data to be interpreted in a holistic manner and facilitates the discovery of previously unseen connections and relationships. Furthermore, we exploit the capability of the human brain for selective or focused hearing that enables the identification of desired signals in noisy data, or amidst similar or more significant signals. Finally, we provide research examples that benefit directly from data sonification. The work presented here aims to help tackle the challenges of the upcoming era of Big Data and help optimise, speed up and expand aspects of data analysis requiring human interaction.
- Published
- 2017
- Full Text
- View/download PDF
46. DEMODULATOR THAT CONVERTS ENCODER STEREO SIGNAL INTO L AND R AUDIO SIGNAL
- Author
-
Faisol Ahmad and Ibrahim Ashari
- Subjects
Audio signal ,Sideband ,business.industry ,Computer science ,Radio equipment ,Electrical engineering ,Process (computing) ,Signal ,law.invention ,Stereophonic sound ,Hardware_GENERAL ,law ,Demodulation ,business ,Encoder - Abstract
As technology develops in today’s age, there are many radio equipment easily found in the market. Certainly, there is a stereo decoder module inside a FM radio receiver which operates to return radio signal that previously was modulated and broken apart into stereo audio signal (L+R) and (L-R). There are many stereo decoders already presented inside FM radio receivers. Based on this premise, there is a problem that can be studied, which is how to design a decoder stereo in a stereo FM radio receiver that functions as the demodulation in stereo FM radio and operates to release L+R audio signal to a receiver radio. Therefore, it needs a signal demodulatorof encoder stereo that converts L and R audio signal. Based on the testing result, it showed that sideband frequency from the result of DSB-SC demodulator process, which was produced through R and L channels, still had frequency value differences with average percentage of 1.416% out of the expected frequency value. This was due to the frequency oscillator from the generator function.
- Published
- 2020
- Full Text
- View/download PDF
47. Localization Uncertainty In Time-Amplitude Stereophonic Reproduction
- Author
-
Toon van Waterschoot, Marc Moonen, Enzo De Sena, Huseyin Hacihabiboglu, and Zoran Cvetkovic
- Subjects
Stereophony ,Acoustics and Ultrasonics ,Computer science ,recording and reproduction ,01 natural sciences ,law.invention ,030507 speech-language pathology & audiology ,03 medical and health sciences ,symbols.namesake ,Position (vector) ,law ,Audio and Speech Processing (eess.AS) ,0103 physical sciences ,Computer Science (miscellaneous) ,FOS: Electrical engineering, electronic engineering, information engineering ,Active listening ,Electrical and Electronic Engineering ,010301 acoustics ,Sweet spot ,Pearson product-moment correlation coefficient ,Computational Mathematics ,Stereophonic sound ,Amplitude ,auditory modeling ,symbols ,localization uncertainty ,panning ,0305 other medical science ,Algorithm ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This article studies the effects of inter-channel time and level differences in stereophonic reproduction on perceived localization uncertainty, which is defined as how difficult it is for a listener to tell where a sound source is located. Towards this end, a computational model of localization uncertainty is proposed first. The model calculates inter-aural time and level difference cues, and compares them to those associated to free-field point-like sources. The comparison is carried out using a particular distance functional that replicates the increased uncertainty observed experimentally with inconsistent inter-aural time and level difference cues. The model is validated by formal listening tests, achieving a Pearson correlation of 0.99. The model is then used to predict localization uncertainty for stereophonic setups and a listener in central and off-central positions. Results show that amplitude methods achieve a slightly lower localization uncertainty for a listener positioned exactly in the center of the sweet spot. As soon as the listener moves away from that position, the situation reverses, with time-amplitude methods achieving a lower localization uncertainty.
- Published
- 2020
- Full Text
- View/download PDF
48. ESResNet: Environmental Sound Classification Based on Visual Domain Models
- Author
-
Andrey Guzhov, Andreas Dengel, Jörn Hees, and Federico Raue
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Sound (cs.SD) ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Feature extraction ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Field (computer science) ,Computer Science - Sound ,Domain (software engineering) ,law.invention ,Machine Learning (cs.LG) ,law ,Audio and Speech Processing (eess.AS) ,0202 electrical engineering, electronic engineering, information engineering ,FOS: Electrical engineering, electronic engineering, information engineering ,0105 earth and related environmental sciences ,business.industry ,020206 networking & telecommunications ,Domain model ,Visualization ,Time–frequency analysis ,Stereophonic sound ,Spectrogram ,Data mining ,Artificial intelligence ,business ,computer ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Environmental Sound Classification (ESC) is an active research area in the audio domain and has seen a lot of progress in the past years. However, many of the existing approaches achieve high accuracy by relying on domain-specific features and architectures, making it harder to benefit from advances in other fields (e.g., the image domain). Additionally, some of the past successes have been attributed to a discrepancy of how results are evaluated (i.e., on unofficial splits of the UrbanSound8K (US8K) dataset), distorting the overall progression of the field. The contribution of this paper is twofold. First, we present a model that is inherently compatible with mono and stereo sound inputs. Our model is based on simple log-power Short-Time Fourier Transform (STFT) spectrograms and combines them with several well-known approaches from the image domain (i.e., ResNet, Siamese-like networks and attention). We investigate the influence of cross-domain pre-training, architectural changes, and evaluate our model on standard datasets. We find that our model out-performs all previously known approaches in a fair comparison by achieving accuracies of 97.0 % (ESC-10), 91.5 % (ESC-50) and 84.2 % / 85.4 % (US8K mono / stereo). Second, we provide a comprehensive overview of the actual state of the field, by differentiating several previously reported results on the US8K dataset between official or unofficial splits. For better reproducibility, our code (including any re-implementations) is made available., Comment: 8 pages, 4 figures; submitted to ICPR 2020
- Published
- 2020
- Full Text
- View/download PDF
49. Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
- Author
-
Hang Zhou, Dahua Lin, Xiaogang Wang, Xudong Xu, and Ziwei Liu
- Subjects
Network architecture ,Ambisonics ,Computer science ,business.industry ,020207 software engineering ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Spatialization ,law.invention ,Stereophonic sound ,law ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,Code (cryptography) ,Computer vision ,Pyramid (image processing) ,Artificial intelligence ,business ,Binaural recording ,0105 earth and related environmental sciences - Abstract
Stereophonic audio is an indispensable ingredient to enhance human auditory experience. Recent research has explored the usage of visual information as guidance to generate binaural or ambisonic audio from mono ones with stereo supervision. However, this fully supervised paradigm suffers from an inherent drawback: the recording of stereophonic audio usually requires delicate devices that are expensive for wide accessibility. To overcome this challenge, we propose to leverage the vastly available mono data to facilitate the generation of stereophonic audio. Our key observation is that the task of visually indicated audio separation also maps independent audios to their corresponding visual positions, which shares a similar objective with stereophonic audio generation. We integrate both stereo generation and source separation into a unified framework, Sep-Stereo, by considering source separation as a particular type of audio spatialization. Specifically, a novel associative pyramid network architecture is carefully designed for audio-visual feature fusion. Extensive experiments demonstrate that our framework can improve the stereophonic audio generation results while performing accurate sound separation with a shared backbone (Code, models and demo video are available at https://hangz-nju-cuhk.github.io/projects/Sep-Stereo.).
- Published
- 2020
- Full Text
- View/download PDF
50. Design of a Tangible Programming Tool for Students with Visual Impairments and Low Vision
- Author
-
Enrico Pontelli and Emmanuel Utreras
- Subjects
Computer science ,05 social sciences ,Visual impairment ,020207 software engineering ,02 engineering and technology ,Braille ,law.invention ,Task (project management) ,Microcontroller ,Identification (information) ,Stereophonic sound ,law ,Human–computer interaction ,Arduino ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,0501 psychology and cognitive sciences ,Graphics ,medicine.symptom ,050107 human factors - Abstract
This article presents the design of a tangible tool for teaching basic programming concepts to students with visual impairments and low vision. Meeting the preliminary requirements of this ongoing project (Tangible input, cost-efficient, and minimal maintenance), the paper describes the design of the hardware of this prototype with inexpensive materials, such as Lego blocks, 3.5 mm stereo audio jack connectors, relays, resistors, and an Arduino Mega microcontroller. To identify the Lego blocks that represent the code instructions, the system provides three different methods of identification; color, 3D printed labels, and braille labels. To acquire feedback and validate this prototype, a preliminary study with nine participants with Visual Impairment and Low Vision has been carried out. All participants completed the task successfully, provided feedback about the prototype, and all recommended this prototype to teach basic programming concepts. Currently, the project is focused on the implementation of music as an output method of programs created by users. The objective is to create a programming tool completely independent of visual graphics or information.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.