Author: "Wang, Hsin-Min" / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wang, Hsin-Min"' showing total 21 results

Start Over Author "Wang, Hsin-Min" Publication Year Range This year

21 results on '"Wang, Hsin-Min"'

1. Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

Author: Ren, Wenze, Hung, Kuo-Hsuan, Chao, Rong, Li, YouJin, Wang, Hsin-Min, and Tsao, Yu
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: This paper addresses the prevalent issue of incorrect speech output in audio-visual speech enhancement (AVSE) systems, which is often caused by poor video quality and mismatched training and test data. We introduce a post-processing classifier (PPC) to rectify these erroneous outputs, ensuring that the enhanced speech corresponds accurately to the intended speaker. We also adopt a mixup strategy in PPC training to improve its robustness. Experimental results on the AVSE-challenge dataset show that integrating PPC into the AVSE model can significantly improve AVSE performance, and combining PPC with the AVSE model trained with permutation invariant training (PIT) yields the best performance. The proposed method substantially outperforms the baseline model by a large margin. This work highlights the potential for broader applications across various modalities and architectures, providing a promising direction for future research in this field., Comment: The 27th International Conference of the Oriental COCOSDA
Published: 2024

2. Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition

Author: Wang, Chien-Chun, Chen, Li-Wei, Chou, Cheng-Kang, Lee, Hung-Shin, Chen, Berlin, and Wang, Hsin-Min
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: While pre-trained automatic speech recognition (ASR) systems demonstrate impressive performance on matched domains, their performance often degrades when confronted with channel mismatch stemming from unseen recording environments and conditions. To mitigate this issue, we propose a novel channel-aware data simulation method for robust ASR training. Our method harnesses the synergistic power of channel-extractive techniques and generative adversarial networks (GANs). We first train a channel encoder capable of extracting embeddings from arbitrary audio. On top of this, channel embeddings are extracted using a minimal amount of target-domain data and used to guide a GAN-based speech synthesizer. This synthesizer generates speech that faithfully preserves the phonetic content of the input while mimicking the channel characteristics of the target domain. We evaluate our method on the challenging Hakka Across Taiwan (HAT) and Taiwanese Across Taiwan (TAT) corpora, achieving relative character error rate (CER) reductions of 20.02% and 9.64%, respectively, compared to the baselines. These results highlight the efficacy of our channel-aware data simulation method for bridging the gap between source- and target-domain acoustics., Comment: Submitted to ICASSP 2025
Published: 2024

3. Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

Author: Ren, Wenze, Wu, Haibin, Lin, Yi-Cheng, Chen, Xuanjun, Chao, Rong, Hung, Kuo-Hsuan, Li, You-Jin, Ting, Wen-Yuan, Wang, Hsin-Min, and Tsao, Yu
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: In multichannel speech enhancement, effectively capturing spatial and spectral information across different microphones is crucial for noise reduction. Traditional methods, such as CNN or LSTM, attempt to model the temporal dynamics of full-band and sub-band spectral and spatial features. However, these approaches face limitations in fully modeling complex temporal dependencies, especially in dynamic acoustic environments. To overcome these challenges, we modify the current advanced model McNet by introducing an improved version of Mamba, a state-space model, and further propose MCMamba. MCMamba has been completely reengineered to integrate full-band and narrow-band spatial information with sub-band and full-band spectral features, providing a more comprehensive approach to modeling spatial and spectral information. Our experimental results demonstrate that MCMamba significantly improves the modeling of spatial and spectral features in multichannel speech enhancement, outperforming McNet and achieving state-of-the-art performance on the CHiME-3 dataset. Additionally, we find that Mamba performs exceptionally well in modeling spectral information.
Published: 2024

4. A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

Author: Zezario, Ryandhimas E., Siniscalchi, Sabato M., Wang, Hsin-Min, and Tsao, Yu
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: This work investigates two strategies for zero-shot non-intrusive speech assessment leveraging large language models. First, we explore the audio analysis capabilities of GPT-4o. Second, we propose GPT-Whisper, which uses Whisper as an audio-to-text module and evaluates the naturalness of text via targeted prompt engineering. We evaluate assessment metrics predicted by GPT-4o and GPT-Whisper examining their correlations with human-based quality and intelligibility assessments, and character error rate (CER) of automatic speech recognition. Experimental results show that GPT-4o alone is not effective for audio analysis; whereas, GPT-Whisper demonstrates higher prediction, showing moderate correlation with speech quality and intelligibility, and high correlation with CER. Compared to supervised non-intrusive neural speech assessment models, namely MOS-SSL and MTI-Net, GPT-Whisper yields a notably higher Spearman's rank correlation with the CER of Whisper. These findings validate GPT-Whisper as a reliable method for accurate zero-shot speech assessment without requiring additional training data (speech data and corresponding assessment scores).
Published: 2024

5. Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages

Author: Cheng, Yao-Fei, Chen, Li-Wei, Lee, Hung-Shin, and Wang, Hsin-Min
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This study investigates the efficacy of data augmentation techniques for low-resource automatic speech recognition (ASR), focusing on two endangered Austronesian languages, Amis and Seediq. Recognizing the potential of self-supervised learning (SSL) in low-resource settings, we explore the impact of data volume on the continued pre-training of SSL models. We propose a novel data-selection scheme leveraging a multilingual corpus to augment the limited target language data. This scheme utilizes a language classifier to extract utterance embeddings and employs one-class classifiers to identify utterances phonetically and phonologically proximate to the target languages. Utterances are ranked and selected based on their decision scores, ensuring the inclusion of highly relevant data in the SSL-ASR pipeline. Our experimental results demonstrate the effectiveness of this approach, yielding substantial improvements in ASR performance for both Amis and Seediq. These findings underscore the feasibility and promise of data augmentation through cross-lingual transfer learning for low-resource language ASR.
Published: 2024

6. The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

Author: Huang, Wen-Chin, Fu, Szu-Wei, Cooper, Erica, Zezario, Ryandhimas E., Toda, Tomoki, Wang, Hsin-Min, Yamagishi, Junichi, and Tsao, Yu
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We present the third edition of the VoiceMOS Challenge, a scientific initiative designed to advance research into automatic prediction of human speech ratings. There were three tracks. The first track was on predicting the quality of ``zoomed-in'' high-quality samples from speech synthesis systems. The second track was to predict ratings of samples from singing voice synthesis and voice conversion with a large variety of systems, listeners, and languages. The third track was semi-supervised quality prediction for noisy, clean, and enhanced speech, where a very small amount of labeled training data was provided. Among the eight teams from both academia and industry, we found that many were able to outperform the baseline systems. Successful techniques included retrieval-based methods and the use of non-self-supervised representations like spectrograms and pitch histograms. These results showed that the challenge has advanced the field of subjective speech rating prediction., Comment: Accepted to SLT2024
Published: 2024

7. Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

Author: Wang, Chien-Chun, Chen, Li-Wei, Lee, Hung-Shin, Chen, Berlin, and Wang, Hsin-Min
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain, leading to a mismatch between training and test conditions. This study puts forward a novel data simulation method to address this issue, leveraging noise-extractive techniques and generative adversarial networks (GANs) with only limited target noisy speech data. Notably, our method employs a noise encoder to extract noise embeddings from target-domain data. These embeddings aptly guide the generator to synthesize utterances acoustically fitted to the target domain while authentically preserving the phonetic content of the input clean speech. Furthermore, we introduce the notion of dynamic stochastic perturbation, which can inject controlled perturbations into the noise embeddings during inference, thereby enabling the model to generalize well to unseen noise conditions. Experiments on the VoiceBank-DEMAND benchmark dataset demonstrate that our domain-adaptive SE method outperforms an existing strong baseline based on data simulation., Comment: Accepted to IEEE SLT 2024
Published: 2024

8. SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

Author: Yin, Chun, Chi, Tai-Shih, Tsao, Yu, and Wang, Hsin-Min
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound
Abstract: Representations from pre-trained speech foundation models (SFMs) have shown impressive performance in many downstream tasks. However, the potential benefits of incorporating pre-trained SFM representations into speaker voice similarity assessment have not been thoroughly investigated. In this paper, we propose SVSNet+, a model that integrates pre-trained SFM representations to improve performance in assessing speaker voice similarity. Experimental results on the Voice Conversion Challenge 2018 and 2020 datasets show that SVSNet+ incorporating WavLM representations shows significant improvements compared to baseline models. In addition, while fine-tuning WavLM with a small dataset of the downstream task does not improve performance, using the same dataset to learn a weighted-sum representation of WavLM can substantially improve performance. Furthermore, when WavLM is replaced by other SFMs, SVSNet+ still outperforms the baseline models and exhibits strong generalization ability., Comment: Accepted to INTERSPEECH 2024
Published: 2024

9. Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Author: Hashmi, Ammarah, Shahzad, Sahibzada Adil, Lin, Chia-Wen, Tsao, Yu, and Wang, Hsin-Min
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Multimedia
Abstract: The emergence of contemporary deepfakes has attracted significant attention in machine learning research, as artificial intelligence (AI) generated synthetic media increases the incidence of misinterpretation and is difficult to distinguish from genuine content. Currently, machine learning techniques have been extensively studied for automatically detecting deepfakes. However, human perception has been less explored. Malicious deepfakes could ultimately cause public and social problems. Can we humans correctly perceive the authenticity of the content of the videos we watch? The answer is obviously uncertain; therefore, this paper aims to evaluate the human ability to discern deepfake videos through a subjective study. We present our findings by comparing human observers to five state-ofthe-art audiovisual deepfake detection models. To this end, we used gamification concepts to provide 110 participants (55 native English speakers and 55 non-native English speakers) with a webbased platform where they could access a series of 40 videos (20 real and 20 fake) to determine their authenticity. Each participant performed the experiment twice with the same 40 videos in different random orders. The videos are manually selected from the FakeAVCeleb dataset. We found that all AI models performed better than humans when evaluated on the same 40 videos. The study also reveals that while deception is not impossible, humans tend to overestimate their detection capabilities. Our experimental results may help benchmark human versus machine performance, advance forensics analysis, and enable adaptive countermeasures.
Published: 2024

10. SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

Author: Wang, Hsuan-Fu, Shih, Yi-Jen, Chang, Heng-Jui, Berry, Layne, Peng, Puyuan, Lee, Hung-yi, Wang, Hsin-Min, and Harwath, David
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: The recently proposed visually grounded speech model SpeechCLIP is an innovative framework that bridges speech and text through images via CLIP without relying on text transcription. On this basis, this paper introduces two extensions to SpeechCLIP. First, we apply the Continuous Integrate-and-Fire (CIF) module to replace a fixed number of CLS tokens in the cascaded architecture. Second, we propose a new hybrid architecture that merges the cascaded and parallel architectures of SpeechCLIP into a multi-task learning framework. Our experimental evaluation is performed on the Flickr8k and SpokenCOCO datasets. The results show that in the speech keyword extraction task, the CIF-based cascaded SpeechCLIP model outperforms the previous cascaded SpeechCLIP model using a fixed number of CLS tokens. Furthermore, through our hybrid architecture, cascaded task learning boosts the performance of the parallel branch in image-speech retrieval tasks., Comment: Accepted to ICASSP 2024, Self-supervision in Audio, Speech, and Beyond (SASB) workshop
Published: 2024

11. HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

Author: Wisnu, Dyah A. M. G., Rini, Stefano, Zezario, Ryandhimas E., Wang, Hsin-Min, and Tsao, Yu
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound
Abstract: This paper introduces HAAQI-Net, a non-intrusive deep learning model for music audio quality assessment tailored for hearing aid users. Unlike traditional methods like the Hearing Aid Audio Quality Index (HAAQI), which rely on intrusive comparisons to a reference signal, HAAQI-Net offers a more accessible and efficient alternative. Using a bidirectional Long Short-Term Memory (BLSTM) architecture with attention mechanisms and features from the pre-trained BEATs model, HAAQI-Net predicts HAAQI scores directly from music audio clips and hearing loss patterns. Results show HAAQI-Net's effectiveness, with predicted scores achieving a Linear Correlation Coefficient (LCC) of 0.9368, a Spearman's Rank Correlation Coefficient (SRCC) of 0.9486, and a Mean Squared Error (MSE) of 0.0064, reducing inference time from 62.52 seconds to 2.54 seconds. Although effective, feature extraction via the large BEATs model incurs computational overhead. To address this, a knowledge distillation strategy creates a student distillBEATs model, distilling information from the teacher BEATs model during HAAQI-Net training, reducing required parameters. The distilled HAAQI-Net maintains strong performance with an LCC of 0.9071, an SRCC of 0.9307, and an MSE of 0.0091, while reducing parameters by 75.85% and inference time by 96.46%. This reduction enhances HAAQI-Net's efficiency and scalability, making it viable for real-world music audio quality assessment in hearing aid settings. This work also opens avenues for further research into optimizing deep learning models for specific applications, contributing to audio signal processing and quality assessment by providing insights into developing efficient and accurate models for practical applications in hearing aid technology.
Published: 2024

12. Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model.

Author: Wang, Ting-Wei, Lai, Shang-Hong, Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Subjects: DRIVER assistance systems, COMPUTER vision, TRAFFIC safety, PREDICTION models, INFORMATION resources
Abstract: Pedestrian crossing intention prediction based on computer vision plays a pivotal role in enhancing the safety of autonomous driving and advanced driver assistance systems. In this paper, we present a novel multi-modal pedestrian crossing intention prediction framework leveraging the transformer model. By integrating diverse sources of information and leveraging the transformer's sequential modeling and parallelization capabilities, our system accurately predicts pedestrian crossing intentions. We introduce a novel representation of traffic environment data and incorporate lifted 3D human pose and head orientation data to enhance the model's understanding of pedestrian behavior. Experimental results demonstrate the state-of-the-art accuracy of our proposed system on benchmark datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. End-to-End Singing Transcription Based on CTC and HSMM Decoding with a Refined Score Representation.

Author: Deng, Tengyu, Nakamura, Eita, Nishikimi, Ryo, Yoshii, Kazuyoshi, Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Subjects: ARTIFICIAL neural networks, MUSIC scores, PROBLEM solving, SINGING, PRIOR learning
Abstract: This paper describes an end-to-end automatic singing transcription (AST) method that translates a music audio signal containing a vocal part into a symbolic musical score of sung notes. A common approach to sequence-to-sequence learning for this problem is to use the connectionist temporal classification (CTC), where a target score is represented as a sequence of notes with discrete pitches and note values. However, if the note value of some note is incorrectly estimated, the score times of the following notes are estimated incorrectly and the metrical structure of the estimated score collapses. To solve this problem, we propose a refined score representation using metrical positions of note onsets. To decode a musical score from the output of a deep neural network (DNN), we use a hidden semi-Markov model (HSMM) that incorporates prior knowledge about musical scores and temporal fluctuation in human performance. We show that the proposed method achieves the state-of-the-art performance and confirm the efficacy of the refined score representation and the decoding method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Estimating 3D Hand Poses and Shapes from Silhouettes.

Author: Chang, Li-Jen, Liao, Yu-Cheng, Lin, Chia-Hui, Yang-Mao, Shys-Fang, Chen, Hwann-Tzong, Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Subjects: SILHOUETTES, ANNOTATIONS, RECORDING & registration, FORECASTING
Abstract: We present Mask2Hand, a self-trainable method for predicting 3D hand pose and shape from a single 2D binary silhouette. Without additional manual annotations, our method uses differentiable rendering to project 3D estimations onto the 2D silhouette. A tailored loss function, applied between the rendered and input silhouettes, provides a self-guidance mechanism during end-to-end optimization, which constrains global mesh registration and hand pose estimation. Our experiments show that Mask2Hand, using only a binary mask input, achieves accuracy comparable to state-ofthe- art methods requiring RGB or depth inputs on both unaligned and aligned datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Meta Soft Prompting and Learning.

Author: Chien, Jen-Tzung, Chen, Ming-Yen, Lee, Ching-hsien, Xue, Jing-Hao, Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Subjects: LANGUAGE models, SOFT sets, NATURAL languages, LANGUAGE attrition, CLASSIFICATION
Abstract: Traditionally, either applying the hard prompt for sentences by handcrafting the prompt templates or directly optimizing the soft or continuous prompt may not sufficiently generalize for unseen domain data. This paper presents a parameter efficient learning for domain-agnostic soft prompt which is developed for few-shot unsupervised domain adaptation. A pre-trained language model (PLM) is frozen and utilized to extract knowledge for unseen domains in various language understanding tasks. The meta learning and optimization over a set of trainable soft tokens is performed by minimizing the cross-entropy loss for masked language model from support and query data in source and target domains, respectively, where the masked tokens for text category and random masking are predicted. The meta soft prompt is learned through a doublylooped optimization for individual learners and a meta learner when implementing the unsupervised domain adaptation. The PLM is then closely adapted to compensate the domain shift in a target domain. The domain adaptation loss and the prompt-based classification loss are jointly minimized through meta learning. The experiments on multi-domain natural language understanding illustrate the merit of the proposed meta soft prompt in pre-trained language modeling under few-shot setting. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. A Lightweight Enhancement Approach for Real-Time Semantic Segmentation by Distilling Rich Knowledge from Pre-Trained Vision-Language Model.

Author: Lin, Chia-Yi, Chen, Jun-Cheng, Wu, Ja-Ling, Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Subjects: LEARNING strategies, SPINE
Abstract: In this work, we propose a lightweight approach to enhance realtime semantic segmentation by leveraging the pre-trained visionlanguage models, specifically utilizing the text encoder of Contrastive Language-Image Pretraining (CLIP) to generate rich semantic embeddings for text labels. Then, our method distills this textual knowledge into the segmentation model, integrating the image and text embeddings to align visual and textual information. Additionally, we implement learnable prompt embeddings for better class-specific semantic comprehension. We propose a two-stage training strategy for efficient learning: the segmentation backbone initially learns from fixed text embeddings and subsequently optimizes prompt embeddings to streamline the learning process. The extensive evaluations and ablation studies validate our approach's ability to effectively improve the semantic segmentation model's performance over the compared methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Author: Hashmi, Ammarah, primary, Shahzad, Sahibzada Adil, additional, Lin, Chia-Wen, additional, Tsao, Yu, additional, and Wang, Hsin-Min, additional
Published: 2024
Full Text: View/download PDF

18. Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model

Author: Zezario, Ryandhimas E., primary, Brian Bai, Bo-Ren, additional, Fuh, Chiou-Shann, additional, Wang, Hsin-Min, additional, and Tsao, Yu, additional
Published: 2024
Full Text: View/download PDF

19. Editorial for Special Issue on Invited Papers from APSIPA ASC 2023.

Author: Wang, Jia-Ching, Wang, Hsin-Min, Peng, Wen-Hsiao, and Yeh, Chia-Hung
Abstract: Editorial for Special Issue on Invited Papers from APSIPA ASC 2023 [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Mapping potentially inappropriate medications in older adults using the Anatomical Therapeutic Chemical (ATC) classification system.

Author: Ndai, Asinamai, Al Bahou, Julie, Morris, Earl, Wang, Hsin‐Min, Marcum, Zach, Hung, Anna, Brandt, Nicole, Steinman, Michael A., and Vouri, Scott Martin
Subjects: PROFESSIONAL practice, FEE for service (Medical fees), POLYPHARMACY, SEROTONIN uptake inhibitors, INAPPROPRIATE prescribing (Medicine), MEDICAL protocols, DRUGS, DRUG prescribing, DRUG utilization, MEDICAID, DRUG side effects, PHYSICIAN practice patterns, MEDICARE, ELDER care, OLD age
Abstract: Background: Potentially inappropriate medications (PIMs) in older adults are medications in which risks often outweigh benefits and are suggested to be avoided. Worldwide, many distinct guidelines and tools classify PIMs in older adults. Collating these guidelines and tools, mapping them to a medication classification system, and creating a crosswalk will enhance the utility of PIM guidance for research and clinical practice. Methods: We used the Anatomical Therapeutic Chemical (ATC) Classification System, a hierarchical classification system, to map PIMs from eight distinct guidelines and tools (2019 Beers Criteria, Screening Tool for Older Person's Appropriate Prescriptions [STOPP], STOPP‐Japan, German PRISCUS, European Union‐7 Potentially Inappropriate Medication [PIM] list, Centers for Medicare & Medicaid Services [CMS] High‐Risk Medication, Anticholinergic Burden Scale, and Drug Burden Index). Each PIM was mapped to ATC Level 5 (drug) and to ATC Level 4 (drug class). We then used the crosswalk (1) to compare PIMs and PIM drug classes across guidelines and tools to determine the number of PIMs that were index (drug‐induced adverse event) or marker (treatment of drug‐induced adverse event) drug of prescribing cascades, and (2) estimate the prevalence of PIM use in older adults continuously enrolled with fee‐for‐service Medicare in 2018 as use cases. Data visualization and descriptive statistics were used to assess guidelines and tools for both use cases. Results: Out of 480 unique PIMs identified, only three medications—amitriptyline, clomipramine, and imipramine and two drug classes—N06AA (tricyclic antidepressants) and N06AB (selective serotonin reuptake inhibitors), were noted in all eight guidelines and tools. Using the crosswalk, 50% of classes of index drugs and 47% of classes of marker drugs of known prescribing cascades were PIMs. Additionally, 88% of Medicare beneficiaries were dispensed ≥1 PIM across the eight guidelines and tools. Conclusion: We created a crosswalk of eight PIM guidelines and tools to the ATC classification system and created two use cases. Our findings could be used to expand the ease of PIM identification and harmonization for research and clinical practice purposes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Magnetic-Field-Assisted Electric-Field-Induced Domain Switching of a Magnetic Single Domain in a Multiferroic/Magnetoelectric Ni Nanochevron/[Pb(Mg 1/3 Nb 2/3)O 3 ] 0.68 –[PbTiO 3 ] 0.32 (PMN–PT) Layered Structure.

Author: Cheng, Chih-Cheng, Chen, Yu-Jen, Lin, Shin-Hung, Wang, Hsin-Min, Lin, Guang-Ping, and Chung, Tien-Kan
Subjects: MAGNETIC control, MAGNETIC domain, MAGNETOELECTRIC effect, MAGNETIC fields, MAGNETIZATION
Abstract: We report the magnetic-field-assisted electric-field-controlled domain switching of a magnetic single domain in a multiferroic/magnetoelectric Ni nanochevrons/[Pb(Mg1/3Nb2/3)O3]0.68–[PbTiO3]0.32 (PMN–PT) layered structure. Initially, a magnetic field was applied in the transverse direction across single-domain Ni nanochevrons to transform each of them into a two-domain state. Subsequently, an electric field was applied to the layered structure, exerting the converse magnetoelectric effect to transform/release the two-domain Ni nanochevrons into one of two possible single-domain states. Finally, the experimental results showed that approximately 50% of the single-domain Ni nanochevrons were switched permanently after applying our approach (i.e., the magnetization direction was permanently rotated by 180 degrees). These results mark important advancements for future nanoelectromagnetic systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

21 results on '"Wang, Hsin-Min"'

1. Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

2. Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition

3. Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

4. A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

5. Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages

6. The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

7. Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

8. SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

9. Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

10. SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

11. HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

12. Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model.

13. End-to-End Singing Transcription Based on CTC and HSMM Decoding with a Refined Score Representation.

14. Estimating 3D Hand Poses and Shapes from Silhouettes.

15. Meta Soft Prompting and Learning.

16. A Lightweight Enhancement Approach for Real-Time Semantic Segmentation by Distilling Rich Knowledge from Pre-Trained Vision-Language Model.

17. Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

18. Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model

19. Editorial for Special Issue on Invited Papers from APSIPA ASC 2023.

20. Mapping potentially inappropriate medications in older adults using the Anatomical Therapeutic Chemical (ATC) classification system.

21. Magnetic-Field-Assisted Electric-Field-Induced Domain Switching of a Magnetic Single Domain in a Multiferroic/Magnetoelectric Ni Nanochevron/[Pb(Mg 1/3 Nb 2/3)O 3 ] 0.68 –[PbTiO 3 ] 0.32 (PMN–PT) Layered Structure.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

21 results on '"Wang, Hsin-Min"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources