Author: "Choi, In-Jin" / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Choi, In-Jin"' showing total 795 results

Start Over Author "Choi, In-Jin" Publication Year Range This year

795 results on '"Choi, In-Jin"'

1. Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Author: Lee, Young-Jun, Lee, Dokyong, Youn, Junyoung, Oh, Kyeongjin, and Choi, Ho-Jin
Subjects: Computer Science - Computation and Language
Abstract: To increase social bonding with interlocutors, humans naturally acquire the ability to respond appropriately in a given situation by considering which conversational skill is most suitable for the response - a process we call skill-of-mind. For large language model (LLM)-based conversational agents, planning appropriate conversational skills, as humans do, is challenging due to the complexity of social dialogue, especially in interactive scenarios. To address this, we propose a skill-of-mind-annotated conversation dataset, named Multifaceted Skill-of-Mind, which includes multi-turn and multifaceted conversational skills across various interactive scenarios (e.g., long-term, counseling, task-oriented), grounded in diverse social contexts (e.g., demographics, persona, rules of thumb). This dataset consists of roughly 100K conversations. Using this dataset, we introduce a new family of skill-of-mind-infused LLMs, named Thanos, with model sizes of 1B, 3B, and 8B parameters. With extensive experiments, these models successfully demonstrate the skill-of-mind process and exhibit strong generalizability in inferring multifaceted skills across a variety of domains. Moreover, we show that Thanos significantly enhances the quality of responses generated by LLM-based conversational agents and promotes prosocial behavior in human evaluations., Comment: Code: https://github.com/passing2961/Thanos
Published: 2024

2. Cross Spline Net and a Unified World

Author: Hu, Linwei, Choi, Ye Jin, and Nair, Vijayan N.
Subjects: Statistics - Methodology, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In today's machine learning world for tabular data, XGBoost and fully connected neural network (FCNN) are two most popular methods due to their good model performance and convenience to use. However, they are highly complicated, hard to interpret, and can be overfitted. In this paper, we propose a new modeling framework called cross spline net (CSN) that is based on a combination of spline transformation and cross-network (Wang et al. 2017, 2021). We will show CSN is as performant and convenient to use, and is less complicated, more interpretable and robust. Moreover, the CSN framework is flexible, as the spline layer can be configured differently to yield different models. With different choices of the spline layer, we can reproduce or approximate a set of non-neural network models, including linear and spline-based statistical models, tree, rule-fit, tree-ensembles (gradient boosting trees, random forest), oblique tree/forests, multi-variate adaptive regression spline (MARS), SVM with polynomial kernel, etc. Therefore, CSN provides a unified modeling framework that puts the above set of non-neural network models under the same neural network framework. By using scalable and powerful gradient descent algorithms available in neural network libraries, CSN avoids some pitfalls (such as being ad-hoc, greedy or non-scalable) in the case-specific optimization methods used in the above non-neural network models. We will use a special type of CSN, TreeNet, to illustrate our point. We will compare TreeNet with XGBoost and FCNN to show the benefits of TreeNet. We believe CSN will provide a flexible and convenient framework for practitioners to build performant, robust and more interpretable models.
Published: 2024

3. Enhancing Speech Emotion Recognition through Segmental Average Pooling of Self-Supervised Learning Features

Author: Hyeon, Jonghwan, Oh, Yung-Hwan, and Choi, Ho-Jin
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Speech Emotion Recognition (SER) analyzes human emotions expressed through speech. Self-supervised learning (SSL) offers a promising approach to SER by learning meaningful representations from a large amount of unlabeled audio data. However, existing SSL-based methods rely on Global Average Pooling (GAP) to represent audio signals, treating speech and non-speech segments equally. This can lead to dilution of informative speech features by irrelevant non-speech information. To address this, the paper proposes Segmental Average Pooling (SAP), which selectively focuses on informative speech segments while ignoring non-speech segments. By applying both GAP and SAP to SSL features, our approach utilizes overall speech signal information from GAP and specific information from SAP, leading to improved SER performance. Experiments show state-of-the-art results on the IEMOCAP for English and superior performance on KEMDy19 for Korean datasets in both unweighted and weighted accuracies.
Published: 2024

4. Intriguing Properties of Large Language and Vision Models

Author: Lee, Young-Jun, Ko, Byungsoo, Kim, Han-Gyu, Hwang, Yechan, and Choi, Ho-Jin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Recently, large language and vision models (LLVMs) have received significant attention and development efforts due to their remarkable generalization performance across a wide range of tasks requiring perception and cognitive abilities. A key factor behind their success is their simple architecture, which consists of a vision encoder, a projector, and a large language model (LLM). Despite their achievements in advanced reasoning tasks, their performance on fundamental perception-related tasks (e.g., MMVP) remains surprisingly low. This discrepancy raises the question of how LLVMs truly perceive images and exploit the advantages of the vision encoder. To address this, we systematically investigate this question regarding several aspects: permutation invariance, robustness, math reasoning, alignment preserving and importance, by evaluating the most common LLVM's families (i.e., LLaVA) across 10 evaluation benchmarks. Our extensive experiments reveal several intriguing properties of current LLVMs: (1) they internally process the image in a global manner, even when the order of visual patch sequences is randomly permuted; (2) they are sometimes able to solve math problems without fully perceiving detailed numerical information; (3) the cross-modal alignment is overfitted to complex reasoning tasks, thereby, causing them to lose some of the original perceptual capabilities of their vision encoder; (4) the representation space in the lower layers (<25%) plays a crucial role in determining performance and enhancing visual understanding. Lastly, based on the above observations, we suggest potential future directions for building better LLVMs and constructing more challenging evaluation benchmarks., Comment: Code is available in https://github.com/passing2961/IP-LLVM
Published: 2024

5. Expanding Horizons: Fostering Creativity and Curiosity through Spherical Video-Based Virtual Reality in Project-Based Language Learning

Author: Chung Sun Joo and Choi Lee Jin
Abstract: The aim of this study is to explore how the creation of spherical video-based virtual reality (SVVR) influences students' creativity and curiosity in project-based language learning (PBLL). Technology is widely used in various instructional contexts, and due to increasing interest in VR technologies, the current study investigated how SVVR technology in PBLL might influence students' self-assessment of creativity and curiosity. Twenty-seven students participated in an SVVR-enhanced PBL course and were asked to complete two questionnaires on creativity and curiosity. Data from students' reflective journals and teacher notes were also analyzed to investigate implementations for using SVVR in language learning. The findings showed that students' self-assessment of their levels of creativity and curiosity demonstrated statistically significant development. The findings from qualitative data analysis identified English language competence as a factor that could have influenced students' performance during the course. The findings support the potential for SVVR in PBLL for motivating students to embrace new ideas and perspectives while participating in authentic problem-solving activities.
Published: 2024
Full Text: View/download PDF

6. Exploring the Impact of Online Social Platforms on Social Connectedness Among UC Berkeley Undergraduates

Author: Nanda, Heer, Gharabi, Ameneh, Wang, Elizabeth, Kolhatkar, Kayhan, Choi, Seo Jin, Le, Trang, and Chaisomboonpan, Yanisa
Abstract: With social media's prevalence in recent decades, the relationship between social media and the well-being of its users has always been under debate—with some believing that social media tends to have a negative impact and others believing that social media has a positive impact. This systematic review aims to determine how social media contributes to feelings of social isolation or connectedness among college students at the University of California, Berkeley.We conducted a university-wide survey to determine how college students feel about using social media. Using cluster analysis and the data analysis program R, we determined that most college students feel socially connected while using social media. While some past research articles support this theory, most do not discuss the ways in which college students generally tend to feel socially connected on social media platforms. Our paper explores specific factors contributing to college students’ social connectedness on various social media platforms.
Published: 2024

7. A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks

Author: Jang, Boa, Ahn, Youngbin, Choe, Eun Kyung, Yoon, Chang Ki, Choi, Hyuk Jin, and Kim, Young-Gon
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Artificial intelligence applied to retinal images offers significant potential for recognizing signs and symptoms of retinal conditions and expediting the diagnosis of eye diseases and systemic disorders. However, developing generalized artificial intelligence models for medical data often requires a large number of labeled images representing various disease signs, and most models are typically task-specific, focusing on major retinal diseases. In this study, we developed a Fundus-Specific Pretrained Model (Image+Fundus), a supervised artificial intelligence model trained to detect abnormalities in fundus images. A total of 57,803 images were used to develop this pretrained model, which achieved superior performance across various downstream tasks, indicating that our proposed model outperforms other general methods. Our Image+Fundus model offers a generalized approach to improve model performance while reducing the number of labeled datasets required. Additionally, it provides more disease-specific insights into fundus images, with visualizations generated by our model. These disease-specific foundation models are invaluable in enhancing the performance and efficiency of deep learning models in the field of fundus imaging., Comment: 10 pages, 4 figures
Published: 2024

8. An objective isogeometric mixed finite element formulation for nonlinear elastodynamic beams with incompatible warping strains

Author: Choi, Myung-Jin, Klinkel, Sven, Klarmann, Simon, and Sauer, Roger A.
Subjects: Mathematics - Numerical Analysis
Abstract: We present a stable mixed isogeometric finite element formulation for geometrically and materially nonlinear beams in transient elastodynamics, where a Cosserat beam formulation with extensible directors is used. The extensible directors yield a linear configuration space incorporating constant in-plane cross-sectional strains. Higher-order (incompatible) strains are introduced to correct stiffness, whose additional degrees-of-freedom are eliminated by an element-wise condensation. Further, the present discretization of the initial director field leads to the objectivity of approximated strain measures, regardless of the degree of basis functions. For physical stress resultants and strains, we employ a global patch-wise approximation using B-spline basis functions, whose higher-order continuity enables to use much less degrees-of-freedom, compared to element-wise approximation. For time-stepping, we employ an implicit energy-momentum consistent scheme, which exhibits superior numerical stability in comparison to standard trapezoidal and mid-point rules. Several numerical examples are presented to verify the present method., Comment: 65 pages, 30 figures
Published: 2024
Full Text: View/download PDF

9. Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers

Author: Kim, Jong Myoung, Lee, Young-Jun, Han, Yong-jin, Jung, Sangkeun, and Choi, Ho-Jin
Subjects: Computer Science - Computation and Language
Abstract: Syntactic elements, such as word order and case markers, are fundamental in natural language processing. Recent studies show that syntactic information boosts language model performance and offers clues for people to understand their learning mechanisms. Unlike languages with a fixed word order such as English, Korean allows for varied word sequences, despite its canonical structure, due to case markers that indicate the functions of sentence components. This study explores whether Korean language models can accurately capture this flexibility. We note that incomplete word orders and omitted case markers frequently appear in ordinary Korean communication. To investigate this further, we introduce the Syntactically Incomplete Korean (SIKO) dataset. Through SIKO, we assessed Korean language models' flexibility with incomplete syntax and confirmed the dataset's training value. Results indicate these models reflect Korean's inherent flexibility, accurately handling incomplete inputs. Moreover, fine-tuning with SIKO enhances the ability to handle common incomplete Korean syntactic forms. The dataset's simple construction process, coupled with significant performance enhancements, solidifies its standing as an effective data augmentation technique., Comment: COLM 2024; Code and dataset is available in https://github.com/grayapple-git/SIKO
Published: 2024

10. Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge

Author: Lee, Young-Jun, Lee, Dokyong, Youn, Junyoung, Oh, Kyeongjin, Ko, Byungsoo, Hyeon, Jonghwan, and Choi, Ho-Jin
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Humans share a wide variety of images related to their personal experiences within conversations via instant messaging tools. However, existing works focus on (1) image-sharing behavior in singular sessions, leading to limited long-term social interaction, and (2) a lack of personalized image-sharing behavior. In this work, we introduce Stark, a large-scale long-term multi-modal conversation dataset that covers a wide range of social personas in a multi-modality format, time intervals, and images. To construct Stark automatically, we propose a novel multi-modal contextualization framework, Mcu, that generates long-term multi-modal dialogue distilled from ChatGPT and our proposed Plan-and-Execute image aligner. Using our Stark, we train a multi-modal conversation model, Ultron 7B, which demonstrates impressive visual imagination ability. Furthermore, we demonstrate the effectiveness of our dataset in human evaluation. We make our source code and dataset publicly available., Comment: Project website: https://stark-dataset.github.io
Published: 2024

11. Time-Dependent Background Analysis in the NEON experiment for Axion-Like Particle Searches

Author: Park, Byung Ju, Choi, Jae Jin, Jeon, Eunju, Kim, Jinyu, Kim, Kyungwon, Kim, Sung Hyun, Kim, Sun Kee, Kim, Yeongduk, Ko, Young Ju, Koh, Byoung-Cheol, Ha, Chang Hyon, Lee, Seo Hyun, Lee, In Soo, Lee, Hyunseok, Lee, Hyun Su, Lee, Jaison, and Oh, Yoomin
Subjects: High Energy Physics - Experiment
Abstract: The NEON experiment, situated at the Hanbit Nuclear Power Plant, is designed to observe coherent neutrinonucleus scattering (CE{\nu}NS) and search for dark sector particle such as axion-like particles (ALPs). Using six NaI(Tl) detector modules, data were collected during both reactor-on and reactor-off periods between April 2022 and June 2023, providing a total exposure of 1596 kg{\cdot}days and 1467 kg{\cdot}days, respectively. The search for ALPs leverages the difference between reactor-on and reactor-off datasets. A thorough understanding of time-dependent backgrounds, including cosmogenic activation and seasonal variations of radon contamination, is essential to the analysis. This paper presents detailed modeling of these backgrounds, identifying their contributions across different energy ranges and detector modules. Systematic uncertainties arising from energy resolution, background shape, and rate variations are considered in the final analysis. The results provide insights into the future potential of ALP searches in short-baseline reactor experiments and demonstrate the efficacy of background reduction techniques in the NEON experiment., Comment: We have included information on the time-dependent background model for the NEON experiment in this manuscript. The manuscript will be updated later with the corrected results from the ALP search
Published: 2024

12. MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance

Author: Kim, Semin, Jeong, Myeonghun, Lee, Hyeonseung, Kim, Minchan, Choi, Byoung Jin, and Kim, Nam Soo
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence
Abstract: In this paper, we propose MakeSinger, a semi-supervised training method for singing voice synthesis (SVS) via classifier-free diffusion guidance. The challenge in SVS lies in the costly process of gathering aligned sets of text, pitch, and audio data. MakeSinger enables the training of the diffusion-based SVS model from any speech and singing voice data regardless of its labeling, thereby enhancing the quality of generated voices with large amount of unlabeled data. At inference, our novel dual guiding mechanism gives text and pitch guidance on the reverse diffusion step by estimating the score of masked input. Experimental results show that the model trained in a semi-supervised manner outperforms other baselines trained only on the labeled data in terms of pronunciation, pitch accuracy and overall quality. Furthermore, we demonstrate that by adding Text-to-Speech (TTS) data in training, the model can synthesize the singing voices of TTS speakers even without their singing voices., Comment: Accepted to Interspeech 2024
Published: 2024

13. Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Author: Conde, Marcos V., Lei, Zhijun, Li, Wen, Stejerean, Cosmin, Katsavounidis, Ioannis, Timofte, Radu, Yoon, Kihwan, Gankhuyag, Ganzorig, Lv, Jiangtao, Sun, Long, Pan, Jinshan, Dong, Jiangxin, Tang, Jinhui, Li, Zhiyuan, Wei, Hao, Ge, Chenyang, Zhang, Dongyang, Liu, Tianle, Chen, Huaian, Jin, Yi, Zhou, Menghan, Yan, Yiqiang, Gao, Si, Wu, Biao, Liu, Shaoli, Zheng, Chengjian, Zhang, Diankai, Wang, Ning, Qiu, Xintao, Zhou, Yuanbo, Wu, Kongxian, Dai, Xinwei, Tang, Hui, Deng, Wei, Gao, Qingquan, Tong, Tong, Lee, Jae-Hyeon, Choi, Ui-Jin, Yan, Min, Liu, Xin, Wang, Qian, Ye, Xiaoqian, Du, Zhan, Zhang, Tiansen, Peng, Long, Guo, Jiaming, Di, Xin, Liao, Bohao, Du, Zhibo, Xia, Peize, Pei, Renjing, Wang, Yang, Cao, Yang, Zha, Zhengjun, Han, Bingnan, Yu, Hongyuan, Wu, Zhuoyuan, Wan, Cheng, Liu, Yuqing, Yu, Haodong, Li, Jizhe, Huang, Zhijuan, Huang, Yuan, Zou, Yajun, Guan, Xianyu, Jia, Qi, Zhang, Heng, Yin, Xuanwu, Zuo, Kunlong, Moon, Hyeon-Cheol, Jeong, Tae-hyun, Yang, Yoonmo, Kim, Jae-Gon, Jeong, Jinwoo, and Kim, Sunjei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images., Comment: CVPR 2024, AI for Streaming (AIS) Workshop
Published: 2024

14. Performance enhancement of material removal using a surface-refinement model based on spatial frequency–response characteristics in magnetorheological finishing

Author: Jeon, Minwoo, Jeong, Seok-Kyeong, Yeo, Woo-Jong, Choi, Hwan-Jin, Kim, Mincheol, Bog, Min-Gab, and Lee, Wonkyun
Published: 2024
Full Text: View/download PDF

15. Association between preterm birth and asthma and atopic dermatitis in preschool children: a nationwide population-based study

Author: Cha, Jong Ho, Hwang, Jae Kyoon, Na, Jae Yoon, Ryu, Soorak, Oh, Jae-Won, and Choi, Young-Jin
Published: 2024
Full Text: View/download PDF

16. A stealthy neural recorder for the study of behaviour in primates

Author: Oh, Saehyuck, Jekal, Janghwan, Won, Jinyoung, Lim, Kyung Seob, Jeon, Chang-Yeop, Park, Junghyung, Yeo, Hyeon-Gu, Kim, Yu Gyeong, Lee, Young Hee, Ha, Leslie Jaesun, Jung, Han Hee, Yea, Junwoo, Lee, Hyeokjun, Ha, Jeongdae, Kim, Jinmo, Lee, Doyoung, Song, Soojeong, Son, Jieun, Yu, Tae Sang, Lee, Jungmin, Lee, Sanghoon, Lee, Jaehong, Kim, Bong Hoon, Choi, Ji-Woong, Rah, Jong-Cheol, Song, Young Min, Jeong, Jae-Woong, Choi, Hyung Jin, Xu, Sheng, Lee, Youngjeon, and Jang, Kyung-In
Published: 2024
Full Text: View/download PDF

17. Survival Outcomes According to NCCN Criteria for Resection Following Neoadjuvant Therapy for Patients with Localized Pancreatic Cancer

Author: Jang, Jong Keon, Byun, Jae Ho, Choi, Se Jin, Kim, Jin Hee, Lee, Seung Soo, Kim, Hyoung Jung, Yoo, Changhoon, Kim, Kyu-pyo, Hong, Seung-Mo, Seo, Dong-Wan, Hwang, Dae Wook, and Kim, Song Cheol
Published: 2024
Full Text: View/download PDF

18. Identification of Representative Wind Power Fluctuation Patterns for Water Electrolysis Device Stress Testing: A Data Mining Approach

Author: Choi, Kyong Jin, Kim, Sanghoon, Kwon, Yongchai, and Sim, Min Kyu
Published: 2024
Full Text: View/download PDF

19. Toxicological implications of storage conditions on yeast vacuole properties and activities

Author: Choi, Hyo Jin, Kim, Taehwan, Shin, Woo-Ri, Lee, Jin-Pyo, Le Ngoc Phuong, Uyen, Ahn, Ji-Young, Kim, Yang-Hoon, and Min, Jiho
Published: 2024
Full Text: View/download PDF

20. On the Feasibility of EEG-based Motor Intention Detection for Real-Time Robot Assistive Control

Author: Choi, Ho Jin, Das, Satyajeet, Peng, Shaoting, Bajcsy, Ruzena, and Figueroa, Nadia
Subjects: Computer Science - Robotics
Abstract: This paper explores the feasibility of employing EEG-based intention detection for real-time robot assistive control. We focus on predicting and distinguishing motor intentions of left/right arm movements by presenting: i) an offline data collection and training pipeline, used to train a classifier for left/right motion intention prediction, and ii) an online real-time prediction pipeline leveraging the trained classifier and integrated with an assistive robot. Central to our approach is a rich feature representation composed of the tangent space projection of time-windowed sample covariance matrices from EEG filtered signals and derivatives; allowing for a simple SVM classifier to achieve unprecedented accuracy and real-time performance. In pre-recorded real-time settings (160 Hz), a peak accuracy of 86.88% is achieved, surpassing prior works. In robot-in-the-loop settings, our system successfully detects intended motion solely from EEG data with 70% accuracy, triggering a robot to execute an assistive task. We provide a comprehensive evaluation of the proposed classifier.
Published: 2024

21. Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

Author: Cooper, Joshua F., Choi, Seung Jin, and Buyuktahtakin, I. Esra
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Mathematics - Combinatorics, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.
Published: 2024

22. Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Author: Kim, Minchan, Jeong, Myeonghun, Choi, Byoung Jin, Kim, Semin, Lee, Joun Yeop, and Kim, Nam Soo
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Sound
Abstract: We propose a novel text-to-speech (TTS) framework centered around a neural transducer. Our approach divides the whole TTS pipeline into semantic-level sequence-to-sequence (seq2seq) modeling and fine-grained acoustic modeling stages, utilizing discrete semantic tokens obtained from wav2vec2.0 embeddings. For a robust and efficient alignment modeling, we employ a neural transducer named token transducer for the semantic token prediction, benefiting from its hard monotonic alignment constraints. Subsequently, a non-autoregressive (NAR) speech generator efficiently synthesizes waveforms from these semantic tokens. Additionally, a reference speech controls temporal dynamics and acoustic conditions at each stage. This decoupled framework reduces the training complexity of TTS while allowing each stage to focus on semantic and acoustic modeling. Our experimental results on zero-shot adaptive TTS demonstrate that our model surpasses the baseline in terms of speech quality and speaker similarity, both objectively and subjectively. We also delve into the inference speed and prosody control capabilities of our approach, highlighting the potential of neural transducers in TTS frameworks., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

23. A shape-morphing cortex-adhesive sensor for closed-loop transcranial ultrasound neurostimulation

Author: Lee, Sungjun, Kum, Jeungeun, Kim, Sumin, Jung, Hyunjin, An, Soojung, Choi, Soon Jin, Choi, Jae Hyuk, Kim, Jinseok, Yu, Ki Jun, Lee, Wonhye, Kim, Hyeok, Han, Hyung-Seop, Shin, Mikyung, Kim, Hyungmin, and Son, Donghee
Published: 2024
Full Text: View/download PDF

24. A Mask R-CNN based process monitoring system for fabricating high density ceramic parts using photo-polymerization

Author: Han, Seungjae, Choi, Seung-Kyum, and Choi, Hae-Jin
Published: 2024
Full Text: View/download PDF

25. Sihler’s staining of the anterior belly of digastric muscle for botulinum toxin injection

Author: Choi, You-Jin, Hu, Hye-Won, Kim, Soo-Bin, Lee, Ji-Hyun, Kim, Seong-Taek, and Kim, Hee-Jin
Published: 2024
Full Text: View/download PDF

26. Enhanced polarization retention and softening in [001]-oriented Pb(Mg1/3Nb2/3)-PbTiO3 single crystals through corona poling

Author: Sun, Jeong-Woo, Choi, Woo-Jin, Yu, Hye-Lim, Lee, Sang-Goo, Ryu, Jong Eun, Zate, Temesgen Tadeyos, and Jo, Wook
Published: 2024
Full Text: View/download PDF

27. Examining the Protective Impact of Peer Relationships on Negative Self-Esteem among High-Risk Adolescents: The Interplay of Gender and Ethnicity

Author: Seng, Tola, Lee, Eunju, and Choi, Mi Jin
Published: 2024
Full Text: View/download PDF

28. When and why employee avoidance crafting promotes coworker organizational citizenship behavior?

Author: Kim, Mihee, Shin, Yuhyung, and Choi, Hyung Jin
Published: 2024
Full Text: View/download PDF

29. Effect of Pravastatin on Kidney Function in Patients with Dyslipidemia and Type 2 Diabetes Mellitus: A Multicenter Prospective Observational Study

Author: Kim, Hae Jin, Hur, Kyu Yeon, Lee, Yong-Ho, Kim, Jin Taek, Lee, Yong-Kyu, Baek, Ki-Hyun, Choi, Euy Jin, Hwang, Won Min, Bang, Ki Tae, Lim, Jung Soo, Chung, Yun Jae, Jo, Sung Rae, Oh, Joon Seok, Lee, Soon Hee, Ko, Seung-Hyun, and Choi, Sung Hee
Published: 2024
Full Text: View/download PDF

30. The investigation of a digitalized projective psychological assessment: Comparison to human expert on bender gestalt test

Author: Chang, Won-Du, Kim, Byeongjun, Kim, Bogeum, Lee, Kyunghan, Kim, Yeonji, Hwang, Jueun, and Choi, Seong-Jin
Published: 2024
Full Text: View/download PDF

31. Evolution of Microcracks in Epitaxial CeO2 Thin Films on YSZ-Buffered Si

Author: Jung, Soo Young, Choi, Hyung-Jin, Lee, Jun Young, Kim, Min-Seok, Ning, Ruiguang, Han, Dong-Hun, Kim, Seong Keun, Won, Sung Ok, Lee, June Hyuk, Jang, Ji-Soo, Jang, Ho Won, and Baek, Seung-Hyub
Published: 2024
Full Text: View/download PDF

32. Cell membrane camouflaged nanoparticle strategy and its application in brain disease: a review

Author: Kim, Beomsu, Park, Byeongmin, You, Seungju, Jung, Suk Han, Lee, Soobok, Lim, Kangseok, Choi, Yeo Jin, Kim, Jong-Ho, and Lee, Sangmin
Published: 2024
Full Text: View/download PDF

33. Regulatory role of Echinochrome A in cancer-associated fibroblast-mediated lung cancer cell migration

Author: Eum, Da-Young, Lee, Chaeyoung, Tran, Cong So, Lee, Jinyoung, Park, Soon Yong, Jeong, Mi-So, Jin, Yunho, Shim, Jae Woong, Lee, Seoung Rak, Koh, Minseob, Vasileva, Elena A., Mishchenko, Natalia P., Park, Seong-Joon, Choi, Si Ho, Choi, Yoo Jin, Yun, Hwayoung, and Heo, Kyu
Published: 2024
Full Text: View/download PDF

34. Determination of the Combined Effects of Asian Herbal Medicine with Calcium and/or Vitamin D Supplements on Bone Mineral Density in Primary Osteoporosis: A Systematic Review and Meta-Analysis

Author: Park, Hee-Joo, Kim, Min-Gyeong, Yoo, Young-Seo, Lee, Boram, Choi, Yu-Jin, Son, Chang-Gue, and Lee, Eun-Jung
Published: 2024
Full Text: View/download PDF

35. Preventive effect of Lacticaseibacillus paracasei LMT18-32 on Porphyromonas gingivalis induced periodontitis

Author: Choi, Woo Jin, Cho, Seung Kee, Dong, Hye Jin, Kim, Tai Hoon, Soon, Jaejoon, Lee, Hyo Jin, Yoon, Kwang Ho, Kwak, Seongsung, and Yun, Jiae
Published: 2024
Full Text: View/download PDF

36. When the well is full, it will run over: the double-edged sword effect of corporate lobbying activities on firm performance

Author: Xiao, Shufeng, Jiménez, Alfredo, Jung, Sukyoon, Park, Byung Il, and Choi, Seong Jin
Published: 2024
Full Text: View/download PDF

37. Evaluation of prediction errors in nine intraocular lens calculation formulas using an explainable machine learning model

Author: Oh, Richul, Oh, Joo Youn, Choi, Hyuk Jin, Kim, Mee Kum, and Yoon, Chang Ho
Published: 2024
Full Text: View/download PDF

38. Non-invasive brain stimulation enhances motor and cognitive performances during dual tasks in patients with Parkinson’s disease: a systematic review and meta-analysis

Author: Lee, Hajun, Choi, Beom Jin, and Kang, Nyeonju
Published: 2024
Full Text: View/download PDF

39. Agreement in anterior segment measurements between swept-source optical coherence and dual Scheimpflug tomography devices in keratoconus eyes

Author: Lee, Yunjin, Oh, Joo Youn, Choi, Hyuk Jin, Kim, Mee Kum, and Yoon, Chang Ho
Published: 2024
Full Text: View/download PDF

40. Increased efficiency of peripheral nerve regeneration using supercritical carbon dioxide-based decellularization in acellular nerve graft

Author: Choi, Soon Jin, Han, Jeonghun, Shin, Young Ho, and Kim, Jae Kwang
Published: 2024
Full Text: View/download PDF

41. Fabric-based lamina emergent MXene-based electrode for electrophysiological monitoring

Author: Lee, Sanghyun, Ho, Dong Hae, Jekal, Janghwan, Cho, Soo Young, Choi, Young Jin, Oh, Saehyuck, Choi, Yoon Young, Lee, Taeyoon, Jang, Kyung-In, and Cho, Jeong Ho
Published: 2024
Full Text: View/download PDF

42. Evaluation of FVIII pharmacokinetic profiles in Korean hemophilia A patients assessed with myPKFiT: a retrospective chart review

Author: Park, Young-Shil, Yoo, Ki-Young, Park, Sang Kyu, Hwang, Taiju, Jung, Aeran, and Choi, Eun Jin
Published: 2024
Full Text: View/download PDF

43. Microneedle patch casting using a micromachined carbon master for enhanced drug delivery

Author: Choi, Hye Jin, Ullah, Asad, Jang, Mi Jin, Lee, Ui Seok, Shin, Min Chul, An, Sang Hyun, Kim, Dongseon, Kim, Bo Hyun, and Kim, Gyu Man
Published: 2024
Full Text: View/download PDF

44. Integrated analysis of spatial transcriptomics and CT phenotypes for unveiling the novel molecular characteristics of recurrent and non-recurrent high-grade serous ovarian cancer

Author: Ju, Hye-Yeon, Youn, Seo Yeon, Kang, Jun, Whang, Min Yeop, Choi, Youn Jin, and Han, Mi-Ryung
Published: 2024
Full Text: View/download PDF

45. Clinical data on treatment regimen and use of medication among patients with hemophilia B in Korea

Author: Park, Young Shil, Park, Ji Kyoung, Park, Jeong A, Baek, Hee Jo, Lee, Jae Hee, You, Chur Woo, Lyu, Chuhl Joo, and Choi, Eun Jin
Published: 2024
Full Text: View/download PDF

46. Effectiveness and satisfaction with virtual and donor dissections: A randomized controlled trial

Author: Yun, Young Hyun, Kwon, Hyeok Yi, Jeon, Su Kyoung, Jon, Yu Mi, Park, Min Jung, Shin, Dong Hoon, and Choi, Hyung Jin
Published: 2024
Full Text: View/download PDF

47. Sex-specific survival gene mutations are discovered as clinical predictors of clear cell renal cell carcinoma

Author: Hwang, Jia, Lee, Hye Eun, Han, Jin Seon, Choi, Moon Hyung, Hong, Sung Hoo, Kim, Sae Woong, Yang, Ji Hoon, Park, Unsang, Jung, Eun Sun, and Choi, Yeong Jin
Published: 2024
Full Text: View/download PDF

48. Integrating mixed reality preparation into acute coronary syndrome simulation for nursing students: a single-group pretest-posttest study

Author: Moon, Sun-Hee, Jeong, Hyeonjin, and Choi, Mi Jin
Published: 2024
Full Text: View/download PDF

49. Limited impact of bacterial virulence on early mortality risk factors in Acinetobacter baumannii bacteremia observed in a Galleria mellonella model

Author: Ham, Sin Young, Chun, June Young, Song, Kyoung-Ho, Kang, Chang Kyung, Park, Jeong Su, Jo, Hee Bum, Ryu, Choong-Min, Choi, Yunsang, Choi, Seong Jin, Lee, Eunyoung, Choe, Pyoeng Gyun, Moon, Song Mi, Park, Wan Beom, Bang, Jihwan, Park, Sang-Won, Park, Kyoung Un, Kim, Nam Joong, Oh, Myoung-don, Kim, Eu Suk, and Kim, Hong Bin
Published: 2024
Full Text: View/download PDF

50. Full endoscopic surgery for calcium pyrophosphate deposition disease (CPPD) in the cervical ligamentum flavum: report of two cervical myelopathy cases

Author: Choi, Seung Jin, Kang, Dong Wan D., Ham, Chang Hwa, Kim, Joo Han, and Kwon, Woo-Keun
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

795 results on '"Choi, In-Jin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources