99,429 results on '"Jee BY"'
Search Results
2. ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech
- Author
-
Shi, Jiatong, Tian, Jinchuan, Wu, Yihan, Jung, Jee-weon, Yip, Jia Qi, Masuyama, Yoshiki, Chen, William, Wu, Yuning, Tang, Yuxun, Baali, Massa, Alharhi, Dareen, Zhang, Dong, Deng, Ruifan, Srivastava, Tejes, Wu, Haibin, Liu, Alexander H., Raj, Bhiksha, Jin, Qin, Song, Ruihua, and Watanabe, Shinji
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Sound - Abstract
Neural codecs have become crucial to recent speech and audio generation research. In addition to signal compression capabilities, discrete codecs have also been found to enhance downstream training efficiency and compatibility with autoregressive language models. However, as extensive downstream applications are investigated, challenges have arisen in ensuring fair comparisons across diverse applications. To address these issues, we present a new open-source platform ESPnet-Codec, which is built on ESPnet and focuses on neural codec training and evaluation. ESPnet-Codec offers various recipes in audio, music, and speech for training and evaluation using several widely adopted codec models. Together with ESPnet-Codec, we present VERSA, a standalone evaluation toolkit, which provides a comprehensive evaluation of codec performance over 20 audio evaluation metrics. Notably, we demonstrate that ESPnet-Codec can be integrated into six ESPnet tasks, supporting diverse applications., Comment: Accepted by SLT
- Published
- 2024
3. SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
- Author
-
Jung, Jee-weon, Wu, Yihan, Wang, Xin, Kim, Ji-Hoon, Maiti, Soumi, Matsunaga, Yuta, Shim, Hye-jin, Tian, Jinchuan, Evans, Nicholas, Chung, Joon Son, Zhang, Wangyou, Um, Seyun, Takamichi, Shinnosuke, and Watanabe, Shinji
- Subjects
Computer Science - Sound ,Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This paper introduces SpoofCeleb, a dataset designed for Speech Deepfake Detection (SDD) and Spoofing-robust Automatic Speaker Verification (SASV), utilizing source data from real-world conditions and spoofing attacks generated by Text-To-Speech (TTS) systems also trained on the same real-world data. Robust recognition systems require speech data recorded in varied acoustic environments with different levels of noise to be trained. However, existing datasets typically include clean, high-quality recordings (bona fide data) due to the requirements for TTS training; studio-quality or well-recorded read speech is typically necessary to train TTS models. Existing SDD datasets also have limited usefulness for training SASV models due to insufficient speaker diversity. We present SpoofCeleb, which leverages a fully automated pipeline that processes the VoxCeleb1 dataset, transforming it into a suitable form for TTS training. We subsequently train 23 contemporary TTS systems. The resulting SpoofCeleb dataset comprises over 2.5 million utterances from 1,251 unique speakers, collected under natural, real-world conditions. The dataset includes carefully partitioned training, validation, and evaluation sets with well-controlled experimental protocols. We provide baseline results for both SDD and SASV tasks. All data, protocols, and baselines are publicly available at https://jungjee.github.io/spoofceleb., Comment: 9 pages, 2 figures, 8 tables
- Published
- 2024
4. Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels
- Author
-
Aldeneh, Zakaria, Higuchi, Takuya, Jung, Jee-weon, Chen, Li-Wei, Shum, Stephen, Abdelaziz, Ahmed Hussen, Watanabe, Shinji, Likhomanenko, Tatiana, and Theobald, Barry-John
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Sound - Abstract
Iterative self-training, or iterative pseudo-labeling (IPL)--using an improved model from the current iteration to provide pseudo-labels for the next iteration--has proven to be a powerful approach to enhance the quality of speaker representations. Recent applications of IPL in unsupervised speaker recognition start with representations extracted from very elaborate self-supervised methods (e.g., DINO). However, training such strong self-supervised models is not straightforward (they require hyper-parameters tuning and may not generalize to out-of-domain data) and, moreover, may not be needed at all. To this end, we show the simple, well-studied, and established i-vector generative model is enough to bootstrap the IPL process for unsupervised learning of speaker representations. We also systematically study the impact of other components on the IPL process, which includes the initial model, the encoder, augmentations, the number of clusters, and the clustering algorithm. Remarkably, we find that even with a simple and significantly weaker initial model like i-vector, IPL can still achieve speaker verification performance that rivals state-of-the-art methods., Comment: Submitted to ICASSP 2025
- Published
- 2024
5. Text-To-Speech Synthesis In The Wild
- Author
-
Jung, Jee-weon, Zhang, Wangyou, Maiti, Soumi, Wu, Yihan, Wang, Xin, Kim, Ji-Hoon, Matsunaga, Yuta, Um, Seyun, Tian, Jinchuan, Shim, Hye-jin, Evans, Nicholas, Chung, Joon Son, Takamichi, Shinnosuke, and Watanabe, Shinji
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Artificial Intelligence - Abstract
Text-to-speech (TTS) systems are traditionally trained using modest databases of studio-quality, prompted or read speech collected in benign acoustic environments such as anechoic rooms. The recent literature nonetheless shows efforts to train TTS systems using data collected in the wild. While this approach allows for the use of massive quantities of natural speech, until now, there are no common datasets. We introduce the TTS In the Wild (TITW) dataset, the result of a fully automated pipeline, in this case, applied to the VoxCeleb1 dataset commonly used for speaker recognition. We further propose two training sets. TITW-Hard is derived from the transcription, segmentation, and selection of VoxCeleb1 source data. TITW-Easy is derived from the additional application of enhancement and additional data selection based on DNSMOS. We show that a number of recent TTS models can be trained successfully using TITW-Easy, but that it remains extremely challenging to produce similar results using TITW-Hard. Both the dataset and protocols are publicly available and support the benchmarking of TTS systems trained using TITW data., Comment: 5 pages, submitted to ICASSP 2025 as a conference paper
- Published
- 2024
6. External Steering of Vine Robots via Magnetic Actuation
- Author
-
Kim, Nam Gyun, Greenidge, Nikita J., Davy, Joshua, Park, Shinwoo, Chandler, James H., Ryu, Jee-Hwan, and Valdastri, Pietro
- Subjects
Computer Science - Robotics - Abstract
This paper explores the concept of external magnetic control for vine robots to enable their high curvature steering and navigation for use in endoluminal applications. Vine robots, inspired by natural growth and locomotion strategies, present unique shape adaptation capabilities that allow passive deformation around obstacles. However, without additional steering mechanisms, they lack the ability to actively select the desired direction of growth. The principles of magnetically steered growing robots are discussed, and experimental results showcase the effectiveness of the proposed magnetic actuation approach. We present a 25 mm diameter vine robot with integrated magnetic tip capsule, including 6 Degrees of Freedom (DOF) localization and camera and demonstrate a minimum bending radius of 3.85 cm with an internal pressure of 30 kPa. Furthermore, we evaluate the robot's ability to form tight curvature through complex navigation tasks, with magnetic actuation allowing for extended free-space navigation without buckling. The suspension of the magnetic tip was also validated using the 6 DOF localization system to ensure that the shear-free nature of vine robots was preserved. Additionally, by exploiting the magnetic wrench at the tip, we showcase preliminary results of vine retraction. The findings contribute to the development of controllable vine robots for endoluminal applications, providing high tip force and shear-free navigation., Comment: 13 pages, 10 figures
- Published
- 2024
7. The VoxCeleb Speaker Recognition Challenge: A Retrospective
- Author
-
Huh, Jaesung, Chung, Joon Son, Nagrani, Arsha, Brown, Andrew, Jung, Jee-weon, Garcia-Romero, Daniel, and Zisserman, Andrew
- Subjects
Computer Science - Sound ,Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The VoxCeleb Speaker Recognition Challenges (VoxSRC) were a series of challenges and workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the tasks of speaker recognition and diarisation under various settings including: closed and open training data; as well as supervised, self-supervised, and semi-supervised training for domain adaptation. The challenges also provided publicly available training and evaluation datasets for each task and setting, with new test sets released each year. In this paper, we provide a review of these challenges that covers: what they explored; the methods developed by the challenge participants and how these evolved; and also the current state of the field for speaker verification and diarisation. We chart the progress in performance over the five installments of the challenge on a common evaluation dataset and provide a detailed analysis of how each year's special focus affected participants' performance. This paper is aimed both at researchers who want an overview of the speaker recognition and diarisation field, and also at challenge organisers who want to benefit from the successes and avoid the mistakes of the VoxSRC challenges. We end with a discussion of the current strengths of the field and open challenges. Project page : https://mm.kaist.ac.kr/datasets/voxceleb/voxsrc/workshop.html, Comment: TASLP 2024
- Published
- 2024
- Full Text
- View/download PDF
8. From catch-up to frontier: The utility model as a learning device to escape the middle-income trap
- Author
-
Jee, Su Jung and Hötte, Kerstin
- Subjects
Economics - General Economics - Abstract
Escaping the middle-income trap requires a country to develop indigenous technological capabilities for high value-added innovation. This study examines the role of second-tier patent systems, known as utility models (UMs), in promoting such capability acquisition in less developed countries. UMs are designed to incentivize incremental and adaptive innovation through lower novelty standards than patents, but their long-term impact on the capability acquisition process remains underexplored. Using South Korea as a case study and drawing on the characteristics of technological regimes in catching-up economies, we present three key findings: First, the country's post-catch-up frontier technologies (U.S. patents) are more impactful (highly cited) when they build on Korean domestic UMs. This suggests that UM-based imitative and adaptive learning laid the foundation for the country's globally competitive capabilities. Second, the impact of UM-based learning diminishes as the country's economy develops. Third, frontier technologies rooted in UMs contribute more to the country's own specialization than to follow-on innovations by foreign actors, compared to technologies without UM linkages. We discuss how technological regimes and industrial policies in catching-up economies interact with the UM system to bridge the catching-up (imitation- and adaptation-based) and post-catching-up (specialization- and creativity-based) phases.
- Published
- 2024
9. Making intellectual property rights work for climate technology transfer and innovation in developing countries
- Author
-
Jee, Su Jung, Hötte, Kerstin, Ring, Caoimhe, and Burrell, Robert
- Subjects
Economics - General Economics - Abstract
This study investigates the controversial role of Intellectual Property Rights (IPRs) in climate technology transfer and innovation in developing countries. Using a systematic literature review and expert interviews, we assess the role of IPRs on three sources of climate technology: (1) international technology transfer, (2) adaptive innovation, and (3) indigenous innovation. Our contributions are threefold. First, patents have limited impact in any of these channels, suggesting that current debates over IPRs may be directed towards the wrong targets. Second, trademarks and utility models provide incentives for climate innovation in the countries studied. Third, drawing from the results, we develop a framework to guide policy on how IPRs can work better in the broader context of climate and trade policies, outlining distinct mechanisms to support mitigation and adaptation. Our results indicate that market mechanisms, especially trade and demand-pull policies, should be prioritised for mitigation solutions. Adaptation differs, relying more on indigenous innovation due to local needs and low demand. Institutional mechanisms, such as finance and co-development, should be prioritised to build innovation capacities for adaptation.
- Published
- 2024
10. ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale
- Author
-
Wang, Xin, Delgado, Hector, Tak, Hemlata, Jung, Jee-weon, Shim, Hye-jin, Todisco, Massimiliano, Kukanov, Ivan, Liu, Xuechen, Sahidullah, Md, Kinnunen, Tomi, Evans, Nicholas, Lee, Kong Aik, and Yamagishi, Junichi
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Artificial Intelligence ,Computer Science - Sound - Abstract
ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks, and the design of detection solutions. Compared to previous challenges, the ASVspoof 5 database is built from crowdsourced data collected from a vastly greater number of speakers in diverse acoustic conditions. Attacks, also crowdsourced, are generated and tested using surrogate detection models, while adversarial attacks are incorporated for the first time. New metrics support the evaluation of spoofing-robust automatic speaker verification (SASV) as well as stand-alone detection solutions, i.e., countermeasures without ASV. We describe the two challenge tracks, the new database, the evaluation metrics, baselines, and the evaluation platform, and present a summary of the results. Attacks significantly compromise the baseline systems, while submissions bring substantial improvements., Comment: 8 pages, ASVspoof 5 Workshop (Interspeech2024 Satellite)
- Published
- 2024
11. Spb3DTracker: A Robust LiDAR-Based Person Tracker for Noisy Environment
- Author
-
Im, Eunsoo, Jee, Changhyun, and Lee, Jung Kwon
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Robotics - Abstract
Person detection and tracking (PDT) has seen significant advancements with 2D camera-based systems in the autonomous vehicle field, leading to widespread adoption of these algorithms. However, growing privacy concerns have recently emerged as a major issue, prompting a shift towards LiDAR-based PDT as a viable alternative. Within this domain, "Tracking-by-Detection" (TBD) has become a prominent methodology. Despite its effectiveness, LiDAR-based PDT has not yet achieved the same level of performance as camera-based PDT. This paper examines key components of the LiDAR-based PDT framework, including detection post-processing, data association, motion modeling, and lifecycle management. Building upon these insights, we introduce SpbTrack, a robust person tracker designed for diverse environments. Our method achieves superior performance on noisy datasets and state-of-the-art results on KITTI Dataset benchmarks and custom office indoor dataset among LiDAR-based trackers., Comment: 17 pages, 5 figures
- Published
- 2024
12. RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records
- Author
-
Park, Sangjoon, Wee, Chan Woo, Choi, Seo Hee, Kim, Kyung Hwan, Chang, Jee Suk, Yoon, Hong In, Lee, Ik Jae, Kim, Yong Bae, Cho, Jaeho, Keum, Ki Chang, Lee, Chang Geol, Byun, Hwa Kyung, and Koom, Woong Sub
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Accurate patient selection is critical in radiotherapy (RT) to prevent ineffective treatments. Traditional survival prediction models, relying on structured data, often lack precision. This study explores the potential of large language models (LLMs) to structure unstructured electronic health record (EHR) data, thereby improving survival prediction accuracy through comprehensive clinical information integration. Data from 34,276 patients treated with RT at Yonsei Cancer Center between 2013 and 2023 were analyzed, encompassing both structured and unstructured data. An open-source LLM was used to structure the unstructured EHR data via single-shot learning, with its performance compared against a domain-specific medical LLM and a smaller variant. Survival prediction models were developed using statistical, machine learning, and deep learning approaches, incorporating both structured and LLM-structured data. Clinical experts evaluated the accuracy of the LLM-structured data. The open-source LLM achieved 87.5% accuracy in structuring unstructured EHR data without additional training, significantly outperforming the domain-specific medical LLM, which reached only 35.8% accuracy. Larger LLMs were more effective, particularly in extracting clinically relevant features like general condition and disease extent, which closely correlated with patient survival. Incorporating LLM-structured clinical features into survival prediction models significantly improved accuracy, with the C-index of deep learning models increasing from 0.737 to 0.820. These models also became more interpretable by emphasizing clinically significant factors. This study shows that general-domain LLMs, even without specific medical training, can effectively structure large-scale unstructured EHR data, substantially enhancing the accuracy and interpretability of clinical predictive models., Comment: 23 pages, 2 tables, 4 figures
- Published
- 2024
13. Grasp Failure Constraints for Fast and Reliable Pick-and-Place Using Multi-Suction-Cup Grippers
- Author
-
Lee, Jee-eun, Sun, Robert, Bylard, Andrew, and Sentis, Luis
- Subjects
Computer Science - Robotics - Abstract
Multi-suction-cup grippers are frequently employed to perform pick-and-place robotic tasks, especially in industrial settings where grasping a wide range of light to heavy objects in limited amounts of time is a common requirement. However, most existing works focus on using one or two suction cups to grasp only irregularly shaped but light objects. There is a lack of research on robust manipulation of heavy objects using larger arrays of suction cups, which introduces challenges in modeling and predicting grasp failure. This paper presents a general approach to modeling grasp strength in multi-suction-cup grippers, introducing new constraints usable for trajectory planning and optimization to achieve fast and reliable pick-and-place maneuvers. The primary modeling challenge is the accurate prediction of the distribution of loads at each suction cup while grasping objects. To solve for this load distribution, we find minimum spring potential energy configurations through a simple quadratic program. This results in a computationally efficient analytical solution that can be integrated to formulate grasp failure constraints in time-optimal trajectory planning. Finally, we present experimental results to validate the efficiency and accuracy of the proposed model.
- Published
- 2024
14. The Effect of Attending Las Americas Middle School on Early High School Outcomes
- Author
-
Rice University, Houston Education Research Consortium (HERC), Jee Sun Lee, Camila Cigarroa Kennedy, Brian Holzman, and Aimee Chin
- Abstract
This brief evaluates the causal effect of attending Las Americas Middle School on newcomer students' early high school outcomes. Using administrative data from the Houston Independent School District (HISD) spanning the 2007-2008 through 2018-2019 school years, the study examined the academic performance, course-taking patterns, and school engagement of newcomer students who did and did not attend Las Americas. Attending Las Americas increased newcomer students' English end-of-course (EOC) exam scores and decreased students' likelihood of receiving disciplinary actions. Newcomer students who attended Las Americas fared similarly to their newcomer peers at other middle schools on all other outcomes. The brief concludes with a discussion of the limitations of the analysis, as well as potential implications for policy and practice.
- Published
- 2024
15. Inhibition of lysine acetyltransferase KAT6 in ER+HER2- metastatic breast cancer: a phase 1 trial.
- Author
-
Mukohara, Toru, Park, Yeon, Sommerhalder, David, Yonemori, Kan, Hamilton, Erika, Kim, Sung-Bae, Kim, Jee, Iwata, Hiroji, Yamashita, Toshinari, Layman, Rachel, Mita, Monica, Clay, Timothy, Chae, Yee, Oakman, Catherine, Yan, Fengting, Kim, Gun, Im, Seock-Ah, Lindeman, Geoffrey, Rugo, Hope, Liyanage, Marlon, Saul, Michelle, Le Corre, Christophe, Skoura, Athanasia, Liu, Li, Li, Meng, and LoRusso, Patricia
- Subjects
Humans ,Female ,Breast Neoplasms ,Histone Acetyltransferases ,Middle Aged ,Receptor ,ErbB-2 ,Receptors ,Estrogen ,Fulvestrant ,Aged ,Adult ,Neoplasm Metastasis ,Antineoplastic Combined Chemotherapy Protocols - Abstract
Inhibition of histone lysine acetyltransferases (KATs) KAT6A and KAT6B has shown antitumor activity in estrogen receptor-positive (ER+) breast cancer preclinical models. PF-07248144 is a selective catalytic inhibitor of KAT6A and KAT6B. In the present study, we report the safety, pharmacokinetics (PK), pharmacodynamics, efficacy and biomarker results from the first-in-human, phase 1 dose escalation and dose expansion study (n = 107) of PF-07248144 monotherapy and fulvestrant combination in heavily pretreated ER+ human epidermal growth factor receptor-negative (HER2-) metastatic breast cancer (mBC). The primary objectives of assessing the safety and tolerability and determining the recommended dose for expansion of PF-07248144, as monotherapy and in combination with fulvestrant, were met. Secondary endpoints included characterization of PK and evaluation of antitumor activity, including objective response rate (ORR) and progression-free survival (PFS). Common treatment-related adverse events (any grade; grades 3-4) included dysgeusia (83.2%, 0%), neutropenia (59.8%, 35.5%) and anemia (48.6%, 13.1%). Exposure was approximately dose proportional. Antitumor activity was observed as monotherapy. For the PF-07248144-fulvestrant combination (n = 43), the ORR (95% confidence interval (CI)) was 30.2% (95% CI = 17.2-46.1%) and the median PFS was 10.7 (5.3-not evaluable) months. PF-07248144 demonstrated a tolerable safety profile and durable antitumor activity in heavily pretreated ER+HER2- mBC. These findings establish KAT6A and KAT6B as druggable cancer targets, provide clinical proof of concept and reveal a potential avenue to treat mBC. clinicaltrial.gov registration: NCT04606446 .
- Published
- 2024
16. Weak-Lensing Characterization of the Dark Matter in 29 Merging Clusters that Exhibit Radio Relics
- Author
-
Finner, Kyle, Jee, M. James, Cho, Hyejeon, Hyeonghan, Kim, Lee, Wonki, van Weeren, Reinout J., Wittman, David, and Yoon, Mijin
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
We present a multiwavelength analysis of 29 merging galaxy clusters that exhibit radio relics. For each merging system, we perform a weak-lensing analysis on Subaru optical imaging. We generate high-resolution mass maps of the dark matter distributions, which are critical for discerning the merging constituents. Combining the weak-lensing detections with X-ray emission, radio emission, and galaxy redshifts, we discuss the formation of radio relics from the past collision. For each subcluster, we obtain mass estimates by fitting a multi-component NFW model with and without a concentration-mass relation. Comparing the two mass estimate techniques, we find that the concentration-mass relation underestimates (overestimates) the mass relative to fitting both parameters for high- (low-) mass subclusters. We compare the mass estimates of each subcluster to their velocity dispersion measurements and find that they preferentially lie below the expected velocity dispersion scaling relation, especially at the low-mass end (~$10^{14}\ M_\odot$). We show that the majority of the clusters that exhibit radio relics are in major mergers with a mass ratio below 1:4. We investigate the position of the mass peak relative to the galaxy luminosity peak, number density peak, and BCG locations and find that the BCG tends to better trace the mass peak position. Finally, we update a golden sample of 8 galaxy clusters that have the simplest geometries and can provide the cleanest picture of the past merger, which we recommend for further investigation to constrain the nature of dark matter and the acceleration process that leads to radio relics., Comment: 55 pages, 36 figures, submitted to ApJS
- Published
- 2024
17. Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing
- Author
-
Shim, Hye-jin, Sahidullah, Md, Jung, Jee-weon, Watanabe, Shinji, and Kinnunen, Tomi
- Subjects
Computer Science - Sound ,Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Current trends in audio anti-spoofing detection research strive to improve models' ability to generalize across unseen attacks by learning to identify a variety of spoofing artifacts. This emphasis has primarily focused on the spoof class. Recently, several studies have noted that the distribution of silence differs between the two classes, which can serve as a shortcut. In this paper, we extend class-wise interpretations beyond silence. We employ loss analysis and asymmetric methodologies to move away from traditional attack-focused and result-oriented evaluations towards a deeper examination of model behaviors. Our investigations highlight the significant differences in training dynamics between the two classes, emphasizing the need for future research to focus on robust modeling of the bonafide class., Comment: 5 pages, 1 figure, 5 tables, ISCA Interspeech 2024 SynData4GenAI Workshop
- Published
- 2024
18. Disentangled Representation Learning for Environment-agnostic Speaker Recognition
- Author
-
Nam, KiHyun, Heo, Hee-Soo, Jung, Jee-weon, and Chung, Joon Son
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This work presents a framework based on feature disentanglement to learn speaker embeddings that are robust to environmental variations. Our framework utilises an auto-encoder as a disentangler, dividing the input speaker embedding into components related to the speaker and other residual information. We employ a group of objective functions to ensure that the auto-encoder's code representation - used as the refined embedding - condenses only the speaker characteristics. We show the versatility of our framework through its compatibility with any existing speaker embedding extractor, requiring no structural modifications or adaptations for integration. We validate the effectiveness of our framework by incorporating it into two popularly used embedding extractors and conducting experiments across various benchmarks. The results show a performance improvement of up to 16%. We release our code for this work to be available https://github.com/kaistmm/voxceleb-disentangler, Comment: Interspeech 2024. The official webpage can be found at https://mm.kaist.ac.kr/projects/voxceleb-disentangler/
- Published
- 2024
19. On existence of Sadovskii vortex patch: A touching pair of symmetric counter-rotating uniform vortex
- Author
-
Choi, Kyudong, Jeong, In-Jee, and Sim, Young-Jin
- Subjects
Mathematics - Analysis of PDEs ,Mathematical Physics - Abstract
The Sadovskii vortex patch is a traveling wave for the two-dimensional incompressible Euler equations consisting of an odd symmetric pair of vortex patches touching the symmetry axis. Its existence was first suggested by numerical computations of Sadovskii in [J. Appl. Math. Mech., 1971], and has gained significant interest due to its relevance in inviscid limit of planar flows via Prandtl--Batchelor theory and as the asymptotic state for vortex ring dynamics. In this work, we prove the existence of a Sadovskii vortex patch, by solving the energy maximization problem under the exact impulse condition and an upper bound on the circulation., Comment: 42 pages, 1 figure
- Published
- 2024
20. On the Evaluation of Speech Foundation Models for Spoken Language Understanding
- Author
-
Arora, Siddhant, Pasad, Ankita, Chien, Chung-Ming, Han, Jionghao, Sharma, Roshan, Jung, Jee-weon, Dhamyal, Hira, Chen, William, Shon, Suwon, Lee, Hung-yi, Livescu, Karen, and Watanabe, Shinji
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The Spoken Language Understanding Evaluation (SLUE) suite of benchmark tasks was recently introduced to address the need for open resources and benchmarking of complex spoken language understanding (SLU) tasks, including both classification and sequence generation tasks, on natural speech. The benchmark has demonstrated preliminary success in using pre-trained speech foundation models (SFM) for these SLU tasks. However, the community still lacks a fine-grained understanding of the comparative utility of different SFMs. Inspired by this, we ask: which SFMs offer the most benefits for these complex SLU tasks, and what is the most effective approach for incorporating these SFMs? To answer this, we perform an extensive evaluation of multiple supervised and self-supervised SFMs using several evaluation protocols: (i) frozen SFMs with a lightweight prediction head, (ii) frozen SFMs with a complex prediction head, and (iii) fine-tuned SFMs with a lightweight prediction head. Although the supervised SFMs are pre-trained on much more speech recognition data (with labels), they do not always outperform self-supervised SFMs; the latter tend to perform at least as well as, and sometimes better than, supervised SFMs, especially on the sequence generation tasks in SLUE. While there is no universally optimal way of incorporating SFMs, the complex prediction head gives the best performance for most tasks, although it increases the inference time. We also introduce an open-source toolkit and performance leaderboard, SLUE-PERB, for these tasks and modeling strategies., Comment: Accepted at ACL Findings 2024
- Published
- 2024
21. To what extent can ASV systems naturally defend against spoofing attacks?
- Author
-
Jung, Jee-weon, Wang, Xin, Evans, Nicholas, Watanabe, Shinji, Shim, Hye-jin, Tak, Hemlata, Arora, Sidhhant, Yamagishi, Junichi, and Chung, Joon Son
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Artificial Intelligence - Abstract
The current automatic speaker verification (ASV) task involves making binary decisions on two types of trials: target and non-target. However, emerging advancements in speech generation technology pose significant threats to the reliability of ASV systems. This study investigates whether ASV effortlessly acquires robustness against spoofing attacks (i.e., zero-shot capability) by systematically exploring diverse ASV systems and spoofing attacks, ranging from traditional to cutting-edge techniques. Through extensive analyses conducted on eight distinct ASV systems and 29 spoofing attack systems, we demonstrate that the evolution of ASV inherently incorporates defense mechanisms against spoofing attacks. Nevertheless, our findings also underscore that the advancement of spoofing attacks far outpaces that of ASV systems, hence necessitating further research on spoofing-robust ASV methodologies., Comment: 5 pages, 3 figures, 3 tables, Interspeech 2024
- Published
- 2024
22. Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
- Author
-
Zhang, Wangyou, Saijo, Kohei, Jung, Jee-weon, Li, Chenda, Watanabe, Shinji, and Qian, Yanmin
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Sound - Abstract
Deep learning-based speech enhancement (SE) models have achieved impressive performance in the past decade. Numerous advanced architectures have been designed to deliver state-of-the-art performance; however, their scalability potential remains unrevealed. Meanwhile, the majority of research focuses on small-sized datasets with restricted diversity, leading to a plateau in performance improvement. In this paper, we aim to provide new insights for addressing the above issues by exploring the scalability of SE models in terms of architectures, model sizes, compute budgets, and dataset sizes. Our investigation involves several popular SE architectures and speech data from different domains. Experiments reveal both similarities and distinctions between the scaling effects in SE and other tasks such as speech recognition. These findings further provide insights into the under-explored SE directions, e.g., larger-scale multi-domain corpora and efficiently scalable architectures., Comment: 5 pages, 3 figures, 4 tables, Accepted by Interspeech 2024
- Published
- 2024
- Full Text
- View/download PDF
23. Learning-Based WiFi Fingerprint Inpainting via Generative Adversarial Networks
- Author
-
Chan, Yu, Lin, Pin-Yu, Tseng, Yu-Yun, Chen, Jen-Jee, and Tseng, Yu-Chee
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Machine Learning - Abstract
WiFi-based indoor positioning has been extensively studied. A fundamental issue in such solutions is the collection of WiFi fingerprints. However, due to real-world constraints, collecting complete fingerprints at all intended locations is sometimes prohibited. This work considers the WiFi fingerprint inpainting problem. This problem differs from typical image/video inpainting problems in several aspects. Unlike RGB images, WiFi field maps come in any shape, and signal data may follow certain distributions. Therefore, it is difficult to forcefully fit them into a fixed-dimensional matrix, as done with processing images in RGB format. As soon as a map is changed, it also becomes difficult to adapt it to the same model due to scale issues. Furthermore, such models are significantly constrained in situations requiring outward inpainting. Fortunately, the spatial relationships of WiFi signals and the rich information provided among channels offer ample opportunities for this generative model to accomplish inpainting. Therefore, we designed this model to not only retain the characteristic of regression models in generating fingerprints of arbitrary shapes but also to accommodate the observational outcomes from densely deployed APs. This work makes two major contributions. Firstly, we delineate the distinctions between this problem and image inpainting, highlighting potential avenues for research. Secondly, we introduce novel generative inpainting models aimed at capturing both inter-AP and intra-AP correlations while preserving latent information. Additionally, we incorporate a specially designed adversarial discriminator to enhance the quality of inpainting outcomes., Comment: ICCCN2024
- Published
- 2024
24. Trends in Dual Antiplatelet Therapy of Aspirin and Clopidogrel and Outcomes in Ischemic Stroke Patients Noneligible for POINT/CHANCE Trial Treatment.
- Author
-
Kim, Joon-Tae, Lee, Ji, Kim, Hyunsoo, Kim, Beom, Lee, Keon-Joo, Park, Jong-Moo, Kang, Kyusik, Lee, Soo, Kim, Jae, Cha, Jae-Kwan, Kim, Dae-Hyun, Park, Tai, Lee, Kyungbok, Lee, Jun, Hong, Keun-Sik, Cho, Yong-Jin, Park, Hong-Kyun, Lee, Byung-Chul, Yu, Kyung-Ho, Oh, Mi, Kim, Dong-Eog, Choi, Jay, Kwon, Jee-Hyun, Kim, Wook-Joo, Shin, Dong-Ick, Yum, Kyu, Sohn, Sung, Hong, Jeong-Ho, Lee, Sang-Hwa, Park, Man-Seok, Ryu, Wi-Sun, Park, Kwang-Yeol, Lee, Juneyoung, Saver, Jeffrey, and Bae, Hee-Joon
- Subjects
acute ischemic stroke ,aspirin ,clopidogrel ,dual antiplatelet treatment ,late‐presenting stroke ,nonminor stroke ,Humans ,Clopidogrel ,Aspirin ,Male ,Aged ,Female ,Ischemic Stroke ,Dual Anti-Platelet Therapy ,Platelet Aggregation Inhibitors ,Registries ,Middle Aged ,Treatment Outcome ,Aged ,80 and over ,Time Factors ,Japan ,Secondary Prevention ,Drug Therapy ,Combination ,Risk Factors - Abstract
BACKGROUND: Recent clinical trials established the benefit of dual antiplatelet therapy with aspirin and clopidogrel (DAPT-AC) in early-presenting patients with minor ischemic stroke. However, the impact of these trials over time on the use and outcomes of DAPT-AC among the patients with nonminor or late-presenting stroke who do not meet the eligibility criteria of these trials has not been delineated. METHODS AND RESULTS: In a multicenter stroke registry, this study examined yearly changes from April 2008 to August 2022 in DAPT-AC use for stroke patients ineligible for CHANCE/POINT (Clopidogrel in High-Risk Patients with Acute Nondisabling Cerebrovascular Events/Platelet-Oriented Inhibition in New TIA and Minor Ischemic Stroke) clinical trials due to National Institutes of Health Stroke Scale >4 or late arrival beyond 24 hours of onset. A total of 32 118 patients (age, 68.1±13.1 years; male, 58.5%) with National Institutes of Health Stroke Scale of 4 (interquartile range, 1-7) were analyzed. In 2008, DAPT-AC was used in 33.0%, other antiplatelets in 62.7%, and no antiplatelet in 4.3%. The frequency of DAPT-AC was relatively unchanged through 2013, when the CHANCE trial was published, and then increased steadily, reaching 78% in 2022, while other antiplatelets decreased to 17.8% in 2022 (Ptrend
- Published
- 2024
25. Abell 746: A Highly Disturbed Cluster Undergoing Multiple Mergers
- Author
-
Rajpurohit, K, Lovisari, L, Botteon, A, Jones, C, Forman, W, O’Sullivan, E, van Weeren, RJ, HyeongHan, K, Bonafede, A, Jee, MJ, Vazza, F, Brunetti, G, Cho, H, Domínguez-Fernández, P, Stroe, A, Finner, K, Brüggen, M, Vrtilek, JM, David, LP, Schellenberger, G, Wittman, D, Lusetti, G, Kraft, R, and De Gasperin, F
- Subjects
Astronomical Sciences ,Physical Sciences ,Astronomical and Space Sciences ,Atomic ,Molecular ,Nuclear ,Particle and Plasma Physics ,Physical Chemistry (incl. Structural) ,Astronomy & Astrophysics ,Astronomical sciences ,Particle and high energy physics ,Space sciences - Abstract
We present deep XMM-Newton, Karl G. Jansky Very Large Array, and upgraded Giant Metrewave Radio Telescope observations of Abell 746, a cluster that hosts a plethora of diffuse emission sources that provide evidence for the acceleration of relativistic particles. Our new XMM-Newton images reveal a complex morphology of the thermal gas with several substructures. We observe an asymmetric temperature distribution across the cluster: the southern regions exhibit higher temperatures, reaching ∼9 keV, while the northern regions have lower temperatures (≤4 keV), likely due to a complex merger. We find evidence of three surface brightness edges and one candidate edge, of which three are merger-driven shock fronts. Combining our new data with published LOw-Frequency ARray observations has unveiled the nature of diffuse sources in this system. The bright NW relic shows thin filaments and a high degree of polarization with aligned magnetic field vectors. We detect a density jump, aligned with the fainter relic to the north. To the south, we detect high-temperature regions, consistent with the shock-heated regions and a density jump coincident with the northern tip of the southern radio structure. Its integrated spectrum shows a high-frequency steepening. Lastly, we find that the cluster hosts large-scale radio halo emission. A comparison of the thermal and nonthermal emission reveals an anticorrelation between the bright radio and X-ray features at the center. Our findings suggest that Abell 746 is a complex system that involves multiple mergers.
- Published
- 2024
26. Direct Evidence of a Major Merger in the Perseus Cluster
- Author
-
HyeongHan, Kim, Jee, M. James, Lee, Wonki, ZuHone, John, Zhuravleva, Irina, Kang, Wooseok, and Hwang, Ho Seong
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Although the Perseus cluster has often been regarded as an archetypical relaxed galaxy cluster, several lines of evidence including ancient, large-scale cold fronts, asymmetric plasma morphology, filamentary galaxy distribution, etc., provide a conflicting view of its dynamical state, suggesting that the cluster might have experienced a major merger. However, the absence of a clear merging companion identified to date hampers our understanding of the evolutionary track of the Perseus cluster consistent with these observational features. In this paper, through careful weak lensing analysis, we successfully identified the missing subcluster halo ($M_{200}=1.70^{+0.73}_{-0.59}\times10^{14}~M_{\odot}$) at the >5$\sigma$ level centered on NGC1264, which is located ~430 kpc west of the Perseus main cluster core. Moreover, a significant ($>3\sigma$) mass bridge, which is also traced by the cluster member galaxies, is detected between the Perseus main and sub clusters, which serves as direct evidence of gravitational interaction. With idealized numerical simulations, we demonstrate that a ~3:1 off-axis major merger can create the cold front observed ~700 kpc east of the main cluster core and also generate the observed mass bridge through multiple core crossings., Comment: The current version is a submitted manuscript
- Published
- 2024
27. On illposedness of the Hall and electron magnetohydrodynamic equations without resistivity on the whole space
- Author
-
Jeong, In-Jee and Oh, Sung-Jin
- Subjects
Mathematics - Analysis of PDEs - Abstract
It has been shown in our previous work that the incompressible and irresistive Hall- and electron-magnetohydrodynamic (MHD) equations are illposed on flat domains $M = \mathbb{R}^k \times \mathbb{T}^{3-k}$ for $0 \le k \le 2$. The data and solutions therein were assumed to be independent of one coordinate, which not only significantly simplifies the systems but also allows for a large class of steady states. In this work, we remove the assumption of independence and conclude strong illposedness for compactly supported data in $\mathbb{R}^3$. This is achieved by constructing degenerating wave packets for linearized systems around time-dependent axisymmetric magnetic fields. A few main additional ingredients are: a more systematic application of the generalized energy estimate, use of the Bogovski\v{i} operator, and a priori estimates for axisymmetric solutions to the Hall- and electron-MHD systems., Comment: 36 pages
- Published
- 2024
28. Combined Pre-Supernova Alert System with Kamland and Super-Kamiokande
- Author
-
KamLAND, Collaborations, Super-Kamiokande, Abe, Seisho, Eizuka, Minori, Futagi, Sawako, Gando, Azusa, Gando, Yoshihito, Goto, Shun, Hachiya, Takahiko, Hata, Kazumi, Ichimura, Koichi, Ieki, Sei, Ikeda, Haruo, Inoue, Kunio, Ishidoshiro, Koji, Kamei, Yuto, Kawada, Nanami, Kishimoto, Yasuhiro, Koga, Masayuki, Kurasawa, Maho, Mitsui, Tadao, Miyake, Haruhiko, Morita, Daisuke, Nakahata, Takeshi, Nakajima, Rika, Nakamura, Kengo, Nakamura, Rikuo, Nakamura, Ryo, Nakane, Jun, Ozaki, Hideyoshi, Saito, Keita, Sakai, Taichi, Shimizu, Itaru, Shirai, Junpei, Shiraishi, Kensuke, Shoji, Ryunosuke, Suzuki, Atsuto, Takeuchi, Atsuto, Tamae, Kyoko, Watanabe, Hiroko, Watanabe, Kazuho, Yoshida, Sei, Umehara, Saori, Fushimi, Ken-Ichi, Kotera, Kenta, Urano, Yusuke, Berger, Bruce E., Fujikawa, Brian K., Larned, John G., Maricic, Jelena, Fu, Zhenghao, Smolsky, Joseph, Winslow, Lindley A., Efremenko, Yuri, Karwowski, Hugon J., Markoff, Diane M., Tornow, Werner, Dell'Oro, Stefano, O'Donnell, Thomas, Detwiler, Jason A., Enomoto, Sanshiro, Decowski, Michal P., Weerman, Kelly M., Grant, Christopher, Song, Hasung, Li, Aobo, Axani, Spencer N., Garcia, Miles, Abe, Ko, Bronner, Christophe, Hayato, Yoshinari, Hiraide, Katsuki, Hosokawa, Keishi, Ieki, Kei, Ikeda, Motoyasu, Kameda, June, Kanemura, Yuki, Kaneshima, Ryota, Kashiwagi, Yuri, Kataoka, Yousuke, Miki, Shintaro, Mine, Shunichi, Miura, Makoto, Moriyama, Shigetaka, Nakahata, Masayuki, Nakano, Yuuki, Nakayama, Shoei, Noguchi, Yohei, Sato, Kazufumi, Sekiya, Hiroyuki, Shiba, Hayato, Shimizu, Kotaro, Shiozawa, Masato, Sonoda, Yutaro, Suzuki, Yoichiro, Takeda, Atsushi, Takemoto, Yasuhiro, Tanaka, Hidekazu K., Yano, Takatomi, Han, Seungho, Kajita, Takaaki, Okumura, Kimihiro, Tashiro, Takuya, Tomiya, Takuya, Wang, Xubin, Yoshida, Shunsuke, Fernandez, Pablo, Labarga, Luis, Ospina, Nataly, Zaldivar, Bryan, Pointon, Barry W., Kearns, Edward, Raaf, Jennifer L., Wan, Linyan, Wester, Thomas, Bian, Jianming, Griskevich, Jeff, Smy, Michael B., Sobel, Henry W., Takhistov, Volodymyr, Yankelevich, Alejandro, Hill, James, Jang, MinCheol, Lee, Seonghak, Moon, DongHo, Park, RyeongGyoon, Bodur, Baran, Scholberg, Kate, Walter, Chris W., Beauchêne, Antoine, Drapier, Olivier, Giampaolo, Alberto, Mueller, Thomas A., Santos, Andrew D., Paganini, Pascal, Quilain, Benjamin, Rogly, Rudolph, Nakamura, Taku, Jang, Jee-Seung, Machado, Lucas N., Learned, John G., Choi, Koun, Iovine, Nadege, Cao, Son V., Anthony, Lauren H. V., Martin, Daniel G. R., Prouse, Nick W., Scott, Mark, Uchida, Yoshi, Berardi, Vincenzo, Calabria, Nicola F., Catanesi, M. G., Radicioni, Emilio, Langella, Aurora, de Rosa, Gianfranca, Collazuol, Gianmaria, Feltre, Matteo, Iacob, Fabio, Mattiazzi, Marco, Ludovici, Lucio, Gonin, Michel, Périssé, Lorenzo, Pronost, Guillaume, Fujisawa, Chiori, Horiuchi, Shogo, Kobayashi, Misaki, Liu, Yu-Ming, Maekawa, Yuto, Nishimura, Yasuhiro, Okazaki, Reo, Akutsu, Ryosuke, Friend, Megan, Hasegawa, Takuya, Ishida, Taku, Kobayashi, Takashi, Jakkapu, Mahesh, Matsubara, Tsunayuki, Nakadaira, Takeshi, Nakamura, Kenzo, Oyama, Yuichi, Sakashita, Ken, Sekiguchi, Tetsuro, Tsukamoto, Toshifumi, Yrey, Antoniosk Portocarrero, Bhuiyan, Nahid, Burton, George T., Di Lodovico, Francesca, Gao, Joanna, Goldsack, Alexander, Katori, Teppei, Migenda, Jost, Ramsden, Rory M., Xie, Zhenxiong, Zsoldos, Stephane, Suzuki, Atsumu T., Takagi, Yusuke, Takeuchi, Yasuo, Zhong, Haiwen, Feng, Jiahui, Feng, Li-Cheng, Hu, Jianrun, Hu, Zhuojun, Kawaue, Masaki, Kikawa, Tatsuya, Mori, Masamitsu, Nakaya, Tsuyoshi, Wendell, Roger A., Yasutome, Kenji, Jenkins, Sam J., McCauley, Neil K., Mehta, Pruthvi, Tarrant, Adam, Wilking, Mike J., Fukuda, Yoshiyuki, Itow, Yoshitaka, Menjo, Hiroaki, Ninomiya, Kotaro, Yoshioka, Yushi, Lagoda, Justyna, Mandal, Maitrayee, Mijakowski, Piotr, Prabhu, Yashwanth S., Zalipska, Joanna, Jia, Mo, Jiang, Junjie, Shi, Wei, Yanagisawa, Chiaki, Harada, Masayuki, Hino, Yota, Ishino, Hirokazu, Koshio, Yusuke, Nakanishi, Fumi, Sakai, Seiya, Tada, Tomoaki, Tano, Tomohiro, Ishizuka, Takeharu, Barr, Giles, Barrow, Daniel, Cook, Laurence, Samani, Soniya, Wark, David, Holin, Anna, Nova, Federico, Jung, Seunghyun, Yang, Byeongsu, Yang, JeongYeol, Yoo, Jonghee, Fannon, Jack E. P., Kneale, Liz, Malek, Matthew, McElwee, Jordan M., Thiesse, Matthew D., Thompson, Lee F., Wilson, Stephen T., Okazawa, Hiroko, Mohan, Lakshmi S., Kim, SooBong, Kwon, Eunhyang, Seo, Ji-Woong, Yu, Intae, Ichikawa, Atsuko K., Nakamura, Kiseki D., Tairafune, Seidai, Nishijima, Kyoshi, Eguchi, Aoi, Nakagiri, Kota, Nakajima, Yasuhiro, Shima, Shizuka, Taniuchi, Natsumi, Watanabe, Eiichiro, Yokoyama, Masashi, de Perio, Patrick, Fujita, Saki, Jesus-Valls, Cesar, Martens, Kai, Tsui, Ka M., Vagins, Mark R., Xia, Junjie, Izumiyama, Shota, Kuze, Masahiro, Matsumoto, Ryo, Terada, Kotaro, Asaka, Ryusei, Ishitsuka, Masaki, Ito, Hiroshi, Ommura, Yuga, Shigeta, Natsuki, Shinoki, Masataka, Yamauchi, Koki, Yoshida, Tsukasa, Gaur, Rhea, Gousy-Leblan, Vincent, Hartz, Mark, Konaka, Akira, Li, Xiaoyue, Chen, Shaomin, Xu, Benda, Zhang, Aiqiang, Zhang, Bin, Posiadala-Zezula, Magdalena, Boyd, Steven B., Edwards, Rory, Hadley, David, Nicholson, Matthew, O'Flaherty, Marcus, Richards, Benjamin, Ali, Ajmi, Jamieson, Blair, Amanai, Shogo, Marti-Magro, Lluis, Minamino, Akihiro, Shibayama, Ryo, and Suzuki, Serina
- Subjects
High Energy Physics - Experiment ,Astrophysics - High Energy Astrophysical Phenomena ,Physics - Instrumentation and Detectors - Abstract
Preceding a core-collapse supernova, various processes produce an increasing amount of neutrinos of all flavors characterized by mounting energies from the interior of massive stars. Among them, the electron antineutrinos are potentially detectable by terrestrial neutrino experiments such as KamLAND and Super-Kamiokande via inverse beta decay interactions. Once these pre-supernova neutrinos are observed, an early warning of the upcoming core-collapse supernova can be provided. In light of this, KamLAND and Super-Kamiokande, both located in the Kamioka mine in Japan, have been monitoring pre-supernova neutrinos since 2015 and 2021, respectively. Recently, we performed a joint study between KamLAND and Super-Kamiokande on pre-supernova neutrino detection. A pre-supernova alert system combining the KamLAND detector and the Super-Kamiokande detector was developed and put into operation, which can provide a supernova alert to the astrophysics community. Fully leveraging the complementary properties of these two detectors, the combined alert is expected to resolve a pre-supernova neutrino signal from a 15 M$_{\odot}$ star within 510 pc of the Earth, at a significance level corresponding to a false alarm rate of no more than 1 per century. For a Betelgeuse-like model with optimistic parameters, it can provide early warnings up to 12 hours in advance., Comment: Resubmitted to ApJ. 22 pages, 16 figures, for more information about the combined pre-supernova alert system, see https://www.lowbg.org/presnalarm/
- Published
- 2024
- Full Text
- View/download PDF
29. On the Performance of Jerk-Constrained Time-Optimal Trajectory Planning for Industrial Manipulators
- Author
-
Lee, Jee-eun, Bylard, Andrew, Sun, Robert, and Sentis, Luis
- Subjects
Computer Science - Robotics - Abstract
Jerk-constrained trajectories offer a wide range of advantages that collectively improve the performance of robotic systems, including increased energy efficiency, durability, and safety. In this paper, we present a novel approach to jerk-constrained time-optimal trajectory planning (TOTP), which follows a specified path while satisfying up to third-order constraints to ensure safety and smooth motion. One significant challenge in jerk-constrained TOTP is a non-convex formulation arising from the inclusion of third-order constraints. Approximating inequality constraints can be particularly challenging because the resulting solutions may violate the actual constraints. We address this problem by leveraging convexity within the proposed formulation to form conservative inequality constraints. We then obtain the desired trajectories by solving an $\boldsymbol n$-dimensional Sequential Linear Program (SLP) iteratively until convergence. Lastly, we evaluate in a real robot the performance of trajectories generated with and without jerk limits in terms of peak power, torque efficiency, and tracking capability.
- Published
- 2024
30. NeuroNet: A Novel Hybrid Self-Supervised Learning Framework for Sleep Stage Classification Using Single-Channel EEG
- Author
-
Lee, Cheol-Hui, Kim, Hakseung, Han, Hyun-jee, Jung, Min-Kyung, Yoon, Byung C., and Kim, Dong-Joo
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
The classification of sleep stages is a pivotal aspect of diagnosing sleep disorders and evaluating sleep quality. However, the conventional manual scoring process, conducted by clinicians, is time-consuming and prone to human bias. Recent advancements in deep learning have substantially propelled the automation of sleep stage classification. Nevertheless, challenges persist, including the need for large datasets with labels and the inherent biases in human-generated annotations. This paper introduces NeuroNet, a self-supervised learning (SSL) framework designed to effectively harness unlabeled single-channel sleep electroencephalogram (EEG) signals by integrating contrastive learning tasks and masked prediction tasks. NeuroNet demonstrates superior performance over existing SSL methodologies through extensive experimentation conducted across three polysomnography (PSG) datasets. Additionally, this study proposes a Mamba-based temporal context module to capture the relationships among diverse EEG epochs. Combining NeuroNet with the Mamba-based temporal context module has demonstrated the capability to achieve, or even surpass, the performance of the latest supervised learning methodologies, even with a limited amount of labeled data. This study is expected to establish a new benchmark in sleep stage classification, promising to guide future research and applications in the field of sleep analysis., Comment: 14 pages, 4 figures
- Published
- 2024
31. Substructures within Substructures in the Complex Post-Merging System A514 Unveiled by High-Resolution Magellan/Megacam Weak Lensing
- Author
-
Ahn, Eunmo, Jee, M. James, Lee, Wonki, Joo, Hyungjin, and ZuHone, John
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
Abell 514 (A514) at $z=0.071$ is an intriguing merging system exhibiting highly elongated (~1 Mpc) X-ray features and three large-scale (300~500 kpc) bent radio jets. To dissect this system with its multi-wavelength data, it is critical to robustly identify and quantify its dark matter (DM) substructures. We present a weak-lensing analysis of A514 using deep Magellan/Megacam observations. Combining two optical band filter imaging data obtained under optimal seeing (~0.''6) and leveraging the proximity of A514, we achieve a high source density of ~$46\mbox{arcmin}^{-2}$ or ~$\mathrm{6940Mpc^{-2}}$, which enables high-resolution mass reconstruction. We unveil the complex DM substructures of A514, which are characterized by the NW and SE subclusters separated by ~$0.7$Mpc, each exhibiting a bimodal mass distribution. The total mass of the NW subcluster is estimated to be $\mathrm{M^{NW}_{200c} = 1.08_{-0.22}^{+0.24} \times 10^{14} M_{\odot}}$ and is further resolved into the eastern ($\mathrm{M^{NW_E}_{200c} = 2.6_{-1.1}^{+1.4} \times 10^{13} M_{\odot}})$ and western ($\mathrm{M^{NW_W}_{200c} = 7.1_{-2.0}^{+2.3} \times 10^{13} M_{\odot}}$) components. The mass of the SE subcluster is $\mathrm{M^{SE}_{200c} = 1.55_{-0.26}^{+0.28} \times 10^{14} M_{\odot}}$, which is also further resolved into the northern ($\mathrm{M^{SE_N}_{200c} = 2.9_{-1.3}^{+1.8} \times 10^{13} M_{\odot}}$) and southern ($\mathrm{M^{SE_S}_{200c} = 8.5_{-2.6}^{+3.0} \times 10^{13} M_{\odot}}$) components. These four substructures coincide with the A514 brightest galaxies and are detected with significances ranging from 3.4$\sigma$ to 4.8$\sigma$. Comparison of the dark matter substructures with the X-ray distribution suggests that A514 might have experienced an off-axis collision, and the NW and SE subclusters are currently near their apocenters., Comment: 15 pages, 9 figures, Submitted to ApJ
- Published
- 2024
32. Deeper, Sharper, Faster: Application of Efficient Transformer to Galaxy Image Restoration
- Author
-
Park, Hyosun, Jo, Yongsik, Kang, Seokun, Kim, Taehwan, and Jee, M. James
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
The Transformer architecture has revolutionized the field of deep learning over the past several years in diverse areas, including natural language processing, code generation, image recognition, time series forecasting, etc. We propose to apply Zamir et al.'s efficient transformer to perform deconvolution and denoising to enhance astronomical images. We conducted experiments using pairs of high-quality images and their degraded versions, and our deep learning model demonstrates exceptional restoration of photometric, structural, and morphological information. When compared with the ground-truth JWST images, the enhanced versions of our HST-quality images reduce the scatter of isophotal photometry, Sersic index, and half-light radius by factors of 4.4, 3.6, and 4.7, respectively, with Pearson correlation coefficients approaching unity. The performance is observed to degrade when input images exhibit correlated noise, point-like sources, and artifacts. We anticipate that this deep learning model will prove valuable for a number of scientific applications, including precision photometry, morphological analysis, and shear calibration., Comment: 18 pages, 14 figures, 1 table, Resubmitted to ApJ after the first revision
- Published
- 2024
33. SN H0pe: The First Measurement of $H_0$ from a Multiply-Imaged Type Ia Supernova, Discovered by JWST
- Author
-
Pascale, Massimo, Frye, Brenda L., Pierel, Justin D. R., Chen, Wenlei, Kelly, Patrick L., Cohen, Seth H., Windhorst, Rogier A., Riess, Adam G., Kamieneski, Patrick S., Diego, Jose M., Meena, Ashish K., Cha, Sangjun, Oguri, Masamune, Zitrin, Adi, Jee, M. James, Foo, Nicholas, Leimbach, Reagen, Koekemoer, Anton M., Conselice, C. J., Dai, Liang, Goobar, Ariel, Siebert, Matthew R., Strolger, Lou, and Willner, S. P.
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The first James Webb Space Telescope ({\it JWST}) Near InfraRed Camera (NIRCam) imaging in the field of the galaxy cluster PLCK G165.7+67.0 ($z=0.35$) uncovered a Type Ia supernova (SN~Ia) at $z=1.78$, called ``SN H0pe." Three different images of this one SN were detected as a result of strong gravitational lensing, each one traversing a different path in spacetime, thereby inducing a relative delay in the arrival of each image. Follow-up {\it JWST} observations of all three SN images enabled photometric and rare spectroscopic measurements of the two relative time delays. Following strict blinding protocols which oversaw a live unblinding and regulated post-unblinding changes, these two measured time delays were compared to the predictions of seven independently constructed cluster lens models to measure a value for the Hubble constant, $H_0=71.8^{+9.8}_{-7.6}$~km~s$^{-1}$~Mpc$^{-1}$. The range of admissible $H_0$ values predicted across the lens models limits further precision, reflecting the well-known degeneracies between lens model constraints and time delays. It has long been theorized that a way forward is to leverage a standard candle, however this has not been realized until now. For the first time, the lens models are evaluated by their agreement with the SN absolute magnification, breaking these degeneracies and producing our best estimate, $H_0=75.4^{+8.1}_{-5.5}$~km~s$^{-1}$~Mpc$^{-1}$. This is the first precision measurement of $H_0$ from a multiply-imaged SN~Ia, and provides a measurement in a rarely utilized redshift regime. This result agrees with other local universe measurements, yet exceeds the value of $H_0$ derived from the early Universe with $\gtrsim90\%$ confidence, increasing evidence of the Hubble tension. With the precision provided by only four more events, this approach could solidify this disagreement to $>3\sigma$., Comment: Submitted to ApJ. 22 pages, 7 Figures
- Published
- 2024
34. Map-Aware Human Pose Prediction for Robot Follow-Ahead
- Author
-
Jiang, Qingyuan, Susam, Burak, Chao, Jun-Jee, and Isler, Volkan
- Subjects
Computer Science - Robotics - Abstract
In the robot follow-ahead task, a mobile robot is tasked to maintain its relative position in front of a moving human actor while keeping the actor in sight. To accomplish this task, it is important that the robot understand the full 3D pose of the human (since the head orientation can be different than the torso) and predict future human poses so as to plan accordingly. This prediction task is especially tricky in a complex environment with junctions and multiple corridors. In this work, we address the problem of forecasting the full 3D trajectory of a human in such environments. Our main insight is to show that one can first predict the 2D trajectory and then estimate the full 3D trajectory by conditioning the estimator on the predicted 2D trajectory. With this approach, we achieve results comparable or better than the state-of-the-art methods three times faster. As part of our contribution, we present a new dataset where, in contrast to existing datasets, the human motion is in a much larger area than a single room. We also present a complete robot system that integrates our human pose forecasting network on the mobile robot to enable real-time robot follow-ahead and present results from real-world experiments in multiple buildings on campus. Our project page, including supplementary material and videos, can be found at: https://qingyuan-jiang.github.io/iros2024_poseForecasting/
- Published
- 2024
35. Accelerating Sparse Tensor Decomposition Using Adaptive Linearized Representation
- Author
-
Laukemann, Jan, Helal, Ahmed E., Anderson, S. Isaac Geronimo, Checconi, Fabio, Soh, Yongseok, Tithi, Jesmin Jahan, Ranadive, Teresa, Gravelle, Brian J, Petrini, Fabrizio, and Choi, Jee
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Data Structures and Algorithms ,Computer Science - Performance - Abstract
High-dimensional sparse data emerge in many critical application domains such as cybersecurity, healthcare, anomaly detection, and trend analysis. To quickly extract meaningful insights from massive volumes of these multi-dimensional data, scientists employ unsupervised analysis tools based on tensor decomposition (TD) methods. However, real-world sparse tensors exhibit highly irregular shapes, data distributions, and sparsity, which pose significant challenges for making efficient use of modern parallel architectures. This study breaks the prevailing assumption that compressing sparse tensors into coarse-grained structures (i.e., tensor slices or blocks) or along a particular dimension/mode (i.e., mode-specific) is more efficient than keeping them in a fine-grained, mode-agnostic form. Our novel sparse tensor representation, Adaptive Linearized Tensor Order (ALTO), encodes tensors in a compact format that can be easily streamed from memory and is amenable to both caching and parallel execution. To demonstrate the efficacy of ALTO, we accelerate popular TD methods that compute the Canonical Polyadic Decomposition (CPD) model across a range of real-world sparse tensors. Additionally, we characterize the major execution bottlenecks of TD methods on multiple generations of the latest Intel Xeon Scalable processors, including Sapphire Rapids CPUs, and introduce dynamic adaptation heuristics to automatically select the best algorithm based on the sparse tensor characteristics. Across a diverse set of real-world data sets, ALTO outperforms the state-of-the-art approaches, achieving more than an order-of-magnitude speedup over the best mode-agnostic formats. Compared to the best mode-specific formats, which require multiple tensor copies, ALTO achieves more than 5.1x geometric mean speedup at a fraction (25%) of their storage., Comment: We extend the results of our previous ICS paper to significantly improve the parallel performance of the Canonical Polyadic Alternating Least Squares (CP-ALS) algorithm for normally distributed data and the Canonical Polyadic Alternating Poisson Regression (CP-APR) algorithm for non-negative count data
- Published
- 2024
36. a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification
- Author
-
Shim, Hye-jin, Jung, Jee-weon, Kinnunen, Tomi, Evans, Nicholas, Bonastre, Jean-Francois, and Lapidot, Itshak
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning - Abstract
Spoofing detection is today a mainstream research topic. Standard metrics can be applied to evaluate the performance of isolated spoofing detection solutions and others have been proposed to support their evaluation when they are combined with speaker detection. These either have well-known deficiencies or restrict the architectural approach to combine speaker and spoof detectors. In this paper, we propose an architecture-agnostic detection cost function (a-DCF). A generalisation of the original DCF used widely for the assessment of automatic speaker verification (ASV), the a-DCF is designed for the evaluation of spoofing-robust ASV. Like the DCF, the a-DCF reflects the cost of decisions in a Bayes risk sense, with explicitly defined class priors and detection cost model. We demonstrate the merit of the a-DCF through the benchmarking evaluation of architecturally-heterogeneous spoofing-robust ASV solutions., Comment: 8 pages, submitted to Speaker Odyssey 2024
- Published
- 2024
37. GNSS Positioning using Cost Function Regulated Multilateration and Graph Neural Networks
- Author
-
Jalalirad, Amir, Belli, Davide, Major, Bence, Jee, Songwon, Shah, Himanshu, and Morrison, Will
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
In urban environments, where line-of-sight signals from GNSS satellites are frequently blocked by high-rise objects, GNSS receivers are subject to large errors in measuring satellite ranges. Heuristic methods are commonly used to estimate these errors and reduce the impact of noisy measurements on localization accuracy. In our work, we replace these error estimation heuristics with a deep learning model based on Graph Neural Networks. Additionally, by analyzing the cost function of the multilateration process, we derive an optimal method to utilize the estimated errors. Our approach guarantees that the multilateration converges to the receiver's location as the error estimation accuracy increases. We evaluate our solution on a real-world dataset containing more than 100k GNSS epochs, collected from multiple cities with diverse characteristics. The empirical results show improvements from 40% to 80% in the horizontal localization error against recent deep learning baselines as well as classical localization approaches., Comment: Published in The Proceedings of the Institute of Navigation GNSS+ 2023
- Published
- 2024
- Full Text
- View/download PDF
38. Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model
- Author
-
Park, Sangjoon, Kim, Yong Bae, Chang, Jee Suk, Choi, Seo Hee, Chung, Hyungjin, Lee, Ik Jae, and Byun, Hwa Kyung
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
As advancements in the field of breast cancer treatment continue to progress, the assessment of post-surgical cosmetic outcomes has gained increasing significance due to its substantial impact on patients' quality of life. However, evaluating breast cosmesis presents challenges due to the inherently subjective nature of expert labeling. In this study, we present a novel automated approach, Attention-Guided Denoising Diffusion Anomaly Detection (AG-DDAD), designed to assess breast cosmesis following surgery, addressing the limitations of conventional supervised learning and existing anomaly detection models. Our approach leverages the attention mechanism of the distillation with no label (DINO) self-supervised Vision Transformer (ViT) in combination with a diffusion model to achieve high-quality image reconstruction and precise transformation of discriminative regions. By training the diffusion model on unlabeled data predominantly with normal cosmesis, we adopt an unsupervised anomaly detection perspective to automatically score the cosmesis. Real-world data experiments demonstrate the effectiveness of our method, providing visually appealing representations and quantifiable scores for cosmesis evaluation. Compared to commonly used rule-based programs, our fully automated approach eliminates the need for manual annotations and offers objective evaluation. Moreover, our anomaly detection model exhibits state-of-the-art performance, surpassing existing models in accuracy. Going beyond the scope of breast cosmesis, our research represents a significant advancement in unsupervised anomaly detection within the medical domain, thereby paving the way for future investigations.
- Published
- 2024
39. On global regularity of some bi-rotational Euler flows in $\mathbb{R}^{4}$
- Author
-
Choi, Kyudong, Jeong, In-Jee, and Lim, Deokwoo
- Subjects
Mathematics - Analysis of PDEs ,76B47, 35Q35 - Abstract
In this paper, we consider incompressible Euler flows in $ \mathbb{R}^{4} $ under bi-rotational symmetry, namely solutions that are invariant under rotations in $\mathbb{R}^{4}$ fixing either the first two or last two axes. With the additional swirl-free assumption, our first main result gives local wellposedness of Yudovich-type solutions, extending the work of Danchin [Uspekhi Mat. Nauk 62(2007), no.3, 73-94] for axisymmetric flows in $\mathbb{R}^{3}$. The second main result establishes global wellposedness under additional decay conditions near the axes and at infinity. This in particular gives global regularity of $C^{\infty}$ smooth and decaying Euler flows in $\mathbb{R}^{4}$ subject to bi-rotational symmetry without swirl., Comment: 25 pages, 2 figures
- Published
- 2024
40. Spatial Distribution of Intracluster Light versus Dark Matter in Horizon Run 5
- Author
-
Yoo, Jaewon, Park, Changbom, Sabiu, Cristiano G., Singh, Ankit, Ko, Jongwan, Lee, Jaehyun, Pichon, Christophe, Jee, M. James, Gibson, Brad K., Snaith, Owain, Kim, Juhan, Shin, Jihye, Kim, Yonghwi, and Kim, Hyowon
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
One intriguing approach for studying the dynamical evolution of galaxy clusters is to compare the spatial distributions among various components, such as dark matter, member galaxies, gas, and intracluster light (ICL). Utilizing the recently introduced Weighted Overlap Coefficient (WOC) \citep{2022ApJS..261...28Y}, we analyze the spatial distributions of components within 174 galaxy clusters ($M_{\rm tot}> 5 \times 10^{13} M_{\odot}$, $z=0.625$) at varying dynamical states in the cosmological hydrodynamical simulation Horizon Run 5. We observe that the distributions of gas and the combination of ICL with the brightest cluster galaxy (BCG) closely resembles the dark matter distribution, particularly in more relaxed clusters, characterized by the half-mass epoch. The similarity in spatial distribution between dark matter and BCG+ICL mimics the changes in the dynamical state of clusters during a major merger. Notably, at redshifts $>$ 1, BCG+ICL traced dark matter more accurately than the gas. Additionally, we examined the one-dimensional radial profiles of each component, which show that the BCG+ICL is a sensitive component revealing the dynamical state of clusters. We propose a new method that can approximately recover the dark matter profile by scaling the BCG+ICL radial profile. Furthermore, we find a recipe for tracing dark matter in unrelaxed clusters by including the most massive satellite galaxies together with BCG+ICL distribution. Combining the BCG+ICL and the gas distribution enhances the dark matter tracing ability. Our results imply that the BCG+ICL distribution is an effective tracer for the dark matter distribution, and the similarity of spatial distribution may be a useful probe of the dynamical state of a cluster., Comment: 23 pages, 12 figures, accepted for publication in ApJ
- Published
- 2024
41. TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
- Author
-
Kim, Minsu, Jung, Jee-weon, Rha, Hyeongseop, Maiti, Soumi, Arora, Siddhant, Chang, Xuankai, Watanabe, Shinji, and Ro, Yong Man
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The capability to jointly process multi-modal information is becoming an essential task. However, the limited number of paired multi-modal data and the large computational requirements in multi-modal learning hinder the development. We propose a novel Tri-Modal Translation (TMT) model that translates between arbitrary modalities spanning speech, image, and text. We introduce a novel viewpoint, where we interpret different modalities as different languages, and treat multi-modal translation as a well-established machine translation problem. To this end, we tokenize speech and image data into discrete tokens, which provide a unified interface across modalities and significantly decrease the computational cost. In the proposed TMT, a multi-modal encoder-decoder conducts the core translation, whereas modality-specific processing is conducted only within the tokenization and detokenization stages. We evaluate the proposed TMT on all six modality translation tasks. TMT outperforms single model counterparts consistently, demonstrating that unifying tasks is beneficial not only for practicality but also for performance.
- Published
- 2024
42. A study of the radio spectrum of Mrk 421
- Author
-
Lee, Jee Won, Lee, Sang-Sung, Hodgson, Jeffrey, Juan-Carlos, Algaba, Kim, Sang-Hyun, Cheong, Whee Yeon, Jeong, Hyeon-Woo, and Kang, Sincheol
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,Astrophysics - Astrophysics of Galaxies - Abstract
We present the results of a spectral analysis using simultaneous multifrequency (22, 43, 86, and 129 GHz) very long baseline interferometry (VLBI) observations of the Korean VLBI Network (KVN) on BL Lac object, Markarian 421 (Mrk 421). The data we used was obtained from January 2013 to June 2018. The light curves showed several flux enhancements with global decreases. To separate the variable and quiescent components in the multifrequency light curves for milliarcsecond-scale emission regions, we assumed that the quiescent radiation comes from the emission regions radiating constant optically-thin synchrotron emissions (i.e., a minimum flux density with an optically thin spectral index). The quiescent spectrum determined from the multifrequency light curves was subtracted from the total CLEAN flux density, yielding a variable component in the flux that produces the time-dependent spectrum. We found that the observed spectra were flat at 22-43 GHz, and relatively steep at 43-86 GHz, whereas the quiescent-corrected spectra are sometimes quite different from the observed spectra (e.g., sometimes inverted at 22-43 GHz ). The quiescent-corrected spectral indices were much more variable than the observed spectral indices. This spectral investigation implies that the quiescent-spectrum correction can significantly affect the multifrequency spectral index of variable compact radio sources such as blazars. Therefore, the synchrotron self-absorption B-field strength (B_SSA) can be significantly affected because B_SSA is proportional to the fifth power of turnover frequency., Comment: 13 pages, 9 figures, Accepted for publication in ApJ
- Published
- 2024
43. Wellposedness of the electron MHD without resistivity for large perturbations of the uniform magnetic field
- Author
-
Jeong, In-Jee and Oh, Sung-Jin
- Subjects
Mathematics - Analysis of PDEs ,Mathematical Physics - Abstract
We prove the local wellposedness of the Cauchy problems for the electron magnetohydrodynamics equations (E-MHD) without resistivity for possibly large perturbations of nonzero uniform magnetic fields. While the local wellposedness problem for (E-MHD) has been extensively studied in the presence of resistivity (which provides dissipative effects), this seems to be the first such result without resistivity. (E-MHD) is a fluid description of plasma in small scales where the motion of electrons relative to ions is significant. Mathematically, it is a quasilinear dispersive equation with nondegenerate but nonelliptic second-order principal term. Our result significantly improves upon the straightforward adaptation of the classical work of Kenig--Ponce--Rolvung--Vega on the quasilinear ultrahyperbolic Schr\"odinger equations, as the regularity and decay assumptions on the initial data are greatly weakened to the level analogous to the recent work of Marzuola--Metcalfe--Tataru in the case of elliptic principal term. A key ingredient of our proof is a simple observation about the relationship between the size of a symbol and the operator norm of its quantization as a pseudodifferential operator when restricted to high frequencies. This allows us to localize the (non-classical) pseudodifferential renormalization operator considered by Kenig--Ponce--Rolvung--Vega, and produce instead a classical pseudodifferential renormalization operator. We furthermore incorporate the function space framework of Marzuola--Metcalfe--Tataru to the present case of nonelliptic principal term., Comment: 58 pages
- Published
- 2024
44. Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?
- Author
-
Aldeneh, Zakaria, Higuchi, Takuya, Jung, Jee-weon, Seto, Skyler, Likhomanenko, Tatiana, Shum, Stephen, Abdelaziz, Ahmed Hussen, Watanabe, Shinji, and Theobald, Barry-John
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Self-supervised features are typically used in place of filter-bank features in speaker verification models. However, these models were originally designed to ingest filter-bank features as inputs, and thus, training them on top of self-supervised features assumes that both feature types require the same amount of learning for the task. In this work, we observe that pre-trained self-supervised speech features inherently include information required for downstream speaker verification task, and therefore, we can simplify the downstream model without sacrificing performance. To this end, we revisit the design of the downstream model for speaker verification using self-supervised features. We show that we can simplify the model to use 97.51% fewer parameters while achieving a 29.93% average improvement in performance on SUPERB. Consequently, we show that the simplified downstream model is more data efficient compared to baseline--it achieves better performance with only 60% of the training data.
- Published
- 2024
45. Hepatocyte-specific RIG-I loss attenuates metabolic dysfunction-associated steatotic liver disease in mice via changes in mitochondrial respiration and metabolite profiles
- Author
-
Seok, Jin Kyung, Yang, Gabsik, Jee, Jung In, Kang, Han Chang, Cho, Yong-Yeon, Lee, Hye Suk, and Lee, Joo Young
- Published
- 2024
- Full Text
- View/download PDF
46. Twisting in Hamiltonian flows and perfect fluids
- Author
-
Drivas, Theodore D., Elgindi, Tarek M., and Jeong, In-Jee
- Published
- 2024
- Full Text
- View/download PDF
47. Cellulose synthesis from germinated tiger nut residue and its application in the production of a functional cookie
- Author
-
Adedeji, Olajide Emmanuel, Abiodun, Olufunmilola Adunni, Adedeji, Omotayo Gloria, Kang, Hye Jee, Istiana, Nur, Min, Ju Hyun, Ayo, Jerome Adekunle, Chinma, Chiemela Enyinnaya, and Jung, Young Hoon
- Published
- 2024
- Full Text
- View/download PDF
48. Agonist antibody to guanylate cyclase receptor NPR1 regulates vascular tone
- Author
-
Dunn, Michael E., Kithcart, Aaron, Kim, Jee Hae, Ho, Andre Jo-Hao, Franklin, Matthew C., Romero Hernandez, Annabel, de Hoon, Jan, Botermans, Wouter, Meyer, Jonathan, Jin, Ximei, Zhang, Dongqin, Torello, Justin, Jasewicz, Daniel, Kamat, Vishal, Garnova, Elena, Liu, Nina, Rosconi, Michael, Pan, Hao, Karnik, Satyajit, Burczynski, Michael E., Zheng, Wenjun, Rafique, Ashique, Nielsen, Jonas B., De, Tanima, Verweij, Niek, Pandit, Anita, Locke, Adam, Chalasani, Naga, Melander, Olle, Schwantes-An, Tae-Hwi, Baras, Aris, Lotta, Luca A., Musser, Bret J., Mastaitis, Jason, Devalaraja-Narashimha, Kishor B., Rankin, Andrew J., Huang, Tammy, Herman, Gary, Olson, William, Murphy, Andrew J., Yancopoulos, George D., Olenchock, Benjamin A., and Morton, Lori
- Published
- 2024
- Full Text
- View/download PDF
49. Effects of indoor nature density and sex differences on working memory
- Author
-
Rhee, Jee Heon, Schermer, Brian, and Lee, Kyung Hoon
- Published
- 2024
- Full Text
- View/download PDF
50. Recent advances in activated carbon fibers for pollutant removal
- Author
-
Joo, Jong-Hyun, Kim, Seong-Hwang, Kim, Jee Hoon, Kang, Hyun-Ju, Lee, Jeong Hoon, Jeon, Hye-Ji, Jang, Yeon Hee, Lee, Jong-Hoon, Lee, Seul-Yi, Park, Soo-Jin, and Seo, Min-Kang
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.