Author: "Xu, Haoran" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Haoran"' showing total 1,740 results

Start Over Author "Xu, Haoran"

1,740 results on '"Xu, Haoran"'

1. Learning to Achieve Goals with Belief State Transformers

Author: Hu, Edward S., Ahn, Kwangjun, Liu, Qinghua, Xu, Haoran, Tomar, Manan, Langford, Ada, Jayaraman, Dinesh, Lamb, Alex, and Langford, John
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: We introduce the "Belief State Transformer", a next-token predictor that takes both a prefix and suffix as inputs, with a novel objective of predicting both the next token for the prefix and the previous token for the suffix. The Belief State Transformer effectively learns to solve challenging problems that conventional forward-only transformers struggle with, in a domain-independent fashion. Key to this success is learning a compact belief state that captures all relevant information necessary for accurate predictions. Empirical ablations show that each component of the model is essential in difficult scenarios where standard Transformers fall short. For the task of story writing with known prefixes and suffixes, our approach outperforms the Fill-in-the-Middle method for reaching known goals and demonstrates improved performance even when the goals are unknown. Altogether, the Belief State Transformer enables more efficient goal-conditioned decoding, better test-time inference, and high-quality text representations on small scale problems.
Published: 2024

2. Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?

Author: Han, HyoJung, Eriguchi, Akiko, Xu, Haoran, Hoang, Hieu, Carpuat, Marine, and Khayrallah, Huda
Subjects: Computer Science - Computation and Language
Abstract: Vocabulary adaptation, which integrates new vocabulary into pre-trained language models (LMs), enables expansion to new languages and mitigates token over-fragmentation. However, existing approaches are limited by their reliance on heuristic or external embeddings. We propose VocADT, a novel method for vocabulary adaptation using adapter modules that are trained to learn the optimal linear combination of existing embeddings while keeping the model's weights fixed. VocADT offers a flexible and scalable solution without requiring external resources or language constraints. Across 11 languages-with various scripts, resource availability, and fragmentation-we demonstrate that VocADT outperforms the original Mistral model and other baselines across various multilingual tasks. We find that Latin-script languages and highly fragmented languages benefit the most from vocabulary adaptation. We further fine-tune the adapted model on the generative task of machine translation and find that vocabulary adaptation is still beneficial after fine-tuning and that VocADT is the most effective method.
Published: 2024

3. RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image

Author: Chen, Xiaoxue, Zheng, Jv, Huang, Hao, Xu, Haoran, Gu, Weihao, Chen, Kangliang, xiang, He, Gao, Huan-ang, Zhao, Hao, Zhou, Guyue, and Zhang, Yaqin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The generation of high-quality 3D car assets is essential for various applications, including video games, autonomous driving, and virtual reality. Current 3D generation methods utilizing NeRF or 3D-GS as representations for 3D objects, generate a Lambertian object under fixed lighting and lack separated modelings for material and global illumination. As a result, the generated assets are unsuitable for relighting under varying lighting conditions, limiting their applicability in downstream tasks. To address this challenge, we propose a novel relightable 3D object generative framework that automates the creation of 3D car assets, enabling the swift and accurate reconstruction of a vehicle's geometry, texture, and material properties from a single input image. Our approach begins with introducing a large-scale synthetic car dataset comprising over 1,000 high-precision 3D vehicle models. We represent 3D objects using global illumination and relightable 3D Gaussian primitives integrating with BRDF parameters. Building on this representation, we introduce a feed-forward model that takes images as input and outputs both relightable 3D Gaussians and global illumination parameters. Experimental results demonstrate that our method produces photorealistic 3D car assets that can be seamlessly integrated into road scenes with different illuminations, which offers substantial practical benefits for industrial applications.
Published: 2024

4. Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets

Author: Li, Tianjian, Xu, Haoran, Tan, Weiting, Murray, Kenton, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Data availability across domains often follows a long-tail distribution: a few domains have abundant data, while most face dat . a scarcity. This imbalance poses challenges in training language models uniformly across all domains. In our study, we focus on multilingual settings, where data sizes vary significantly between high- and low-resource languages. Common strategies to address this include upsampling low-resource languages (Temperature Sampling) or upweighting their loss (Scalarization). Although often considered equivalent, this assumption has not been proven, which motivates our study. Through both theoretical and empirical analysis, we identify the conditions under which these approaches are equivalent and when they diverge. Specifically, we demonstrate that these two methods are equivalent under full gradient descent, but this equivalence breaks down with stochastic gradient descent. Empirically, we observe that Temperature Sampling converges more quickly but is prone to overfitting. We argue that this faster convergence is likely due to the lower variance in gradient estimations, as shown theoretically. Based on these insights, we propose Cooldown, a strategy that reduces sampling temperature during training, accelerating convergence without overfitting to low-resource languages. Our method is competitive with existing data re-weighting and offers computational efficiency., Comment: 19 pages
Published: 2024

5. X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale

Author: Xu, Haoran, Murray, Kenton, Koehn, Philipp, Hoang, Hieu, Eriguchi, Akiko, and Khayrallah, Huda
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) have achieved remarkable success across various NLP tasks, yet their focus has predominantly been on English due to English-centric pre-training and limited multilingual data. While some multilingual LLMs claim to support for hundreds of languages, models often fail to provide high-quality response for mid- and low-resource languages, leading to imbalanced performance heavily skewed in favor of high-resource languages like English and Chinese. In this paper, we prioritize quality over scaling number of languages, with a focus on multilingual machine translation task, and introduce X-ALMA, a model designed with a commitment to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels. X-ALMA surpasses state-of-the-art open-source multilingual LLMs, such as Aya-101 and Aya-23, in every single translation direction on the FLORES and WMT'23 test datasets according to COMET-22. This is achieved by plug-and-play language-specific module architecture to prevent language conflicts during training and a carefully designed training regimen with novel optimization methods to maximize the translation performance. At the final stage of training regimen, our proposed Adaptive Rejection Preference Optimization (ARPO) surpasses existing preference optimization methods in translation tasks.
Published: 2024

6. PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training

Author: Xu, Haoran, Liu, Ziqian, Fu, Rong, Su, Zhongling, Wang, Zerui, Cai, Zheng, Pei, Zhilin, and Zhang, Xingcheng
Subjects: Computer Science - Machine Learning
Abstract: With the evolution of large language models, traditional Transformer models become computationally demanding for lengthy sequences due to the quadratic growth in computation with respect to the sequence length. Mamba, emerging as a groundbreaking architecture in the field of generative AI, demonstrates remarkable proficiency in handling elongated sequences with reduced computational and memory complexity. Nevertheless, the existing training framework of Mamba presents inefficiency with variable-length sequence inputs. Either single-sequence training results in low GPU utilization, or batched processing of variable-length sequences to a maximum length incurs considerable memory and computational overhead. To address this problem, we analyze the performance of bottleneck operators in Mamba under diverse tensor shapes and proposed PackMamba, a high-throughput Mamba that efficiently handles variable-length sequences. Diving deep into state-space models (SSMs), we modify the parallel operators to avoid passing information between individual sequences while maintaining high performance. Experimental results on an NVIDIA A100 GPU demonstrate throughput exceeding the baseline single-sequence processing scheme: 3.06x speedup on the 1.4B model and 2.62x on the 2.8B model.
Published: 2024

7. Online Linear Programming with Batching

Author: Xu, Haoran, Glynn, Peter W., and Ye, Yinyu
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We study Online Linear Programming (OLP) with batching. The planning horizon is cut into $K$ batches, and the decisions on customers arriving within a batch can be delayed to the end of their associated batch. Compared with OLP without batching, the ability to delay decisions brings better operational performance, as measured by regret. Two research questions of interest are: (1) What is a lower bound of the regret as a function of $K$? (2) What algorithms can achieve the regret lower bound? These questions have been analyzed in the literature when the distribution of the reward and the resource consumption of the customers have finite support. By contrast, this paper analyzes these questions when the conditional distribution of the reward given the resource consumption is continuous, and we show the answers are different under this setting. When there is only a single type of resource and the decision maker knows the total number of customers, we propose an algorithm with a $O(\log K)$ regret upper bound and provide a $\Omega(\log K)$ regret lower bound. We also propose algorithms with $O(\log K)$ regret upper bound for the setting in which there are multiple types of resource and the setting in which customers arrive following a Poisson process. All these regret upper and lower bounds are independent of the length of the planning horizon, and all the proposed algorithms delay decisions on customers arriving in only the first and the last batch. We also take customer impatience into consideration and establish a way of selecting an appropriate batch size.
Published: 2024

8. Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning

Author: Mao, Liyuan, Xu, Haoran, Zhan, Xianyuan, Zhang, Weinan, and Zhang, Amy
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: One important property of DIstribution Correction Estimation (DICE) methods is that the solution is the optimal stationary distribution ratio between the optimized and data collection policy. In this work, we show that DICE-based methods can be viewed as a transformation from the behavior distribution to the optimal policy distribution. Based on this, we propose a novel approach, Diffusion-DICE, that directly performs this transformation using diffusion models. We find that the optimal policy's score function can be decomposed into two terms: the behavior policy's score function and the gradient of a guidance term which depends on the optimal distribution ratio. The first term can be obtained from a diffusion model trained on the dataset and we propose an in-sample learning objective to learn the second term. Due to the multi-modality contained in the optimal policy distribution, the transformation in Diffusion-DICE may guide towards those local-optimal modes. We thus generate a few candidate actions and carefully select from them to approach global-optimum. Different from all other diffusion-based offline RL methods, the guide-then-select paradigm in Diffusion-DICE only uses in-sample actions for training and brings minimal error exploitation in the value function. We use a didatic toycase example to show how previous diffusion-based methods fail to generate optimal actions due to leveraging these errors and how Diffusion-DICE successfully avoids that. We then conduct extensive experiments on benchmark datasets to show the strong performance of Diffusion-DICE. Project page at https://ryanxhr.github.io/Diffusion-DICE/., Comment: NeurIPS 2024, first two authors contribute equally
Published: 2024

9. IE-NeRF: Inpainting Enhanced Neural Radiance Fields in the Wild

Author: Wang, Shuaixian, Xu, Haoran, Li, Yaokun, Chen, Jiwei, and Tan, Guang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel approach for synthesizing realistic novel views using Neural Radiance Fields (NeRF) with uncontrolled photos in the wild. While NeRF has shown impressive results in controlled settings, it struggles with transient objects commonly found in dynamic and time-varying scenes. Our framework called \textit{Inpainting Enhanced NeRF}, or \ours, enhances the conventional NeRF by drawing inspiration from the technique of image inpainting. Specifically, our approach extends the Multi-Layer Perceptrons (MLP) of NeRF, enabling it to simultaneously generate intrinsic properties (static color, density) and extrinsic transient masks. We introduce an inpainting module that leverages the transient masks to effectively exclude occlusions, resulting in improved volume rendering quality. Additionally, we propose a new training strategy with frequency regularization to address the sparsity issue of low-frequency transient components. We evaluate our approach on internet photo collections of landmarks, demonstrating its ability to generate high-quality novel views and achieve state-of-the-art performance.
Published: 2024

10. FairDiff: Fair Segmentation with Point-Image Diffusion

Author: Li, Wenyi, Xu, Haoran, Zhang, Guiyu, Gao, Huan-ang, Gao, Mingju, Wang, Mengyu, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fairness is an important topic for medical image analysis, driven by the challenge of unbalanced training data among diverse target groups and the societal demand for equitable medical quality. In response to this issue, our research adopts a data-driven strategy-enhancing data balance by integrating synthetic images. However, in terms of generating synthetic images, previous works either lack paired labels or fail to precisely control the boundaries of synthetic images to be aligned with those labels. To address this, we formulate the problem in a joint optimization manner, in which three networks are optimized towards the goal of empirical risk minimization and fairness maximization. On the implementation side, our solution features an innovative Point-Image Diffusion architecture, which leverages 3D point clouds for improved control over mask boundaries through a point-mask-image synthesis pipeline. This method outperforms significantly existing techniques in synthesizing scanning laser ophthalmoscopy (SLO) fundus images. By combining synthetic data with real data during the training phase using a proposed Equal Scale approach, our model achieves superior fairness segmentation performance compared to the state-of-the-art fairness learning models. Code is available at https://github.com/wenyi-li/FairDiff., Comment: Accepted to MICCAI 2024
Published: 2024

11. NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

Author: Liu, Xiaohong, Min, Xiongkuo, Zhai, Guangtao, Li, Chunyi, Kou, Tengchuan, Sun, Wei, Wu, Haoning, Gao, Yixuan, Cao, Yuqin, Zhang, Zicheng, Wu, Xiele, Timofte, Radu, Peng, Fei, Fu, Huiyuan, Ming, Anlong, Wang, Chuanming, Ma, Huadong, He, Shuai, Dou, Zifei, Chen, Shu, Zhang, Huacong, Xie, Haiyi, Wang, Chengwei, Chen, Baoying, Zeng, Jishen, Yang, Jianquan, Wang, Weigang, Fang, Xi, Lv, Xiaoxin, Yan, Jun, Zhi, Tianwu, Zhang, Yabin, Li, Yaohui, Li, Yang, Xu, Jingwen, Liu, Jianzhao, Liao, Yiting, Li, Junlin, Yu, Zihao, Lu, Yiting, Li, Xin, Motamednia, Hossein, Hosseini-Benvidi, S. Farhad, Guan, Fengbin, Mahmoudi-Aznaveh, Ahmad, Mansouri, Azadeh, Gankhuyag, Ganzorig, Yoon, Kihwan, Xu, Yifang, Fan, Haotian, Kong, Fangyuan, Zhao, Shiling, Dong, Weifeng, Yin, Haibing, Zhu, Li, Wang, Zhiling, Huang, Bingchen, Saha, Avinab, Mishra, Sandeep, Gupta, Shashank, Sureddi, Rajesh, Saha, Oindrila, Celona, Luigi, Bianco, Simone, Napoletano, Paolo, Schettini, Raimondo, Yang, Junfeng, Fu, Jing, Zhang, Wei, Cao, Wenzhi, Liu, Limei, Peng, Han, Yuan, Weijun, Li, Zhan, Cheng, Yihang, Deng, Yifan, Li, Haohui, Qu, Bowen, Li, Yao, Luo, Shuqing, Wang, Shunzhou, Gao, Wei, Lu, Zihao, Conde, Marcos V., Wang, Xinrui, Chen, Zhibo, Liao, Ruling, Ye, Yan, Wang, Qiulin, Li, Bing, Zhou, Zhaokun, Geng, Miao, Chen, Rui, Tao, Xin, Liang, Xiaoyu, Sun, Shangkun, Ma, Xingyuan, Li, Jiaze, Yang, Mengduo, Xu, Haoran, Zhou, Jie, Zhu, Shiding, Yu, Bohan, Chen, Pengfei, Xu, Xinrui, Shen, Jiabin, Duan, Zhichao, Asadi, Erfan, Liu, Jiahe, Yan, Qi, Qu, Youran, Zeng, Xiaohui, Wang, Lele, and Liao, Renjie
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Content (AIGC). The challenge is divided into the image track and the video track. The image track uses the AIGIQA-20K, which contains 20,000 AI-Generated Images (AIGIs) generated by 15 popular generative models. The image track has a total of 318 registered participants. A total of 1,646 submissions are received in the development phase, and 221 submissions are received in the test phase. Finally, 16 participating teams submitted their models and fact sheets. The video track uses the T2VQA-DB, which contains 10,000 AI-Generated Videos (AIGVs) generated by 9 popular Text-to-Video (T2V) models. A total of 196 participants have registered in the video track. A total of 991 submissions are received in the development phase, and 185 submissions are received in the test phase. Finally, 12 participating teams submitted their models and fact sheets. Some methods have achieved better results than baseline methods, and the winning methods in both tracks have demonstrated superior prediction performance on AIGC.
Published: 2024

12. NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

Author: Li, Xin, Yuan, Kun, Pei, Yajing, Lu, Yiting, Sun, Ming, Zhou, Chao, Chen, Zhibo, Timofte, Radu, Sun, Wei, Wu, Haoning, Zhang, Zicheng, Jia, Jun, Zhang, Zhichao, Cao, Linhan, Chen, Qiubo, Min, Xiongkuo, Lin, Weisi, Zhai, Guangtao, Sun, Jianhui, Wang, Tianyi, Li, Lei, Kong, Han, Wang, Wenxuan, Li, Bing, Luo, Cheng, Wang, Haiqiang, Chen, Xiangguang, Meng, Wenhui, Pan, Xiang, Shi, Huiying, Zhu, Han, Xu, Xiaozhong, Sun, Lei, Chen, Zhenzhong, Liu, Shan, Kong, Fangyuan, Fan, Haotian, Xu, Yifang, Xu, Haoran, Yang, Mengduo, Zhou, Jie, Li, Jiaze, Wen, Shijie, Xu, Mai, Li, Da, Yao, Shunyu, Du, Jiazhi, Zuo, Wangmeng, Li, Zhibo, He, Shuai, Ming, Anlong, Fu, Huiyuan, Ma, Huadong, Wu, Yong, Xue, Fie, Zhao, Guozhi, Du, Lina, Guo, Jie, Zhang, Yu, Zheng, Huimin, Chen, Junhao, Liu, Yue, Zhou, Dulan, Xu, Kele, Xu, Qisheng, Sun, Tao, Ding, Zhixiang, and Hu, Yuhang
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence
Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The purpose is to build new benchmarks and advance the development of S-UGC VQA. The competition had 200 participants and 13 teams submitted valid solutions for the final testing phase. The proposed solutions achieved state-of-the-art performances for S-UGC VQA. The project can be found at https://github.com/lixinustc/KVQChallenge-CVPR-NTIRE2024., Comment: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge
Published: 2024

13. Self-enhanced mobility enables vortex pattern formation in living matter

Author: Xu, Haoran and Wu, Yilin
Subjects: Physics - Biological Physics, Condensed Matter - Soft Condensed Matter, Physics - Fluid Dynamics
Abstract: Emergence of regular spatial patterns is a hallmark in living matter ranging from subcellular organelles to developing embryos and to ecosystems. Mechanisms for the formation of ordered spatial patterns in biology often require chemical signaling that coordinates cellular behavior and differentiation. Here we discovered a novel route to large-scale regular pattern formation in living matter mediated by purely physical interactions. We found that dense bacterial living matter spontaneously developed an ordered lattice of mesoscale, fast-spinning vortices each consisting of ~10^4-10^5 motile cells; these mesoscale vortices were arranged in space over centimeter scale with apparent hexagonal order, while individual cells in the vortices moved in coordinated directions with strong polar and vortical order. Single-cell tracking and numerical simulations suggest that the phenomenon is enabled by self-enhanced mobility of individual cells in the system. Our findings demonstrate a simple physical mechanism for self-organized pattern formation in living systems and more generally, in other active matter systems near the boundary of fluidlike and solidlike behaviors.
Published: 2024
Full Text: View/download PDF

14. TOP-Net Prediction Model Using Bidirectional Long Short-term Memory and Medical-Grade Wearable Multisensor System for Tachycardia Onset: Algorithm Development Study

Author: Liu, Xiaoli, Liu, Tongbo, Zhang, Zhengbo, Kuo, Po-Chih, Xu, Haoran, Yang, Zhicheng, Lan, Ke, Li, Peiyao, Ouyang, Zhenchao, Ng, Yeuk Lam, Yan, Wei, and Li, Deyu
Subjects: Computer applications to medicine. Medical informatics, R858-859.7
Abstract: BackgroundWithout timely diagnosis and treatment, tachycardia, also called tachyarrhythmia, can cause serious complications such as heart failure, cardiac arrest, and even death. The predictive performance of conventional clinical diagnostic procedures needs improvement in order to assist physicians in detecting risk early on. ObjectiveWe aimed to develop a deep tachycardia onset prediction (TOP-Net) model based on deep learning (ie, bidirectional long short-term memory) for early tachycardia diagnosis with easily accessible data. MethodsTOP-Net leverages 2 easily accessible data sources: vital signs, including heart rate, respiratory rate, and blood oxygen saturation (SpO2) acquired continuously by wearable embedded systems, and electronic health records, containing age, gender, admission type, first care unit, and cardiovascular disease history. The model was trained with a large data set from an intensive care unit and then transferred to a real-world scenario in the general ward. In this study, 3 experiments incorporated merging patients’ personal information, temporal memory, and different feature combinations. Six metrics (area under the receiver operating characteristic curve [AUROC], sensitivity, specificity, accuracy, F1 score, and precision) were used to evaluate predictive performance. ResultsTOP-Net outperformed the baseline models on the large critical care data set (AUROC 0.796, 95% CI 0.768-0.824; sensitivity 0.753, 95% CI 0.663-0.793; specificity 0.720, 95% CI 0.645-0.758; accuracy 0.721; F1 score 0.718; precision 0.686) when predicting tachycardia onset 6 hours in advance. When predicting tachycardia onset 2 hours in advance with data acquired from our hospital using the transferred TOP-Net, the 6 metrics were 0.965, 0.955, 0.881, 0.937, 0.793, and 0.680, respectively. The best performance was achieved using comprehensive vital signs (heart rate, respiratory rate, and SpO2) statistical information. ConclusionsTOP-Net is an early tachycardia prediction model that uses 8 types of data from wearable sensors and electronic health records. When validated in clinical scenarios, the model achieved a prediction performance that outperformed baseline models 0 to 6 hours before tachycardia onset in the intensive care unit and 2 hours before tachycardia onset in the general ward. Because of the model’s implementation and use of easily accessible data from wearable sensors, the model can assist physicians with early discovery of patients at risk in general wards and houses.
Published: 2021
Full Text: View/download PDF

15. Long-Term Herbicide Mixture Exposure Increases Hypertension Risk and Aging Biomarkers Play Mediation Effects: A Nested Case-Control Study

Author: Zhou, Yilin, Shi, Jiayu, Wei, Dandan, Zhao, Mengzhen, Ma, Cuicui, Geng, Jintian, Guo, Yao, Wu, Xueyan, Xu, Haoran, Chen, Zhiwei, Huo, Wenqian, Wang, Chongjian, and Mao, Zhenxing
Published: 2024
Full Text: View/download PDF

16. Theoretical study on the Cs/Cs-O adsorbed graphene/semiconductor heterojunction anode for thermionic converters

Author: Sun, Weiting, Xu, Haoran, Qiu, Hao, and Xiao, Gang
Published: 2024
Full Text: View/download PDF

17. Streaming Sequence Transduction through Dynamic Compression

Author: Tan, Weiting, Chen, Yunmo, Chen, Tongfei, Qin, Guanghui, Xu, Haoran, Zhang, Heidi C., Van Durme, Benjamin, and Koehn, Philipp
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams. STAR dynamically segments input streams to create compressed anchor representations, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover, STAR demonstrates superior segmentation and latency-quality trade-offs in simultaneous speech-to-text tasks, optimizing latency, memory footprint, and quality.
Published: 2024

18. ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

Author: Mao, Liyuan, Xu, Haoran, Zhang, Weinan, and Zhan, Xianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: In this study, we investigate the DIstribution Correction Estimation (DICE) methods, an important line of work in offline reinforcement learning (RL) and imitation learning (IL). DICE-based methods impose state-action-level behavior constraint, which is an ideal choice for offline learning. However, they typically perform much worse than current state-of-the-art (SOTA) methods that solely use action-level behavior constraint. After revisiting DICE-based methods, we find there exist two gradient terms when learning the value function using true-gradient update: forward gradient (taken on the current state) and backward gradient (taken on the next state). Using forward gradient bears a large similarity to many offline RL methods, and thus can be regarded as applying action-level constraint. However, directly adding the backward gradient may degenerate or cancel out its effect if these two gradients have conflicting directions. To resolve this issue, we propose a simple yet effective modification that projects the backward gradient onto the normal plane of the forward gradient, resulting in an orthogonal-gradient update, a new learning rule for DICE-based methods. We conduct thorough theoretical analyses and find that the projected backward gradient brings state-level behavior regularization, which reveals the mystery of DICE-based methods: the value learning objective does try to impose state-action-level constraint, but needs to be used in a corrected way. Through toy examples and extensive experiments on complex offline RL and IL tasks, we demonstrate that DICE-based methods using orthogonal-gradient updates (O-DICE) achieve SOTA performance and great robustness., Comment: Spotlight @ ICLR 2024, first two authors contribute equally
Published: 2024

19. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

Author: Shen, Lingfeng, Tan, Weiting, Chen, Sihao, Chen, Yunmo, Zhang, Jingyu, Xu, Haoran, Zheng, Boyuan, Koehn, Philipp, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages, we observe that (1) LLMs tend to generate unsafe responses much more often when a malicious prompt is written in a lower-resource language, and (2) LLMs tend to generate more irrelevant responses to malicious prompts in lower-resource languages. To understand where the discrepancy can be attributed, we study the effect of instruction tuning with reinforcement learning from human feedback (RLHF) or supervised finetuning (SFT) on the HH-RLHF dataset. Surprisingly, while training with high-resource languages improves model alignment, training in lower-resource languages yields minimal improvement. This suggests that the bottleneck of cross-lingual alignment is rooted in the pretraining stage. Our findings highlight the challenges in cross-lingual LLM safety, and we hope they inform future research in this direction.
Published: 2024

20. Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Author: Xu, Haoran, Sharaf, Amr, Chen, Yunmo, Tan, Weiting, Shen, Lingfeng, Van Durme, Benjamin, Murray, Kenton, and Kim, Young Jin
Subjects: Computer Science - Computation and Language
Abstract: Moderate-sized large language models (LLMs) -- those with 7B or 13B parameters -- exhibit promising machine translation (MT) performance. However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of state-of-the-art conventional encoder-decoder translation models or larger-scale LLMs such as GPT-4. In this study, we bridge this performance gap. We first assess the shortcomings of supervised fine-tuning for LLMs in the MT task, emphasizing the quality issues present in the reference data, despite being human-generated. Then, in contrast to SFT which mimics reference translations, we introduce Contrastive Preference Optimization (CPO), a novel approach that trains models to avoid generating adequate but not perfect translations. Applying CPO to ALMA models with only 22K parallel sentences and 12M parameters yields significant improvements. The resulting model, called ALMA-R, can match or exceed the performance of the WMT competition winners and GPT-4 on WMT'21, WMT'22 and WMT'23 test datasets., Comment: Accepted at ICML 2024
Published: 2024

21. Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles

Author: Tan, Weiting, Xu, Haoran, Shen, Lingfeng, Li, Shuyue Stella, Murray, Kenton, Koehn, Philipp, Van Durme, Benjamin, and Chen, Yunmo
Subjects: Computer Science - Computation and Language
Abstract: Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations are relatively good, there remains a discernible gap comparing their performance with the few-shot setting. In this paper, we investigate the factors contributing to this gap and find that this gap can largely be closed (for about 70%) by matching the writing styles of the target corpus. Additionally, we explore potential approaches to enhance zero-shot baselines without the need for parallel demonstration examples, providing valuable insights into how these methods contribute to improving translation metrics.
Published: 2023

22. Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models

Author: Li, Tianjian, Xu, Haoran, Koehn, Philipp, Khashabi, Daniel, and Murray, Kenton
Subjects: Computer Science - Computation and Language
Abstract: Text generation models are notoriously vulnerable to errors in the training data. With the wide-spread availability of massive amounts of web-crawled data becoming more commonplace, how can we enhance the robustness of models trained on a massive amount of noisy web-crawled text? In our work, we propose Error Norm Truncation (ENT), a robust enhancement method to the standard training objective that truncates noisy data. Compared to methods that only uses the negative log-likelihood loss to estimate data quality, our method provides a more accurate estimation by considering the distribution of non-target tokens, which is often overlooked by previous work. Through comprehensive experiments across language modeling, machine translation, and text summarization, we show that equipping text generation models with ENT improves generation quality over standard training and previous soft and hard truncation methods. Furthermore, we show that our method improves the robustness of models against two of the most detrimental types of noise in machine translation, resulting in an increase of more than 2 BLEU points over the MLE baseline when up to 50% of noise is added to the data., Comment: ICLR 2024
Published: 2023

23. A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Author: Xu, Haoran, Kim, Young Jin, Sharaf, Amr, and Awadalla, Hany Hassan
Subjects: Computer Science - Computation and Language
Abstract: Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B parameters), which still lag behind conventional supervised encoder-decoder translation models. Previous studies have attempted to improve the translation capabilities of these moderate LLMs, but their gains have been limited. In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the translation task, eliminating the need for the abundant parallel data that traditional translation models usually depend on. Our approach consists of two fine-tuning stages: initial fine-tuning on monolingual data followed by subsequent fine-tuning on a small set of high-quality parallel data. We introduce the LLM developed through this strategy as Advanced Language Model-based trAnslator (ALMA). Based on LLaMA-2 as our underlying model, our results show that the model can achieve an average improvement of more than 12 BLEU and 12 COMET over its zero-shot performance across 10 translation directions from the WMT'21 (2 directions) and WMT'22 (8 directions) test datasets. The performance is significantly better than all prior work and even superior to the NLLB-54B model and GPT-3.5-text-davinci-003, with only 7B or 13B parameters. This method establishes the foundation for a novel training paradigm in machine translation., Comment: Accepted at ICLR 2024
Published: 2023

24. Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

Author: Wang, Xiangsen, Xu, Haoran, Zheng, Yinan, and Zhan, Xianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Offline reinforcement learning (RL) has received considerable attention in recent years due to its attractive capability of learning policies from offline datasets without environmental interactions. Despite some success in the single-agent setting, offline multi-agent RL (MARL) remains to be a challenge. The large joint state-action space and the coupled multi-agent behaviors pose extra complexities for offline policy optimization. Most existing offline MARL studies simply apply offline data-related regularizations on individual agents, without fully considering the multi-agent system at the global level. In this work, we present OMIGA, a new offline m ulti-agent RL algorithm with implicit global-to-local v alue regularization. OMIGA provides a principled framework to convert global-level value regularization into equivalent implicit local value regularizations and simultaneously enables in-sample learning, thus elegantly bridging multi-agent value decomposition and policy learning with offline regularizations. Based on comprehensive experiments on the offline multi-agent MuJoCo and StarCraft II micro-management tasks, we show that OMIGA achieves superior performance over the state-of-the-art offline MARL methods in almost all tasks., Comment: Accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Published: 2023

25. Offline Reinforcement Learning with Imbalanced Datasets

Author: Jiang, Li, Cheng, Sijie, Qiu, Jielin, Xu, Haoran, Chan, Wai Kin, and Ding, Zhao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The prevalent use of benchmarks in current offline reinforcement learning (RL) research has led to a neglect of the imbalance of real-world dataset distributions in the development of models. The real-world offline RL dataset is often imbalanced over the state space due to the challenge of exploration or safety considerations. In this paper, we specify properties of imbalanced datasets in offline RL, where the state coverage follows a power law distribution characterized by skewed policies. Theoretically and empirically, we show that typically offline RL methods based on distributional constraints, such as conservative Q-learning (CQL), are ineffective in extracting policies under the imbalanced dataset. Inspired by natural intelligence, we propose a novel offline RL method that utilizes the augmentation of CQL with a retrieval process to recall past related experiences, effectively alleviating the challenges posed by imbalanced datasets. We evaluate our method on several tasks in the context of imbalanced datasets with varying levels of imbalance, utilizing the variant of D4RL. Empirical results demonstrate the superiority of our method over other baselines.
Published: 2023

26. Interobserver variability of clinical target volume delineation in patients undergoing breast-conserving surgery without surgical clips: a pilot study on preoperative magnetic resonance simulation

Author: Jiao, Shuning, Wang, Yiqing, Ma, Jiabin, Shen, Jing, Zhang, Xi-Qian, Zhou, Bing, Sun, Xiansong, Xu, Haoran, Liu, Xia, Hu, Ke, Zhang, Fuquan, Hou, Xiaorong, and Qiu, Jie
Published: 2024
Full Text: View/download PDF

27. Feasibility of dose calculation for treatment plans using electron density maps from a novel dual-layer detector spectral CT simulator

Author: Zhu, Qizhen, Wei, Shuoyang, Wang, Zhiqun, Xu, Haoran, Zhou, Bing, Qu, Huiying, Nie, Mingming, Guo, Ning, Wang, Wenshuai, Yang, Bo, and Qiu, Jie
Published: 2024
Full Text: View/download PDF

28. Porosity Prediction Based on Ensemble Learning for Feature Selection and an Optimized GRU Improved by the PSO Algorithm

Author: Liu, Miaomiao, Xu, Haoran, Zhao, Fengda, Zhang, Qiang, Jia, Ying, and Xi, Jiahao
Published: 2024
Full Text: View/download PDF

29. Regional climate model intercomparison over the Tibetan Plateau in the GEWEX/LS4P Phase I

Author: Tang, Jianping, Xue, Yongkang, Long, Mengyuan, Ma, Mengnan, Liang, Xin-Zhong, Sugimoto, Shiori, Yang, Kun, Ji, Zhenming, Hong, Jinkyu, Kim, Jeongwon, Xu, Haoran, Zhou, Xu, Sato, Tomonori, Takahashi, Hiroshi G., Wang, Shuyu, Wang, Guiling, Chou, Sin Chan, Guo, Weidong, Yu, Miao, and Pan, Xiaoduo
Published: 2024
Full Text: View/download PDF

30. Regional climate modeling to understand Tibetan heating remote impacts on East China precipitation

Author: Xu, Haoran, Liang, Xin-Zhong, and Xue, Yongkang
Published: 2024
Full Text: View/download PDF

31. Li-storage of Li2ZnTi3O8@C-N anodes with high-performance in a wide temperature range

Author: Li, Jiayi, Zhang, Xue, Xu, Haoran, Ma, Wenzhao, and Wang, Lijuan
Published: 2024
Full Text: View/download PDF

32. PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

Author: Li, Jianxiong, Hu, Xiao, Xu, Haoran, Liu, Jingjing, Zhan, Xianyuan, and Zhang, Ya-Qin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Offline-to-online reinforcement learning (RL), by combining the benefits of offline pretraining and online finetuning, promises enhanced sample efficiency and policy performance. However, existing methods, effective as they are, suffer from suboptimal performance, limited adaptability, and unsatisfactory computational efficiency. We propose a novel framework, PROTO, which overcomes the aforementioned limitations by augmenting the standard RL objective with an iteratively evolving regularization term. Performing a trust-region-style update, PROTO yields stable initial finetuning and optimal final performance by gradually evolving the regularization term to relax the constraint strength. By adjusting only a few lines of code, PROTO can bridge any offline policy pretraining and standard off-policy RL finetuning to form a powerful offline-to-online RL pathway, birthing great adaptability to diverse methods. Simple yet elegant, PROTO imposes minimal additional computation and enables highly efficient online finetuning. Extensive experiments demonstrate that PROTO achieves superior performance over SOTA baselines, offering an adaptable and efficient offline-to-online RL framework.
Published: 2023

33. Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

Author: Xu, Haoran, Tan, Weiting, Li, Shuyue Stella, Chen, Yunmo, Van Durme, Benjamin, Koehn, Philipp, and Murray, Kenton
Subjects: Computer Science - Computation and Language
Abstract: Incorporating language-specific (LS) modules is a proven method to boost performance in multilingual machine translation. This approach bears similarity to Mixture-of-Experts (MoE) because it does not inflate FLOPs. However, the scalability of this approach to hundreds of languages (experts) tends to be unmanageable due to the prohibitive number of parameters introduced by full-rank matrices in fully-connected layers. In this work, we introduce the Language-Specific Matrix Synthesis (LMS) method. This approach constructs LS modules by generating low-rank matrices from two significantly smaller matrices to approximate the full-rank matrix. Furthermore, we condense multilingual knowledge from multiple LS modules into a single shared module with the Fuse Distillation (FD) technique to improve the efficiency of inference and model serialization. We show that our LMS method significantly outperforms previous LS methods and MoE methods with the same amount of extra parameters, e.g., 1.73 BLEU points over the Switch Transformer on many-to-many multilingual machine translation. Importantly, LMS is able to have comparable translation performance with much fewer parameters., Comment: Accepted at the main conference of EMNLP 2023
Published: 2023

34. Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity

Author: Xu, Haoran, Elbayad, Maha, Murray, Kenton, Maillard, Jean, and Goswami, Vedanuj
Subjects: Computer Science - Computation and Language
Abstract: Mixture-of-experts (MoE) models that employ sparse activation have demonstrated effectiveness in significantly increasing the number of parameters while maintaining low computational requirements per token. However, recent studies have established that MoE models are inherently parameter-inefficient as the improvement in performance diminishes with an increasing number of experts. We hypothesize this parameter inefficiency is a result of all experts having equal capacity, which may not adequately meet the varying complexity requirements of different tokens or tasks. In light of this, we propose Stratified Mixture of Experts (SMoE) models, which feature a stratified structure and can assign dynamic capacity to different tokens. We demonstrate the effectiveness of SMoE on three multilingual machine translation benchmarks, containing 4, 15, and 94 language pairs, respectively. We show that SMoE outperforms multiple state-of-the-art MoE models with the same or fewer parameters., Comment: Accepted at Findings of EMNLP 2023
Published: 2023

35. Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Author: Xu, Haoran, Jiang, Li, Li, Jianxiong, Yang, Zhuoran, Wang, Zhaoran, Chan, Victor Wai Kin, and Zhan, Xianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Most offline reinforcement learning (RL) methods suffer from the trade-off between improving the policy to surpass the behavior policy and constraining the policy to limit the deviation from the behavior policy as computing $Q$-values using out-of-distribution (OOD) actions will suffer from errors due to distributional shift. The recently proposed \textit{In-sample Learning} paradigm (i.e., IQL), which improves the policy by quantile regression using only data samples, shows great promise because it learns an optimal policy without querying the value function of any unseen actions. However, it remains unclear how this type of method handles the distributional shift in learning the value function. In this work, we make a key finding that the in-sample learning paradigm arises under the \textit{Implicit Value Regularization} (IVR) framework. This gives a deeper understanding of why the in-sample learning paradigm works, i.e., it applies implicit value regularization to the policy. Based on the IVR framework, we further propose two practical algorithms, Sparse $Q$-learning (SQL) and Exponential $Q$-learning (EQL), which adopt the same value regularization used in existing works, but in a complete in-sample manner. Compared with IQL, we find that our algorithms introduce sparsity in learning the value function, making them more robust in noisy data regimes. We also verify the effectiveness of SQL and EQL on D4RL benchmark datasets and show the benefits of in-sample learning by comparing them with CQL in small data regimes., Comment: ICLR 2023 notable top 5%
Published: 2023

36. Rock fracture identification algorithm based on the confidence score and non-maximum suppression

Author: Xu, Haoran, Tang, Shibin, Wang, Jia, Dong, Bingyan, Wang, Xiaojun, Zhao, Kui, Zhu, Yichun, and Geng, Jiabo
Published: 2024
Full Text: View/download PDF

37. Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Author: Xu, Haoran, Maillard, Jean, and Goswami, Vedanuj
Subjects: Computer Science - Computation and Language
Abstract: Multilingual machine translation (MMT) benefits from cross-lingual transfer but is a challenging multitask optimization problem. This is partly because there is no clear framework to systematically learn language-specific parameters. Self-supervised learning (SSL) approaches that leverage large quantities of monolingual data (where parallel data is unavailable) have shown promise by improving translation performance as complementary tasks to the MMT task. However, jointly optimizing SSL and MMT tasks is even more challenging. In this work, we first investigate how to utilize intra-distillation to learn more *language-specific* parameters and then show the importance of these language-specific parameters. Next, we propose a novel but simple SSL task, concurrent denoising, that co-trains with the MMT task by concurrently denoising monolingual data on both the encoder and decoder. Finally, we apply intra-distillation to this co-training approach. Combining these two approaches significantly improves MMT performance, outperforming three state-of-the-art SSL methods by a large margin, e.g., 11.3\% and 3.7\% improvement on an 8-language and a 15-language benchmark compared with MASS, respectively, Comment: Findings of EACL 2023
Published: 2023

38. Mind the Gap: Offline Policy Optimization for Imperfect Rewards

Author: Li, Jianxiong, Hu, Xiao, Xu, Haoran, Liu, Jingjing, Zhan, Xianyuan, Jia, Qing-Shan, and Zhang, Ya-Qin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Reward function is essential in reinforcement learning (RL), serving as the guiding signal to incentivize agents to solve given tasks, however, is also notoriously difficult to design. In many cases, only imperfect rewards are available, which inflicts substantial performance loss for RL agents. In this study, we propose a unified offline policy optimization approach, \textit{RGM (Reward Gap Minimization)}, which can smartly handle diverse types of imperfect rewards. RGM is formulated as a bi-level optimization problem: the upper layer optimizes a reward correction term that performs visitation distribution matching w.r.t. some expert data; the lower layer solves a pessimistic RL problem with the corrected rewards. By exploiting the duality of the lower layer, we derive a tractable algorithm that enables sampled-based learning without any online interactions. Comprehensive experiments demonstrate that RGM achieves superior performance to existing methods under diverse settings of imperfect rewards. Further, RGM can effectively correct wrong or inconsistent rewards against expert preference and retrieve useful information from biased rewards., Comment: Accept by ICLR2023. The first two authors contributed equally
Published: 2023

39. SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning

Author: Zhang, Qin, Zhang, Linrui, Xu, Haoran, Shen, Li, Wang, Bowen, Chang, Yongzhe, Wang, Xueqian, Yuan, Bo, and Tao, Dacheng
Subjects: Computer Science - Machine Learning
Abstract: Offline safe RL is of great practical relevance for deploying agents in real-world applications. However, acquiring constraint-satisfying policies from the fixed dataset is non-trivial for conventional approaches. Even worse, the learned constraints are stationary and may become invalid when the online safety requirement changes. In this paper, we present a novel offline safe RL approach referred to as SaFormer, which tackles the above issues via conditional sequence modeling. In contrast to existing sequence models, we propose cost-related tokens to restrict the action space and a posterior safety verification to enforce the constraint explicitly. Specifically, SaFormer performs a two-stage auto-regression conditioned by the maximum remaining cost to generate feasible candidates. It then filters out unsafe attempts and executes the optimal action with the highest expected return. Extensive experiments demonstrate the efficacy of SaFormer featuring (1) competitive returns with tightened constraint satisfaction; (2) adaptability to the in-range cost values of the offline data without retraining; (3) generalizability for constraints beyond the current dataset.
Published: 2023

40. Reentrancy Vulnerability Detection Based on Improved Attention Mechanism

Author: Xu, Haoran, Qiu, Meikang, Zhao, Hui, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cao, Cungeng, editor, Chen, Huajun, editor, Zhao, Liang, editor, Arshad, Junaid, editor, Asyhari, Taufiq, editor, and Wang, Yonghao, editor
Published: 2024
Full Text: View/download PDF

41. Reinforcement Learning-Based Consensus Reaching in Large-Scale Social Networks

Author: Guo, Shijun, Xu, Haoran, Xie, Guangqiang, Wen, Di, Huang, Yangru, Peng, Peixi, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

42. A Policy-Guided Imitation Approach for Offline Reinforcement Learning

Author: Xu, Haoran, Jiang, Li, Li, Jianxiong, and Zhan, Xianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution generalization but suffer from erroneous off-policy evaluation. Imitation-based methods avoid off-policy evaluation but are too conservative to surpass the dataset. In this study, we propose an alternative approach, inheriting the training stability of imitation-style methods while still allowing logical out-of-distribution generalization. We decompose the conventional reward-maximizing policy in offline RL into a guide-policy and an execute-policy. During training, the guide-poicy and execute-policy are learned using only data from the dataset, in a supervised and decoupled manner. During evaluation, the guide-policy guides the execute-policy by telling where it should go so that the reward can be maximized, serving as the \textit{Prophet}. By doing so, our algorithm allows \textit{state-compositionality} from the dataset, rather than \textit{action-compositionality} conducted in prior imitation-style methods. We dumb this new approach Policy-guided Offline RL (\texttt{POR}). \texttt{POR} demonstrates the state-of-the-art performance on D4RL, a standard benchmark for offline RL. We also highlight the benefits of \texttt{POR} in terms of improving with supplementary suboptimal data and easily adapting to new tasks by only changing the guide-poicy., Comment: Oral @ NeurIPS 2022; An extended version with more experiments & correct some experiment details in previous version
Published: 2022

43. Comparison study on first bunch compressor schemes by conventional and double C-chicane for MaRIE XFEL

Author: Xu, Haoran, Duffy, Leanne D., Marksteiner, Quinn R., and Anisimov, Petr M.
Subjects: Physics - Accelerator Physics
Abstract: We report our comparison study on the first stage electron bunch compression schemes at 750 MeV using a conventional and a double C-chicane for the X-ray free electron laser (XFEL) under development for the Matter-Radiation Interactions in Extremes (MaRIE) project at Los Alamos National Laboratory. Compared to the performance of the conventional C-chicane bunch compressor, the double C-chicane scheme exhibits the capability of utilizing the transverse momentum shift induced by the coherent synchrotron radiation (CSR) in the second C-chicane to compensate that generated in the first C-chicane, resulting in a compressed electron bunch with minimized transverse momentum along the beam. It is also found that the double C-chicane scheme can be designed to significantly better preserve the beam emittance in the course of the bunch compression. This is particularly beneficial for the MaRIE XFEL whose lasing performance critically depends on the preservation of the ultralow beam emittance., Comment: Paper TUPA72, North American Particle Accelerator Conference (NAPAC) 2022
Published: 2022

44. Investigation of a Diels-Alder approach to novel narciclasine derivatives

Author: Xu, Haoran, Caggiano, Lorenzo, and Lewis, Simon
Subjects: narciclasine, Diels-Alder
Abstract: Narciclasine and structurally related pancratistatin are natural products isolated from Amaryllidaceae flowering plants, such as the daffodils. They display potent and selective anticancer activities, however, their isolation/extraction yields are low. Thus, efficient synthetic routes to these compounds and biologically active derivatives are of great interest. The Caggiano research team has been developing novel synthetic routes to obtain simplified narciclasine analogues. Attempts to synthesise unsaturated narciclasine analogues, ready for further functionalisation, using the Diels-Alder reaction together with the Caggiano modified-Curtius rearrangement reaction as key steps, are described in this thesis. In addition, we explore the synthesis of novel unnatural narciclasine N-analogues using commercially available D-lyxose/L-lyxose with aniline to generate the highly functionalised poly-hydroxylated C-ring. In Chapter 3, a four-step route to the ABC-ring key intermediates has been developed and is described. Following this synthetic methodology, Benzaldehydes were treated with Meldrum's acid to afford dienophiles via the Knoevenagel condensation reaction. Subsequent Diels-Alder reaction furnished the corresponding unsaturated C-ring cycloadducts. A stepwise decarboxylation of cycloadducts afforded the carboxylic acids required for cyclization. Finally, acids were readily converted to the ABC-ring key intermediates using the Curtius rearrangement and Friedel-Crafts acylation to afford three analogues in overall yields of 27-73%. In Chapter 4, the synthetic route started with trans-3,4,5-trimethoxycinnamic acid, which relied on chiral auxiliaries to perform the Diels-Alder reaction with control over the stereochemistry. Initially, an achiral auxiliary was used to explore this approach and a racemic ABC-ring key intermediate was successfully obtained, demonstrating the feasibility of this method. Subsequently, chiral auxiliaries were employed and following enantioselective Diels-Alder reactions, hydrolysis revealed the corresponding enantiomerically pure carboxylic acid for cyclization. Using this approach, two various enantiomerically pure carboxylic acids have been synthesized and readily to be cyclized. Chapter 5 describes our efforts towards the efficient synthesis of an N-analogue of narciclasine via two complementary approaches: functionalisation of the C-ring after cyclization, or cyclization of a functionalised precursor. The first approach was investigated with the ABC-ring core, obtained from anthranilamide and glutaraldehyde in a single step. Various attempts to further functionalised the enamine present in the C-ring were explored, but unfortunately failed. The alternative approach investigated the use of commercially available sugars D- and L-lyxose to afford a functionalised C-ring analogue. Following the initial synthesis of the AB-system in the first step, various strategies to perform the cyclization to afford the functionalised C-ring were explored. Although cyclization of the unprotected system was achieved in high yield using Mitsunobu conditions, the product obtained was actually a previously unreported fully substituted THF-system with four contiguous stereogenic centres, which was fully characterised, and the stereochemistry confirmed by X-ray crystallography.
Published: 2023

45. Exploring the Action Mechanism of Rosa roxburghii Fruit Flavonoids in the Intervention of Ulcerative Colitis Based on Network Pharmacology, Molecular Docking and Experimental Verification

Author: PU Xian, YUAN Meng, TAN Shuming, XIE Guofang, TAO Yun, LOU Jienan, LU Guanglei, XU Haoran
Subjects: rosa roxburghii flavonoids, ulcerative colitis, network pharmacology, molecular docking, arachidonic acid metabolism, Food processing and manufacture, TP368-456
Abstract: The aim of this study was to investigate the core components and mechanism of action of Rosa roxburghii fruit flavonoids in the intervention of ulcerative colitis (UC) using network pharmacology and molecular docking. We extracted flavonoids from R. roxburghii fruit by an ethanol-cellulase method and used liquid chromatography-mass spectrometry (LC-MS) to identify the major chemical components of the flavonoid extract, and then we selected 34 active flavonoid components and 146 targets in the intervention of UC on an analytical platform and constructed a “R. roxburghii flavonoids-active components-intersecting targets-UC” network. The results of gene ontology pathway analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis showed that the intervention of R. roxburghii flavonoids in UC mainly involved the arachidonic acid signaling pathway, the nuclear factor-kappa B signaling pathway, and cancer pathways. Molecular docking was performed on the major active ingredients and the key targets selected from the network, with docking energies all less than −6.0 kcal/mol, indicating a strong binding affinity between the core active ingredients and the key targets for UC treatment. So, the network analysis results are reliable. The results of animal experiments showed that intervention with R. roxburghii flavonoids could effectively alleviate the body mass loss and colon pathological changes of UC mice, and significantly inhibit the expression of inflammatory cytokines such as tumour necrosis factor-α, interleukin (IL)-1β and IL-6 in the UC model group (P < 0.05). The results of Western blot showed that R. roxburghii flavonoids could effectively inhibit the expression of the key target proteins associated with the metabolic pathway of arachidonic acid, such as cyclooxygenase-2 and 5-lipoxygenase. In conclusion, R. roxburghii flavonoids can intervene in UC through a multi-component, multi-target and multi-pathway mechanism. The results of this research can provide a theoretical basis for the development and utilization of functional foods from R. roxburghii.
Published: 2024
Full Text: View/download PDF

46. Geometrical control of interface patterning underlies active matter invasion

Author: Xu, Haoran, Nejad, Mehrana R., Yeomans, Julia M., and Wu, Yilin
Subjects: Condensed Matter - Soft Condensed Matter
Abstract: Interaction between active materials and the boundaries of geometrical confinement is key to many emergent phenomena in active systems. For living active matter consisting of animal cells or motile bacteria, the confinement boundary is often a deformable interface, and it has been unclear how activity-induced interface dynamics might lead to morphogenesis and pattern formation. Here we studied the evolution of bacterial active matter confined by a deformable boundary. We discovered that an ordered morphological pattern emerged at the interface characterized by periodically-spaced interfacial protrusions; behind the interfacial protrusions, bacterial swimmers self-organized into multicellular clusters displaying +1/2 nematic defects. Subsequently, a hierarchical sequence of transitions from interfacial protrusions to creeping branches allowed the bacterial active drop to rapidly invade surrounding space with a striking self-similar branch pattern. We found that this interface patterning is controlled by the local curvature of the interface, a phenomenon we denote as collective curvature sensing. Using a continuum active model, we revealed that the collective curvature sensing arises from enhanced active stresses near high-curvature regions, with the active length-scale setting the characteristic distance between the interfacial protrusions. Our findings reveal a protrusion-to-branch transition as a novel mode of active matter invasion and suggest a new strategy to engineer pattern formation of active materials.
Published: 2022

47. Autonomous waves and global motion modes in living active solids

Author: Xu, Haoran, Huang, Yulu, Zhang, Rui, and Wu, Yilin
Subjects: Condensed Matter - Soft Condensed Matter, Physics - Biological Physics
Abstract: Elastic active matter or active solid consists of self-propelled units embedded in an elastic matrix. Active solid resists deformation; the shape-preserving property and the intrinsic non-equilibrium nature make active solids a superior component for self-driven devices. Nonetheless, the mechanical properties and emergent behavior of active solids are poorly understood. Using a biofilm-based bacterial active solid, here we discovered self-sustained elastic waves with unique wave properties not seen in passive solids, such as power-law scaling of wave speed with activity. Under isotropic confinement, the active solid develops two topologically distinct global motion modes that can be selectively excited, with a surprising step-like frequency jump at mode transition. Our findings reveal novel spatiotemporal order in elastic active matter and may guide the development of solid-state adaptive or living materials.
Published: 2022
Full Text: View/download PDF

48. Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations

Author: Xu, Haoran, Zhan, Xianyuan, Yin, Honglei, and Qin, Huiling
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: We study the problem of offline Imitation Learning (IL) where an agent aims to learn an optimal expert behavior policy without additional online environment interactions. Instead, the agent is provided with a supplementary offline dataset from suboptimal behaviors. Prior works that address this problem either require that expert data occupies the majority proportion of the offline dataset, or need to learn a reward function and perform offline reinforcement learning (RL) afterwards. In this paper, we aim to address the problem without additional steps of reward learning and offline RL training for the case when demonstrations contain a large proportion of suboptimal data. Built upon behavioral cloning (BC), we introduce an additional discriminator to distinguish expert and non-expert data. We propose a cooperation framework to boost the learning of both tasks, Based on this framework, we design a new IL algorithm, where the outputs of discriminator serve as the weights of the BC loss. Experimental results show that our proposed algorithm achieves higher returns and faster training speed compared to baseline algorithms., Comment: ICML 2022, code at https://github.com/ryanxhr/DWBC
Published: 2022

49. Discriminator-Guided Model-Based Offline Imitation Learning

Author: Zhang, Wenjia, Xu, Haoran, Niu, Haoyi, Cheng, Peng, Li, Ming, Zhang, Heming, Zhou, Guyue, and Zhan, Xianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward labels. Existing offline IL methods suffer from severe performance degeneration under limited expert data. Including a learned dynamics model can potentially improve the state-action space coverage of expert data, however, it also faces challenging issues like model approximation/generalization errors and suboptimality of rollout data. In this paper, we propose the Discriminator-guided Model-based offline Imitation Learning (DMIL) framework, which introduces a discriminator to simultaneously distinguish the dynamics correctness and suboptimality of model rollout data against real expert demonstrations. DMIL adopts a novel cooperative-yet-adversarial learning strategy, which uses the discriminator to guide and couple the learning process of the policy and dynamics model, resulting in improved model performance and robustness. Our framework can also be extended to the case when demonstrations contain a large proportion of suboptimal data. Experimental results show that DMIL and its extension achieve superior performance and robustness compared to state-of-the-art offline IL methods under small datasets., Comment: This work has been accepted by CoRL 2022
Published: 2022

50. Regional climate model intercomparison over the Tibetan Plateau in the GEWEX/LS4P Phase I

Author: Tang, Jianping, Xue, Yongkang, Long, Mengyuan, Ma, Mengnan, Liang, Xin-Zhong, Sugimoto, Shiori, Yang, Kun, Ji, Zhenming, Hong, Jinkyu, Kim, Jeongwon, Xu, Haoran, Zhou, Xu, Sato, Tomonori, Takahashi, Hiroshi G, Wang, Shuyu, Wang, Guiling, Chou, Sin Chan, Guo, Weidong, Yu, Miao, and Pan, Xiaoduo
Subjects: Earth Sciences, Oceanography, Atmospheric Sciences, Climate Change Science, Climate Action, Tibetan Plateau, Regional climate model, Precipitation, Temperature, Seasonal prediction, LS4P, Physical Geography and Environmental Geoscience, Meteorology & Atmospheric Sciences, Atmospheric sciences, Climate change science
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,740 results on '"Xu, Haoran"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources