Author: "Chen,Jie" / Publication Year Range: Last 50 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen,Jie"' showing total 56,679 results

Start Over Author "Chen,Jie" Publication Year Range Last 50 years

56,679 results on '"Chen,Jie"'

1. Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning

Author: Zhou, Shicheng, Liu, Jingju, Lu, Yuliang, Yang, Jiahai, Zhang, Yue, and Chen, Jie
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: With increasing numbers of vulnerabilities exposed on the internet, autonomous penetration testing (pentesting) has emerged as an emerging research area, while reinforcement learning (RL) is a natural fit for studying autonomous pentesting. Previous research in RL-based autonomous pentesting mainly focused on enhancing agents' learning efficacy within abstract simulated training environments. They overlooked the applicability and generalization requirements of deploying agents' policies in real-world environments that differ substantially from their training settings. In contrast, for the first time, we shift focus to the pentesting agents' ability to generalize across unseen real environments. For this purpose, we propose a Generalizable Autonomous Pentesting framework (namely GAP) for training agents capable of drawing inferences from one to another -- a key requirement for the broad application of autonomous pentesting and a hallmark of human intelligence. GAP introduces a Real-to-Sim-to-Real pipeline with two key methods: domain randomization and meta-RL learning. Specifically, we are among the first to apply domain randomization in autonomous pentesting and propose a large language model-powered domain randomization method for synthetic environment generation. We further apply meta-RL to improve the agents' generalization ability in unseen environments by leveraging the synthetic environments. The combination of these two methods can effectively bridge the generalization gap and improve policy adaptation performance. Experiments are conducted on various vulnerable virtual machines, with results showing that GAP can (a) enable policy learning in unknown real environments, (b) achieve zero-shot policy transfer in similar environments, and (c) realize rapid policy adaptation in dissimilar environments., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

2. RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians

Author: Gao, Qiankun, Wu, Yanmin, Wen, Chengxiang, Meng, Jiarui, Tang, Luyang, Chen, Jie, Wang, Ronggang, and Zhang, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Reconstructing dynamic scenes with large-scale and complex motions remains a significant challenge. Recent techniques like Neural Radiance Fields and 3D Gaussian Splatting (3DGS) have shown promise but still struggle with scenes involving substantial movement. This paper proposes RelayGS, a novel method based on 3DGS, specifically designed to represent and reconstruct highly dynamic scenes. Our RelayGS learns a complete 4D representation with canonical 3D Gaussians and a compact motion field, consisting of three stages. First, we learn a fundamental 3DGS from all frames, ignoring temporal scene variations, and use a learnable mask to separate the highly dynamic foreground from the minimally moving background. Second, we replicate multiple copies of the decoupled foreground Gaussians from the first stage, each corresponding to a temporal segment, and optimize them using pseudo-views constructed from multiple frames within each segment. These Gaussians, termed Relay Gaussians, act as explicit relay nodes, simplifying and breaking down large-scale motion trajectories into smaller, manageable segments. Finally, we jointly learn the scene's temporal motion and refine the canonical Gaussians learned from the first two stages. We conduct thorough experiments on two dynamic scene datasets featuring large and complex motions, where our RelayGS outperforms state-of-the-arts by more than 1 dB in PSNR, and successfully reconstructs real-world basketball game scenes in a much more complete and coherent manner, whereas previous methods usually struggle to capture the complex motion of players. Code will be publicly available at https://github.com/gqk/RelayGS, Comment: Technical Report. GitHub: https://github.com/gqk/RelayGS
Published: 2024

3. FreeCodec: A disentangled neural speech codec with fewer tokens

Author: Zheng, Youqiang, Tu, Weiping, Kang, Yueteng, Chen, Jie, Zhang, Yike, Xiao, Li, Yang, Yuhong, and Ma, Long
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Neural speech codecs have gained great attention for their outstanding reconstruction with discrete token representations. It is a crucial component in generative tasks such as speech coding and large language models (LLM). However, most works based on residual vector quantization perform worse with fewer tokens due to low coding efficiency for modeling complex coupled information. In this paper, we propose a neural speech codec named FreeCodec which employs a more effective encoding framework by decomposing intrinsic properties of speech into different components: 1) a global vector is extracted as the timbre information, 2) a prosody encoder with a long stride level is used to model the prosody information, 3) the content information is from a content encoder. Using different training strategies, FreeCodec achieves state-of-the-art performance in reconstruction and disentanglement scenarios. Results from subjective and objective experiments demonstrate that our framework outperforms existing methods.
Published: 2024

4. Angular dependence of large negative magnetoresistance in a field-induced Weyl semimetal candidate HoAuSn

Author: Lu, Yue, Chen, Jie, Zhou, Feng, Lau, Yong-Chang, Wisniewski, Piotr, Kaczorowski, Dariusz, Xi, Xue-Kui, and Wang, Wen-Hong
Subjects: Condensed Matter - Materials Science
Abstract: The angular dependence of magnetoresistance (MR) in antiferromagnetic half-Heusler HoAuSn single crystals have been systematically studied. Negative MR, as large as 99%, is observed at 9 T, is not restricted to the specific configuration of applied magnetics fields and current, and can persist up to 20 K, much higher than the Neel temperature (TN 1.9 K). Experiments and first-principles calculations suggest that the observed large negative MR is derived from a magnetic field that reconstructs the band structure and induces a Weyl point, which changes the carrier concentration. Taking into consideration that large negative MR has so far been rarely reported, especially in antiferromagnetic materials, it is anticipated that the present work not only offers a guideline for searching materials with large negative MR but also helps to further realize other exotic topological electronic states in a large class of antiferromagnetic half-Heusler compounds.
Published: 2024

5. Self-Supervised Conditional Distribution Learning on Graphs

Author: Chen, Jie, Mao, Hua, Gou, Yuanbiao, Wang, Zhu, and Peng, Xi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Graph contrastive learning (GCL) has shown promising performance in semisupervised graph classification. However, existing studies still encounter significant challenges in GCL. First, successive layers in graph neural network (GNN) tend to produce more similar node embeddings, while GCL aims to increase the dissimilarity between negative pairs of node embeddings. This inevitably results in a conflict between the message-passing mechanism of GNNs and the contrastive learning of negative pairs via intraviews. Second, leveraging the diversity and quantity of data provided by graph-structured data augmentations while preserving intrinsic semantic information is challenging. In this paper, we propose a self-supervised conditional distribution learning (SSCDL) method designed to learn graph representations from graph-structured data for semisupervised graph classification. Specifically, we present an end-to-end graph representation learning model to align the conditional distributions of weakly and strongly augmented features over the original features. This alignment effectively reduces the risk of disrupting intrinsic semantic information through graph-structured data augmentation. To avoid conflict between the message-passing mechanism and contrastive learning of negative pairs, positive pairs of node representations are retained for measuring the similarity between the original features and the corresponding weakly augmented features. Extensive experiments with several benchmark graph datasets demonstrate the effectiveness of the proposed SSCDL method., Comment: 8 pages
Published: 2024

6. Adversarial Diffusion Compression for Real-World Image Super-Resolution

Author: Chen, Bin, Li, Gehui, Wu, Rongyuan, Zhang, Xindong, Chen, Jie, Zhang, Jian, and Zhang, Lei
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Real-world image super-resolution (Real-ISR) aims to reconstruct high-resolution images from low-resolution inputs degraded by complex, unknown processes. While many Stable Diffusion (SD)-based Real-ISR methods have achieved remarkable success, their slow, multi-step inference hinders practical deployment. Recent SD-based one-step networks like OSEDiff and S3Diff alleviate this issue but still incur high computational costs due to their reliance on large pretrained SD models. This paper proposes a novel Real-ISR method, AdcSR, by distilling the one-step diffusion network OSEDiff into a streamlined diffusion-GAN model under our Adversarial Diffusion Compression (ADC) framework. We meticulously examine the modules of OSEDiff, categorizing them into two types: (1) Removable (VAE encoder, prompt extractor, text encoder, etc.) and (2) Prunable (denoising UNet and VAE decoder). Since direct removal and pruning can degrade the model's generation capability, we pretrain our pruned VAE decoder to restore its ability to decode images and employ adversarial distillation to compensate for performance loss. This ADC-based diffusion-GAN hybrid design effectively reduces complexity by 73% in inference time, 78% in computation, and 74% in parameters, while preserving the model's generation capability. Experiments manifest that our proposed AdcSR achieves competitive recovery quality on both synthetic and real-world datasets, offering up to 9.3$\times$ speedup over previous one-step diffusion-based methods. Code and models will be made available.
Published: 2024

7. Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search

Author: Jiang, Jinhao, Chen, Zhipeng, Min, Yingqian, Chen, Jie, Cheng, Xiaoxue, Wang, Jiapeng, Tang, Yiru, Sun, Haoxiang, Deng, Jia, Zhao, Wayne Xin, Liu, Zheng, Yan, Dong, Xie, Jian, Wang, Zhongyuan, and Wen, Ji-Rong
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Recently, test-time scaling has garnered significant attention from the research community, largely due to the substantial advancements of the o1 model released by OpenAI. By allocating more computational resources during the inference phase, large language models~(LLMs) can extensively explore the solution space by generating more thought tokens or diverse solutions, thereby producing more accurate responses. However, developing an o1-like reasoning approach is challenging, and researchers have been making various attempts to advance this open area of research. In this paper, we present a preliminary exploration into enhancing the reasoning abilities of LLMs through reward-guided tree search algorithms. This framework is implemented by integrating the policy model, reward model, and search algorithm. It is primarily constructed around a tree search algorithm, where the policy model navigates a dynamically expanding tree guided by a specially trained reward model. We thoroughly explore various design considerations necessary for implementing this framework and provide a detailed report of the technical aspects. To assess the effectiveness of our approach, we focus on mathematical reasoning tasks and conduct extensive evaluations on four challenging datasets, significantly enhancing the reasoning abilities of LLMs., Comment: LLM;Complex Reasoning;Math
Published: 2024

8. Local well-posedness for the Schr\'{o}dinger-KdV system in $H^{s_1}\times H^{s_2}$, II

Author: Ban, Yingzhe, Chen, Jie, and Zhang, Ying
Subjects: Mathematics - Analysis of PDEs
Abstract: In this paper, we continue the study of the local well-posedness theory for the Schr\"{o}dinger-KdV system in the Sobolev space $H^{s_1}\times H^{s_2}$. We show the local well-posedness in $H^{-3/16}\times H^{-3/4}$ for $\beta = 0$. Combining our work \cite{banchenzhang}, we also have the local well-posedness for $\max\{-3/4,s_1-3\}\leq s_2\leq \min\{4s_1,s_1+2\}$. The result is sharp by using the contraction mapping argument.
Published: 2024

9. Local well-posedness for the Schr\'{o}dinger-KdV system in $H^{s_1}\times H^{s_2}$

Author: Ban, Yingzhe, Chen, Jie, and Zhang, Ying
Subjects: Mathematics - Analysis of PDEs
Abstract: In this paper, we study local well-posedness theory of the Cauchy problem for Schr\"{o}dinger-KdV system in Sobolev spaces $H^{s_1}\times H^{s_2}$. We obtain the local well-posedness when $s_1\geq 0$, $\max\{-3/4,s_1-3\}\leq s_2\leq \min\{4s_1,s_1+2\}$. The result is sharp in some sense and improves previous one by Corcho-Linares \cite{corcho2007well}. The endpoint case $(s_1,s_2) = (0,-3/4)$ has been solved in \cite{guo2010well,wang2011cauchy}. We show the necessary and sufficient conditions for related estimates in Bourgain spaces. To solve the borderline cases, we use the $U^p-V^p$ spaces introduced by Koch-Tataru \cite{kochtataru} and function spaces constructed by Guo-Wang \cite{guo2010well}. We also use normal form argument to control the nonresonant interaction.
Published: 2024

10. HiCoM: Hierarchical Coherent Motion for Streamable Dynamic Scene with 3D Gaussian Splatting

Author: Gao, Qiankun, Meng, Jiarui, Wen, Chengxiang, Chen, Jie, and Zhang, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The online reconstruction of dynamic scenes from multi-view streaming videos faces significant challenges in training, rendering and storage efficiency. Harnessing superior learning speed and real-time rendering capabilities, 3D Gaussian Splatting (3DGS) has recently demonstrated considerable potential in this field. However, 3DGS can be inefficient in terms of storage and prone to overfitting by excessively growing Gaussians, particularly with limited views. This paper proposes an efficient framework, dubbed HiCoM, with three key components. First, we construct a compact and robust initial 3DGS representation using a perturbation smoothing strategy. Next, we introduce a Hierarchical Coherent Motion mechanism that leverages the inherent non-uniform distribution and local consistency of 3D Gaussians to swiftly and accurately learn motions across frames. Finally, we continually refine the 3DGS with additional Gaussians, which are later merged into the initial 3DGS to maintain consistency with the evolving scene. To preserve a compact representation, an equivalent number of low-opacity Gaussians that minimally impact the representation are removed before processing subsequent frames. Extensive experiments conducted on two widely used datasets show that our framework improves learning efficiency of the state-of-the-art methods by about $20\%$ and reduces the data storage by $85\%$, achieving competitive free-viewpoint video synthesis quality but with higher robustness and stability. Moreover, by parallel learning multiple frames simultaneously, our HiCoM decreases the average training wall time to $<2$ seconds per frame with negligible performance degradation, substantially boosting real-world applicability and responsiveness., Comment: Accepted to NeurIPS 2024; Code is avaliable at https://github.com/gqk/HiCoM
Published: 2024

11. An Efficient Hierarchical Preconditioner-Learner Architecture for Reconstructing Multi-scale Basis Functions of High-dimensional Subsurface Fluid Flow

Author: Li, Peiqi and Chen, Jie
Subjects: Physics - Fluid Dynamics, Computer Science - Machine Learning, 35Q35
Abstract: Modeling subsurface fluid flow in porous media is crucial for applications such as oil and gas exploration. However, the inherent heterogeneity and multi-scale characteristics of these systems pose significant challenges in accurately reconstructing fluid flow behaviors. To address this issue, we proposed Fourier Preconditioner-based Hierarchical Multiscale Net (FP-HMsNet), an efficient hierarchical preconditioner-learner architecture that combines Fourier Neural Operators (FNO) with multi-scale neural networks to reconstruct multi-scale basis functions of high-dimensional subsurface fluid flow. Using a dataset comprising 102,757 training samples, 34,252 validation samples, and 34,254 test samples, we ensured the reliability and generalization capability of the model. Experimental results showed that FP-HMsNet achieved an MSE of 0.0036, an MAE of 0.0375, and an R2 of 0.9716 on the testing set, significantly outperforming existing models and demonstrating exceptional accuracy and generalization ability. Additionally, robustness tests revealed that the model maintained stability under various levels of noise interference. Ablation studies confirmed the critical contribution of the preconditioner and multi-scale pathways to the model's performance. Compared to current models, FP-HMsNet not only achieved lower errors and higher accuracy but also demonstrated faster convergence and improved computational efficiency, establishing itself as the state-of-the-art (SOTA) approach. This model offers a novel method for efficient and accurate subsurface fluid flow modeling, with promising potential for more complex real-world applications., Comment: 20 pages, 9 figures
Published: 2024

12. The D-Subspace Algorithm for Online Learning over Distributed Networks

Author: Chen, Yitong, Jin, Danqi, Chen, Jie, and Richard, Cedric
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: This material introduces the D-Subspace algorithm derived on the basis of the centralized algorithm [1], which originally addresses parameter estimation problems under a subspace constraint.
Published: 2024

13. Standardizing Generative Face Video Compression using Supplemental Enhancement Information

Author: Chen, Bolin, Ye, Yan, Chen, Jie, Liao, Ru-Ling, Yin, Shanzhi, Wang, Shiqi, Yang, Kaifa, Li, Yue, Xu, Yiling, Wang, Ye-Kui, Gehlot, Shiv, Su, Guan-Ming, Yin, Peng, McCarthy, Sean, and Sullivan, Gary J.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper proposes a Generative Face Video Compression (GFVC) approach using Supplemental Enhancement Information (SEI), where a series of compact spatial and temporal representations of a face video signal (i.e., 2D/3D key-points, facial semantics and compact features) can be coded using SEI message and inserted into the coded video bitstream. At the time of writing, the proposed GFVC approach is an official "technology under consideration" (TuC) for standardization by the Joint Video Experts Team (JVET) of ISO/IEC JVT 1/SC 29 and ITU-T SG16. To the best of the authors' knowledge, the JVET work on the proposed SEI-based GFVC approach is the first standardization activity for generative video compression. The proposed SEI approach has not only advanced the reconstruction quality of early-day Model-Based Coding (MBC) via the state-of-the-art generative technique, but also established a new SEI definition for future GFVC applications and deployment. Experimental results illustrate that the proposed SEI-based GFVC approach can achieve remarkable rate-distortion performance compared with the latest Versatile Video Coding (VVC) standard, whilst also potentially enabling a wide variety of functionalities including user-specified animation/filtering and metaverse-related applications.
Published: 2024

14. Graph Neural Flows for Unveiling Systemic Interactions Among Irregularly Sampled Time Series

Author: Mercatali, Giangiacomo, Freitas, Andre, and Chen, Jie
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Interacting systems are prevalent in nature. It is challenging to accurately predict the dynamics of the system if its constituent components are analyzed independently. We develop a graph-based model that unveils the systemic interactions of time series observed at irregular time points, by using a directed acyclic graph to model the conditional dependencies (a form of causal notation) of the system components and learning this graph in tandem with a continuous-time model that parameterizes the solution curves of ordinary differential equations (ODEs). Our technique, a graph neural flow, leads to substantial enhancements over non-graph-based methods, as well as graph-based methods without the modeling of conditional dependencies. We validate our approach on several tasks, including time series classification and forecasting, to demonstrate its efficacy., Comment: NeurIPS 2024. Code is available at https://github.com/gmerca/GNeuralFlow
Published: 2024

15. Beyond GFVC: A Progressive Face Video Compression Framework with Adaptive Visual Tokens

Author: Chen, Bolin, Yin, Shanzhi, Zhang, Zihan, Chen, Jie, Liao, Ru-Ling, Zhu, Lingyu, Wang, Shiqi, and Ye, Yan
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently, deep generative models have greatly advanced the progress of face video coding towards promising rate-distortion performance and diverse application functionalities. Beyond traditional hybrid video coding paradigms, Generative Face Video Compression (GFVC) relying on the strong capabilities of deep generative models and the philosophy of early Model-Based Coding (MBC) can facilitate the compact representation and realistic reconstruction of visual face signal, thus achieving ultra-low bitrate face video communication. However, these GFVC algorithms are sometimes faced with unstable reconstruction quality and limited bitrate ranges. To address these problems, this paper proposes a novel Progressive Face Video Compression framework, namely PFVC, that utilizes adaptive visual tokens to realize exceptional trade-offs between reconstruction robustness and bandwidth intelligence. In particular, the encoder of the proposed PFVC projects the high-dimensional face signal into adaptive visual tokens in a progressive manner, whilst the decoder can further reconstruct these adaptive visual tokens for motion estimation and signal synthesis with different granularity levels. Experimental results demonstrate that the proposed PFVC framework can achieve better coding flexibility and superior rate-distortion performance in comparison with the latest Versatile Video Coding (VVC) codec and the state-of-the-art GFVC algorithms. The project page can be found at https://github.com/Berlin0610/PFVC.
Published: 2024

16. Identifying Money Laundering Subgraphs on the Blockchain

Author: Song, Kiwhan, Dhraief, Mohamed Ali, Xu, Muhua, Cai, Locke, Chen, Xuhao, Arvind, and Chen, Jie
Subjects: Computer Science - Machine Learning, Quantitative Finance - General Finance
Abstract: Anti-Money Laundering (AML) involves the identification of money laundering crimes in financial activities, such as cryptocurrency transactions. Recent studies advanced AML through the lens of graph-based machine learning, modeling the web of financial transactions as a graph and developing graph methods to identify suspicious activities. For instance, a recent effort on opensourcing datasets and benchmarks, Elliptic2, treats a set of Bitcoin addresses, considered to be controlled by the same entity, as a graph node and transactions among entities as graph edges. This modeling reveals the "shape" of a money laundering scheme - a subgraph on the blockchain. Despite the attractive subgraph classification results benchmarked by the paper, competitive methods remain expensive to apply due to the massive size of the graph; moreover, existing methods require candidate subgraphs as inputs which may not be available in practice. In this work, we introduce RevTrack, a graph-based framework that enables large-scale AML analysis with a lower cost and a higher accuracy. The key idea is to track the initial senders and the final receivers of funds; these entities offer a strong indication of the nature (licit vs. suspicious) of their respective subgraph. Based on this framework, we propose RevClassify, which is a neural network model for subgraph classification. Additionally, we address the practical problem where subgraph candidates are not given, by proposing RevFilter. This method identifies new suspicious subgraphs by iteratively filtering licit transactions, using RevClassify. Benchmarking these methods on Elliptic2, a new standard for AML, we show that RevClassify outperforms state-of-the-art subgraph classification techniques in both cost and accuracy. Furthermore, we demonstrate the effectiveness of RevFilter in discovering new suspicious subgraphs, confirming its utility for practical AML., Comment: ICAIF 2024. Code is available at https://github.com/MITIBMxGraph/RevTrack
Published: 2024

17. Decentralized Clinical Trials in the Era of Real-World Evidence: A Statistical Perspective

Author: Chen, Jie, Di, Junrui, Daizadeh, Nadia, Lu, Ying, Wang, Hongwei, Shen, Yuan-Li, Kirk, Jennifer, Rockhold, Frank W., Pang, Herbert, Zhao, Jing, He, Weili, Potter, Andrew, and Lee, Hana
Subjects: Statistics - Applications
Abstract: There has been a growing trend that activities relating to clinical trials take place at locations other than traditional trial sites (hence decentralized clinical trials or DCTs), some of which are at settings of real-world clinical practice. Although there are numerous benefits of DCTs, this also brings some implications on a number of issues relating to the design, conduct, and analysis of DCTs. The Real-World Evidence Scientific Working Group of the American Statistical Association Biopharmaceutical Section has been reviewing the field of DCTs and provides in this paper considerations for decentralized trials from a statistical perspective. This paper first discusses selected critical decentralized elements that may have statistical implications on the trial and then summarizes regulatory guidance, framework, and initiatives on DCTs. More discussions are presented by focusing on the design (including construction of estimand), implementation, statistical analysis plan (including missing data handling), and reporting of safety events. Some additional considerations (e.g., ethical considerations, technology infrastructure, study oversight, data security and privacy, and regulatory compliance) are also briefly discussed. This paper is intended to provide statistical considerations for decentralized trials of medical products to support regulatory decision-making.
Published: 2024

18. Use of Real-World Data and Real-World Evidence in Rare Disease Drug Development: A Statistical Perspective

Author: Chen, Jie, Gruber, Susan, Lee, Hana, Chu, Haitao, Lee, Shiowjen, Tian, Haijun, Wang, Yan, He, Weili, Jemielita, Thomas, Song, Yang, Tamura, Roy, Tian, Lu, Zhao, Yihua, Chen, Yong, van der Laan, Mark, and Nie, Lei
Subjects: Statistics - Applications
Abstract: Real-world data (RWD) and real-world evidence (RWE) have been increasingly used in medical product development and regulatory decision-making, especially for rare diseases. After outlining the challenges and possible strategies to address the challenges in rare disease drug development (see the accompanying paper), the Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section reviews the roles of RWD and RWE in clinical trials for drugs treating rare diseases. This paper summarizes relevant guidance documents and frameworks by selected regulatory agencies and the current practice on the use of RWD and RWE in natural history studies and the design, conduct, and analysis of rare disease clinical trials. A targeted learning roadmap for rare disease trials is described, followed by case studies on the use of RWD and RWE to support a natural history study and marketing applications in various settings.
Published: 2024

19. Challenges and Possible Strategies to Address Them in Rare Disease Drug Development: A Statistical Perspective

Author: Chen, Jie, Nie, Lei, Lee, Shiowjen, Chu, Haitao, Tian, Haijun, Wang, Yan, He, Weili, Jemielita, Thomas, Gruber, Susan, Song, Yang, Tamura, Roy, Tian, Lu, Zhao, Yihua, Chen, Yong, van der Laan, Mark, and Lee, Hana
Subjects: Statistics - Applications
Abstract: Developing drugs for rare diseases presents unique challenges from a statistical perspective. These challenges may include slowly progressive diseases with unmet medical needs, poorly understood natural history, small population size, diversified phenotypes and geneotypes within a disorder, and lack of appropriate surrogate endpoints to measure clinical benefits. The Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section has assembled a research team to assess the landscape including challenges and possible strategies to address these challenges and the role of real-world data (RWD) and RWE in rare disease drug development. This paper first reviews the current regulations by regulatory agencies worldwide and then discusses in more details the challenges from a statistical perspective in the design, conduct, and analysis of rare disease clinical trials. After outlining an overall development pathway for rare disease drugs, corresponding strategies to address the aforementioned challenges are presented. Other considerations are also discussed for generating relevant evidence for regulatory decision-making on drugs for rare diseases. The accompanying paper discusses how RWD and RWE can be used to improve the efficiency of rare disease drug development.
Published: 2024

20. Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusion

Author: Cao, Yice, Liu, Chenchen, Wu, Zhenhua, Yao, Wenxin, Xiong, Liu, Chen, Jie, and Huang, Zhixiang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: As remote sensing imaging technology continues to advance and evolve, processing high-resolution and diversified satellite imagery to improve segmentation accuracy and enhance interpretation efficiency emerg as a pivotal area of investigation within the realm of remote sensing. Although segmentation algorithms based on CNNs and Transformers achieve significant progress in performance, balancing segmentation accuracy and computational complexity remains challenging, limiting their wide application in practical tasks. To address this, this paper introduces state space model (SSM) and proposes a novel hybrid semantic segmentation network based on vision Mamba (CVMH-UNet). This method designs a cross-scanning visual state space block (CVSSBlock) that uses cross 2D scanning (CS2D) to fully capture global information from multiple directions, while by incorporating convolutional neural network branches to overcome the constraints of Vision Mamba (VMamba) in acquiring local information, this approach facilitates a comprehensive analysis of both global and local features. Furthermore, to address the issue of limited discriminative power and the difficulty in achieving detailed fusion with direct skip connections, a multi-frequency multi-scale feature fusion block (MFMSBlock) is designed. This module introduces multi-frequency information through 2D discrete cosine transform (2D DCT) to enhance information utilization and provides additional scale local detail information through point-wise convolution branches. Finally, it aggregates multi-scale information along the channel dimension, achieving refined feature fusion. Findings from experiments conducted on renowned datasets of remote sensing imagery demonstrate that proposed CVMH-UNet achieves superior segmentation performance while maintaining low computational complexity, outperforming surpassing current leading-edge segmentation algorithms.
Published: 2024

21. Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

Author: Liu, Gang, Sun, Michael, Matusik, Wojciech, Jiang, Meng, and Chen, Jie
Subjects: Computer Science - Machine Learning, Physics - Chemical Physics, Quantitative Biology - Biomolecules
Abstract: While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inverse design with retrosynthetic planning. Llamole integrates a base LLM with the Graph Diffusion Transformer and Graph Neural Networks for multi-conditional molecular generation and reaction inference within texts, while the LLM, with enhanced molecular understanding, flexibly controls activation among the different graph modules. Additionally, Llamole integrates A* search with LLM-based cost functions for efficient retrosynthetic planning. We create benchmarking datasets and conduct extensive experiments to evaluate Llamole against in-context learning and supervised fine-tuning. Llamole significantly outperforms 14 adapted LLMs across 12 metrics for controllable molecular design and retrosynthetic planning., Comment: 27 pages, 11 figures, 4 tables
Published: 2024

22. The Relationship Between Travel Distance for Treatment and Outcomes in Patients Undergoing Radiation Therapy: A Systematic Review.

Author: Silverwood, Sierra, Waeldner, Kathleen, Demeulenaere, Sasha, Keren, Shavit, To, Jason, Chen, Jie, Kouzi, Zakaria, Ayoub, Alan, Grover, Surbhi, Lichter, Katie, and Mohamad, Osama
Abstract: PURPOSE: Although recent technological advances in radiation therapy have significantly improved treatment outcomes, the global distribution of radiation therapy is unbalanced, making access especially challenging for patients in rural or low-resource settings because of travel burden. This systematic review aimed to explore the impact of geographic distance to treatment facilities on survival, as well as other treatment outcomes, among patients undergoing radiation therapy. METHODS AND MATERIALS: A search of four databases (PubMed, Embase, CINAHL, and Web of Science) was performed. Studies were included if they were primary literature, published between May 2000 and May 2023, and reported the travel distances for patients undergoing radiation therapy for malignant conditions and its influence on survival outcomes. Studies were excluded if they did not report primary outcomes, were published before 2000, or were non-English. RESULTS: After review, 23 studies were included. Most studies were conducted in the United States, with cervical cancer being the most frequently studied disease site. Data suggested that travel distances vary significantly, with patients often traveling a median distance of 20 miles to radiation therapy. Among the studies, 5 reported a negative impact on overall survival, often associating greater travel with nonadherence to recommended care. Other survival metrics, including progression-free survival and all-cause mortality, were also assessed, demonstrating similar variability in relation to travel distance. Conversely, seven studies found no significant impact on overall survival, and four suggested a positive impact on overall survival, with improved outcomes at centers with higher case volumes. Some data also revealed an inverse correlation between travel distance and the likelihood of receiving guideline-concordant radiation therapy. CONCLUSIONS: The impact of travel distance on radiation therapy outcomes is varied. Our findings underscore the challenges posed by travel in accessing radiation therapy and the disparities affecting particular patient demographic groups. Additional studies are needed to thoroughly assess the impacts of geographic disparities and to identify effective measures to address these challenges.
Published: 2024

23. FlatKnotInfo: the first 1.24 million flat knots

Author: Chen, Jie
Subjects: Mathematics - Geometric Topology
Abstract: We use matchings on Lyndon words to classify flat knots up to 8 crossings. Using flat knots invariants such as the based matrix, the $\phi$-invariant, the flat arrow polynomial, and the flat Jones-Krushkal polynomial, we distinguish all flat knots up to 7 crossings except for five pairs. Among the many flat knots considered, we find examples that are: (i) algebraically slice but not slice; (ii) almost classical (null-homologous) but not slice; (iii) nontrivial but with trivial (primitive) based matrix. The classification data has been curated and is available on FlatKnotInfo, which is an interactive searchable website listing flat knots up to 8 crossings and their invariants. It also provides access to algebraic and diagrammatic information on these knots and is designed to enable users to discover patterns and formulate conjectures on their own.
Published: 2024

24. Tannenbaum's gain-margin optimization meets Polyak's heavy-ball algorithm

Author: Wu, Wuwei, Chen, Jie, Jovanović, Mihailo R., and Georgiou, Tryphon T.
Subjects: Electrical Engineering and Systems Science - Systems and Control, Mathematics - Numerical Analysis, Mathematics - Optimization and Control, 93B36, 93B52, 65-XX, 49Mxx, 49M15, 30E05
Abstract: The paper highlights a relatively unknown link between algorithm design in optimization and control synthesis in robust control. Specifically, quadratic optimization can be recast as a regulation problem within the framework of $\mathcal{H}_\infty$ control. From this vantage point, the optimality of Polyak's fastest heavy-ball algorithm can be ascertained as a solution to a gain margin optimization problem. The approach is independent of Polyak's original and brilliant argument, yet simpler, and relies on the foundational work by Tannenbaum that introduced and solved the gain margin optimization via Nevanlinna--Pick interpolation theory. The link between first-order optimization methods and robust control theory sheds new light into limits of algorithmic performance for such methods, and suggests a new framework where similar computational problems can be systematically studied and algorithms optimized. In particular, it raises the question as to whether periodically scheduled algorithms can achieve faster rates for quadratic optimization, in a manner analogous to periodic control that extends gain margin beyond that of time-invariant control. This turns out not to be the case, due to the analytic obstruction of a transmission zero that is inherent in causal optimization algorithms. Interestingly, this obstruction can be removed with implicit algorithms, cast in a similar manner as feedback regulation problems with causal, but not strictly causal dynamics, thereby devoid of the transmission zero at infinity and able to achieve superior convergence rates. The confluence of the fields of optimization algorithms and control provides a frame to tackle questions pertaining to speed, accuracy, distributed computation, and so forth, and to delineate respective limits to performance and tradeoffs in a systematic manner, utilizing the formalism of robust control., Comment: 25 pages, 8 figures
Published: 2024

25. OTFS-MDMA: An Elastic Multi-Domain Resource Utilization Mechanism for High Mobility Scenarios

Author: Chen, Jie, Wang, Xianbin, and Hanzo, Lajos
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: By harnessing the delay-Doppler (DD) resource domain, orthogonal time-frequency space (OTFS) substantially improves the communication performance under high-mobility scenarios by maintaining quasi-time-invariant channel characteristics. However, conventional multiple access (MA) techniques fail to efficiently support OTFS in the face of diverse communication requirements. Recently, multi-dimensional MA (MDMA) has emerged as a flexible channel access technique by elastically exploiting multi-domain resources for tailored service provision. Therefore, we conceive an elastic multi-domain resource utilization mechanism for a novel multi-user OTFS-MDMA system by leveraging user-specific channel characteristics across the DD, power, and spatial resource domains. Specifically, we divide all DD resource bins into separate subregions called DD resource slots (RSs), each of which supports a fraction of users, thus reducing the multi-user interference. Then, the most suitable MA, including orthogonal, non-orthogonal, or spatial division MA (OMA/ NOMA/ SDMA), will be selected with each RS based on the interference levels in the power and spatial domains, thus enhancing the spectrum efficiency. Then, we jointly optimize the user assignment, access scheme selection, and power allocation in all DD RSs to maximize the weighted sum-rate subject to their minimum rate and various practical constraints. Since this results in a non-convex problem, we develop a dynamic programming and monotonic optimization (DPMO) method to find the globally optimal solution in the special case of disregarding rate constraints. Subsequently, we apply a low-complexity algorithm to find sub-optimal solutions in general cases., Comment: This paper has been accepted by IEEE Journal on Selected Areas in Communications
Published: 2024

26. Unrolling Plug-and-Play Network for Hyperspectral Unmixing

Author: Zhao, Min, Tang, Linruize, and Chen, Jie
Subjects: Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Deep learning based unmixing methods have received great attention in recent years and achieve remarkable performance. These methods employ a data-driven approach to extract structure features from hyperspectral image, however, they tend to be less physical interpretable. Conventional unmixing methods are with much more interpretability, whereas they require manually designing regularization and choosing penalty parameters. To overcome these limitations, we propose a novel unmixing method by unrolling the plug-and-play unmixing algorithm to conduct the deep architecture. Our method integrates both inner and outer priors. The carefully designed unfolding deep architecture is used to learn the spectral and spatial information from the hyperspectral image, which we refer to as inner priors. Additionally, our approach incorporates deep denoisers that have been pretrained on a large volume of image data to leverage the outer priors. Secondly, we design a dynamic convolution to model the multiscale information. Different scales are fused using an attention module. Experimental results of both synthetic and real datasets demonstrate that our method outperforms compared methods.
Published: 2024

27. Hierarchical Sparse Representation Clustering for High-Dimensional Data Streams

Author: Chen, Jie, Mao, Hua, Gou, Yuanbiao, and Peng, Xi
Subjects: Computer Science - Machine Learning
Abstract: Data stream clustering reveals patterns within continuously arriving, potentially unbounded data sequences. Numerous data stream algorithms have been proposed to cluster data streams. The existing data stream clustering algorithms still face significant challenges when addressing high-dimensional data streams. First, it is intractable to measure the similarities among high-dimensional data objects via Euclidean distances when constructing and merging microclusters. Second, these algorithms are highly sensitive to the noise contained in high-dimensional data streams. In this paper, we propose a hierarchical sparse representation clustering (HSRC) method for clustering high-dimensional data streams. HSRC first employs an $l_1$-minimization technique to learn an affinity matrix for data objects in individual landmark windows with fixed sizes, where the number of neighboring data objects is automatically selected. This approach ensures that highly correlated data samples within clusters are grouped together. Then, HSRC applies a spectral clustering technique to the affinity matrix to generate microclusters. These microclusters are subsequently merged into macroclusters based on their sparse similarity degrees (SSDs). Additionally, HSRC introduces sparsity residual values (SRVs) to adaptively select representative data objects from the current landmark window. These representatives serve as dictionary samples for the next landmark window. Finally, HSRC refines each macrocluster through fine-tuning. In particular, HSRC enables the detection of outliers in high-dimensional data streams via the associated SRVs. The experimental results obtained on several benchmark datasets demonstrate the effectiveness and robustness of HSRC., Comment: 11 pages, 6 figures
Published: 2024

28. Dynamic Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling

Author: Wan, Guangya, Wu, Yuqi, Chen, Jie, and Li, Sheng
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Self-Consistency (SC) is a widely used method to mitigate hallucinations in Large Language Models (LLMs) by sampling the LLM multiple times and outputting the most frequent solution. Despite its benefits, SC results in significant computational costs proportional to the number of samples generated. Previous early-stopping approaches, such as Early Stopping Self Consistency and Adaptive Consistency, have aimed to reduce these costs by considering output consistency, but they do not analyze the quality of the reasoning paths (RPs) themselves. To address this issue, we propose Reasoning-Aware Self-Consistency (RASC), an innovative early-stopping framework that dynamically adjusts the number of sample generations by considering both the output answer and the RPs from Chain of Thought (CoT) prompting. RASC assigns confidence scores sequentially to the generated samples, stops when certain criteria are met, and then employs weighted majority voting to optimize sample usage and enhance answer reliability. We comprehensively test RASC with multiple LLMs across varied QA datasets. RASC outperformed existing methods and significantly reduces sample usage by an average of 80% while maintaining or improving accuracy up to 5% compared to the original SC
Published: 2024

29. Do Graph Neural Networks Work for High Entropy Alloys?

Author: Zhang, Hengrui, Huang, Ruishu, Chen, Jie, Rondinelli, James M., and Chen, Wei
Subjects: Computer Science - Machine Learning, Condensed Matter - Materials Science
Abstract: Graph neural networks (GNNs) have excelled in predictive modeling for both crystals and molecules, owing to the expressiveness of graph representations. High-entropy alloys (HEAs), however, lack chemical long-range order, limiting the applicability of current graph representations. To overcome this challenge, we propose a representation of HEAs as a collection of local environment (LE) graphs. Based on this representation, we introduce the LESets machine learning model, an accurate, interpretable GNN for HEA property prediction. We demonstrate the accuracy of LESets in modeling the mechanical properties of quaternary HEAs. Through analyses and interpretation, we further extract insights into the modeling and design of HEAs. In a broader sense, LESets extends the potential applicability of GNNs to disordered materials with combinatorial complexity formed by diverse constituents and their flexible configurations.
Published: 2024

30. PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification

Author: Tan, Lei, Dai, Pingyang, Chen, Jie, Cao, Liujuan, Wu, Yongjian, and Ji, Rongrong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Extracting robust feature representation is critical for object re-identification to accurately identify objects across non-overlapping cameras. Although having a strong representation ability, the Vision Transformer (ViT) tends to overfit on most distinct regions of training data, limiting its generalizability and attention to holistic object features. Meanwhile, due to the structural difference between CNN and ViT, fine-grained strategies that effectively address this issue in CNN do not continue to be successful in ViT. To address this issue, by observing the latent diverse representation hidden behind the multi-head attention, we present PartFormer, an innovative adaptation of ViT designed to overcome the granularity limitations in object Re-ID tasks. The PartFormer integrates a Head Disentangling Block (HDB) that awakens the diverse representation of multi-head self-attention without the typical loss of feature richness induced by concatenation and FFN layers post-attention. To avoid the homogenization of attention heads and promote robust part-based feature learning, two head diversity constraints are imposed: attention diversity constraint and correlation diversity constraint. These constraints enable the model to exploit diverse and discriminative feature representations from different attention heads. Comprehensive experiments on various object Re-ID benchmarks demonstrate the superiority of the PartFormer. Specifically, our framework significantly outperforms state-of-the-art by 2.4\% mAP scores on the most challenging MSMT17 dataset.
Published: 2024

31. CoT Rerailer: Enhancing the Reliability of Large Language Models in Complex Reasoning Tasks through Error Detection and Correction

Author: Wan, Guangya, Wu, Yuqi, Chen, Jie, and Li, Sheng
Subjects: Computer Science - Computation and Language
Abstract: Chain-of-Thought (CoT) prompting enhances Large Language Models (LLMs) complex reasoning abilities by generating intermediate steps. However, these steps can introduce hallucinations and accumulate errors. We propose the CoT Rerailer to address these challenges, employing self-consistency and multi-agent debate systems to identify and rectify errors in the reasoning process. The CoT Rerailer first selects the most logically correct Reasoning Path (RP) using consistency checks and critical evaluation by automated agents. It then engages a multi-agent debate system to propose and validate corrections to ensure the generation of an error-free intermediate logical path. The corrected steps are then used to generate a revised reasoning chain to further reduce hallucinations and enhance answer quality. We demonstrate the effectiveness of our approach across diverse question-answering datasets in various knowledge domains. The CoT Rerailer enhances the reliability of LLM-generated reasoning, contributing to more trustworthy AI driven decision-making processes.
Published: 2024

32. Syntax-Guided Procedural Synthesis of Molecules

Author: Sun, Michael, Lo, Alston, Gao, Wenhao, Guo, Minghao, Thost, Veronika, Chen, Jie, Coley, Connor, and Matusik, Wojciech
Subjects: Quantitative Biology - Biomolecules, Computer Science - Machine Learning, Physics - Chemical Physics
Abstract: Designing synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery. We reconceptualize both problems using ideas from program synthesis. Drawing inspiration from syntax-guided synthesis approaches, we decouple the syntactic skeleton from the semantics of a synthetic tree to create a bilevel framework for reasoning about the combinatorial space of synthesis pathways. Given a molecule we aim to generate analogs for, we iteratively refine its skeletal characteristics via Markov Chain Monte Carlo simulations over the space of syntactic skeletons. Given a black-box oracle to optimize, we formulate a joint design space over syntactic templates and molecular descriptors and introduce evolutionary algorithms that optimize both syntactic and semantic dimensions synergistically. Our key insight is that once the syntactic skeleton is set, we can amortize over the search complexity of deriving the program's semantics by training policies to fully utilize the fixed horizon Markov Decision Process imposed by the syntactic template. We demonstrate performance advantages of our bilevel framework for synthesizable analog generation and synthesizable molecule design. Notably, our approach offers the user explicit control over the resources required to perform synthesis and biases the design space towards simpler solutions, making it particularly promising for autonomous synthesis platforms.
Published: 2024

33. A Digital Twin Framework Utilizing Machine Learning for Robust Predictive Maintenance: Enhancing Tire Health Monitoring

Author: Karkaria, Vispi, Chen, Jie, Luey, Christopher, Siuta, Chase, Lim, Damien, Radulescu, Robert, and Chen, Wei
Subjects: Computer Science - Machine Learning, Computer Science - Computational Engineering, Finance, and Science
Abstract: We introduce a novel digital twin framework for predictive maintenance of long-term physical systems. Using monitoring tire health as an application, we show how the digital twin framework can be used to enhance automotive safety and efficiency, and how the technical challenges can be overcome using a three-step approach. Firstly, for managing the data complexity over a long operation span, we employ data reduction techniques to concisely represent physical tires using historical performance and usage data. Relying on these data, for fast real-time prediction, we train a transformer-based model offline on our concise dataset to predict future tire health over time, represented as Remaining Casing Potential (RCP). Based on our architecture, our model quantifies both epistemic and aleatoric uncertainty, providing reliable confidence intervals around predicted RCP. Secondly, to incorporate real-time data, we update the predictive model in the digital twin framework, ensuring its accuracy throughout its life span with the aid of hybrid modeling and the use of discrepancy function. Thirdly, to assist decision making in predictive maintenance, we implement a Tire State Decision Algorithm, which strategically determines the optimal timing for tire replacement based on RCP forecasted by our transformer model. This approach ensures our digital twin accurately predicts system health, continually refines its digital representation, and supports predictive maintenance decisions. Our framework effectively embodies a physical system, leveraging big data and machine learning for predictive maintenance, model updates, and decision-making., Comment: Paper accepted at ASME IDETC 2024, and fast-tracked for ASME Journal of Computing and Information Science in Engineering
Published: 2024

34. Coexistence of large anomalous Hall effect and topological magnetic skyrmions in a Weyl nodal ring ferromagnet Mn5Ge3

Author: Li, Hang, Zhou, Feng, Ding, Bei, Chen, Jie, Song, Linxuan, Yang, Wenyun, Lau, Yong-Chang, Yang, Jinbo, Li, Yue, Jiang, Yong, and Wang, Wenhong
Subjects: Condensed Matter - Materials Science
Abstract: Topological magnetic materials are expected to show multiple transport responses because of their unusual bulk electronic topology in momentum space and topological spin texture in real space. However, such multiple topological properties-hosting materials are rare in nature. In this work, we reveal the coexistence of a large tunable anomalous Hall effect and topological magnetic skyrmions in a Weyl nodal ring ferromagnet Mn5Ge3, by using electrical transport and Lorentz transmission electronic microscope (TEM) measurements. It was found that the intrinsic anomalous Hall conductivity (AHC) can reach up to 979.7 S/cm with current along [120] and magnetic field along [001] of the Mn5Ge3 single crystals. Our theoretical calculations reveal that the large AHC is closely related with two Weyl nodal rings in band structure near the Fermi level and is strongly modified by the content of Ge. Moreover, our Lorentz-TEM images and micromagnetic simulation results, together with the sizable topological Hall effect clearly point to the robust formation of magnetic skyrmions over a wide temperature-magnetic field region. These results prove Mn5Ge3 as a rare magnetic topological nodal-line semimetal with great significance to explore novel multiple topological phenomena, which facilitates the development of spintronics., Comment: 38 pages, 22 figures
Published: 2024

35. Towards Effective and Efficient Continual Pre-training of Large Language Models

Author: Chen, Jie, Chen, Zhipeng, Wang, Jiapeng, Zhou, Kun, Zhu, Yutao, Jiang, Jinhao, Min, Yingqian, Zhao, Wayne Xin, Dou, Zhicheng, Mao, Jiaxin, Lin, Yankai, Song, Ruihua, Xu, Jun, Chen, Xu, Yan, Rui, Wei, Zhewei, Hu, Di, Huang, Wenbing, and Wen, Ji-Rong
Subjects: Computer Science - Computation and Language, 68T50, I.2.7
Abstract: Continual pre-training (CPT) has been an important approach for adapting language models to specific domains or tasks. To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which significantly enhances the Chinese language ability and scientific reasoning ability of the backbone model. To enhance the new abilities while retaining the original abilities, we design specific data mixture and curriculum strategies by utilizing existing datasets and synthesizing high-quality datasets. Specifically, we synthesize multidisciplinary scientific question and answer (QA) pairs based on related web pages, and subsequently incorporate these synthetic data to improve the scientific reasoning ability of Llama-3. We refer to the model after CPT as Llama-3-SynE (Synthetic data Enhanced Llama-3). We also present the tuning experiments with a relatively small model -- TinyLlama, and employ the derived findings to train the backbone model. Extensive experiments on a number of evaluation benchmarks show that our approach can largely improve the performance of the backbone models, including both the general abilities (+8.81 on C-Eval and +6.31 on CMMLU) and the scientific reasoning abilities (+12.00 on MATH and +4.13 on SciEval), without hurting the original capacities. Our model, data, and codes are available at https://github.com/RUC-GSAI/Llama-3-SynE., Comment: 16 pages, 10 figures, 16 tables
Published: 2024

36. Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation

Author: Qiu, Shoumeng, Chen, Jie, Li, Xinrun, Wan, Ru, Xue, Xiangyang, and Pu, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we introduce a novel knowledge distillation approach for the semantic segmentation task. Unlike previous methods that rely on power-trained teachers or other modalities to provide additional knowledge, our approach does not require complex teacher models or information from extra sensors. Specifically, for the teacher model training, we propose to noise the label and then incorporate it into input to effectively boost the lightweight teacher performance. To ensure the robustness of the teacher model against the introduced noise, we propose a dual-path consistency training strategy featuring a distance loss between the outputs of two paths. For the student model training, we keep it consistent with the standard distillation for simplicity. Our approach not only boosts the efficacy of knowledge distillation but also increases the flexibility in selecting teacher and student models. To demonstrate the advantages of our Label Assisted Distillation (LAD) method, we conduct extensive experiments on five challenging datasets including Cityscapes, ADE20K, PASCAL-VOC, COCO-Stuff 10K, and COCO-Stuff 164K, five popular models: FCN, PSPNet, DeepLabV3, STDC, and OCRNet, and results show the effectiveness and generalization of our approach. We posit that incorporating labels into the input, as demonstrated in our work, will provide valuable insights into related fields. Code is available at https://github.com/skyshoumeng/Label_Assisted_Distillation.
Published: 2024

37. Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs

Author: Ma, Rong, Chen, Jie, Xue, Xiangyang, and Pu, Jian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep supervised models possess significant capability to assimilate extensive training data, thereby presenting an opportunity to enhance model performance through training on multiple datasets. However, conflicts arising from different label spaces among datasets may adversely affect model performance. In this paper, we propose a novel approach to automatically construct a unified label space across multiple datasets using graph neural networks. This enables semantic segmentation models to be trained simultaneously on multiple datasets, resulting in performance improvements. Unlike existing methods, our approach facilitates seamless training without the need for additional manual reannotation or taxonomy reconciliation. This significantly enhances the efficiency and effectiveness of multi-dataset segmentation model training. The results demonstrate that our method significantly outperforms other multi-dataset training methods when trained on seven datasets simultaneously, and achieves state-of-the-art performance on the WildDash 2 benchmark.
Published: 2024

38. Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

Author: Jin, Peng, Li, Hao, Cheng, Zesen, Li, Kehan, Yu, Runyi, Liu, Chang, Ji, Xiangyang, Yuan, Li, and Chen, Jie
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily focus on the direct synthesis of global motions while neglecting the importance of generating and controlling local actions. In this paper, we propose the local action-guided motion diffusion model, which facilitates global motion generation by utilizing local actions as fine-grained control signals. Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis. During the diffusion process for synthesizing global motion, we calculate the local-action gradient to provide conditional guidance. This local-to-global paradigm reduces the complexity associated with direct global motion generation and promotes motion diversity via sampling diverse actions as conditions. Extensive experiments on two human motion datasets, i.e., HumanML3D and KIT, demonstrate the effectiveness of our method. Furthermore, our method provides flexibility in seamlessly combining various local actions and continuous guiding weight adjustment, accommodating diverse user preferences, which may hold potential significance for the community. The project page is available at https://jpthu17.github.io/GuidedMotion-project/., Comment: Accepted by ECCV 2024
Published: 2024

39. LLMBox: A Comprehensive Library for Large Language Models

Author: Tang, Tianyi, Hu, Yiwen, Li, Bingqian, Luo, Wenyang, Qin, Zijing, Sun, Haoxiang, Wang, Jiapeng, Xu, Shiyi, Cheng, Xiaoxue, Guo, Geyang, Peng, Han, Zheng, Bowen, Tang, Yiru, Min, Yingqian, Chen, Yushuo, Chen, Jie, Zhao, Yuanqian, Ding, Luran, Wang, Yuhao, Dong, Zican, Xia, Chunxuan, Li, Junyi, Zhou, Kun, Zhao, Wayne Xin, and Wen, Ji-Rong
Subjects: Computer Science - Computation and Language
Abstract: To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs. This library is featured with three main merits: (1) a unified data interface that supports the flexible implementation of various training strategies, (2) a comprehensive evaluation that covers extensive tasks, datasets, and models, and (3) more practical consideration, especially on user-friendliness and efficiency. With our library, users can easily reproduce existing methods, train new models, and conduct comprehensive performance comparisons. To rigorously test LLMBox, we conduct extensive experiments in a diverse coverage of evaluation settings, and experimental results demonstrate the effectiveness and efficiency of our library in supporting various implementations related to LLMs. The detailed introduction and usage guidance can be found at https://github.com/RUCAIBox/LLMBox., Comment: Accepted by ACL 2024 Demo
Published: 2024

40. Kinetics of Rayleigh-Taylor instability in van der Waals fluid: the influence of compressibility

Author: Chen, Jie, Xu, Aiguo, Zhang, Yudong, Chen, Dawei, and Chen, Zhihua
Subjects: Physics - Fluid Dynamics
Abstract: Early studies on Rayleigh-Taylor instability (RTI) primarily relied on the Navier-Stokes (NS) model. As research progresses, it becomes increasingly evident that the kinetic information that the NS model failed to capture is of great value for identifying and even controlling the RTI process; simultaneously, the lack of analysis techniques for complex physical fields results in a significant waste of data information. In addition, early RTI studies mainly focused on the incompressible case and the weakly compressible case. In the case of strong compressibility, the density of the fluid from the upper layer (originally heavy fluid) may become smaller than that of the surrounding (originally light) fluid, thus invalidating the early method of distinguishing light and heavy fluids based on density. In this paper, tracer particles are incorporated into a single-fluid discrete Boltzmann method (DBM) model that considers the van der Waals potential. By using tracer particles to label the matter-particle sources, a careful study of the matter-mixing and energy-mixing processes of the RTI evolution is realized in the single-fluid framework. The effects of compressibility on the evolution of RTI are examined mainly through the analysis of bubble and spike velocities, the ratio of area occupied by heavy fluid, and various entropy generation rates of the system. It is demonstrated that: (1) compressibility has a suppressive effect on the spike velocity, and this suppressive impact diminishes as the Atwood number ($At$) increases. The influence of compressibility on bubble velocity shows a staged behavior with increasing $At$. (2) The impact of compressibility on the entropy production rate associated with the heat flow (${{\dot{S}}_{NOEF}}$) is related to the stages of RTI evolution.
Published: 2024

41. Long-Term Safety of Facilitated Subcutaneous Immunoglobulin 10% Treatment in US Clinical Practice in Patients with Primary Immunodeficiency Diseases: Results from a Post-Authorization Safety Study.

Author: Rubinstein, Arye, Mabudian, Mohsen, McNeil, Donald, Patel, Niraj, Wasserman, Richard, Gupta, Sudhir, Carrasco, Paz, Chen, Jie, Garcia, Enrique, Nagy, Andras, and Yel, Leman
Subjects: Immunogenicity, Immunoglobulin replacement, Inborn errors of immunity, Quality of life, Tolerability, Humans, Male, Female, United States, Adult, Adolescent, Prospective Studies, Hyaluronoglucosaminidase, Primary Immunodeficiency Diseases, Middle Aged, Infusions, Subcutaneous, Child, Young Adult, Immunoglobulins, Injections, Subcutaneous, Treatment Outcome, Aged, Child, Preschool, Immunologic Deficiency Syndromes
Abstract: Facilitated subcutaneous immunoglobulin (fSCIG) 10% is an immunoglobulin replacement therapy that utilizes recombinant human hyaluronidase (rHuPH20) to enhance immunoglobulin dispersion and absorption, allowing for longer treatment intervals similar to intravenous immunoglobulin (up to once monthly). fSCIG 10% is indicated in the USA for treating adults and children aged ≥ 2 years with primary immunodeficiency diseases (PIDs). This prospective, non-interventional, open-label, multicenter, post-authorization safety study (NCT02593188) was conducted in the USA from November 2015 to October 2021 to assess the long-term safety of fSCIG 10% in routine clinical practice. Patients with PIDs aged ≥ 16 years who were prescribed and/or had started fSCIG 10% treatment were enrolled. In total, 253 patients were enrolled and included (full analysis set). Participants received fSCIG 10% treatment for a median (interquartile range) of 10.0 (3.5-11.8) months, with the majority of infusions administered every 4 weeks (54.4% [1197/2201 infusions]) and at home (62.6% [1395/2230 infusions]). Overall, 98.5% of infusions were administered without rate reduction, interruption, or discontinuation due to adverse events (AEs). Treatment-related, non-serious AEs were experienced by 52 patients (20.6%, 284 events). Two patients (0.8%) each experienced one treatment-related serious AE (aseptic meningitis and deep vein thrombosis). Development of antibodies against rHuPH20 was uncommon; 14/196 patients (7.1%) tested positive for binding antibodies (titer ≥ 1:160) with no neutralizing antibodies detected. There was no relationship between anti-rHuPH20 antibody positivity and the occurrence of treatment-related serious or non-serious AEs. Long-term, repeated self-administration of fSCIG 10% was well tolerated in US clinical practice by patients with PIDs.
Published: 2024

42. Time-optimal Flight in Cluttered Environments via Safe Reinforcement Learning

Author: Xiao, Wei, Feng, Zhaohan, Zhou, Ziyu, Sun, Jian, Wang, Gang, and Chen, Jie
Subjects: Computer Science - Robotics
Abstract: This paper addresses the problem of guiding a quadrotor through a predefined sequence of waypoints in cluttered environments, aiming to minimize the flight time while avoiding collisions. Previous approaches either suffer from prolonged computational time caused by solving complex non-convex optimization problems or are limited by the inherent smoothness of polynomial trajectory representations, thereby restricting the flexibility of movement. In this work, we present a safe reinforcement learning approach for autonomous drone racing with time-optimal flight in cluttered environments. The reinforcement learning policy, trained using safety and terminal rewards specifically designed to enforce near time-optimal and collision-free flight, outperforms current state-of-the-art algorithms. Additionally, experimental results demonstrate the efficacy of the proposed approach in achieving both minimum flight time and obstacle avoidance objectives in complex environments, with a commendable $66.7\%$ success rate in unseen, challenging settings., Comment: 7 pages, 3 figures
Published: 2024

43. Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models

Author: Chen, Jie, Zhang, Yupeng, Wang, Bingning, Zhao, Wayne Xin, Wen, Ji-Rong, and Chen, Weipeng
Subjects: Computer Science - Computation and Language
Abstract: Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs). Studies have shown that synthetic data can effectively improve the performance of LLMs on downstream benchmarks. However, despite its potential benefits, our analysis suggests that there may be inherent flaws in synthetic data. The uniform format of synthetic data can lead to pattern overfitting and cause significant shifts in the output distribution, thereby reducing the model's instruction-following capabilities. Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws. The empirical results demonstrate the effectiveness of our approach, which can reverse the instruction-following issues caused by pattern overfitting without compromising performance on benchmarks at relatively low cost. Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training., Comment: 15 pages
Published: 2024

44. Unlock the Correlation between Supervised Fine-Tuning and Reinforcement Learning in Training Code Large Language Models

Author: Chen, Jie, Han, Xintian, Ma, Yu, Zhou, Xun, and Xiang, Liang
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Automatic code generation has been a longstanding research topic. With the advancement of general-purpose large language models (LLMs), the ability to code stands out as one important measure to the model's reasoning performance. Usually, a two-stage training paradigm is implemented to obtain a Code LLM, namely the pretraining and the fine-tuning. Within the fine-tuning, supervised fine-tuning (SFT), and reinforcement learning (RL) are often used to improve the model's zero-shot ability. A large number of work has been conducted to improve the model's performance on code-related benchmarks with either modifications to the algorithm or refinement of the dataset. However, we still lack a deep insight into the correlation between SFT and RL. For instance, what kind of dataset should be used to ensure generalization, or what if we abandon the SFT phase in fine-tuning. In this work, we make an attempt to understand the correlation between SFT and RL. To facilitate our research, we manually craft 100 basis python functions, called atomic functions, and then a synthesizing pipeline is deployed to create a large number of synthetic functions on top of the atomic ones. In this manner, we ensure that the train and test sets remain distinct, preventing data contamination. Through comprehensive ablation study, we find: (1) Both atomic and synthetic functions are indispensable for SFT's generalization, and only a handful of synthetic functions are adequate; (2) Through RL, the SFT's generalization to target domain can be greatly enhanced, even with the same training prompts; (3) Training RL from scratch can alleviate the over-fitting issue introduced in the SFT phase.
Published: 2024

45. Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

Author: Yu, Runyi, He, Tianyu, Zhang, Ailing, Wang, Yuchi, Guo, Junliang, Tan, Xu, Liu, Chang, Chen, Jie, and Bian, Jiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We aim to edit the lip movements in talking video according to the given speech while preserving the personal identity and visual details. The task can be decomposed into two sub-problems: (1) speech-driven lip motion generation and (2) visual appearance synthesis. Current solutions handle the two sub-problems within a single generative model, resulting in a challenging trade-off between lip-sync quality and visual details preservation. Instead, we propose to disentangle the motion and appearance, and then generate them one by one with a speech-to-motion diffusion model and a motion-conditioned appearance generation model. However, there still remain challenges in each stage, such as motion-aware identity preservation in (1) and visual details preservation in (2). Therefore, to preserve personal identity, we adopt landmarks to represent the motion, and further employ a landmark-based identity loss. To capture motion-agnostic visual details, we use separate encoders to encode the lip, non-lip appearance and motion, and then integrate them with a learned fusion module. We train MyTalk on a large-scale and diverse dataset. Experiments show that our method generalizes well to the unknown, even out-of-domain person, in terms of both lip sync and visual detail preservation. We encourage the readers to watch the videos on our project page (https://Ingrid789.github.io/MyTalk/)., Comment: 14 pages of main text, 23 pages in total, 9 figures
Published: 2024

46. Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data

Author: Li, Haolong, Ma, Yu, Zhang, Yinqi, Ye, Chen, and Chen, Jie
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have shown excellent performance in language understanding, text generation, code synthesis, and many other tasks, while they still struggle in complex multi-step reasoning problems, such as mathematical reasoning. In this paper, through a newly proposed arithmetical puzzle problem, we show that the model can perform well on multi-step reasoning tasks via fine-tuning on high-quality synthetic data. Experimental results with the open-llama-3B model on three different test datasets show that not only the model can reach a zero-shot pass@1 at 0.44 on the in-domain dataset, it also demonstrates certain generalization capabilities on the out-of-domain datasets. Specifically, this paper has designed two out-of-domain datasets in the form of extending the numerical range and the composing components of the arithmetical puzzle problem separately. The fine-tuned models have shown encouraging performance on these two far more difficult tasks with the zero-shot pass@1 at 0.33 and 0.35, respectively., Comment: Accept by Findings of ACL 2024
Published: 2024

47. Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Author: Chen, Jie
Subjects: Mathematics - Numerical Analysis, Computer Science - Machine Learning
Abstract: Preconditioning is at the heart of iterative solutions of large, sparse linear systems of equations in scientific disciplines. Several algebraic approaches, which access no information beyond the matrix itself, are widely studied and used, but ill-conditioned matrices remain very challenging. We take a machine learning approach and propose using graph neural networks as a general-purpose preconditioner. They show attractive performance for many problems and can be used when the mainstream preconditioners perform poorly. Empirical evaluation on over 800 matrices suggests that the construction time of these graph neural preconditioners (GNPs) is more predictable and can be much shorter than that of other widely used ones, such as ILU and AMG, while the execution time is faster than using a Krylov method as the preconditioner, such as in inner-outer GMRES. GNPs have a strong potential for solving large-scale, challenging algebraic problems arising from not only partial differential equations, but also economics, statistics, graph, and optimization, to name a few., Comment: From v1: Updated the timing experiments and evaluation metrics for fairer and better results
Published: 2024

48. Correction for the Weakening Magnetic Field within the Sunspot Umbra Observed by ASO-S/FMG

Author: Xu, Haiqing, Su, Jiangtao, Liu, Suo, Deng, Yuanyong, Bai, Xianyong, Chen, Jie, Wang, Xiaofan, Yang, Xiao, and Song, Yongliang
Subjects: Astrophysics - Solar and Stellar Astrophysics
Abstract: The magnetic field inside the sunspot umbra, as observed by the Full-disk MagnetoGraph (FMG) onboard the Advanced Space based Solar Observatory (ASO-S), was found to be experiencing a weakening. To address this issue, we employed a method developed by Xu et al. (2021) to correct the weakening in the data of 20 active regions observed by FMG during the period spanning December 29, 2022, to July 23, 2023. Research has revealed that the onset of magnetic field weakening occurs at a minimum magnetic field strength of 705 G, with the peak strength reaching up to 1931 G. We computed the change ratio (R1) of the unsigned magnetic flux within the sunspot umbra, considering measurements both before and after correction. The change ratio (R1) spans from 26% to 124%, indicating a significant increase in the unsigned magnetic flux within sunspot umbrae observed by FMG after correction. To illustrate this, we selected four active regions for comparison with data from the Helioseismic and Magnetic Imager (HMI). After correction, it is found that the unsigned magnetic flux in sunspot umbrae measured by FMG aligns more closely with that of HMI. This supports the effectiveness of the corrective method for FMG, despite imperfections, particularly at the umbra-penumbra boundary., Comment: 12 pages, 5 figures
Published: 2024

49. Learning-Based Intermittent CSI Estimation with Adaptive Intervals in Integrated Sensing and Communication Systems

Author: Chen, Jie and Wang, Xianbin
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Information Theory
Abstract: Due to the distinct objectives and multipath utilization mechanisms between the communication module and radar module, the system design of integrated sensing and communication (ISAC) necessitates two types of channel state information (CSI), i.e., communication CSI representing the whole channel gain and phase shifts, and radar CSI exclusively focused on target mobility and position information. However, current ISAC systems apply an identical mechanism to estimate both types of CSI at the same predetermined estimation interval, leading to significant overhead and compromised performances. Therefore, this paper proposes an intermittent communication and radar CSI estimation scheme with adaptive intervals for individual users/targets, where both types of CSI can be predicted using channel temporal correlations for cost reduction or re-estimated via training signal transmission for improved estimation accuracy. Specifically, we jointly optimize the binary CSI re-estimation/prediction decisions and transmit beamforming matrices for individual users/targets to maximize communication transmission rates and minimize radar tracking errors and costs in a multiple-input single-output (MISO) ISAC system. Unfortunately, this problem has causality issues because it requires comparing system performances under re-estimated CSI and predicted CSI during the optimization. Additionally, the binary decision makes the joint design a mixed integer nonlinear programming (MINLP) problem, resulting in high complexity when using conventional optimization algorithms. Therefore, we propose a deep reinforcement online learning (DROL) framework that first implements an online deep neural network (DNN) to learn the binary CSI updating decisions from the experiences. Given the learned decisions, we propose an efficient algorithm to solve the remaining beamforming design problem efficiently., Comment: This paper has been accepted by IEEE Journal of Selected Topics in Signal Processing
Published: 2024

50. Observation of a large-scale filament eruption initiated by two small-scale erupting filaments pushing out from below

Author: Song, Yongliang, Su, Jiangtao, Zhang, Qingmin, Zhang, Mei, Deng, Yuanyong, Bai, Xianyong, Liu, Suo, Yang, Xiao, Chen, Jie, Xu, Haiqing, Ji, Kaifan, and Hu, Ziyao
Subjects: Astrophysics - Solar and Stellar Astrophysics
Abstract: Filament eruptions often result in flares and coronal mass ejections (CMEs). Most studies attribute the filament eruptions to their instabilities or magnetic reconnection. In this study, we report a unique observation of a filament eruption whose initiation process has not been reported before. This large-scale filament, with a length of about 360 Mm crossing an active region, is forced to erupted by two small-scale erupting filaments pushing out from below. This process of multi-filament eruption results in an M6.4 flare in the active region NOAA 13229 on 25th February 2023. The whole process can be divided into three stages: the eruptions of two active-region filaments F1 and F2; the interactions between the erupting F1, F2, and the large-scale filament F3; and the eruption of F3. Though this multi-filament eruption occurs near the northwest limb of the solar disk, it produces a strong halo CME that causes a significant geomagnetic disturbance. Our observations present a new filament eruption mechanism, in which the initial kinetic energy of the eruption is obtained from and transported to by other erupting structures. This event provides us a unique insight into the dynamics of multi-filament eruptions and their corresponding effects on the interplanetary space., Comment: 16 pages, 10 figures. Accepted for publication in Solar Physics
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

56,679 results on '"Chen,Jie"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources