24,934 results on '"CHEN, LING"'
Search Results
2. “Wars without Gun Smoke”: Global Supply Chains, Power Transitions, and Economic Statecraft
- Author
-
Chen, Ling S. and Evers, Miles M.
- Published
- 2023
3. SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training
- Author
-
Zhang, Gengwei, Wang, Liyuan, Kang, Guoliang, Chen, Ling, and Wei, Yunchao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
In recent years, continual learning with pre-training (CLPT) has received widespread interest, instead of its traditional focus of training from scratch. The use of strong pre-trained models (PTMs) can greatly facilitate knowledge transfer and alleviate catastrophic forgetting, but also suffers from progressive overfitting of pre-trained knowledge into specific downstream tasks. A majority of current efforts often keep the PTMs frozen and incorporate task-specific prompts to instruct representation learning, coupled with a prompt selection process for inference. However, due to the limited capacity of prompt parameters, this strategy demonstrates only sub-optimal performance in continual learning. In comparison, tuning all parameters of PTMs often provides the greatest potential for representation learning, making sequential fine-tuning (Seq FT) a fundamental baseline that has been overlooked in CLPT. To this end, we present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT. Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Classifier Alignment (SLCA++) framework to unleash the power of Seq FT, serving as a strong baseline approach for CLPT. Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Classifier Alignment to align the disjoint classification layers in a post-hoc fashion. We further enhance the efficacy of SL with a symmetric cross-entropy loss, as well as employ a parameter-efficient strategy to implement Seq FT with SLCA++. Across a variety of continual learning scenarios on image classification benchmarks, our approach provides substantial improvements and outperforms state-of-the-art methods by a large margin. Code: https://github.com/GengDavid/SLCA., Comment: This paper is an extension of our ICCV 23 paper (arXiv:2303.05118)
- Published
- 2024
4. AppAgent v2: Advanced Agent for Flexible Mobile Interactions
- Author
-
Li, Yanda, Zhang, Chi, Yang, Wanqi, Fu, Bin, Cheng, Pei, Chen, Xin, Chen, Ling, and Wei, Yunchao
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
With the advancement of Multimodal Large Language Models (MLLM), LLM-driven visual agents are increasingly impacting software interfaces, particularly those with graphical user interfaces. This work introduces a novel LLM-based multimodal agent framework for mobile devices. This framework, capable of navigating mobile devices, emulates human-like interactions. Our agent constructs a flexible action space that enhances adaptability across various applications including parser, text and vision descriptions. The agent operates through two main phases: exploration and deployment. During the exploration phase, functionalities of user interface elements are documented either through agent-driven or manual explorations into a customized structured knowledge base. In the deployment phase, RAG technology enables efficient retrieval and update from this knowledge base, thereby empowering the agent to perform tasks effectively and accurately. This includes performing complex, multi-step operations across various applications, thereby demonstrating the framework's adaptability and precision in handling customized task workflows. Our experimental results across various benchmarks demonstrate the framework's superior performance, confirming its effectiveness in real-world scenarios. Our code will be open source soon., Comment: Pre-print version, some content needs to be supplemented
- Published
- 2024
5. XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training
- Author
-
Wu, Biao, Xie, Yutong, Zhang, Zeyu, Phan, Minh Hieu, Chen, Qi, Chen, Ling, and Wu, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision-and-language pretraining (VLP) in the medical field utilizes contrastive learning on image-text pairs to achieve effective transfer across tasks. Yet, current VLP approaches with the masked modelling strategy face two challenges when applied to the medical domain. First, current models struggle to accurately reconstruct key pathological features due to the scarcity of medical data. Second, most methods only adopt either paired image-text or image-only data, failing to exploit the combination of both paired and unpaired data. To this end, this paper proposes a XLIP (Masked modelling for medical Language-Image Pre-training) framework to enhance pathological learning and feature learning via unpaired data. First, we introduce the attention-masked image modelling (AttMIM) and entity-driven masked language modelling module (EntMLM), which learns to reconstruct pathological visual and textual tokens via multi-modal feature interaction, thus improving medical-enhanced features. The AttMIM module masks a portion of the image features that are highly responsive to textual features. This allows XLIP to improve the reconstruction of highly similar image data in medicine efficiency. Second, our XLIP capitalizes unpaired data to enhance multimodal learning by introducing disease-kind prompts. The experimental results show that XLIP achieves SOTA for zero-shot and fine-tuning classification performance on five datasets. Our code will be available at https://github.com/White65534/XLIP
- Published
- 2024
6. Contrastive Learning with Counterfactual Explanations for Radiology Report Generation
- Author
-
Li, Mingjie, Lin, Haokun, Qiu, Liang, Liang, Xiaodan, Chen, Ling, Elsaddik, Abdulmotaleb, and Chang, Xiaojun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Due to the common content of anatomy, radiology images with their corresponding reports exhibit high similarity. Such inherent data bias can predispose automatic report generation models to learn entangled and spurious representations resulting in misdiagnostic reports. To tackle these, we propose a novel \textbf{Co}unter\textbf{F}actual \textbf{E}xplanations-based framework (CoFE) for radiology report generation. Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking ``what if'' scenarios. By leveraging this concept, CoFE can learn non-spurious visual representations by contrasting the representations between factual and counterfactual images. Specifically, we derive counterfactual images by swapping a patch between positive and negative samples until a predicted diagnosis shift occurs. Here, positive and negative samples are the most semantically similar but have different diagnosis labels. Additionally, CoFE employs a learnable prompt to efficiently fine-tune the pre-trained large language model, encapsulating both factual and counterfactual content to provide a more generalizable prompt representation. Extensive experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports and outperform in terms of language generation and clinical efficacy metrics., Comment: ECCV 2024
- Published
- 2024
7. Continual Learning for Temporal-Sensitive Question Answering
- Author
-
Yang, Wanqi, Xu, Yunqiu, Li, Yanda, Wang, Kunze, Huang, Binbin, and Chen, Ling
- Subjects
Computer Science - Computation and Language - Abstract
In this study, we explore an emerging research area of Continual Learning for Temporal Sensitive Question Answering (CLTSQA). Previous research has primarily focused on Temporal Sensitive Question Answering (TSQA), often overlooking the unpredictable nature of future events. In real-world applications, it's crucial for models to continually acquire knowledge over time, rather than relying on a static, complete dataset. Our paper investigates strategies that enable models to adapt to the ever-evolving information landscape, thereby addressing the challenges inherent in CLTSQA. To support our research, we first create a novel dataset, divided into five subsets, designed specifically for various stages of continual learning. We then propose a training framework for CLTSQA that integrates temporal memory replay and temporal contrastive learning. Our experimental results highlight two significant insights: First, the CLTSQA task introduces unique challenges for existing models. Second, our proposed framework effectively navigates these challenges, resulting in improved performance., Comment: Accepted by IJCNN 2024
- Published
- 2024
8. Towards a Holistic Framework for Multimodal Large Language Models in Three-dimensional Brain CT Report Generation
- Author
-
Li, Cheng-Yi, Chang, Kao-Jung, Yang, Cheng-Fu, Wu, Hsin-Yu, Chen, Wenting, Bansal, Hritik, Chen, Ling, Yang, Yi-Ping, Chen, Yu-Chun, Chen, Shih-Pin, Lirng, Jiing-Feng, Chang, Kai-Wei, and Chiou, Shih-Hwa
- Subjects
Computer Science - Computation and Language - Abstract
Multi-modal large language models (MLLMs) have been given free rein to explore exciting medical applications with a primary focus on radiology report generation. Nevertheless, the preliminary success in 2D radiology captioning is incompetent to reflect the real-world diagnostic challenge in the volumetric 3D anatomy. To mitigate three crucial limitation aspects in the existing literature, including (1) data complexity, (2) model capacity, and (3) evaluation metric fidelity, we collected an 18,885 text-scan pairs 3D-BrainCT dataset and applied clinical visual instruction tuning (CVIT) to train BrainGPT models to generate radiology-adherent 3D brain CT reports. Statistically, our BrainGPT scored BLEU-1 = 44.35, BLEU-4 = 20.38, METEOR = 30.13, ROUGE-L = 47.6, and CIDEr-R = 211.77 during internal testing and demonstrated an accuracy of 0.91 in captioning midline shifts on the external validation CQ500 dataset. By further inspecting the captioned report, we reported that the traditional metrics appeared to measure only the surface text similarity and failed to gauge the information density of the diagnostic purpose. To close this gap, we proposed a novel Feature-Oriented Radiology Task Evaluation (FORTE) to estimate the report's clinical relevance (lesion feature and landmarks). Notably, the BrainGPT model scored an average FORTE F1-score of 0.71 (degree=0.661; landmark=0.706; feature=0.693; impression=0.779). To demonstrate that BrainGPT models possess objective readiness to generate human-like radiology reports, we conducted a Turing test that enrolled 11 physician evaluators, and around 74% of the BrainGPT-generated captions were indistinguishable from those written by humans. Our work embodies a holistic framework that showcased the first-hand experience of curating a 3D brain CT dataset, fine-tuning anatomy-sensible language models, and proposing robust radiology evaluation metrics., Comment: 6 figures, 5 supplementary figures, 8 supplementary tables
- Published
- 2024
9. Curriculum Learning with Quality-Driven Data Selection
- Author
-
Wu, Biao, Meng, Fang, and Chen, Ling
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
The impressive multimodal capabilities demonstrated by OpenAI's GPT-4 have generated significant interest in the development of Multimodal Large Language Models (MLLMs). Visual instruction tuning of MLLMs with machine-generated instruction-following data has shown to enhance zero-shot capabilities across various tasks. However, there has been limited exploration into controlling the quality of the instruction data.Current methodologies for data selection in MLLMs often rely on single, unreliable scores or use downstream tasks for selection, which is time-consuming and can lead to potential overfitting on the chosen evaluation datasets. To mitigate these limitations, we propose a novel data selection methodology that utilizes image-text correlation and model perplexity to evaluate and select data of varying quality. This approach leverages the distinct distribution of these two attributes, mapping data quality into a two-dimensional space that allows for the selection of data based on their location within this distribution. By utilizing this space, we can analyze the impact of task type settings, used as prompts, on data quality. Additionally, this space can be used to construct multi-stage subsets of varying quality to facilitate curriculum learning. Our research includes comprehensive experiments conducted on various datasets. The results emphasize substantial enhancements in five commonly assessed capabilities compared to using the complete dataset. Our codes, data, and models are publicly available at: \url{https://anonymous.4open.science/r/EHIT-31B4}
- Published
- 2024
10. Analogues of Alder-Type Partition Inequalities for Fixed Perimeter Partitions
- Author
-
Chen, Ling, Hernandez, Isabelle, Shields, Zain, and Swisher, Holly
- Subjects
Mathematics - Number Theory ,Mathematics - Combinatorics - Abstract
In a 2016 paper, Straub proved an analogue to Euler's partition identity for partitions with fixed perimeter. Later, Fu and Tang provided a refinement and generalization of Straub's analogue to $d$-distinct partitions as well as a result related to the first Rogers-Ramanujan identity. Motivated by Alder-type partition identities and their generalizations, we build on work of Fu and Tang to establish generalized Alder-type partition inequalities in a fixed perimeter setting, and notably, a reverse Alder-type inequality., Comment: 13 pages
- Published
- 2024
11. Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification
- Author
-
Huang, Gexin, Wu, Chenfei, Li, Mingjie, Chang, Xiaojun, Chen, Ling, Sun, Ying, Zhao, Shen, Liang, Xiaodan, and Lin, Liang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Predicting genetic mutations from whole slide images is indispensable for cancer diagnosis. However, existing work training multiple binary classification models faces two challenges: (a) Training multiple binary classifiers is inefficient and would inevitably lead to a class imbalance problem. (b) The biological relationships among genes are overlooked, which limits the prediction performance. To tackle these challenges, we innovatively design a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances. BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules: (a) A gene graph whose node features are the genes' linguistic descriptions and the cancer phenotype, with edges modeled by genes' pathway associations and mutation consistencies. (b) A knowledge association module that fuses linguistic and biomedical knowledge into gene priors by transformer-based graph representation learning, capturing the intrinsic relationships between different genes' mutations. BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules: (a) A modality fusion module that firstly fuses the gene priors with critical regions in WSIs and obtains gene-wise mutation logits. (b) A comparative multi-label loss that emphasizes the inherent comparisons among mutation status to enhance the discrimination capabilities. Sufficient experiments on The Cancer Genome Atlas benchmark demonstrate that BPGT outperforms the state-of-the-art., Comment: 16 pages, 8 figures, and 3 tables
- Published
- 2024
12. MotionLLM: Understanding Human Behaviors from Human Motions and Videos
- Author
-
Chen, Ling-Hao, Lu, Shunlin, Zeng, Ailing, Zhang, Hao, Wang, Benyou, Zhang, Ruimao, and Zhang, Lei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This study delves into the realm of multi-modality (i.e., video and motion modalities) human behavior understanding by leveraging the powerful capabilities of Large Language Models (LLMs). Diverging from recent LLMs designed for video-only or motion-only understanding, we argue that understanding human behavior necessitates joint modeling from both videos and motion sequences (e.g., SMPL sequences) to capture nuanced body part dynamics and semantics effectively. In light of this, we present MotionLLM, a straightforward yet effective framework for human motion understanding, captioning, and reasoning. Specifically, MotionLLM adopts a unified video-motion training strategy that leverages the complementary advantages of existing coarse video-text data and fine-grained motion-text data to glean rich spatial-temporal insights. Furthermore, we collect a substantial dataset, MoVid, comprising diverse videos, motions, captions, and instructions. Additionally, we propose the MoVid-Bench, with carefully manual annotations, for better evaluation of human behavior understanding on video and motion. Extensive experiments show the superiority of MotionLLM in the caption, spatial-temporal comprehension, and reasoning ability., Comment: MotionLLM version 1.0, project page see https://lhchen.top/MotionLLM
- Published
- 2024
13. Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
- Author
-
Zhang, Zeyu, Wang, Yiran, Wu, Biao, Chen, Shuo, Zhang, Zhiyuan, Huang, Shiya, Zhang, Wenbo, Fang, Meng, Chen, Ling, and Zhao, Yang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In recent years, there has been significant interest in creating 3D avatars and motions, driven by their diverse applications in areas like film-making, video games, AR/VR, and human-robot interaction. However, current efforts primarily concentrate on either generating the 3D avatar mesh alone or producing motion sequences, with integrating these two aspects proving to be a persistent challenge. Additionally, while avatar and motion generation predominantly target humans, extending these techniques to animals remains a significant challenge due to inadequate training data and methods. To bridge these gaps, our paper presents three key contributions. Firstly, we proposed a novel agent-based approach named Motion Avatar, which allows for the automatic generation of high-quality customizable human and animal avatars with motions through text queries. The method significantly advanced the progress in dynamic 3D character generation. Secondly, we introduced a LLM planner that coordinates both motion and avatar generation, which transforms a discriminative planning into a customizable Q&A fashion. Lastly, we presented an animal motion dataset named Zoo-300K, comprising approximately 300,000 text-motion pairs across 65 animal categories and its building pipeline ZooGen, which serves as a valuable resource for the community. See project website https://steve-zeyu-zhang.github.io/MotionAvatar/, Comment: Accepted to BMVC 2024
- Published
- 2024
14. Harnessing Collective Structure Knowledge in Data Augmentation for Graph Neural Networks
- Author
-
Ma, Rongrong, Pang, Guansong, and Chen, Ling
- Subjects
Computer Science - Machine Learning - Abstract
Graph neural networks (GNNs) have achieved state-of-the-art performance in graph representation learning. Message passing neural networks, which learn representations through recursively aggregating information from each node and its neighbors, are among the most commonly-used GNNs. However, a wealth of structural information of individual nodes and full graphs is often ignored in such process, which restricts the expressive power of GNNs. Various graph data augmentation methods that enable the message passing with richer structure knowledge have been introduced as one main way to tackle this issue, but they are often focused on individual structure features and difficult to scale up with more structure features. In this work we propose a novel approach, namely collective structure knowledge-augmented graph neural network (CoS-GNN), in which a new message passing method is introduced to allow GNNs to harness a diverse set of node- and graph-level structure features, together with original node features/attributes, in augmented graphs. In doing so, our approach largely improves the structural knowledge modeling of GNNs in both node and graph levels, resulting in substantially improved graph representations. This is justified by extensive empirical results where CoS-GNN outperforms state-of-the-art models in various graph-level learning tasks, including graph classification, anomaly detection, and out-of-distribution generalization.
- Published
- 2024
15. Martian seismic anisotropy underneath Elysium Planitia revealed by direct S wave splitting
- Author
-
Shi, Jing, Han, Cunrui, Wang, Tao, Qi, Chao, Chen, Han, Yu, Zhihan, Geng, Jiaqi, Yang, Minghan, Wang, Xu, Chen, Ling, and Hui, Hejiu
- Subjects
Physics - Geophysics - Abstract
Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the Martian crust, obtained from crustal converted waves, multiples, and surface waves recorded by the InSight seismometer, the evidence is plausible. Notably, the shear wave splitting, which stands out as a straight indicator of seismic anisotropy, has not been reported using marsquake records. In this study, we employ Low-frequency marsquakes detected by the InSight seismometer to reveal shear wave splitting in Mars. We find that the direct S waves of three marsquake recordings (S0173a, S0235b, and S1133c) with high signal-to-noise ratios exhibit the splitting pheonmenon. We rule out the possibility of apparent anisotropy through synthetic tests, affirming the presence of seismic anisotropy in Mars. The delay time (about 1.33 s on average) measured from the direct S wave splitting is too large to be solely attributed to the seismic anisotropy in the upper crust (0 - 10 km) beneath the InSight. Thus, seismic anisotropy in the deeper region of Mars is indispensable. Combined with other geophysical evidence near the InSight landing site, the strong seismic anisotropy observed in this study implies the porous crust with aligned cracks being greater than 10 km beneath the InSight and/or the presence of an active mantle plume underneath the Elysium Planitia of Mars., Comment: Manuscript has been submitted to Earth and Planetary Science Letters; 9 figures; 33 pages
- Published
- 2024
16. Imbalanced Graph Classification with Multi-scale Oversampling Graph Neural Networks
- Author
-
Ma, Rongrong, Pang, Guansong, and Chen, Ling
- Subjects
Computer Science - Machine Learning - Abstract
One main challenge in imbalanced graph classification is to learn expressive representations of the graphs in under-represented (minority) classes. Existing generic imbalanced learning methods, such as oversampling and imbalanced learning loss functions, can be adopted for enabling graph representation learning models to cope with this challenge. However, these methods often directly operate on the graph representations, ignoring rich discriminative information within the graphs and their interactions. To tackle this issue, we introduce a novel multi-scale oversampling graph neural network (MOSGNN) that learns expressive minority graph representations based on intra- and inter-graph semantics resulting from oversampled graphs at multiple scales - subgraph, graph, and pairwise graphs. It achieves this by jointly optimizing subgraph-level, graph-level, and pairwise-graph learning tasks to learn the discriminative information embedded within and between the minority graphs. Extensive experiments on 16 imbalanced graph datasets show that MOSGNN i) significantly outperforms five state-of-the-art models, and ii) offers a generic framework, in which different advanced imbalanced learning loss functions can be easily plugged in and obtain significantly improved classification performance.
- Published
- 2024
17. MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
- Author
-
Dai, Wenxun, Chen, Ling-Hao, Wang, Jingbo, Liu, Jinpeng, Dai, Bo, and Tang, Yansong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This work introduces MotionLCM, extending controllable motion generation to a real-time level. Existing methods for spatial control in text-conditioned motion generation suffer from significant runtime inefficiency. To address this issue, we first propose the motion latent consistency model (MotionLCM) for motion generation, building upon the latent diffusion model (MLD). By employing one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation. To ensure effective controllability, we incorporate a motion ControlNet within the latent space of MotionLCM and enable explicit control signals (e.g., pelvis trajectory) in the vanilla motion space to control the generation process directly, similar to controlling other latent-free diffusion models for motion generation. By employing these techniques, our approach can generate human motions with text and control signals in real-time. Experimental results demonstrate the remarkable generation and controlling capabilities of MotionLCM while maintaining real-time runtime efficiency., Comment: MotionLCM project version 1.0
- Published
- 2024
18. Interest Clock: Time Perception in Real-Time Streaming Recommendation System
- Author
-
Zhu, Yongchun, Chen, Jingwu, Chen, Ling, Li, Yitan, Zhang, Feng, and Liu, Zuotao
- Subjects
Computer Science - Information Retrieval - Abstract
User preferences follow a dynamic pattern over a day, e.g., at 8 am, a user might prefer to read news, while at 8 pm, they might prefer to watch movies. Time modeling aims to enable recommendation systems to perceive time changes to capture users' dynamic preferences over time, which is an important and challenging problem in recommendation systems. Especially, streaming recommendation systems in the industry, with only available samples of the current moment, present greater challenges for time modeling. There is still a lack of effective time modeling methods for streaming recommendation systems. In this paper, we propose an effective and universal method Interest Clock to perceive time information in recommendation systems. Interest Clock first encodes users' time-aware preferences into a clock (hour-level personalized features) and then uses Gaussian distribution to smooth and aggregate them into the final interest clock embedding according to the current time for the final prediction. By arming base models with Interest Clock, we conduct online A/B tests, obtaining +0.509% and +0.758% improvements on user active days and app duration respectively. Besides, the extended offline experiments show improvements as well. Interest Clock has been deployed on Douyin Music App., Comment: Accepted by SIGIR 2024
- Published
- 2024
- Full Text
- View/download PDF
19. MMAC-Copilot: Multi-modal Agent Collaboration Operating System Copilot
- Author
-
Song, Zirui, Li, Yaohang, Fang, Meng, Chen, Zhenhao, Shi, Zecheng, Huang, Yuan, and Chen, Ling
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
Autonomous virtual agents are often limited by their singular mode of interaction with real-world environments, restricting their versatility. To address this, we propose the Multi-Modal Agent Collaboration framework (MMAC-Copilot), a framework utilizes the collective expertise of diverse agents to enhance interaction ability with operating systems. The framework introduces a team collaboration chain, enabling each participating agent to contribute insights based on their specific domain knowledge, effectively reducing the hallucination associated with knowledge domain gaps. To evaluate the performance of MMAC-Copilot, we conducted experiments using both the GAIA benchmark and our newly introduced Visual Interaction Benchmark (VIBench). VIBench focuses on non-API-interactable applications across various domains, including 3D gaming, recreation, and office scenarios. MMAC-Copilot achieved exceptional performance on GAIA, with an average improvement of 6.8\% over existing leading systems. Furthermore, it demonstrated remarkable capability on VIBench, particularly in managing various methods of interaction within systems and applications. These results underscore MMAC-Copilot's potential in advancing the field of autonomous virtual agents through its innovative approach to agent collaboration., Comment: In processing
- Published
- 2024
20. ICST-DNET: An Interpretable Causal Spatio-Temporal Diffusion Network for Traffic Speed Prediction
- Author
-
Rong, Yi, Mao, Yingchi, Liu, Yinqiu, Chen, Ling, He, Xiaoming, and Niyato, Dusit
- Subjects
Computer Science - Machine Learning ,Computer Science - Networking and Internet Architecture - Abstract
Traffic speed prediction is significant for intelligent navigation and congestion alleviation. However, making accurate predictions is challenging due to three factors: 1) traffic diffusion, i.e., the spatial and temporal causality existing between the traffic conditions of multiple neighboring roads, 2) the poor interpretability of traffic data with complicated spatio-temporal correlations, and 3) the latent pattern of traffic speed fluctuations over time, such as morning and evening rush. Jointly considering these factors, in this paper, we present a novel architecture for traffic speed prediction, called Interpretable Causal Spatio-Temporal Diffusion Network (ICST-DNET). Specifically, ICST-DENT consists of three parts, namely the Spatio-Temporal Causality Learning (STCL), Causal Graph Generation (CGG), and Speed Fluctuation Pattern Recognition (SFPR) modules. First, to model the traffic diffusion within road networks, an STCL module is proposed to capture both the temporal causality on each individual road and the spatial causality in each road pair. The CGG module is then developed based on STCL to enhance the interpretability of the traffic diffusion procedure from the temporal and spatial perspectives. Specifically, a time causality matrix is generated to explain the temporal causality between each road's historical and future traffic conditions. For spatial causality, we utilize causal graphs to visualize the diffusion process in road pairs. Finally, to adapt to traffic speed fluctuations in different scenarios, we design a personalized SFPR module to select the historical timesteps with strong influences for learning the pattern of traffic speed fluctuations. Extensive experimental results prove that ICST-DNET can outperform all existing baselines, as evidenced by the higher prediction accuracy, ability to explain causality, and adaptability to different scenarios.
- Published
- 2024
21. Graph Continual Learning with Debiased Lossless Memory Replay
- Author
-
Niu, Chaoxi, Pang, Guansong, and Chen, Ling
- Subjects
Computer Science - Machine Learning - Abstract
Real-life graph data often expands continually, rendering the learning of graph neural networks (GNNs) on static graph data impractical. Graph continual learning (GCL) tackles this problem by continually adapting GNNs to the expanded graph of the current task while maintaining the performance over the graph of previous tasks. Memory replay-based methods, which aim to replay data of previous tasks when learning new tasks, have been explored as one principled approach to mitigate the forgetting of the knowledge learned from the previous tasks. In this paper we extend this methodology with a novel framework, called Debiased Lossless Memory replay (DeLoMe). Unlike existing methods that sample nodes/edges of previous graphs to construct the memory, DeLoMe learns small lossless synthetic node representations as the memory. The learned memory can not only preserve the graph data privacy but also capture the holistic graph information, for which the sampling-based methods are not viable. Further, prior methods suffer from bias toward the current task due to the data imbalance between the classes in the memory data and the current data. A debiased GCL loss function is devised in DeLoMe to effectively alleviate this bias. Extensive experiments on four graph datasets show the effectiveness of DeLoMe under both class- and task-incremental learning settings., Comment: 12 pages
- Published
- 2024
22. DSGNN: A Dual-View Supergrid-Aware Graph Neural Network for Regional Air Quality Estimation
- Author
-
Zhang, Xin, Chen, Ling, Tang, Xing, and Shi, Hongyu
- Subjects
Computer Science - Machine Learning - Abstract
Air quality estimation can provide air quality for target regions without air quality stations, which is useful for the public. Existing air quality estimation methods divide the study area into disjointed grid regions, and apply 2D convolution to model the spatial dependencies of adjacent grid regions based on the first law of geography, failing to model the spatial dependencies of distant grid regions. To this end, we propose a Dual-view Supergrid-aware Graph Neural Network (DSGNN) for regional air quality estimation, which can model the spatial dependencies of distant grid regions from dual views (i.e., satellite-derived aerosol optical depth (AOD) and meteorology). Specifically, images are utilized to represent the regional data (i.e., AOD data and meteorology data). The dual-view supergrid learning module is introduced to generate supergrids in a parameterized way. Based on the dual-view supergrids, the dual-view implicit correlation encoding module is introduced to learn the correlations between pairwise supergrids. In addition, the dual-view message passing network is introduced to implement the information interaction on the supergrid graphs and images. Extensive experiments on two real-world datasets demonstrate that DSGNN achieves the state-of-the-art performances on the air quality estimation task, outperforming the best baseline by an average of 19.64% in MAE., Comment: Submitted to TKDE, 12 pages and 8 figures
- Published
- 2024
23. D-PAD: Deep-Shallow Multi-Frequency Patterns Disentangling for Time Series Forecasting
- Author
-
Yuan, Xiaobing and Chen, Ling
- Subjects
Computer Science - Artificial Intelligence - Abstract
In time series forecasting, effectively disentangling intricate temporal patterns is crucial. While recent works endeavor to combine decomposition techniques with deep learning, multiple frequencies may still be mixed in the decomposed components, e.g., trend and seasonal. Furthermore, frequency domain analysis methods, e.g., Fourier and wavelet transforms, have limitations in resolution in the time domain and adaptability. In this paper, we propose D-PAD, a deep-shallow multi-frequency patterns disentangling neural network for time series forecasting. Specifically, a multi-component decomposing (MCD) block is introduced to decompose the series into components with different frequency ranges, corresponding to the "shallow" aspect. A decomposition-reconstruction-decomposition (D-R-D) module is proposed to progressively extract the information of frequencies mixed in the components, corresponding to the "deep" aspect. After that, an interaction and fusion (IF) module is used to further analyze the components. Extensive experiments on seven real-world datasets demonstrate that D-PAD achieves the state-of-the-art performance, outperforming the best baseline by an average of 9.48% and 7.15% in MSE and MAE, respectively.
- Published
- 2024
24. Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments
- Author
-
Cheng, Sitao, Zhuang, Ziyuan, Xu, Yong, Yang, Fangkai, Zhang, Chaoyun, Qin, Xiaoting, Huang, Xiang, Chen, Ling, Lin, Qingwei, Zhang, Dongmei, Rajmohan, Saravan, and Zhang, Qi
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graph and table. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous methods leverage LLMs to incrementally build a reasoning path, where the LLMs either invoke tools or pick up schemas by step-by-step interacting with the environment. We propose Reasoning-Path-Editing (Readi), a novel framework where LLMs can efficiently and faithfully reason over structured environments. In Readi, LLMs initially generate a reasoning path given a query, and edit the path only when necessary. We instantiate the path on structured environments and provide feedback to edit the path if anything goes wrong. Experimental results on three KGQA and two TableQA datasets show the effectiveness of Readi, significantly surpassing previous LLM-based methods (by 9.1% Hit@1 on WebQSP, 12.4% on MQA-3H and 9.5% on WTQ), comparable with state-of-the-art fine-tuned methods (67% on CWQ and 74.7% on WebQSP) and substantially boosting the vanilla LLMs (by 14.9% on CWQ). Our code will be available on https://aka.ms/readi., Comment: Accepted by ACL 2024 Findings. 21 pages, 7 figures, 17 tables
- Published
- 2024
25. Type IV-like Solar Radio Burst Consisting of a Series of Spikes Observed by PSP
- Author
-
Ma, Bing, Chen, Ling, Wu, De-Jin, Pulupa, Marc, and Bale, Stuart D.
- Subjects
Astrophysics - Solar and Stellar Astrophysics ,Physics - Space Physics - Abstract
Solar and interplanetary radio bursts can reflect the existence and motion of energetic electrons and are therefore a kind of vital phenomenon in solar activities. The present study reported a solar radio burst (SRB) event observed by Parker Solar Probe (PSP) in its 8th orbital encounter phase, and it lasted about 20 hours in a frequency range of 0.5-15 MHz, called the type IV-like SRB. This type IV-like SRB consists of a series of numerous spikes with the center-frequency drifting slowly from ~5 MHz to ~1 MHz, and each individual spike appears a much faster frequency drifting and has a narrow frequency range of a few MHz and short duration of a few minutes. Based on the empirical models of the solar atmosphere adopted commonly, combining the in-situ measurement by PSP, we propose that these small-scale spikes were generated by a group of solitary kinetic Alfv\'en waves (SKAWs) in a magnetic loop accompanying coronal mass ejection (CME) and moving outwards, in which the frequency drifting of individual spike is caused by the SKAW's propagation and the center-frequency drifting may be attributed to the motion of the magnetic loop., Comment: There are some questions about models and the emission mechanisms to be discussed more carefully. We need to revise this manuscript
- Published
- 2024
26. Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data
- Author
-
Ding, Jun-En, Thao, Phan Nguyen Minh, Peng, Wen-Chih, Wang, Jian-Zhe, Chug, Chun-Cheng, Hsieh, Min-Chen, Tseng, Yun-Chien, Chen, Ling, Luo, Dongsheng, Wang, Chi-Te, Chen, Pei-fu, Liu, Feng, and Hung, Fang-Ming
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from the Taiwan hospital database, including 1,420,596 clinical notes, 387,392 laboratory test results, and more than 1,505 laboratory test items, focusing on research pre-training large language models. We proposed a novel Large Language Multimodal Models (LLMMs) framework incorporating multimodal data from clinical notes and laboratory test results for the prediction of chronic disease risk. Our method combined a text embedding encoder and multi-head attention layer to learn laboratory test values, utilizing a deep neural network (DNN) module to merge blood features with chronic disease semantics into a latent space. In our experiments, we observe that clinicalBERT and PubMed-BERT, when combined with attention fusion, can achieve an accuracy of 73% in multiclass chronic diseases and diabetes prediction. By transforming laboratory test values into textual descriptions and employing the Flan T-5 model, we achieved a 76% Area Under the ROC Curve (AUROC), demonstrating the effectiveness of leveraging numerical text data for training and inference in language models. This approach significantly improves the accuracy of early-stage diabetes prediction.
- Published
- 2024
27. Degree-heterogeneous Latent Class Analysis for High-dimensional Discrete Data
- Author
-
Lyu, Zhongyuan, Chen, Ling, and Gu, Yuqi
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
The latent class model is a widely used mixture model for multivariate discrete data. Besides the existence of qualitatively heterogeneous latent classes, real data often exhibit additional quantitative heterogeneity nested within each latent class. The modern latent class analysis also faces extra challenges, including the high-dimensionality, sparsity, and heteroskedastic noise inherent in discrete data. Motivated by these phenomena, we introduce the Degree-heterogeneous Latent Class Model and propose a spectral approach to clustering and statistical inference in the challenging high-dimensional sparse data regime. We propose an easy-to-implement HeteroClustering algorithm. It uses heteroskedastic PCA with L2 normalization to remove degree effects and perform clustering in the top singular subspace of the data matrix. We establish an exponential error rate for HeteroClustering, leading to exact clustering under minimal signal-to-noise conditions. We further investigate the estimation and inference of the high-dimensional continuous item parameters in the model, which are crucial to interpreting and finding useful markers for latent classes. We provide comprehensive procedures for global testing and multiple testing of these parameters with valid error controls. The superior performance of our methods is demonstrated through extensive simulations and applications to three diverse real-world datasets from political voting records, genetic variations, and single-cell sequencing.
- Published
- 2024
28. RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering
- Author
-
Zhang, Zihan, Fang, Meng, and Chen, Ling
- Subjects
Computer Science - Computation and Language - Abstract
Adaptive retrieval-augmented generation (ARAG) aims to dynamically determine the necessity of retrieval for queries instead of retrieving indiscriminately to enhance the efficiency and relevance of the sourced information. However, previous works largely overlook the evaluation of ARAG approaches, leading to their effectiveness being understudied. This work presents a benchmark, RetrievalQA, comprising 1,271 short-form questions covering new world and long-tail knowledge. The knowledge necessary to answer the questions is absent from LLMs; therefore, external information must be retrieved to answer correctly. This makes RetrievalQA a suitable testbed to evaluate existing ARAG methods. We observe that calibration-based methods heavily rely on threshold tuning, while vanilla prompting is inadequate for guiding LLMs to make reliable retrieval decisions. Based on our findings, we propose Time-Aware Adaptive Retrieval (TA-ARE), a simple yet effective method that helps LLMs assess the necessity of retrieval without calibration or additional training. The dataset and code will be available at https://github.com/hyintell/RetrievalQA, Comment: Findings of ACL 2024
- Published
- 2024
29. Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation
- Author
-
Na, Hongbin, Wang, Zimu, Maimaiti, Mieradilijiang, Chen, Tong, Wang, Wei, Shen, Tao, and Chen, Ling
- Subjects
Computer Science - Computation and Language - Abstract
Large language models (LLMs) have demonstrated promising potential in various downstream tasks, including machine translation. However, prior work on LLM-based machine translation has mainly focused on better utilizing training data, demonstrations, or pre-defined and universal knowledge to improve performance, with a lack of consideration of decision-making like human translators. In this paper, we incorporate Thinker with the Drift-Diffusion Model (Thinker-DDM) to address this issue. We then redefine the Drift-Diffusion process to emulate human translators' dynamic decision-making under constrained resources. We conduct extensive experiments under the high-resource, low-resource, and commonsense translation settings using the WMT22 and CommonMT datasets, in which Thinker-DDM outperforms baselines in the first two scenarios. We also perform additional analysis and evaluation on commonsense translation to illustrate the high effectiveness and efficacy of the proposed method., Comment: Under review
- Published
- 2024
30. SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection Framework for Large Language Models
- Author
-
Xu, Tianhan, Hu, Zhe, Chen, Ling, and Li, Bin
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Recent advances in large language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, their effective application in the medical domain is hampered by a lack of medical domain knowledge. In this study, we present SA-MDKIF, a scalable and adaptable framework that aims to inject medical knowledge into general-purpose LLMs through instruction tuning, thereby enabling adaptability for various downstream tasks. SA-MDKIF consists of two stages: skill training and skill adaptation. In the first stage, we define 12 basic medical skills and use AdaLoRA to train these skills based on uniformly formatted instructional datasets that we have constructed. In the next stage, we train the skill router using task-specific downstream data and use this router to integrate the acquired skills with LLMs during inference. Experimental results on 9 different medical tasks show that SA-MDKIF improves performance by 10-20% compared to the original LLMs. Notably, this improvement is particularly pronounced for unseen medical tasks, showing an improvement of up to 30%.
- Published
- 2024
31. Continuously Evolving Graph Neural Controlled Differential Equations for Traffic Forecasting
- Author
-
Wu, Jiajia and Chen, Ling
- Subjects
Computer Science - Machine Learning - Abstract
As a crucial technique for developing a smart city, traffic forecasting has become a popular research focus in academic and industrial communities for decades. This task is highly challenging due to complex and dynamic spatial-temporal dependencies in traffic networks. Existing works ignore continuous temporal dependencies and spatial dependencies evolving over time. In this paper, we propose Continuously Evolving Graph Neural Controlled Differential Equations (CEGNCDE) to capture continuous temporal dependencies and spatial dependencies over time simultaneously. Specifically, a continuously evolving graph generator (CEGG) based on NCDE is introduced to generate the spatial dependencies graph that continuously evolves over time from discrete historical observations. Then, a graph neural controlled differential equations (GNCDE) framework is introduced to capture continuous temporal dependencies and spatial dependencies over time simultaneously. Extensive experiments demonstrate that CEGNCDE outperforms the SOTA methods by average 2.34% relative MAE reduction, 0.97% relative RMSE reduction, and 3.17% relative MAPE reduction., Comment: 9 pages, 4 figures
- Published
- 2024
32. Mobile App-Based Intervention for Pregnant Women With Stress Urinary Incontinence: Protocol for a Hybrid Effectiveness-Implementation Trial
- Author
-
Li, Tiantian, Chen, Xiaomin, Wang, Jia, Chen, Ling, and Cai, Wenzhi
- Subjects
Medicine ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
BackgroundStress urinary incontinence (SUI) is a common source of distress among women during and after pregnancy. It has a negative effect on quality of life but with poor care-seeking. Mobile health (mHealth) may be a promising solution with potential advantages. However, there is uncertainty whether a mobile app is effective for SUI symptom improvement during and after pregnancy. The implementation is also unclear. We developed an app named UIW (Urinary Incontinence for Women) aimed at improving perinatal incontinence. ObjectiveThe objective of this study is to evaluate the effectiveness of the UIW app-based intervention in improving SUI symptoms among pregnant women and explore the facilitators and barriers to using the UIW app to help refine and optimize the intervention. MethodsThis study is a hybrid effectiveness-implementation trial with a randomized controlled trial alongside a mixed-methods process evaluation according to the Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) framework. Pregnant women with SUI (n=336) will be recruited from a university-affiliated hospital in China. They will be randomly allocated (1:1) to either the intervention group that receive usual care plus UIW app or control group that receive usual care alone. The intervention period will last 2 months. The 5 dimensions of the RE-AIM framework will be evaluated at recruitment (-T1), baseline (T0), immediately after intervention (T1), 42 days after delivery (T2), 3 months after delivery (T3), and 6 months after delivery (T4) through project documents, online questionnaires and a pelvic floor muscle training diary, surface electromyography, log data in the background management system, and qualitative interviews. Data analysis will follow the intention-to-treat principle. Descriptive statistics, t tests, chi-square tests, and a linear mixed model will be used to analyze the quantitative data. Deductive and inductive content analysis will be used to analyze the qualitative data. ResultsThe effectiveness-implementation trial started in June 2020, trial recruitment was completed in October 2020, and the intervention will last for a 2-month period. Completion of the 6-month follow-up will be in July 2021, and we anticipate that the results of this study will be published in December 2021. ConclusionsThis study will evaluate both effectiveness and implementation of the UIW app-based intervention among pregnant women. The hybrid effectiveness-implementation trial design according to the RE-AIM framework with a mixed-methods approach will give valuable insights into the effects as well as facilitators and barriers to the implementation that will influence the effects of the UIW app-based intervention. Trial RegistrationChinese Clinical Trial Registry ChiCTR1800016171; http://www.chictr.org.cn/showproj.aspx?proj=27455 International Registered Report Identifier (IRRID)PRR1-10.2196/22771
- Published
- 2021
- Full Text
- View/download PDF
33. Epidemiologic trends and changes in humoral immunity and lymphocyte subsets levels among hospitalized children with Mycoplasma pneumoniae infection during 2019–2023
- Author
-
Tang, Linyan, Zheng, Kaiwen, Ma, Lanlan, Chen, Ling, Zhao, Yuling, Li, Li, Wang, Ke, Zhang, Jing, and Chen, Xing
- Published
- 2024
- Full Text
- View/download PDF
34. Predicting pile-bearing capacity utilizing least square support vector regression coupled with giant trevally optimizer and the flying foxes optimization
- Author
-
Chen, Ling
- Published
- 2024
- Full Text
- View/download PDF
35. Clinical Significance of MicroRNA-299-3p in Coronary Artery Disease Based on Bioinformatics Analysis
- Author
-
Wu, Jian, Wu, Sha, Liu, Denghai, and Chen, Ling
- Published
- 2024
- Full Text
- View/download PDF
36. Long Noncoding RNA NKX2-1-AS1 Accelerates Non-Small Cell Lung Cancer Progression through the miR-589-5p/NME1 Axis
- Author
-
Chen, Xiaoying, Jiang, Ruilai, Huang, Xiaocheng, Chen, Ling, Hu, Xiaogang, and Wei, Yanbin
- Published
- 2024
- Full Text
- View/download PDF
37. A CRISPR/RfxCas13d-mediated strategy for efficient RNA knockdown in mouse embryonic development
- Author
-
Zhang, Lin, Cao, Shi-Meng, Wu, Hao, Yan, Meng, Li, Jinsong, and Chen, Ling-Ling
- Published
- 2024
- Full Text
- View/download PDF
38. Associations Between CYP3A5 (c.6986A>G) Gene Polymorphism and Kidney Impairment in Hypertensive Adults Without Cystatin C Elevation
- Author
-
Chen, Ling, Jiang, Yufeng, and Cheng, Xingbo
- Published
- 2024
- Full Text
- View/download PDF
39. Characterization and source apportionment of pharmaceuticals in surface water of the Yangtze Estuary and adjacent sea
- Author
-
Chen, Chunzhao, Tang, Jian, Li, Fei, Xue, Rui, Xiao, Yihua, Chen, Ling, and Yu, Gang
- Published
- 2024
- Full Text
- View/download PDF
40. Factors Influencing Autonomy in Middle-Aged and Elderly Women with Urinary Incontinence
- Author
-
Zhang, Yingying, Li, Jie, Hu, Yingjie, Chen, Ling, Cai, Wenzhi, and Ren, Wei
- Published
- 2024
- Full Text
- View/download PDF
41. Inference and prioritization of tissue-specific regulons in Arabidopsis and Oryza
- Author
-
Dai, Honggang, Fan, Yaxin, Mei, Yichao, Chen, Ling-Ling, and Gao, Junxiang
- Published
- 2024
- Full Text
- View/download PDF
42. Disfunction of dorsal raphe nucleus-hippocampus serotonergic-HTR3 transmission results in anxiety phenotype of Neuroplastin 65-deficient mice
- Author
-
Cheng, Jie, Chen, Ling, Zheng, Ya-ni, Liu, Juan, Zhang, Lei, Zhang, Xiao-ming, Huang, Liang, and Yuan, Qiong-lan
- Published
- 2024
- Full Text
- View/download PDF
43. How and when college students’ perception of teachers’ entrepreneurial leadership affects entrepreneurial intention: a moderated serial mediation model
- Author
-
Gao, Sun-Yu, Chen, Ling-Ge, Huang, Jian-Hao, and Tsai, Yi-Ying
- Published
- 2024
- Full Text
- View/download PDF
44. Regression analysis of doubly censored failure time data with ancillary information
- Author
-
Du, Mingyue, Gao, Xiyuan, and Chen, Ling
- Published
- 2024
- Full Text
- View/download PDF
45. A multi-label transformer-based deep learning approach to predict focal visual field progression
- Author
-
Chen, Ling, Tseng, Vincent S., Tsung, Ta-Hsin, and Lu, Da-Wen
- Published
- 2024
- Full Text
- View/download PDF
46. Clinical characteristics and application value of risk prediction models of acute appendicitis in rural Tibet: A retroprective study
- Author
-
Liu, Jie, Chen, Ling, and Dai, Zhiqiang
- Published
- 2023
47. Clinical Characteristics and Outcomes of Childbearing-Age Women With COVID-19 in Wuhan: Retrospective, Single-Center Study
- Author
-
Wei, Lijie, Gao, Xuan, Chen, Suhua, Zeng, Wanjiang, Wu, Jianli, Lin, Xingguang, Zhang, Huiting, Mwamaka Sharifu, Lali, Chen, Ling, Feng, Ling, and Wang, Shaoshuai
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Public aspects of medicine ,RA1-1270 - Abstract
BackgroundSince December 2019, an outbreak of the coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has spread rapidly in Wuhan and worldwide. However, previous studies on pregnant patients were limited. ObjectiveThe aim of this study is to evaluate the clinical characteristics and outcomes of pregnant and nonpregnant women with COVID-19. MethodsThis study retrospectively collected epidemiological, clinical, laboratory, imaging, management, and outcome data of 43 childbearing-age women patients (including 17 pregnant and 26 nonpregnant patients) who presented with laboratory-confirmed COVID-19 in Tongji Hospital, Wuhan, China from January 19 to March 2, 2020. Clinical outcomes were followed up to March 28, 2020. ResultsOf the 43 childbearing-age women in this study, none developed a severe adverse illness or died. The median ages of pregnant and nonpregnant women were 33.0 and 33.5 years, respectively. Pregnant women had a markedly higher proportion of history exposure to hospitals within 2 weeks before onset compared to nonpregnant women (9/17, 53% vs 5/26, 19%, P=.02) and a lower proportion of other family members affected (4/17, 24% vs 19/26, 73%, P=.004). Fever (8/17, 47% vs 18/26, 69%) and cough (9/17, 53% vs 12/26, 46%) were common onsets of symptoms for the two groups. Abdominal pain (n=4, 24%), vaginal bleeding (n=1, 6%), reduced fetal movement (n=1, 6%), and increased fetal movement (n=2, 13%) were observed at onset in the 17 pregnant patients. Higher neutrophil and lower lymphocyte percent were observed in the pregnant group compared to the nonpregnant group (79% vs 56%, P
- Published
- 2020
- Full Text
- View/download PDF
48. Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game
- Author
-
Shi, Zijing, Fang, Meng, Zheng, Shunfeng, Deng, Shilong, Chen, Ling, and Du, Yali
- Subjects
Computer Science - Computation and Language - Abstract
Multi-agent collaboration with Large Language Models (LLMs) demonstrates proficiency in basic tasks, yet its efficiency in more complex scenarios remains unexplored. In gaming environments, these agents often face situations without established coordination protocols, requiring them to make intelligent inferences about teammates from limited data. This problem motivates the area of ad hoc teamwork, in which an agent may potentially cooperate with a variety of teammates to achieve a shared goal. Our study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language. Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication. To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning, enabling the repurposing of partial information for rapid adaptation to new teammates., Comment: Code will release soon
- Published
- 2023
49. One-Shot Learning as Instruction Data Prospector for Large Language Models
- Author
-
Li, Yunshui, Hui, Binyuan, Xia, Xiaobo, Yang, Jiaxi, Yang, Min, Zhang, Lei, Si, Shuzheng, Chen, Ling-Hao, Liu, Junhao, Liu, Tongliang, Huang, Fei, and Li, Yongbin
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Contemporary practices in instruction tuning often hinge on enlarging data scaling without a clear strategy for ensuring data quality, inadvertently introducing noise that may compromise model performance. To address this challenge, we introduce \textsc{Nuggets}, a novel and efficient methodology that leverages one-shot learning to discern and select high-quality instruction data from extensive datasets. \textsc{Nuggets} assesses the potential of individual instruction examples to act as effective one-shot learning instances, thereby identifying those that can significantly improve performance across diverse tasks. \textsc{Nuggets} utilizes a scoring system based on the impact of candidate examples on the perplexity of a diverse anchor set, facilitating the selection of the most advantageous data for instruction tuning. Through comprehensive evaluations on two benchmarks, including MT-Bench and Alpaca-Eval, we show that instruction tuning with the top 1\% of examples curated by \textsc{Nuggets} substantially outperforms conventional methods employing the entire dataset., Comment: ACL 2024
- Published
- 2023
50. BDHT: Generative AI Enables Causality Analysis for Mild Cognitive Impairment
- Author
-
Zuo, Qiankun, Chen, Ling, Shen, Yanyan, Ng, Michael Kwok-Po, Lei, Baiying, and Wang, Shuqiang
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Quantitative Biology - Neurons and Cognition - Abstract
Effective connectivity estimation plays a crucial role in understanding the interactions and information flow between different brain regions. However, the functional time series used for estimating effective connectivity is derived from certain software, which may lead to large computing errors because of different parameter settings and degrade the ability to model complex causal relationships between brain regions. In this paper, a brain diffuser with hierarchical transformer (BDHT) is proposed to estimate effective connectivity for mild cognitive impairment (MCI) analysis. To our best knowledge, the proposed brain diffuser is the first generative model to apply diffusion models to the application of generating and analyzing multimodal brain networks. Specifically, the BDHT leverages structural connectivity to guide the reverse processes in an efficient way. It makes the denoising process more reliable and guarantees effective connectivity estimation accuracy. To improve denoising quality, the hierarchical denoising transformer is designed to learn multi-scale features in topological space. By stacking the multi-head attention and graph convolutional network, the graph convolutional transformer (GraphConformer) module is devised to enhance structure-function complementarity and improve the ability in noise estimation. Experimental evaluations of the denoising diffusion model demonstrate its effectiveness in estimating effective connectivity. The proposed model achieves superior performance in terms of accuracy and robustness compared to existing approaches. Moreover, the proposed model can identify altered directional connections and provide a comprehensive understanding of parthenogenesis for MCI treatment., Comment: 13pages, 14 figures
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.