Author: "Ding, Yidong" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ding, Yidong"' showing total 9 results

Start Over Author "Ding, Yidong"

9 results on '"Ding, Yidong"'

1. MBTSAD: Mitigating Backdoors in Language Models Based on Token Splitting and Attention Distillation

Author: Ding, Yidong, Niu, Jiafei, and Yi, Ping
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language
Abstract: In recent years, attention-based models have excelled across various domains but remain vulnerable to backdoor attacks, often from downloading or fine-tuning on poisoned datasets. Many current methods to mitigate backdoors in NLP models rely on the pre-trained (unfine-tuned) weights, but these methods fail in scenarios where the pre-trained weights are not available. In this work, we propose MBTSAD, which can mitigate backdoors in the language model by utilizing only a small subset of clean data and does not require pre-trained weights. Specifically, MBTSAD retrains the backdoored model on a dataset generated by token splitting. Then MBTSAD leverages attention distillation, the retrained model is the teacher model, and the original backdoored model is the student model. Experimental results demonstrate that MBTSAD achieves comparable backdoor mitigation performance as the methods based on pre-trained weights while maintaining the performance on clean data. MBTSAD does not rely on pre-trained weights, enhancing its utility in scenarios where pre-trained weights are inaccessible. In addition, we simplify the min-max problem of adversarial training and visualize text representations to discover that the token splitting method in MBTSAD's first step generates Out-of-Distribution (OOD) data, leading the model to learn more generalized features and eliminate backdoor patterns., Comment: Accepted by ICTAI 2024
Published: 2025

2. TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models

Author: Cheng, Pengzhou, Ding, Yidong, Ju, Tianjie, Wu, Zongru, Du, Wei, Yi, Ping, Zhang, Zhuosheng, and Liu, Gongshen
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have raised concerns about potential security threats despite performing significantly in Natural Language Processing (NLP). Backdoor attacks initially verified that LLM is doing substantial harm at all stages, but the cost and robustness have been criticized. Attacking LLMs is inherently risky in security review, while prohibitively expensive. Besides, the continuous iteration of LLMs will degrade the robustness of backdoors. In this paper, we propose TrojanRAG, which employs a joint backdoor attack in the Retrieval-Augmented Generation, thereby manipulating LLMs in universal attack scenarios. Specifically, the adversary constructs elaborate target contexts and trigger sets. Multiple pairs of backdoor shortcuts are orthogonally optimized by contrastive learning, thus constraining the triggering conditions to a parameter subspace to improve the matching. To improve the recall of the RAG for the target contexts, we introduce a knowledge graph to construct structured data to achieve hard matching at a fine-grained level. Moreover, we normalize the backdoor scenarios in LLMs to analyze the real harm caused by backdoors from both attackers' and users' perspectives and further verify whether the context is a favorable tool for jailbreaking models. Extensive experimental results on truthfulness, language understanding, and harmfulness show that TrojanRAG exhibits versatility threats while maintaining retrieval capabilities on normal queries., Comment: 19 pages, 14 figures, 4 tables
Published: 2024

3. Contrasting responses of soil phosphorus pool and bioavailability to alder expansion in a boreal peatland, Northeast China

Author: Wan, Songze, Lin, Guigang, Liu, Bo, Ding, Yidong, Li, Suli, and Mao, Rong
Published: 2022
Full Text: View/download PDF

4. Genesis and Related Reservoir Development Model of Ordovician Dolomite in Shuntogol Area, Tarim Basin.

Author: Zhong, Liangxuanzi, Cheng, Leli, Fu, Heng, Zhao, Shaoze, Ye, Xiaobin, Ding, Yidong, and Senlin, Yin
Subjects: DOLOMITE, RARE earth metals, CARBONATE rocks, CATHODE rays, OXYGEN isotopes, CORE drilling, RARE earth oxides, TRACE elements
Abstract: The Ordovician thick dolostone in Shuntogol area of the Tarim Basin has the potential to form a large-scale reservoir, but its genesis and reservoir development model are still unclear. Starting from a sedimentary sequence, this study takes a batch of dolostone samples obtained from new drilling cores in recent years as the research object. On the basis of core observation and thin section identification, trace elements, cathodoluminescence, carbon and oxygen isotopes, rare earth elements, and X-ray diffraction order degree tests were carried out to discuss the origin of the dolomite and summarize the development model of the dolostone reservoir. The analysis results show that the Ordovician dolomite in the study area had a good crystalline shape, large thickness, high Fe and Mn values, and mostly showed bright red light or bright orange–red light under cathode rays. The ratio of δ18O values to seawater values at the same time showed a negative bias; the δCe values were negative anomalies, the δEu values were positive anomalies, and the order degree was high. This indicates that the dolomitization process occurred in a relatively closed diagenetic environment. The Ordovician carbonate rocks in the study area were low-lying during the sedimentary period, and with the rise of sea level, the open platform facies continued to develop. When the Middle and Lower Ordovician series entered the burial stage, the main hydrocarbon source rocks of the lower Cambrian Series entered the oil generation peak, and the resulting formation overpressure provided the dynamic source for the upward migration of the lower magnesium-rich fluid, and the dolomitization fluid entered the karst pore system in the target layer to produce all the dolomitization. This set of dolostone reservoirs is large in scale and can be used as a favorable substitute area for deep carbonate exploration for continuous study. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations

Author: Wu, Jinyang, primary, Ning, Zhiwei, additional, Ding, Yidong, additional, Wang, Ying, additional, Peng, Qinke, additional, and Fu, Laiyi, additional
Published: 2023
Full Text: View/download PDF

6. BertNDA: a Model Based on Graph-Bert and Multi-scale Information Fusion for ncRNA-disease Association Prediction

Author: Ning, Zhiwei, primary, Wu, Jinyang, additional, Ding, Yidong, additional, Wang, Ying, additional, Peng, Qinke, additional, and Fu, Laiyi, additional
Published: 2023
Full Text: View/download PDF

7. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations

Author: Wu, Jinyang, Ning, Zhiwei, Ding, Yidong, Wang, Ying, Peng, Qinke, and Fu, Laiyi
Abstract: Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.
Published: 2023
Full Text: View/download PDF

8. BertNDA: A Model Based on Graph-Bert and Multi-Scale Information Fusion for ncRNA-Disease Association Prediction

Author: Ning, Zhiwei, Wu, Jinyang, Ding, Yidong, Wang, Ying, Peng, Qinke, and Fu, Laiyi
Abstract: Non-coding RNAs (ncRNAs) are a class of RNA molecules that lack the ability to encode proteins in human cells, but play crucial roles in various biological process. Understanding the interactions between different ncRNAs and their impact on diseases can significantly contribute to diagnosis, prevention, and treatment of diseases. However, predicting tertiary interactions between ncRNAs and diseases based on structural information in multiple scales remains a challenging task. To address this challenge, we propose a method called BertNDA, aiming to predict potential relationships between miRNAs, lncRNAs, and diseases. The framework identifies the local information through connectionless subgraph, which aggregate neighbor nodes’ feature. And global information is extracted by leveraging Laplace transform of graph structures and WL (Weisfeiler-Lehman) absolute role coding. Additionally, an EMLP (Element-wise MLP) structure is designed to fuse pairwise global information. The transformer-encoder is employed as the backbone of our approach, followed by a prediction-layer to output the final correlation score. Extensive experiments demonstrate that BertNDA outperforms state-of-the-art methods in prediction assignment and exhibits significant potential for various biological applications. Moreover, we develop an online prediction platform that incorporates the prediction model, providing users with an intuitive and interactive experience. Overall, our model offers an efficient, accurate, and comprehensive tool for predicting tertiary associations between ncRNAs and diseases.
Published: 2023
Full Text: View/download PDF

9. The research and improving for multi-pattern string matching algorithm

Author: Fang Xiangyan, Yuan Youguang, Ding Yidong, and Xiong Tinggang
Subjects: Optimal matching, Theoretical computer science, law, 3-dimensional matching, Commentz-Walter algorithm, String searching algorithm, String metric, Approximate string matching, Boyer–Moore string search algorithm, Rabin–Karp algorithm, Mathematics, law.invention
Abstract: The paper proposes improving methods to advance the matching rate for multi-pattern string matching algorithm Wu-Manber. First, string abstract value matching method advances the precision of the first matching, and reduces the string comparing times; second, heuristic matching method increases the safe moving distance when the matching of string matched is successful; third, multi-level cache parallel matching method removes the delay time of the long pattern string matching and hash operation. Lastly, the algorithmic complexity is analyzed. The algorithmic effectiveness of improved W-M algorithm is demonstration by experimenting
Published: 2010
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Ding, Yidong"'

1. MBTSAD: Mitigating Backdoors in Language Models Based on Token Splitting and Attention Distillation

2. TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models

3. Contrasting responses of soil phosphorus pool and bioavailability to alder expansion in a boreal peatland, Northeast China

4. Genesis and Related Reservoir Development Model of Ordovician Dolomite in Shuntogol Area, Tarim Basin.

5. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations

6. BertNDA: a Model Based on Graph-Bert and Multi-scale Information Fusion for ncRNA-disease Association Prediction

7. KGETCDA: an efficient representation learning framework based on knowledge graph encoder from transformer for predicting circRNA-disease associations

8. BertNDA: A Model Based on Graph-Bert and Multi-Scale Information Fusion for ncRNA-Disease Association Prediction

9. The research and improving for multi-pattern string matching algorithm

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

9 results on '"Ding, Yidong"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources