12,216 results
Search Results
12. Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
- Author
-
Papamarkou, Theodore, Skoularidou, Maria, Palla, Konstantina, Aitchison, Laurence, Arbel, Julyan, Dunson, David, Filippone, Maurizio, Fortuin, Vincent, Hennig, Philipp, Lobato, Jose Miguel Hernandez, Hubin, Aliaksandr, Immer, Alexander, Karaletsos, Theofanis, Khan, Mohammad Emtiyaz, Kristiadi, Agustinus, Li, Yingzhen, Mandt, Stephan, Nemeth, Christopher, Osborne, Michael A., Rudner, Tim G. J., Rügamer, David, Teh, Yee Whye, Welling, Max, Wilson, Andrew Gordon, and Zhang, Ruqi
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learning (BDL) constitutes a promising avenue, offering advantages across these diverse settings. This paper posits that BDL can elevate the capabilities of deep learning. It revisits the strengths of BDL, acknowledges existing challenges, and highlights some exciting research avenues aimed at addressing these obstacles. Looking ahead, the discussion focuses on possible ways to combine large-scale foundation models with BDL to unlock their full potential.
- Published
- 2024
13. Position Paper: Toward New Frameworks for Studying Model Representations
- Author
-
Golechha, Satvik and Dao, James
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Mechanistic interpretability (MI) aims to understand AI models by reverse-engineering the exact algorithms neural networks learn. Most works in MI so far have studied behaviors and capabilities that are trivial and token-aligned. However, most capabilities are not that trivial, which advocates for the study of hidden representations inside these networks as the unit of analysis. We do a literature review, formalize representations for features and behaviors, highlight their importance and evaluation, and perform some basic exploration in the mechanistic interpretability of representations. With discussion and exploratory results, we justify our position that studying representations is an important and under-studied field, and that currently established methods in MI are not sufficient to understand representations, thus pushing for the research community to work toward new frameworks for studying representations.
- Published
- 2024
14. Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models
- Author
-
Li, Xi and Wang, Jiaqi
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Computers and Society - Abstract
Federated Learning (FL), while a breakthrough in decentralized machine learning, contends with significant challenges such as limited data availability and the variability of computational resources, which can stifle the performance and scalability of the models. The integration of Foundation Models (FMs) into FL presents a compelling solution to these issues, with the potential to enhance data richness and reduce computational demands through pre-training and data augmentation. However, this incorporation introduces novel issues in terms of robustness, privacy, and fairness, which have not been sufficiently addressed in the existing research. We make a preliminary investigation into this field by systematically evaluating the implications of FM-FL integration across these dimensions. We analyze the trade-offs involved, uncover the threats and issues introduced by this integration, and propose a set of criteria and strategies for navigating these challenges. Furthermore, we identify potential research directions for advancing this field, laying a foundation for future development in creating reliable, secure, and equitable FL systems., Comment: Under review
- Published
- 2024
15. Position Paper: Generalized grammar rules and structure-based generalization beyond classical equivariance for lexical tasks and transduction
- Author
-
Petrache, Mircea and Trivedi, Shubhendu
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Compositional generalization is one of the main properties which differentiates lexical learning in humans from state-of-art neural networks. We propose a general framework for building models that can generalize compositionally using the concept of Generalized Grammar Rules (GGRs), a class of symmetry-based compositional constraints for transduction tasks, which we view as a transduction analogue of equivariance constraints in physics-inspired tasks. Besides formalizing generalized notions of symmetry for language transduction, our framework is general enough to contain many existing works as special cases. We present ideas on how GGRs might be implemented, and in the process draw connections to reinforcement learning and other areas of research., Comment: 12 pages
- Published
- 2024
16. When SMILES have Language: Drug Classification using Text Classification Methods on Drug SMILES Strings
- Author
-
Wasi, Azmine Toushik, Karlo, Šerbetar, Islam, Raima, Rafi, Taki Hasan, and Chae, Dong-Kyu
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Computation and Language ,Computer Science - Information Retrieval ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Complex chemical structures, like drugs, are usually defined by SMILES strings as a sequence of molecules and bonds. These SMILES strings are used in different complex machine learning-based drug-related research and representation works. Escaping from complex representation, in this work, we pose a single question: What if we treat drug SMILES as conventional sentences and engage in text classification for drug classification? Our experiments affirm the possibility with very competitive scores. The study explores the notion of viewing each atom and bond as sentence components, employing basic NLP methods to categorize drug types, proving that complex problems can also be solved with simpler perspectives. The data and code are available here: https://github.com/azminewasi/Drug-Classification-NLP., Comment: 7 pages, 2 figures, 5 tables, Accepted (invited to present) to the The Second Tiny Papers Track at ICLR 2024 (https://openreview.net/forum?id=VUYCyH8fCw)
- Published
- 2024
17. Cooperation and Federation in Distributed Radar Point Cloud Processing
- Author
-
Savazzi, S., Rampa, V., Kianoush, S., Minora, A., and Costa, L.
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Information Theory - Abstract
The paper considers the problem of human-scale RF sensing utilizing a network of resource-constrained MIMO radars with low range-azimuth resolution. The radars operate in the mmWave band and obtain time-varying 3D point cloud (PC) information that is sensitive to body movements. They also observe the same scene from different views and cooperate while sensing the environment using a sidelink communication channel. Conventional cooperation setups allow the radars to mutually exchange raw PC information to improve ego sensing. The paper proposes a federation mechanism where the radars exchange the parameters of a Bayesian posterior measure of the observed PCs, rather than raw data. The radars act as distributed parameter servers to reconstruct a global posterior (i.e., federated posterior) using Bayesian tools. The paper quantifies and compares the benefits of radar federation with respect to cooperation mechanisms. Both approaches are validated by experiments with a real-time demonstration platform. Federation makes minimal use of the sidelink communication channel (20 {\div} 25 times lower bandwidth use) and is less sensitive to unresolved targets. On the other hand, cooperation reduces the mean absolute target estimation error of about 20%.
- Published
- 2024
- Full Text
- View/download PDF
18. Enhancing Bangla Language Next Word Prediction and Sentence Completion through Extended RNN with Bi-LSTM Model On N-gram Language
- Author
-
Islam, Md Robiul, Amin, Al, and Zereen, Aniqua Nusrat
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Texting stands out as the most prominent form of communication worldwide. Individual spend significant amount of time writing whole texts to send emails or write something on social media, which is time consuming in this modern era. Word prediction and sentence completion will be suitable and appropriate in the Bangla language to make textual information easier and more convenient. This paper expands the scope of Bangla language processing by introducing a Bi-LSTM model that effectively handles Bangla next-word prediction and Bangla sentence generation, demonstrating its versatility and potential impact. We proposed a new Bi-LSTM model to predict a following word and complete a sentence. We constructed a corpus dataset from various news portals, including bdnews24, BBC News Bangla, and Prothom Alo. The proposed approach achieved superior results in word prediction, reaching 99\% accuracy for both 4-gram and 5-gram word predictions. Moreover, it demonstrated significant improvement over existing methods, achieving 35\%, 75\%, and 95\% accuracy for uni-gram, bi-gram, and tri-gram word prediction, respectively, Comment: This paper contains 6 pages, 8 figures
- Published
- 2024
19. The Privacy Power of Correlated Noise in Decentralized Learning
- Author
-
Allouah, Youssef, Koloskova, Anastasia, Firdoussi, Aymane El, Jaggi, Martin, and Guerraoui, Rachid
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Computer Science - Distributed, Parallel, and Cluster Computing ,Mathematics - Optimization and Control ,Statistics - Machine Learning - Abstract
Decentralized learning is appealing as it enables the scalable usage of large amounts of distributed data and resources (without resorting to any central entity), while promoting privacy since every user minimizes the direct exposure of their data. Yet, without additional precautions, curious users can still leverage models obtained from their peers to violate privacy. In this paper, we propose Decor, a variant of decentralized SGD with differential privacy (DP) guarantees. Essentially, in Decor, users securely exchange randomness seeds in one communication round to generate pairwise-canceling correlated Gaussian noises, which are injected to protect local models at every communication round. We theoretically and empirically show that, for arbitrary connected graphs, Decor matches the central DP optimal privacy-utility trade-off. We do so under SecLDP, our new relaxation of local DP, which protects all user communications against an external eavesdropper and curious users, assuming that every pair of connected users shares a secret, i.e., an information hidden to all others. The main theoretical challenge is to control the accumulation of non-canceling correlated noise due to network sparsity. We also propose a companion SecLDP privacy accountant for public use., Comment: Accepted as conference paper at ICML 2024
- Published
- 2024
20. PVF (Parameter Vulnerability Factor): A Quantitative Metric Measuring AI Vulnerability and Resilience Against Parameter Corruptions
- Author
-
Jiao, Xun, Lin, Fred, Dixit, Harish D., Coburn, Joel, Pandey, Abhinav, Wang, Han, Huang, Jianyu, Ramesh, Venkat, Xu, Wang, Moore, Daniel, and Sankar, Sriram
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Hardware Architecture ,Computer Science - Machine Learning - Abstract
Reliability of AI systems is a fundamental concern for the successful deployment and widespread adoption of AI technologies. Unfortunately, the escalating complexity and heterogeneity of AI hardware systems make them inevitably and increasingly susceptible to hardware faults (e.g., bit flips) that can potentially corrupt model parameters. Given this challenge, this paper aims to answer a critical question: How likely is a parameter corruption to result in an incorrect model output? To systematically answer this question, we propose a novel quantitative metric, Parameter Vulnerability Factor (PVF), inspired by architectural vulnerability factor (AVF) in computer architecture community, aiming to standardize the quantification of AI model resilience/vulnerability against parameter corruptions. We define a model parameter's PVF as the probability that a corruption in that particular model parameter will result in an incorrect output. Similar to AVF, this statistical concept can be derived from statistically extensive and meaningful fault injection (FI) experiments. In this paper, we present several use cases on applying PVF to three types of tasks/models during inference -- recommendation (DLRM), vision classification (CNN), and text classification (BERT). PVF can provide pivotal insights to AI hardware designers in balancing the tradeoff between fault protection and performance/efficiency such as mapping vulnerable AI parameter components to well-protected hardware modules. PVF metric is applicable to any AI model and has a potential to help unify and standardize AI vulnerability/resilience evaluation practice.
- Published
- 2024
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.