Author: "Swamy OR" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Swamy OR"' showing total 39,588 results

Start Over Author "Swamy OR"

39,588 results on '"Swamy OR"'

1. Diffusing States and Matching Scores: A New Framework for Imitation Learning

Author: Wu, Runzhe, Chen, Yiding, Swamy, Gokul, Brantley, Kianté, and Sun, Wen
Subjects: Computer Science - Machine Learning
Abstract: Adversarial Imitation Learning is traditionally framed as a two-player zero-sum game between a learner and an adversarially chosen cost function, and can therefore be thought of as the sequential generalization of a Generative Adversarial Network (GAN). A prominent example of this framework is Generative Adversarial Imitation Learning (GAIL). However, in recent years, diffusion models have emerged as a non-adversarial alternative to GANs that merely require training a score function via regression, yet produce generations of a higher quality. In response, we investigate how to lift insights from diffusion modeling to the sequential setting. We propose diffusing states and performing score-matching along diffused states to measure the discrepancy between the expert's and learner's states. Thus, our approach only requires training score functions to predict noises via standard regression, making it significantly easier and more stable to train than adversarial methods. Theoretically, we prove first- and second-order instance-dependent bounds with linear scaling in the horizon, proving that our approach avoids the compounding errors that stymie offline approaches to imitation learning. Empirically, we show our approach outperforms GAN-style imitation learning baselines across various continuous control problems, including complex tasks like controlling humanoids to walk, sit, and crawl.
Published: 2024

2. Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Author: Gao, Zhaolin, Zhan, Wenhao, Chang, Jonathan D., Swamy, Gokul, Brantley, Kianté, Lee, Jason D., and Sun, Wen
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have achieved remarkable success at tasks like summarization that involve a single turn of interaction. However, they can still struggle with multi-turn tasks like dialogue that require long-term planning. Previous works on multi-turn dialogue extend single-turn reinforcement learning from human feedback (RLHF) methods to the multi-turn setting by treating all prior dialogue turns as a long context. Such approaches suffer from covariate shift: the conversations in the training set have previous turns generated by some reference policy, which means that low training error may not necessarily correspond to good performance when the learner is actually in the conversation loop. In response, we introduce REgressing the RELative FUture (REFUEL), an efficient policy optimization approach designed to address multi-turn RLHF in LLMs. REFUEL employs a single model to estimate $Q$-values and trains on self-generated data, addressing the covariate shift issue. REFUEL frames the multi-turn RLHF problem as a sequence of regression tasks on iteratively collected datasets, enabling ease of implementation. Theoretically, we prove that REFUEL can match the performance of any policy covered by the training set. Empirically, we evaluate our algorithm by using Llama-3.1-70B-it to simulate a user in conversation with our model. REFUEL consistently outperforms state-of-the-art methods such as DPO and REBEL across various settings. Furthermore, despite having only 8 billion parameters, Llama-3-8B-it fine-tuned with REFUEL outperforms Llama-3.1-70B-it on long multi-turn dialogues. Implementation of REFUEL can be found at https://github.com/ZhaolinGao/REFUEL/, and models trained by REFUEL can be found at https://huggingface.co/Cornell-AGI.
Published: 2024

3. DiffSpec: Differential Testing with LLMs using Natural Language Specifications and Code Artifacts

Author: Rao, Nikitha, Gilbert, Elizabeth, Ramananandro, Tahina, Swamy, Nikhil, Goues, Claire Le, and Fakhoury, Sarah
Subjects: Computer Science - Software Engineering
Abstract: Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, and language runtimes. Specifications for such systems are often standardized in natural language documents, like Instruction Set Architecture (ISA) specifications, Wasm specifications or IETF RFC's. Large Language Models (LLMs) have demonstrated potential in both generating tests and handling large volumes of natural language text, making them well-suited for utilizing artifacts like specification documents, bug reports, and code implementations. In this work, we leverage natural language and code artifacts to guide LLMs to generate targeted, meaningful tests that highlight meaningful behavioral differences between implementations, including those corresponding to bugs. We introduce DiffSpec, a framework for generating differential tests with LLMs using prompt chaining. We demonstrate the efficacy of DiffSpec on two different systems, namely, eBPF runtimes and Wasm validators. Using DiffSpec, we generated 359 differentiating tests, uncovering at least four distinct and confirmed bugs in eBPF, including a kernel memory leak, inconsistent behavior in jump instructions, and undefined behavior when using the stack pointer. We also found 279 differentiating tests in Wasm validators, that point to at least 2 confirmed and fixed bugs in Wizard Engine.
Published: 2024

4. From Explanations to Action: A Zero-Shot, Theory-Driven LLM Framework for Student Performance Feedback

Author: Swamy, Vinitra, Romano, Davide, Desikan, Bhargav Srinivasa, Camburu, Oana-Maria, and Käser, Tanja
Subjects: Computer Science - Computers and Society, Computer Science - Human-Computer Interaction, Computer Science - Machine Learning
Abstract: Recent advances in eXplainable AI (XAI) for education have highlighted a critical challenge: ensuring that explanations for state-of-the-art AI models are understandable for non-technical users such as educators and students. In response, we introduce iLLuMinaTE, a zero-shot, chain-of-prompts LLM-XAI pipeline inspired by Miller's cognitive model of explanation. iLLuMinaTE is designed to deliver theory-driven, actionable feedback to students in online courses. iLLuMinaTE navigates three main stages - causal connection, explanation selection, and explanation presentation - with variations drawing from eight social science theories (e.g. Abnormal Conditions, Pearl's Model of Explanation, Necessity and Robustness Selection, Contrastive Explanation). We extensively evaluate 21,915 natural language explanations of iLLuMinaTE extracted from three LLMs (GPT-4o, Gemma2-9B, Llama3-70B), with three different underlying XAI methods (LIME, Counterfactuals, MC-LIME), across students from three diverse online courses. Our evaluation involves analyses of explanation alignment to the social science theory, understandability of the explanation, and a real-world user preference study with 114 university students containing a novel actionability simulation. We find that students prefer iLLuMinaTE explanations over traditional explainers 89.52% of the time. Our work provides a robust, ready-to-use framework for effectively communicating hybrid XAI-driven insights in education, with significant generalization potential for other human-centric fields.
Published: 2024

5. myRESEARCHpath: An Interactive Tool for Investigators and Research Administrators

Author: Jamie S. Wylie, Rebecca J. Namenek Brouwer, Derek M. Jones, and Geeta K. Swamy
Abstract: Research-intensive institutions rely on specialized central offices to support research administrators and investigators through various processes and requirements. This helps researchers successfully and compliantly conduct and manage research. However, when these support offices communicate their processes and resources from disparate locations, it can be challenging for research administrators and investigators to locate what they need at the time they need it, and to understand how this information relates with that provided by other research support offices. This can result in research administrators and investigators lacking a clear understanding of critical information and an underutilization of available support. Duke University sought to address this issue by developing a web-based interactive research roadmap to consolidate and organize information from research support offices around the institution. In this roadmap, all support office content is integrated by topic and organized across the research project life cycle. To achieve this, a dedicated project team (1) convened the research support offices to develop integrated content and a process for contributing their resources on the website, (2) solicited researcher feedback to determine the critical features and functionality of the site, (3) engaged a technical development partner to build the site, (4) engaged researchers for beta-testing, and (5) devised a communication strategy to raise awareness and adoption of the site. The interactive research roadmap, "myRESEARCHpath," launched in 2021, and has experienced steady growth in utilization. Initial data shows that users are accessing the site to find relevant information for research information and guidance, and research support offices are encouraged by the improved discoverability of resources and services. This model of a single location to access research support office information needed to navigate the research project life cycle could be beneficial for other research-intensive institutions.
Published: 2024

6. Approximation Algorithms for Correlated Knapsack Orienteering

Author: Espinosa, David Aleman and Swamy, Chaitanya
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Discrete Mathematics, F.2.2, G.1.6, G.2
Abstract: We consider the {\em correlated knapsack orienteering} (CSKO) problem: we are given a travel budget $B$, processing-time budget $W$, finite metric space $(V,d)$ with root $\rho\in V$, where each vertex is associated with a job with possibly correlated random size and random reward that become known only when the job completes. Random variables are independent across different vertices. The goal is to compute a $\rho$-rooted path of length at most $B$, in a possibly adaptive fashion, that maximizes the reward collected from jobs that processed by time $W$. To our knowledge, CSKO has not been considered before, though prior work has considered the uncorrelated problem, {\em stochastic knapsack orienteering}, and {\em correlated orienteering}, which features only one budget constraint on the {\em sum} of travel-time and processing-times. We show that the {\em adaptivity gap of CSKO is not a constant, and is at least $\Omega\bigl(\max\sqrt{\log{B}},\sqrt{\log\log{W}}\}\bigr)$}. Complementing this, we devise {\em non-adaptive} algorithms that obtain: (a) $O(\log\log W)$-approximation in quasi-polytime; and (b) $O(\log W)$-approximation in polytime. We obtain similar guarantees for CSKO with cancellations, wherein a job can be cancelled before its completion time, foregoing its reward. We also consider the special case of CSKO, wherein job sizes are weighted Bernoulli distributions, and more generally where the distributions are supported on at most two points (2-CSKO). Although weighted Bernoulli distributions suffice to yield an $\Omega(\sqrt{\log\log B})$ adaptivity-gap lower bound for (uncorrelated) {\em stochastic orienteering}, we show that they are easy instances for CSKO. We develop non-adaptive algorithms that achieve $O(1)$-approximation in polytime for weighted Bernoulli distributions, and in $(n+\log B)^{O(\log W)}$-time for the more general case of 2-CSKO., Comment: Full version of APPROX 2024 paper
Published: 2024

7. IDNet: A Novel Dataset for Identity Document Analysis and Fraud Detection

Author: Guan, Hong, Wang, Yancheng, Xie, Lulu, Nag, Soham, Goel, Rajeev, Swamy, Niranjan Erappa Narayana, Yang, Yingzhen, Xiao, Chaowei, Prisby, Jonathan, Maciejewski, Ross, and Zou, Jia
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Effective fraud detection and analysis of government-issued identity documents, such as passports, driver's licenses, and identity cards, are essential in thwarting identity theft and bolstering security on online platforms. The training of accurate fraud detection and analysis tools depends on the availability of extensive identity document datasets. However, current publicly available benchmark datasets for identity document analysis, including MIDV-500, MIDV-2020, and FMIDV, fall short in several respects: they offer a limited number of samples, cover insufficient varieties of fraud patterns, and seldom include alterations in critical personal identifying fields like portrait images, limiting their utility in training models capable of detecting realistic frauds while preserving privacy. In response to these shortcomings, our research introduces a new benchmark dataset, IDNet, designed to advance privacy-preserving fraud detection efforts. The IDNet dataset comprises 837,060 images of synthetically generated identity documents, totaling approximately 490 gigabytes, categorized into 20 types from $10$ U.S. states and 10 European countries. We evaluate the utility and present use cases of the dataset, illustrating how it can aid in training privacy-preserving fraud detection methods, facilitating the generation of camera and video capturing of identity documents, and testing schema unification and other identity document management functionalities., Comment: 40 pages
Published: 2024

8. PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images

Author: Sharifi, Parastoo Sotoudeh, Ahmad, M. Omair, and Swamy, M. N. S.
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent advancements in deep learning (DL) have significantly advanced medical image analysis. In the field of medical image processing, particularly in histopathology image analysis, the variation in staining protocols and differences in scanners present significant domain shift challenges, undermine the generalization capabilities of models to the data from unseen domains, prompting the need for effective domain generalization (DG) strategies to improve the consistency and reliability of automated cancer detection tools in diagnostic decision-making. In this paper, we introduce Pathology Weight Averaging (PathoWAve), a multi-source DG strategy for addressing domain shift phenomenon of DL models in histopathology image analysis. Integrating specific weight averaging technique with parallel training trajectories and a strategically combination of regular augmentations with histopathology-specific data augmentation methods, PathoWAve enables a comprehensive exploration and precise convergence within the loss landscape. This method significantly enhanced generalization capabilities of DL models across new, unseen histopathology domains. To the best of our knowledge, PathoWAve is the first proposed weight averaging method for DG in histopathology image analysis. Our quantitative results on Camelyon17 WILDS dataset demonstrate PathoWAve's superiority over previous proposed methods to tackle the domain shift phenomenon in histopathology image processing. Our code is available at \url{https://github.com/ParastooSotoudeh/PathoWAve}.
Published: 2024

9. EvIL: Evolution Strategies for Generalisable Imitation Learning

Author: Sapora, Silvia, Swamy, Gokul, Lu, Chris, Teh, Yee Whye, and Foerster, Jakob Nicolaus
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Machine Learning
Abstract: Often times in imitation learning (IL), the environment we collect expert demonstrations in and the environment we want to deploy our learned policy in aren't exactly the same (e.g. demonstrations collected in simulation but deployment in the real world). Compared to policy-centric approaches to IL like behavioural cloning, reward-centric approaches like inverse reinforcement learning (IRL) often better replicate expert behaviour in new environments. This transfer is usually performed by optimising the recovered reward under the dynamics of the target environment. However, (a) we find that modern deep IL algorithms frequently recover rewards which induce policies far weaker than the expert, even in the same environment the demonstrations were collected in. Furthermore, (b) these rewards are often quite poorly shaped, necessitating extensive environment interaction to optimise effectively. We provide simple and scalable fixes to both of these concerns. For (a), we find that reward model ensembles combined with a slightly different training objective significantly improves re-training and transfer performance. For (b), we propose a novel evolution-strategies based method EvIL to optimise for a reward-shaping term that speeds up re-training in the target environment, closing a gap left open by the classical theory of IRL. On a suite of continuous control tasks, we are able to re-train policies in target (and source) environments more interaction-efficiently than prior work., Comment: 17 pages, 8 figures, ICML 2024
Published: 2024

10. Highly Connected Graph Partitioning: Exact Formulation and Solution Methods

Author: Swamy, Rahul, King, Douglas M., and Jacobson, Sheldon H.
Subjects: Computer Science - Discrete Mathematics, Mathematics - Optimization and Control
Abstract: Graph partitioning (GP) and vertex connectivity have traditionally been two distinct fields of study. This paper introduces the highly connected graph partitioning (HCGP) problem, which partitions a graph into compact, size balanced, and $Q$-(vertex) connected parts for any $Q\geq 1$. This problem is valuable in applications that seek cohesion and fault-tolerance within their parts, such as community detection in social networks and resiliency-focused partitioning of power networks. Existing research in this fundamental interconnection primarily focuses on providing theoretical existence guarantees of highly connected partitions for a limited set of dense graphs, and do not include canonical GP considerations such as size balance and compactness. This paper's key contribution is providing a general modeling and algorithmic approach for HCGP, inspired by recent work in the political districting problem, a special case of HCGP with $Q=1$. This approach models $Q$-connectivity constraints as mixed integer programs for any $Q\geq 1$ and provides an efficient branch-and-cut method to solve HCGP. When solution time is a priority over optimality, this paper provides a heuristic method specifically designed for HCGP with $Q=2$. A computational analysis evaluates these methods using a test bed of instances from various real-world graphs. In this analysis, the branch-and-cut method finds an optimal solution within one hour in $82.8\%$ of the instances solved. For $Q=2$, small and sparse instances are challenging for the heuristic, whereas large and sparse instances are challenging for the exact method. Furthermore, this study quantifies the computational cost of ensuring higher connectivity using the branch-and-cut approach, compared to a baseline of ensuring $1$-connectivity. Overall, this work serves as an effective tool to partition a graph into resilient and cohesive parts.
Published: 2024

11. Multi-Agent Imitation Learning: Value is Easy, Regret is Hard

Author: Tang, Jingwu, Swamy, Gokul, Fang, Fei, and Wu, Zhiwei Steven
Subjects: Computer Science - Machine Learning
Abstract: We study a multi-agent imitation learning (MAIL) problem where we take the perspective of a learner attempting to coordinate a group of agents based on demonstrations of an expert doing so. Most prior work in MAIL essentially reduces the problem to matching the behavior of the expert within the support of the demonstrations. While doing so is sufficient to drive the value gap between the learner and the expert to zero under the assumption that agents are non-strategic, it does not guarantee robustness to deviations by strategic agents. Intuitively, this is because strategic deviations can depend on a counterfactual quantity: the coordinator's recommendations outside of the state distribution their recommendations induce. In response, we initiate the study of an alternative objective for MAIL in Markov Games we term the regret gap that explicitly accounts for potential deviations by agents in the group. We first perform an in-depth exploration of the relationship between the value and regret gaps. First, we show that while the value gap can be efficiently minimized via a direct extension of single-agent IL algorithms, even value equivalence can lead to an arbitrarily large regret gap. This implies that achieving regret equivalence is harder than achieving value equivalence in MAIL. We then provide a pair of efficient reductions to no-regret online convex optimization that are capable of minimizing the regret gap (a) under a coverage assumption on the expert (MALICE) or (b) with access to a queryable expert (BLADES).
Published: 2024

12. The Importance of Online Data: Understanding Preference Fine-tuning via Coverage

Author: Song, Yuda, Swamy, Gokul, Singh, Aarti, Bagnell, J. Andrew, and Sun, Wen
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Learning from human preference data has emerged as the dominant paradigm for fine-tuning large language models (LLMs). The two most common families of techniques -- online reinforcement learning (RL) such as Proximal Policy Optimization (PPO) and offline contrastive methods such as Direct Preference Optimization (DPO) -- were positioned as equivalent in prior work due to the fact that both have to start from the same offline preference dataset. To further expand our theoretical understanding of the similarities and differences between online and offline techniques for preference fine-tuning, we conduct a rigorous analysis through the lens of dataset coverage, a concept that captures how the training data covers the test distribution and is widely used in RL. We prove that a global coverage condition is both necessary and sufficient for offline contrastive methods to converge to the optimal policy, but a weaker partial coverage condition suffices for online RL methods. This separation provides one explanation of why online RL methods can perform better than offline methods, especially when the offline preference data is not diverse enough. Finally, motivated by our preceding theoretical observations, we derive a hybrid preference optimization (HyPO) algorithm that uses offline data for contrastive-based preference optimization and online data for KL regularization. Theoretically and empirically, we demonstrate that HyPO is more performant than its pure offline counterpart DPO, while still preserving its computation and memory efficiency.
Published: 2024

13. Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning

Author: Gado, Elena Grazia, Martorella, Tommaso, Zunino, Luca, Mejia-Domenzain, Paola, Swamy, Vinitra, Frej, Jibril, and Käser, Tanja
Subjects: Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: Intelligent Tutoring Systems (ITS) enhance personalized learning by predicting student answers to provide immediate and customized instruction. However, recent research has primarily focused on the correctness of the answer rather than the student's performance on specific answer choices, limiting insights into students' thought processes and potential misconceptions. To address this gap, we present MCQStudentBert, an answer forecasting model that leverages the capabilities of Large Language Models (LLMs) to integrate contextual understanding of students' answering history along with the text of the questions and answers. By predicting the specific answer choices students are likely to make, practitioners can easily extend the model to new answer choices or remove answer choices for the same multiple-choice question (MCQ) without retraining the model. In particular, we compare MLP, LSTM, BERT, and Mistral 7B architectures to generate embeddings from students' past interactions, which are then incorporated into a finetuned BERT's answer-forecasting mechanism. We apply our pipeline to a dataset of language learning MCQ, gathered from an ITS with over 10,000 students to explore the predictive accuracy of MCQStudentBert, which incorporates student interaction patterns, in comparison to correct answer prediction and traditional mastery-learning feature-based approaches. This work opens the door to more personalized content, modularization, and granular support., Comment: Accepted as a poster paper at EDM 2024: 17th International Conference on Educational Data Mining in Atlanta, USA
Published: 2024

14. Interpret3C: Interpretable Student Clustering Through Individualized Feature Selection

Author: Salles, Isadora, Mejia-Domenzain, Paola, Swamy, Vinitra, Blackwell, Julian, and Käser, Tanja
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: Clustering in education, particularly in large-scale online environments like MOOCs, is essential for understanding and adapting to diverse student needs. However, the effectiveness of clustering depends on its interpretability, which becomes challenging with high-dimensional data. Existing clustering approaches often neglect individual differences in feature importance and rely on a homogenized feature set. Addressing this gap, we introduce Interpret3C (Interpretable Conditional Computation Clustering), a novel clustering pipeline that incorporates interpretable neural networks (NNs) in an unsupervised learning context. This method leverages adaptive gating in NNs to select features for each student. Then, clustering is performed using the most relevant features per student, enhancing clusters' relevance and interpretability. We use Interpret3C to analyze the behavioral clusters considering individual feature importances in a MOOC with over 5,000 students. This research contributes to the field by offering a scalable, robust clustering methodology and an educational case study that respects individual student differences and improves interpretability for high-dimensional data., Comment: Accepted as a LBR paper at AIED 2024: The 25th International Conference on Artificial Intelligence in Education on July 8-12 in Recife, Brazil
Published: 2024

15. Synchronization of E. coli bacteria moving in coupled wells

Author: Japaridze, Aleksandre, Struijk, Victor, Swamy, Kushal, Roslon, Irek, Shoshani, Oriel, Dekker, Cees, and Alijani, Farbod
Subjects: Nonlinear Sciences - Adaptation and Self-Organizing Systems, Condensed Matter - Other Condensed Matter, Physics - Biological Physics
Abstract: Synchronization plays a crucial role in the dynamics of living organisms, from fireflies flashing in unison to pacemaker cells that jointly generate heartbeats. Uncovering the mechanism behind these phenomena requires an understanding of individual biological oscillators and the coupling forces between them. Here, we develop a single-cell assay that studies rhythmic behavior in the motility of individual E.coli cells that can be mutually synchronized. Circular microcavities are used to isolate E.coli cells that swim along the cavity wall, resulting in self-sustained oscillations. Upon connecting these cavities by microchannels the bacterial motions can be coupled, yielding nonlinear dynamic synchronization patterns with phase slips. We demonstrate that the coordinated movement observed in coupled E. coli oscillators follows mathematical rules of synchronization which we use to quantify the coupling strength. These findings advance our understanding of motility in confinement, and lay the foundation for engineering desired dynamics in microbial active matter.
Published: 2024

16. Towards Neural Synthesis for SMT-Assisted Proof-Oriented Programming

Author: Chakraborty, Saikat, Ebner, Gabriel, Bhat, Siddharth, Fakhoury, Sarah, Fatima, Sakina, Lahiri, Shuvendu, and Swamy, Nikhil
Subjects: Computer Science - Programming Languages, Computer Science - Artificial Intelligence, Computer Science - Software Engineering
Abstract: Proof-oriented programs mix computational content with proofs of program correctness. However, the human effort involved in programming and proving is still substantial, despite the use of Satisfiability Modulo Theories (SMT) solvers to automate proofs in languages such as F*. Seeking to spur research on using AI to automate the construction of proof-oriented programs, we curate a dataset of 600K lines of open-source F* programs and proofs, including software used in production systems ranging from Windows and Linux to Python and Firefox. Our dataset includes around 32K top-level F* definitions, each representing a type-directed program and proof synthesis problem producing a definition given a formal specification expressed as an F* type. We provide a program fragment checker that queries F* to check the correctness of candidate solutions. We also report on an extended version of our dataset containing a total of 940K lines of programs and proofs, with a total of 54k top-level F* definitions. We believe this is the largest corpus of SMT-assisted program proofs coupled with a reproducible program-fragment checker. Grounded in this dataset, we investigate the use of AI to synthesize programs and their proofs in F*, with promising results. Our main finding in that the performance of fine-tuned smaller language models (such as Phi-2 or StarCoder) compare favorably with large language models (such as GPT-4), at a much lower computational cost. We also identify various type-based retrieval augmentation techniques and find that they boost performance significantly. With detailed error analysis and case studies, we identify potential strengths and weaknesses of models and techniques and suggest directions for future improvements., Comment: 47th International Conference on Software Engineering
Published: 2024

17. REBEL: Reinforcement Learning via Regressing Relative Rewards

Author: Gao, Zhaolin, Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Swamy, Gokul, Brantley, Kianté, Joachims, Thorsten, Bagnell, J. Andrew, Lee, Jason D., and Sun, Wen
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO requires multiple heuristics to enable stable convergence (e.g. value networks, clipping), and is notorious for its sensitivity to the precise implementation of these components. In response, we take a step back and ask what a minimalist RL algorithm for the era of generative models would look like. We propose REBEL, an algorithm that cleanly reduces the problem of policy optimization to regressing the relative reward between two completions to a prompt in terms of the policy, enabling strikingly lightweight implementation. In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL, which allows us to match the strongest known theoretical guarantees in terms of convergence and sample complexity in the RL literature. REBEL can also cleanly incorporate offline data and be extended to handle the intransitive preferences we frequently see in practice. Empirically, we find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO, all while being simpler to implement and more computationally efficient than PPO. When fine-tuning Llama-3-8B-Instruct, REBEL achieves strong performance in AlpacaEval 2.0, MT-Bench, and Open LLM Leaderboard., Comment: New experimental results on general chat
Published: 2024

18. Emerging Advancements in 6G NTN Radio Access Technologies: An Overview

Author: Shahid, Husnain, Amatetti, Carla, Campana, Riccardo, Tong, Sorya, Panaitopol, Dorin, Coralli, Alessandro Vanelli, Mohamed, Abdelhamed, Zhang, Chao, Khalifa, Ebraam, Medeiros, Eduardo, Recayte, Estefania, Ghasemifard, Fatemeh, Lianghai, Ji, Bucheli, Juan, Swamy, Karthik Anantha, Caus, Marius, Gurelli, Mehmet, Vazquez, Miguel A., Shaat, Musbah, Borios, Nathan, Eriksson, Per-Erik, Euler, Sebastian, Li, Zheng, and Fu, Xiaotian
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: The efforts on the development, standardization and improvements to communication systems towards 5G Advanced and 6G are on track to provide benefits such as an unprecedented level of connectivity and performance, enabling a diverse range of vertical services. The full integration of non-terrestrial components into 6G plays a pivotal role in realizing this paradigm shift towards ubiquitous communication and global coverage. However, this integration into 6G brings forth a set of its own challenges, particularly in Radio Access Technologies (RATs). To this end, this paper comprehensively discusses those challenges at different levels of RATs and proposes the corresponding potential emerging advancements in the realm of 6G NTN. In particular, the focus is on advancing the prospective aspects of Radio Resource Management (RRM), spectral coexistence in terrestrial and non-terrestrial components and flexible waveform design solutions to combat the impediments. This discussion with a specific focus on emerging advancements in 6G NTN RATs is critical for shaping the next generation networks and potentially relevant in contributing the part in standardization in forthcoming releases, Comment: accepted in 2024 EuCNC and 6G Summit, Antwerp, Belgium, 3_6 June 2024
Published: 2024

19. 3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers

Author: Fakhoury, Sarah, Kuppe, Markus, Lahiri, Shuvendu K., Ramananandro, Tahina, and Swamy, Nikhil
Subjects: Computer Science - Software Engineering
Abstract: Improper parsing of attacker-controlled input is a leading source of software security vulnerabilities, especially when programmers transcribe informal format descriptions in RFCs into efficient parsing logic in low-level, memory unsafe languages. Several researchers have proposed formal specification languages for data formats from which efficient code can be extracted. However, distilling informal requirements into formal specifications is challenging and, despite their benefits, new, formal languages are hard for people to learn and use. In this work, we present 3DGen, a framework that makes use of AI agents to transform mixed informal input, including natural language documents (i.e., RFCs) and example inputs into format specifications in a language called 3D. To support humans in understanding and trusting the generated specifications, 3DGen uses symbolic methods to also synthesize test inputs that can be validated against an external oracle. Symbolic test generation also helps in distinguishing multiple plausible solutions. Through a process of repeated refinement, 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C. We have evaluated 3DGen on 20 Internet standard formats, demonstrating the potential for AI-agents to produce formally verified C code at a non-trivial scale. A key enabler is the use of a domain-specific language to limit AI outputs to a class for which automated, symbolic analysis is tractable.
Published: 2024

20. Polymer-based supporting materials and polymer-encapsulated phase change materials for thermal energy storage: A review on the recent advances of materials, synthesis, and characterization techniques

Author: Nagar, Sumit and Sreenivasa, Swamy
Subjects: Polymer industry -- Product development, Refrigerators -- Usage -- Product development, Polymers -- Product development -- Usage, Heat storage -- Usage -- Analysis, Force and energy -- Usage -- Analysis, Engineering and manufacturing industries, Science and technology
Abstract: Phase change materials (PCMs) can be classified as smart materials having its applications in varied fields like domestic and commercial refrigerators, solar absorption chillers, air conditioning, free and radiative cooling, solar air heaters, solar stills, solar absorption cooling, electric and electronic devices for cooling purposes and in textiles. Here, in this review, the various polymerbased and encapsulated PCMs used for fulfilling the above applications are discussed along with their varied synthesis/fabrication methods. Furthermore, chemical characterization is discussed by FTIR for understanding the chemical structure along with functional groups present in the materials. The thermogravimetric analysis (TGA) is also critically discussed for understanding the thermal stability of the Polymer PCM or the phase change composites and the latent heat of PCM melting was also explored by differential scanning calorimetry (DSC) for various PCM which gives insight into the thermal energy storage capability and property. The inbuilt surface structure of the polymer PCM was also tiied to be investigated by scanning electron microscopy (SEM) which when understood gives a clear picture about the structure-property relationship. Highlights * Polymeric supporting materials and polymer encapsulation on PCM were reviewed. * The materials used and the synthesis of polymeric PCM were studied in depth. * Chemical characterization was reviewed for chemical structure. * Thermal stability checks by TGA and latent heat prediction was done by DSC. * Morphology was reviewed using SEM for the structure-property relationship. KEYWORDS characterizations, DSC, morphology, phase change materials, SEM, synthesis, TGA, 1 | INTRODUCTION Phase change materials (PCM) are the substances that absorb and release huge amount of energy during phase transition (fusion/melting) at constant temperatures. PCM have engrossed the attention [...]
Published: 2024
Full Text: View/download PDF

21. ZnO/poly(nigrosine)/modified carbon paste electrode for selective sensing of vanillin in the presence of amaranth: a voltammetric study

Author: Arpitha, S. B. and Swamy, B. E. Kumara
Published: 2024
Full Text: View/download PDF

22. Optimal Design of Low-Power Ultra-Wideband Low-Noise Transconductance Amplifier in 0.18 µm CMOS

Author: Yasmeen, W., Swamy, G. N., and Priya, K. Padma
Published: 2024
Full Text: View/download PDF

23. Plethysmograph Variability Index Values in Healthy Neonates – An Observational Pilot Study

Author: Narayanaswamy, Vindhya, Swamy, Ravi Shankar, Harohalli A, Venkatesh, and Nagesh N, Karthik
Published: 2024
Full Text: View/download PDF

24. Comprehensive Analysis of Antidiabetic Properties in Raphanus sativus Leaves: A Synergistic In-Silico and In-Vitro Approach

Author: Saha, Sakshar, Das, Pronoy Kanti, Dhiwar, Prasad Sanjay, Khanra, Ritu, Paul, Subham, Chatterjee, Atanu, and Matada, Gurubasavaraja Swamy Purawarga
Published: 2024
Full Text: View/download PDF

25. Advancements of anticancer agents by targeting the Hippo signalling pathway: biological activity, selectivity, docking analysis, and structure–activity relationship

Author: Haripriya, E., Hemalatha, K., Matada, Gurubasavaraja Swamy Purawarga, Pal, Rohit, Das, Pronoy Kanti, Ashadul Sk, M. D., Mounika, S., Viji, M. P., Aayishamma, I., and Jayashree, K. R.
Published: 2024
Full Text: View/download PDF

26. Block chain enabled Indian Agricultural supply chain using ISM DEMATEL approach

Author: Beloor, Vanishree, Vijaykumar, M., Swamy, D. R., and Navneeth, S.
Published: 2024
Full Text: View/download PDF

27. Wavelet Analysis and Machine Learning Approach for Improved Protection of PV-Wind-SVC Integrated Smart Power System

Author: Garika, Gantaiah Swamy and Kottala, Padma
Published: 2024
Full Text: View/download PDF

28. Experimental and numerical investigation of nanoparticle assisted PCM-based battery thermal management system

Author: Swamy, Kundrapu Ayyappa, Verma, Saket, and Bhattacharyya, Suvanjan
Published: 2024
Full Text: View/download PDF

29. Simulation of Solidification, Microsegregation, and Heat Treatment of Cr-Based Fe–xMn–7.5Al–1.0C Lightweight Steels

Author: Shetti, Swamy, Gandi, Appala Naidu, and Hasan, Sk Md
Published: 2024
Full Text: View/download PDF

30. Utilization of organic waste from Chinar leaves as sustainable and eco-friendly adsorbent for fluoride removal

Author: Dar, Firdous Ahmad and Kurella, Swamy
Published: 2024
Full Text: View/download PDF

31. Effects of Chemotherapy on Fertility and Fertility Preservation Strategies for the Women of Childbearing Potential Undergoing Chemotherapy: A Comprehensive Review

Author: Kapoor, Mayank, Swamy, Anusha Mruthyunjaya, Sundriyal, Deepak, Khanna, Mridul, Sinha, Nishant, J, Karthik, Rajaram, Shalini, and Sehrawat, Amit
Published: 2024
Full Text: View/download PDF

32. Assessing genetic diversity of indigenous turmeric (Curcuma longa L.) through inter-simple sequence repeat (ISSR) markers

Author: Gowda, M. R. Swamy, Soundarya, D., Hiremath, Channayya, and Shetty, Nandini P.
Published: 2024
Full Text: View/download PDF

33. Insights for Clinical Providers and Community Leaders: Unaccompanied Immigrant Children’s Mental Health Includes Caregiver Support

Author: Báez, Johanna Creswell, Swamy, Padma, Gutierrez, Adriana, Ortiz-Mejias, Ana, Othon, Jacquelyn, Roberts, Nohemi Garcia, and Misra, Sanghamitra
Published: 2024
Full Text: View/download PDF

34. Strength and microstructure characteristics of low-grade (LG) limestone-based cements for a sustainable concrete

Author: Tamma, Venugopal Reddy, Pancharathi, Rathish Kumar, Bibekananda, Mohapatra, and Pallapothu, Swamy Naga Ratna Giri
Published: 2024
Full Text: View/download PDF

35. Classification of diabetic retinopathy severity level using deep learning

Author: Durairaj, Santhi, Subramanian, Parvathi, and Swamy, Carmel Sobia Micheal
Published: 2024
Full Text: View/download PDF

36. Jamun Seed-Derived Nitrogen-Doped Carbon Dots: A Novel Microwave-Assisted Synthesis for Ultra-Bright Fluorescence and Mn7+ Detection

Author: Swathi, R., Reddy, G. Bhagavanth, Rajkumar, Bandi, Ramakrishna, Dadigala, and Swamy, P. Yadagiri
Published: 2024
Full Text: View/download PDF

37. Hybrid Inverse Reinforcement Learning

Author: Ren, Juntao, Swamy, Gokul, Wu, Zhiwei Steven, Bagnell, J. Andrew, and Choudhury, Sanjiban
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The inverse reinforcement learning approach to imitation learning is a double-edged sword. On the one hand, it can enable learning from a smaller number of expert demonstrations with more robustness to error compounding than behavioral cloning approaches. On the other hand, it requires that the learner repeatedly solve a computationally expensive reinforcement learning (RL) problem. Often, much of this computation is wasted searching over policies very dissimilar to the expert's. In this work, we propose using hybrid RL -- training on a mixture of online and expert data -- to curtail unnecessary exploration. Intuitively, the expert data focuses the learner on good states during training, which reduces the amount of exploration required to compute a strong policy. Notably, such an approach doesn't need the ability to reset the learner to arbitrary states in the environment, a requirement of prior work in efficient inverse RL. More formally, we derive a reduction from inverse RL to expert-competitive RL (rather than globally optimal RL) that allows us to dramatically reduce interaction during the inner policy search loop while maintaining the benefits of the IRL approach. This allows us to derive both model-free and model-based hybrid inverse RL algorithms with strong policy performance guarantees. Empirically, we find that our approaches are significantly more sample efficient than standard inverse RL and several other baselines on a suite of continuous control tasks.
Published: 2024

38. InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts

Author: Swamy, Vinitra, Montariol, Syrielle, Blackwell, Julian, Frej, Jibril, Jaggi, Martin, and Käser, Tanja
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction
Abstract: Interpretability for neural networks is a trade-off between three key requirements: 1) faithfulness of the explanation (i.e., how perfectly it explains the prediction), 2) understandability of the explanation by humans, and 3) model performance. Most existing methods compromise one or more of these requirements; e.g., post-hoc approaches provide limited faithfulness, automatically identified feature masks compromise understandability, and intrinsically interpretable methods such as decision trees limit model performance. These shortcomings are unacceptable for sensitive applications such as education and healthcare, which require trustworthy explanations, actionable interpretations, and accurate predictions. In this work, we present InterpretCC (interpretable conditional computation), a family of interpretable-by-design neural networks that guarantee human-centric interpretability, while maintaining comparable performance to state-of-the-art models by adaptively and sparsely activating features before prediction. We extend this idea into an interpretable, global mixture-of-experts (MoE) model that allows humans to specify topics of interest, discretely separates the feature space for each data point into topical subnetworks, and adaptively and sparsely activates these topical subnetworks for prediction. We apply variations of the InterpretCC architecture for text, time series and tabular data across several real-world benchmarks, demonstrating comparable performance with non-interpretable baselines, outperforming interpretable-by-design baselines, and showing higher actionability and usefulness according to a user study.
Published: 2024

39. The Virtues of Pessimism in Inverse Reinforcement Learning

Author: Wu, David, Swamy, Gokul, Bagnell, J. Andrew, Wu, Zhiwei Steven, and Choudhury, Sanjiban
Subjects: Computer Science - Machine Learning
Abstract: Inverse Reinforcement Learning (IRL) is a powerful framework for learning complex behaviors from expert demonstrations. However, it traditionally requires repeatedly solving a computationally expensive reinforcement learning (RL) problem in its inner loop. It is desirable to reduce the exploration burden by leveraging expert demonstrations in the inner-loop RL. As an example, recent work resets the learner to expert states in order to inform the learner of high-reward expert states. However, such an approach is infeasible in the real world. In this work, we consider an alternative approach to speeding up the RL subroutine in IRL: \emph{pessimism}, i.e., staying close to the expert's data distribution, instantiated via the use of offline RL algorithms. We formalize a connection between offline RL and IRL, enabling us to use an arbitrary offline RL algorithm to improve the sample efficiency of IRL. We validate our theory experimentally by demonstrating a strong correlation between the efficacy of an offline RL algorithm and how well it works as part of an IRL procedure. By using a strong offline RL algorithm as part of an IRL procedure, we are able to find policies that match expert performance significantly more efficiently than the prior art., Comment: This paper has been withdrawn by the authors pending edits from other authors
Published: 2024

40. A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Author: Swamy, Gokul, Dann, Christoph, Kidambi, Rahul, Wu, Zhiwei Steven, and Agarwal, Alekh
Subjects: Computer Science - Machine Learning
Abstract: We present Self-Play Preference Optimization (SPO), an algorithm for reinforcement learning from human feedback. Our approach is minimalist in that it does not require training a reward model nor unstable adversarial training and is therefore rather simple to implement. Our approach is maximalist in that it provably handles non-Markovian, intransitive, and stochastic preferences while being robust to the compounding errors that plague offline approaches to sequential prediction. To achieve the preceding qualities, we build upon the concept of a Minimax Winner (MW), a notion of preference aggregation from the social choice theory literature that frames learning from preferences as a zero-sum game between two policies. By leveraging the symmetry of this game, we prove that rather than using the traditional technique of dueling two policies to compute the MW, we can simply have a single agent play against itself while maintaining strong convergence guarantees. Practically, this corresponds to sampling multiple trajectories from a policy, asking a preference or teacher model to compare them, and then using the proportion of wins as the reward for a particular trajectory. We demonstrate that on a suite of continuous control tasks, we are able to learn significantly more efficiently than reward-model based approaches while maintaining robustness to the intransitive and stochastic preferences that frequently occur in practice when aggregating human judgments.
Published: 2024

41. Plurality under BJP dominance

Author: Swamy, Arun
Published: 2024

42. LRRK2 is not required for lysozyme expression in Paneth cells

Author: Tasegian, Anna, Dikovskaya, Dina, Scott, Molly M., Chawla, Amanpreet Singh, Pemberton, Rebecca, Helps, Thomas, Meus, Tosca, McLean, Mairi H., and Swamy, Mahima
Published: 2024
Full Text: View/download PDF

43. A Clinic-Level Approach to Improve Uptake of First COVID-19 Vaccine Dose in Primary Care

Author: Swamy, Annemarie M., Kaufman, Noah, Lievers, Ernest, Tyler, Carrie, Veira, Olivia, Smith, Sofia Osio, Genies, Marquita C., Turtle, Melina, Matson, Pamela A., Kim, Julia M., and Marcell, Arik V.
Published: 2024
Full Text: View/download PDF

44. Influence of Heat Variation on Thermal and Mechanical Performance of Al-7075-Based Hybrid Composites

Author: Santhosh Kumar, B. M., Swamy, G. M., Aprameya, C. R., Bavan, Saravana, Venkatesh, B. N., Kumar, Prakash, and Nagaraja, T. K.
Published: 2024
Full Text: View/download PDF

45. Enhancing Tribological Characteristics of AA6061-SiC Composites via Response Surface Methodology

Author: Venkatesh, B. N., Hebbal, Umamaheshwar, Yogesha, K. B., Mruthunjaya, M., and Swamy, G. M.
Published: 2024
Full Text: View/download PDF

46. Preference and progeny development of stored product insects in response to grain characteristics of millets

Author: Swamy, S. V. S. Gopala, Raja, D. Sandeep, and Rao, V. Vasudeva
Published: 2024
Full Text: View/download PDF

47. Enhancement of Kevlar fiber-polypropylene composite by the inclusions of cotton stalk and granite particle: characteristics study

Author: Hangargi, Sumeet, Swamy, Amit, Raj, R. Gowtham, Aruna, M., Venkatesh, R., Madhu, S., Al Obaid, Sami, Alharbi, Sulaiman Ali, and Kalam, M. A.
Published: 2024
Full Text: View/download PDF

48. Numerical Evaluation of the Influence of Terrain Properties in Clay-Tire Interactions

Author: Swamy, Varsha S., Mason, Destiny, Yerro, Alba, Sandu, Corina, Sebeck, Katherine, Gorsich, David, Chiru, Anghel, editor, and Covaciu, Dinu, editor
Published: 2025
Full Text: View/download PDF

49. Predicting Rock Properties of Limestone Using Operating Parameters of Ball Mill

Author: Swamy, S. V., Kunar, B. M., Chandar, K. R., Bezaeva, Natalia S., Series Editor, Gomes Coe, Heloisa Helena, Series Editor, Nawaz, Muhammad Farrakh, Series Editor, Gorai, Amit Kumar, editor, Ram, Sahendra, editor, Bishwal, Ram Manohar, editor, and Bhowmik, Santanu, editor
Published: 2025
Full Text: View/download PDF

50. Autochthonous Plasmodium vivax Infections, Florida, USA, 2023

Author: Muneer, Azhar, Adapa, Swamy R., Silbert, Suzane, Scanlan, Kelly, Vore, Harold, Cannons, Andrew, Morrison, Andrea M., Stanek, Danielle, Blackmore, Carina, Adams, John H., Kim, Kami, Jiang, Rays H.Y., and Cui, Liwang
Subjects: Phylogeny -- Identification and classification -- Health aspects, Malaria -- Diagnosis -- Care and treatment -- Genetic aspects, Public health administration, Genome-wide association studies, Health
Abstract: Although commendable progress for combating malaria in endemic areas has been achieved and a dozen countries have been declared malaria-free since 2000 (1), increasing international travel has led to a [...]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

39,588 results on '"Swamy OR"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources