Author: "Dhingra P." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Dhingra P."' showing total 3,779 results

Start Over Author "Dhingra P."

3,779 results on '"Dhingra P."'

1. MatViX: Multimodal Information Extraction from Visually Rich Articles

Author: Khalighinejad, Ghazal, Scott, Sharon, Liu, Ollie, Anderson, Kelly L., Stureborg, Rickard, Tyagi, Aman, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language
Abstract: Multimodal information extraction (MIE) is crucial for scientific literature, where valuable data is often spread across text, figures, and tables. In materials science, extracting structured information from research articles can accelerate the discovery of new materials. However, the multimodal nature and complex interconnections of scientific content present challenges for traditional text-based methods. We introduce \textsc{MatViX}, a benchmark consisting of $324$ full-length research articles and $1,688$ complex structured JSON files, carefully curated by domain experts. These JSON files are extracted from text, tables, and figures in full-length documents, providing a comprehensive challenge for MIE. We introduce an evaluation method to assess the accuracy of curve similarity and the alignment of hierarchical structures. Additionally, we benchmark vision-language models (VLMs) in a zero-shot manner, capable of processing long contexts and multimodal inputs, and show that using a specialized model (DePlot) can improve performance in extracting curves. Our results demonstrate significant room for improvement in current models. Our dataset and evaluation code are available\footnote{\url{https://matvix-bench.github.io/}}.
Published: 2024

2. Gradient Descent Efficiency Index

Author: Dhingra, Aviral
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: Gradient descent is a widely used iterative algorithm for finding local minima in multivariate functions. However, the final iterations often either overshoot the minima or make minimal progress, making it challenging to determine an optimal stopping point. This study introduces a new efficiency metric, Ek, designed to quantify the effectiveness of each iteration. The proposed metric accounts for both the relative change in error and the stability of the loss function across iterations. This measure is particularly valuable in resource-constrained environments, where costs are closely tied to training time. Experimental validation across multiple datasets and models demonstrates that Ek provides valuable insights into the convergence behavior of gradient descent, complementing traditional performance metrics. The index has the potential to guide more informed decisions in the selection and tuning of optimization algorithms in machine learning applications and be used to compare the "effectiveness" of models relative to each other., Comment: 12 Pages, 3 Figures
Published: 2024

3. Enhancing Large Language Models' Situated Faithfulness to External Contexts

Author: Huang, Yukun, Chen, Sanxing, Cai, Hongyi, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) are often augmented with external information as contexts, but this external information can sometimes be inaccurate or even intentionally misleading. We argue that robust LLMs should demonstrate situated faithfulness, dynamically calibrating their trust in external information based on their confidence in the internal knowledge and the external context. To benchmark this capability, we evaluate LLMs across several QA datasets, including a newly created dataset called RedditQA featuring in-the-wild incorrect contexts sourced from Reddit posts. We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information, regardless of its factual accuracy. To enhance situated faithfulness, we propose two approaches: Self-Guided Confidence Reasoning (SCR) and Rule-Based Confidence Reasoning (RCR). SCR enables models to self-access the confidence of external information relative to their own internal knowledge to produce the most accurate answer. RCR, in contrast, extracts explicit confidence signals from the LLM and determines the final answer using predefined rules. Our results show that for LLMs with strong reasoning capabilities, such as GPT-4o and GPT-4o mini, SCR outperforms RCR, achieving improvements of up to 24.2% over a direct input augmentation baseline. Conversely, for a smaller model like Llama-3-8B, RCR outperforms SCR. Fine-tuning SCR with our proposed Confidence Reasoning Direct Preference Optimization (CR-DPO) method improves performance on both seen and unseen datasets, yielding an average improvement of 8.9% on Llama-3-8B. In addition to quantitative results, we offer insights into the relative strengths of SCR and RCR. Our findings highlight promising avenues for improving situated faithfulness in LLMs. The data and code are released.
Published: 2024

4. Real-time Fake News from Adversarial Feedback

Author: Chen, Sanxing, Huang, Yukun, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We show that existing evaluations for fake news detection based on conventional sources, such as claims on fact-checking websites, result in an increasing accuracy over time for LLM-based detectors -- even after their knowledge cutoffs. This suggests that recent popular political claims, which form the majority of fake news on such sources, are easily classified using surface-level shallow patterns. Instead, we argue that a proper fake news detection dataset should test a model's ability to reason factually about the current world by retrieving and reading related evidence. To this end, we develop a novel pipeline that leverages natural language feedback from a RAG-based detector to iteratively modify real-time news into deceptive fake news that challenges LLMs. Our iterative rewrite decreases the binary classification AUC by an absolute 17.5 percent for a strong RAG GPT-4o detector. Our experiments reveal the important role of RAG in both detecting and generating fake news, as retrieval-free LLM detectors are vulnerable to unseen events and adversarial attacks, while feedback from RAG detection helps discover more deceitful patterns in fake news.
Published: 2024

5. GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings

Author: Thirukovalluru, Raghuveer and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Training-free embedding methods directly leverage pretrained large language models (LLMs) to embed text, bypassing the costly and complex procedure of contrastive learning. Previous training-free embedding methods have mainly focused on optimizing embedding prompts and have overlooked the benefits of utilizing the generative abilities of LLMs. We propose a novel method, GenEOL, which uses LLMs to generate diverse transformations of a sentence that preserve its meaning, and aggregates the resulting embeddings of these transformations to enhance the overall sentence embedding. GenEOL significantly outperforms the existing training-free embedding methods by an average of 2.85 points across several LLMs on the sentence semantic text similarity (STS) benchmark. Our analysis shows that GenEOL stabilizes representation quality across LLM layers and is robust to perturbations of embedding prompts. GenEOL also achieves notable gains on multiple clustering, reranking and pair-classification tasks from the MTEB benchmark.
Published: 2024

6. Evaluating Morphological Compositional Generalization in Large Language Models

Author: Ismayilzada, Mete, Circi, Defne, Sälevä, Jonne, Sirin, Hale, Köksal, Abdullatif, Dhingra, Bhuwan, Bosselut, Antoine, van der Plas, Lonneke, and Ataman, Duygu
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) have demonstrated significant progress in various natural language generation and understanding tasks. However, their linguistic generalization capabilities remain questionable, raising doubts about whether these models learn language similarly to humans. While humans exhibit compositional generalization and linguistic creativity in language use, the extent to which LLMs replicate these abilities, particularly in morphology, is under-explored. In this work, we systematically investigate the morphological generalization abilities of LLMs through the lens of compositionality. We define morphemes as compositional primitives and design a novel suite of generative and discriminative tasks to assess morphological productivity and systematicity. Focusing on agglutinative languages such as Turkish and Finnish, we evaluate several state-of-the-art instruction-finetuned multilingual models, including GPT-4 and Gemini. Our analysis shows that LLMs struggle with morphological compositional generalization particularly when applied to novel word roots, with performance declining sharply as morphological complexity increases. While models can identify individual morphological combinations better than chance, their performance lacks systematicity, leading to significant accuracy gaps compared to humans., Comment: 33 pages
Published: 2024

7. Low-regularity global solution of the inhomogeneous nonlinear Schr\'odinger equations in modulation spaces

Author: Bhimani, Divyang G., Dhingra, Diksha, and Sohani, Vijay Kumar
Subjects: Mathematics - Analysis of PDEs, 35Q55
Abstract: The study of low regularity Cauchy data for nonlinear dispersive PDEs has successfully been achieved using modulation spaces $M^{p,q}$ in recent years. In this paper, we study the inhomogeneous nonlinear Schr\"odinger equation (INLS) $$iu_t + \Delta u\pm |x|^{-b}|u|^{\alpha}u=0,$$ where $\alpha, b>0,$ on whole space $\mathbb R^n$ in modulation spaces. In the subcritical regime $(0<\alpha< \frac{4-2b}{n}),$ we establish local well-posedness in $L^{2}+M^{\alpha+2,\frac{\alpha+2}{\alpha+1}}( \supset L^2 + H^s \ \text{for} \ s>\frac{n\alpha}{2(\alpha+2)}).$ By adapting Bourgain's high-low decomposition method, we establish global well-posedness in $M^{p,\frac{p}{p-1}}$ with $2
Published: 2024

8. Direct and indirect regulation of β-glucocerebrosidase by the transcription factors USF2 and ONECUT2.

Author: Ging, Kathi, Frick, Lukas, Schlachetzki, Johannes, Armani, Andrea, Zhu, Yanping, Gilormini, Pierre-André, Dhingra, Ashutosh, Böck, Desirée, Marques, Ana, Deen, Matthew, Chen, Xi, Serdiuk, Tetiana, Trevisan, Chiara, Sellitto, Stefano, Pisano, Claudio, Glass, Christopher, Heutink, Peter, Yin, Jiang-An, Vocadlo, David, and Aguzzi, Adriano
Abstract: Mutations in GBA1 encoding the lysosomal enzyme β-glucocerebrosidase (GCase) are among the most prevalent genetic susceptibility factors for Parkinsons disease (PD), with 10-30% of carriers developing the disease. To identify genetic modifiers contributing to the incomplete penetrance, we examined the effect of 1634 human transcription factors (TFs) on GCase activity in lysates of an engineered human glioblastoma line homozygous for the pathogenic GBA1 L444P variant. Using an arrayed CRISPR activation library, we uncovered 11 TFs as regulators of GCase activity. Among these, activation of MITF and TFEC increased lysosomal GCase activity in live cells, while activation of ONECUT2 and USF2 decreased it. While MITF, TFEC, and USF2 affected GBA1 transcription, ONECUT2 might control GCase trafficking. The effects of MITF, TFEC, and USF2 on lysosomal GCase activity were reproducible in iPSC-derived neurons from PD patients. Our study provides a systematic approach to identifying modulators of GCase activity and deepens our understanding of the mechanisms regulating GCase.
Published: 2024

9. HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration

Author: Dhingra, Pratyush, Doppa, Janardhan Rao, and Pande, Partha Pratim
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning, B.0
Abstract: Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to the wide variety of computing kernels involved in the transformer architecture. Existing accelerators are either inadequate to accelerate end-to-end transformer models or suffer notable thermal limitations. In this paper, we propose the design of a three-dimensional heterogeneous architecture referred to as HeTraX specifically optimized to accelerate end-to-end transformer models. HeTraX employs hardware resources aligned with the computational kernels of transformers and optimizes both performance and energy. Experimental results show that HeTraX outperforms existing state-of-the-art by up to 5.6x in speedup and improves EDP by 14.5x while ensuring thermally feasibility., Comment: Presented at ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED-24)
Published: 2024

10. On the origins of charge transport in spin crossover complexes

Author: Dhingra, Archit and Zaz, M. Zaid
Subjects: Condensed Matter - Materials Science, Condensed Matter - Other Condensed Matter
Abstract: Spin crossover (SCO) complexes are highly promising candidates for a myriad of potential applications in room-temperature electronics; however, as it stands, establishing a clear connection between their spin-state switching and transport properties has been far from trivial. In this letter, an effort to unravel the underlying charge transport mechanism in these SCO complexes, via a general theory, is made. The theory presented herein is aimed at providing a unifying picture that explains the widely different trends observed in the spin-crossover-dependent carrier transport properties in the SCO molecular thin film systems.
Published: 2024

11. A flexured-gimbal 3-axis force-torque sensor reveals minimal cross-axis coupling in an insect-sized flapping-wing robot

Author: Weber, Aaron, Dhingra, Daksh, and Fuller, Sawyer B.
Subjects: Computer Science - Robotics, Electrical Engineering and Systems Science - Systems and Control
Abstract: The mechanical complexity of flapping wings, their unsteady aerodynamic flow, and challenge of making measurements at the scale of a sub-gram flapping-wing flying insect robot (FIR) make its behavior hard to predict. Knowing the precise mapping from voltage input to torque output, however, can be used to improve their mechanical and flight controller design. To address this challenge, we created a sensitive force-torque sensor based on a flexured gimbal that only requires a standard motion capture system or accelerometer for readout. Our device precisely and accurately measures pitch and roll torques simultaneously, as well as thrust, on a tethered flapping-wing FIR in response to changing voltage input signals. With it, we were able to measure cross-axis coupling of both torque and thrust input commands on a 180 mg FIR, the UW Robofly. We validated these measurements using free-flight experiments. Our results showed that roll and pitch have maximum cross-axis coupling errors of 8.58% and 17.24%, respectively, relative to the range of torque that is possible. Similarly, varying the pitch and roll commands resulted in up to a 5.78% deviation from the commanded thrust, across the entire commanded torque range. Our system, the first to measure two torque axes simultaneously, shows that torque commands have a negligible cross-axis coupling on both torque and thrust., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

12. Modeling and LQR Control of Insect Sized Flapping Wing Robot

Author: Dhingra, Daksh, Kaheman, Kadierdan, and Fuller, Sawyer B.
Subjects: Computer Science - Robotics, Mathematics - Optimization and Control
Abstract: Flying insects can perform rapid, sophisticated maneuvers like backflips, sharp banked turns, and in-flight collision recovery. To emulate these in aerial robots weighing less than a gram, known as flying insect robots (FIRs), a fast and responsive control system is essential. To date, these have largely been, at their core, elaborations of proportional-integral-derivative (PID)-type feedback control. Without exception, their gains have been painstakingly tuned by hand. Aggressive maneuvers have further required task-specific tuning. Optimal control has the potential to mitigate these issues, but has to date only been demonstrated using approxiate models and receding horizon controllers (RHC) that are too computationally demanding to be carried out onboard the robot. Here we used a more accurate stroke-averaged model of forces and torques to implement the first demonstration of optimal control on an FIR that is computationally efficient enough to be performed by a microprocessor carried onboard. We took force and torque measurements from a 150 mg FIR, the UW Robofly, using a custom-built sensitive force-torque sensor, and validated them using motion capture data in free flight. We demonstrated stable hovering (RMS error of about 4 cm) and trajectory tracking maneuvers at translational velocities up to 25 cm/s using an optimal linear quadratic regulator (LQR). These results were enabled by a more accurate model and lay the foundation for future work that uses our improved model and optimal controller in conjunction with recent advances in low-power receding horizon control to perform accurate aggressive maneuvers without iterative, task-specific tuning., Comment: The video of the results can be accessed using www.youtube.com/watch?v=0o7j1nS2KHA
Published: 2024

13. ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

Author: Xie, Roy, Wang, Junlin, Huang, Ruomin, Zhang, Minxing, Ge, Rong, Pei, Jian, Gong, Neil Zhenqiang, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the pretraining data used for training them. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraining data by leveraging their conditional language modeling capabilities. ReCaLL examines the relative change in conditional log-likelihoods when prefixing target data points with non-member context. Our empirical findings show that conditioning member data on non-member prefixes induces a larger decrease in log-likelihood compared to non-member data. We conduct comprehensive experiments and show that ReCaLL achieves state-of-the-art performance on the WikiMIA dataset, even with random and synthetic prefixes, and can be further improved using an ensemble approach. Moreover, we conduct an in-depth analysis of LLMs' behavior with different membership contexts, providing insights into how LLMs leverage membership information for effective inference at both the sequence and token level.
Published: 2024

14. Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

Author: Wang, Junlin, Yang, Tianyi, Xie, Roy, and Dhingra, Bhuwan
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language
Abstract: With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed queries. To help mitigate this problem, we introduce the Raccoon benchmark which comprehensively evaluates a model's susceptibility to prompt extraction attacks. Our novel evaluation method assesses models under both defenseless and defended scenarios, employing a dual approach to evaluate the effectiveness of existing defenses and the resilience of the models. The benchmark encompasses 14 categories of prompt extraction attacks, with additional compounded attacks that closely mimic the strategies of potential attackers, alongside a diverse collection of defense templates. This array is, to our knowledge, the most extensive compilation of prompt theft attacks and defense mechanisms to date. Our findings highlight universal susceptibility to prompt theft in the absence of defenses, with OpenAI models demonstrating notable resilience when protected. This paper aims to establish a more systematic benchmark for assessing LLM robustness against prompt extraction attacks, offering insights into their causes and potential countermeasures. Resources of Raccoon are publicly available at https://github.com/M0gician/RaccoonBench., Comment: ACL 2024 Findings
Published: 2024
Full Text: View/download PDF

15. Ultrafast Optical Control of Rashba Interactions in a TMDC Heterostructure

Author: Mittenzwey, Henry, Kumar, Abhijeet, Dhingra, Raghav, Watanabe, Kenji, Taniguchi, Takashi, Gahl, Cornelius, Bolotin, Kirill I., Selig, Malte, and Knorr, Andreas
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science, Physics - Optics, Quantum Physics
Abstract: We investigate spin relaxation dynamics of interlayer excitons in a MoSe2/MoS2 heterostructure induced by the Rashba effect. In such a system, Rashba interactions arise from an out-of-plane electric field due to photo-generated interlayer excitons inducing a phonon-assisted intravalley spin relaxation. We develop a theoretical description based on a microscopic approach to quantify the magnitude of Rashba interactions and test these predictions via time-resolved Kerr rotation measurements. In agreement with the calculations, we find that the Rashba-induced intravalley spin mixing becomes the dominating spin relaxation channel above T = 50 K. Our work identifies a previously unexplored spin-depolarization channel in heterostructures which can be used for ultrafast spin manipulation.
Published: 2024

16. Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

Author: Fisch, Adam, Maynez, Joshua, Hofer, R. Alex, Dhingra, Bhuwan, Globerson, Amir, and Cohen, William W.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate -- but potentially biased -- automatic system, in a way that results in tighter confidence intervals for certain parameters of interest (e.g., the mean performance of a language model). In this paper, we propose a method called Stratified Prediction-Powered Inference (StratPPI), in which we show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies. Without making any assumptions on the underlying automatic labeling system or data distribution, we derive an algorithm for computing provably valid confidence intervals for population parameters (such as averages) that is based on stratified sampling. In particular, we show both theoretically and empirically that, with appropriate choices of stratification and sample allocation, our approach can provide substantially tighter confidence intervals than unstratified approaches. Specifically, StratPPI is expected to improve in cases where the performance of the autorater varies across different conditional distributions of the target data.
Published: 2024

17. Atomic Self-Consistency for Better Long Form Generations

Author: Thirukovalluru, Raghuveer, Huang, Yukun, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language
Abstract: Recent work has aimed to improve LLM generations by filtering out hallucinations, thereby improving the precision of the information in responses. Correctness of a long-form response, however, also depends on the recall of multiple pieces of information relevant to the question. In this paper, we introduce Atomic Self-Consistency (ASC), a technique for improving the recall of relevant information in an LLM response. ASC follows recent work, Universal Self-Consistency (USC) in using multiple stochastic samples from an LLM to improve the long-form response. Unlike USC which only focuses on selecting the best single generation, ASC picks authentic subparts from the samples and merges them into a superior composite answer. Through extensive experiments and ablations, we show that merging relevant subparts of multiple samples performs significantly better than picking a single sample. ASC demonstrates significant gains over USC on multiple factoids and open-ended QA datasets - ASQA, QAMPARI, QUEST, ELI5 with ChatGPT and Llama2. Our analysis also reveals untapped potential for enhancing long-form generations using approach of merging multiple samples., Comment: 12 pages
Published: 2024

18. Tailoring Vaccine Messaging with Common-Ground Opinions

Author: Stureborg, Rickard, Chen, Sanxing, Xie, Ruoyu, Patel, Aayushi, Li, Christopher, Zhu, Chloe Qinyu, Hu, Tingnan, Yang, Jun, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, 68T50 (Primary) 68T01, 68T37, 91F20 (Secondary), I.2, I.2.7, I.7
Abstract: One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often have seemingly little ideological overlap. We define the task of tailoring vaccine interventions to a Common-Ground Opinion (CGO). Tailoring responses to a CGO involves meaningfully improving the answer by relating it to an opinion or belief the reader holds. In this paper we introduce TAILOR-CGO, a dataset for evaluating how well responses are tailored to provided CGOs. We benchmark several major LLMs on this task; finding GPT-4-Turbo performs significantly better than others. We also build automatic evaluation metrics, including an efficient and accurate BERT model that outperforms finetuned LLMs, investigate how to successfully tailor vaccine messaging to CGOs, and provide actionable recommendations from this investigation. Code and model weights: https://github.com/rickardstureborg/tailor-cgo Dataset: https://huggingface.co/datasets/DukeNLP/tailor-cgo, Comment: NAACL Findings 2024
Published: 2024

19. Bayesian Prediction-Powered Inference

Author: Hofer, R. Alex, Maynez, Joshua, Dhingra, Bhuwan, Fisch, Adam, Globerson, Amir, and Cohen, William W.
Subjects: Computer Science - Machine Learning
Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. Specifically, PPI methods provide tighter confidence intervals by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate, but potentially biased, automatic system. We propose a framework for PPI based on Bayesian inference that allows researchers to develop new task-appropriate PPI methods easily. Exploiting the ease with which we can design new metrics, we propose improved PPI methods for several importantcases, such as autoraters that give discrete responses (e.g., prompted LLM ``judges'') and autoraters with scores that have a non-linear relationship to human scores.
Published: 2024

20. ChatShop: Interactive Information Seeking with Language Agents

Author: Chen, Sanxing, Wiseman, Sam, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language
Abstract: The desire and ability to seek new information strategically are fundamental to human learning but often overlooked in current language agent evaluation. We analyze a popular web shopping task designed to test language agents' ability to perform strategic exploration and discover that it can be reformulated and solved as a single-turn retrieval task without the need for interactive information seeking. This finding encourages us to rethink realistic constraints on information access that would necessitate strategic information seeking. We then redesign the task to introduce a notion of task ambiguity and the role of a shopper, serving as a dynamic party with whom the agent strategically interacts in an open-ended conversation to make informed decisions. Our experiments demonstrate that the proposed task can effectively evaluate the agent's ability to explore and gradually accumulate information through multi-turn interactions. Additionally, we show that large language model-simulated shoppers serve as a good proxy for real human shoppers, revealing similar error patterns in agents.
Published: 2024

21. IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations

Author: Fu, Deqing, Guo, Ruohao, Khalighinejad, Ghazal, Liu, Ollie, Dhingra, Bhuwan, Yogatama, Dani, Jia, Robin, and Neiswanger, Willie
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities change depending on the input modality? In this work, we propose $\textbf{IsoBench}$, a benchmark dataset containing problems from four major areas: math, science, algorithms, and games. Each example is presented with multiple $\textbf{isomorphic representations}$ of inputs, such as visual, textual, and mathematical presentations. IsoBench provides fine-grained feedback to diagnose performance gaps caused by the form of the representation. Across various foundation models, we observe that on the same problem, models have a consistent preference towards textual representations. Most prominently, when evaluated on all IsoBench problems, Claude-3 Opus performs 28.7 points worse when provided with images instead of text; similarly, GPT-4 Turbo is 18.7 points worse and Gemini Pro is 14.9 points worse. Finally, we present two prompting techniques, $\textit{IsoCombination}$ and $\textit{IsoScratchPad}$, which improve model performance by considering combinations of, and translations between, different input representations., Comment: 1st Conference on Language Modeling (COLM), 2024
Published: 2024

22. Process and techno-economic analyses of ethylene production by electrochemical reduction of aqueous alkaline carbonates

Author: Venkataraman, Anush, Song, Hakhyeon, Brandão, Victor D., Ma, Chen, Casajus, Magdalena Salazar, Fernandez Otero, Carlos A., Sievers, Carsten, Hatzell, Marta C., Bhargava, Saket S., Arora, Sukaran S., Villa, Carlos, Dhingra, Sandeep, and Nair, Sankar
Published: 2024
Full Text: View/download PDF

23. An Empirical Study of Robust Mean-Variance Portfolios with Short Selling

Author: Dhingra, Vrinda and Gupta, S. K.
Published: 2024
Full Text: View/download PDF

24. OSA Prevalence in Children with Sickle Cell Disease: An Indian Experience

Author: Abhishek, Goyal, Sankalp, Dupare, Avishek, Kar, and Bhavna, Dhingra
Published: 2024
Full Text: View/download PDF

25. Enhancement of Mustard Oil Bio-Diesel Yield Using Evolutionary Algorithms

Author: Kumar, Pardeep, Dhingra, Ashwani Kumar, Chhabra, Deepak, and Chhikara, Ashish
Published: 2024
Full Text: View/download PDF

26. Exploring characteristic features for effective HCN1 channel inhibition using integrated analytical approaches: 3D QSAR, molecular docking, homology modelling, ADME and molecular dynamics

Author: Sharma, Shiwani, Rana, Priyanka, Chadha, Vijayta Dani, Dhingra, Neelima, and Kaur, Tanzeer
Published: 2024
Full Text: View/download PDF

27. A comprehensive review on performance-based comparative analysis, categorization, classification and mapping of text extraction system techniques for images

Author: Ghai, Deepika, Saxena, Sobhit, Dhingra, Gittaly, and Tripathi, Suman Lata
Published: 2024
Full Text: View/download PDF

28. Multi-objective Parameter Optimization of Four-Stroke Diesel Engine with Waste Cooking Oil Biodiesel and Diesel Blend using RSM-NSGA-II

Author: Kumar, Pardeep, Dhingra, Ashwani Kumar, Chhabra, Deepak, and Chhikara, Ashish
Published: 2024
Full Text: View/download PDF

29. Risk factors for moderate acute malnutrition among children with acute diarrhoea in India and Tanzania: a secondary analysis of data from a randomized trial

Author: Kisenge, Rodrick, Dhingra, Usha, Rees, Chris A., Liu, Enju, Dutta, Arup, Saikat, Deb, Dhingra, Pratibha, Somji, Sarah, Sudfeld, Chris, Simon, Jon, Ashorn, Per, Sazawal, Sunil, Duggan, Christopher P., and Manji, Karim
Published: 2024
Full Text: View/download PDF

30. Extracting Polymer Nanocomposite Samples from Full-Length Documents

Author: Khalighinejad, Ghazal, Circi, Defne, Brinson, L. C., and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language
Abstract: This paper investigates the use of large language models (LLMs) for extracting sample lists of polymer nanocomposites (PNCs) from full-length materials science research papers. The challenge lies in the complex nature of PNC samples, which have numerous attributes scattered throughout the text. The complexity of annotating detailed information on PNCs limits the availability of data, making conventional document-level relation extraction techniques impractical due to the challenge in creating comprehensive named entity span annotations. To address this, we introduce a new benchmark and an evaluation technique for this task and explore different prompting strategies in a zero-shot manner. We also incorporate self-consistency to improve the performance. Our findings show that even advanced LLMs struggle to extract all of the samples from an article. Finally, we analyze the errors encountered in this process, categorizing them into three main challenges, and discuss potential strategies for future research to overcome them.
Published: 2024

31. Adversarial Math Word Problem Generation

Author: Xie, Roy, Huang, Chengxuan, Wang, Junlin, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) have significantly transformed the educational landscape. As current plagiarism detection tools struggle to keep pace with LLMs' rapid advancements, the educational community faces the challenge of assessing students' true problem-solving abilities in the presence of LLMs. In this work, we explore a new paradigm for ensuring fair evaluation -- generating adversarial examples which preserve the structure and difficulty of the original questions aimed for assessment, but are unsolvable by LLMs. Focusing on the domain of math word problems, we leverage abstract syntax trees to structurally generate adversarial examples that cause LLMs to produce incorrect answers by simply editing the numeric values in the problems. We conduct experiments on various open- and closed-source LLMs, quantitatively and qualitatively demonstrating that our method significantly degrades their math problem-solving ability. We identify shared vulnerabilities among LLMs and propose a cost-effective approach to attack high-cost models. Additionally, we conduct automatic analysis to investigate the cause of failure, providing further insights into the limitations of LLMs., Comment: Code/data: https://github.com/ruoyuxie/adversarial_mwps_generation
Published: 2024

32. Calibrating Long-form Generations from Large Language Models

Author: Huang, Yukun, Liu, Yixin, Thirukovalluru, Raghuveer, Cohan, Arman, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: To enhance Large Language Models' (LLMs) reliability, calibration is essential -- the model's assessed confidence scores should align with the actual likelihood of its responses being correct. However, current confidence elicitation methods and calibration metrics typically rely on a binary true/false assessment of response correctness. This approach does not apply to long-form generation, where an answer can be partially correct. Addressing this gap, we introduce a unified calibration framework, in which both the correctness of the LLMs' responses and their associated confidence levels are treated as distributions across a range of scores. Within this framework, we develop three metrics to precisely evaluate LLM calibration and further propose two confidence elicitation methods based on self-consistency and self-evaluation. Our experiments, which include long-form QA and summarization tasks, demonstrate that larger models don't necessarily guarantee better calibration, that calibration performance is found to be metric-dependent, and that self-consistency methods excel in factoid datasets. We also find that calibration can be enhanced through techniques such as fine-tuning, integrating relevant source documents, scaling the temperature, and combining self-consistency with self-evaluation. Lastly, we showcase a practical application of our system: selecting and cascading open-source models and ChatGPT to optimize correctness given a limited API budget. This research not only challenges existing notions of LLM calibration but also offers practical methodologies for improving trustworthiness in long-form generation.
Published: 2024

33. Hierarchical Multi-Label Classification of Online Vaccine Concerns

Author: Zhu, Chloe Qinyu, Stureborg, Rickard, and Dhingra, Bhuwan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. Identifying longitudinal trends in vaccine concerns and misinformation might inform the healthcare space by helping public health efforts strategically allocate resources or information campaigns. We explore the task of detecting vaccine concerns in online discourse using large language models (LLMs) in a zero-shot setting without the need for expensive training datasets. Since real-time monitoring of online sources requires large-scale inference, we explore cost-accuracy trade-offs of different prompting strategies and offer concrete takeaways that may inform choices in system designs for current applications. An analysis of different prompting strategies reveals that classifying the concerns over multiple passes through the LLM, each consisting a boolean question whether the text mentions a vaccine concern or not, works the best. Our results indicate that GPT-4 can strongly outperform crowdworker accuracy when compared to ground truth annotations provided by experts on the recently introduced VaxConcerns dataset, achieving an overall F1 score of 78.7%., Comment: Published in AAAI 2024 Health Intelligence workshop
Published: 2024

34. Revisiting Common Randomness, No-signaling and Information Structure in Decentralized Control

Author: Dhingra, Apurva and Kulkarni, Ankur A.
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Systems and Control, 93C41, 93E20, 81Q93
Abstract: This work revisits the no-signaling condition for decentralized information structures. We produce examples to show that within the no-signaling polytope exist strategies that cannot be achieved by passive common randomness but instead require agents to either share their observations with a mediator or communicate directly with each other. This poses a question mark on whether the no-signaling condition truly captures the decentralized information structure in the strictest sense.
Published: 2024

35. FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators

Author: Dhingra, Pratyush, Ogbogu, Chukwufumnanya, Joardar, Biresh Kumar, Doppa, Janardhan Rao, Kalyanaraman, Ananth, and Pande, Partha Pratim
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning, B.8.1
Abstract: Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architecture is an attractive solution for training Graph Neural Networks (GNNs) on edge platforms. However, the immature fabrication process and limited write endurance of ReRAMs make them prone to hardware faults, thereby limiting their widespread adoption for GNN training. Further, the existing fault-tolerant solutions prove inadequate for effectively training GNNs in the presence of faults. In this paper, we propose a fault-aware framework referred to as FARe that mitigates the effect of faults during GNN training. FARe outperforms existing approaches in terms of both accuracy and timing overhead. Experimental results demonstrate that FARe framework can restore GNN test accuracy by 47.6% on faulty ReRAM hardware with a ~1% timing overhead compared to the fault-free counterpart., Comment: This paper has been accepted to the conference DATE (Design, Automation and Test in Europe) - 2024
Published: 2024

36. A comprehensive evaluation of constrained mean-expectile portfolios with short selling

Author: Dhingra, Vrinda, Sharma, Amita, and Gupta, Shiv Kumar
Published: 2024
Full Text: View/download PDF

37. Influence of Rare Earth Yb3+ Dopant on the Spectroscopic Properties of Manganese Ferrite Nanoparticles

Author: Gulati, Sudha and Dhingra, Mansi
Published: 2024
Full Text: View/download PDF

38. How Well Do Large Language Models Understand Tables in Materials Science?

Author: Circi, Defne, Khalighinejad, Ghazal, Chen, Anlan, Dhingra, Bhuwan, and Brinson, L. Catherine
Published: 2024
Full Text: View/download PDF

39. Establishing a bone bank within a hospital setting in India: early insights from a tertiary care center in Northern India—a review article

Author: Regmi, Anil, Niraula, Bishwa Bandhu, Maheshwari, Vikas, Nongdamba, Hawaibam, Karn, Rahul, Bondarde, Parshwanath, Anand, Utsav, Dhingra, Mohit, and Kandwal, Pankaj
Published: 2024
Full Text: View/download PDF

40. Fermentation of Rice Straw Hydrolyzates for Bioethanol Production and Increasing its Yield by Applying Random Physical and Chemical Mutagenesis

Author: Ningthoujam, Reema, Jangid, Pankaj, Yadav, Virendra Kumar, Ali, Daoud, Alarifi, Saud, Patel, Ashish, and Dhingra, Harish Kumar
Published: 2024
Full Text: View/download PDF

41. Histological evaluation of decellularization of freeze dried and chemically treated indigenously prepared bovine pericardium membrane

Author: Gupt, Chander, Lamba, Arundeep Kaur, Faraz, Farrukh, Tandon, Shruti, Augustine , Jeyaseelan, Datta, Archita, and Dhingra, Sachin
Published: 2024
Full Text: View/download PDF

42. Correspondence between a new pair of nondifferentiable mixed dual vector programs and higher-order generalized convexity

Author: Kailey, N., Sonali Sethi, and Dhingra, Vivek
Published: 2024
Full Text: View/download PDF

43. Sectoral portfolio optimization by judicious selection of financial ratios via PCA

Author: Dhingra, Vrinda, Sharma, Amita, and Gupta, Shiv K.
Published: 2024
Full Text: View/download PDF

44. Deep Angiomyxoma of the Knee: a Rare Case Report

Author: Pranav, J, Bansal, Shivam, Barman, Saptarshi, Kumawat, Nivesh, Gowda, Rohan, Dhingra, Mohit, and Kumar, Arvind
Published: 2024
Full Text: View/download PDF

45. Pulmonary Tuberculosis in Severely Malnourished Children Admitted to Nutrition Rehabilitation Centers: A Multicenter Study

Author: Singh, Manjula, Dhingra, Bhavna, Bishnu, Bipra, Pandey, Dhruvendra, Anand, Praveen K., Gupta, Sarika, Das, Vidyanand Ravi, Dhochak, Nitin, and Kabra, S. K.
Published: 2024
Full Text: View/download PDF

46. Vascular Tumor with Kasabach Merritt Phenomenon Treated with Steroids and Vincristine: A Retrospective Study

Author: Agarwal, Pulkit, Khera, Sanjeev, Shaw, Subhash Chandra, and Dhingra, Sandeep
Published: 2024
Full Text: View/download PDF

47. Efficient Turn-On Zr Based Metal Organic Framework Fluorescent Sensor for Ultrafast Detection of Danofloxacin in Milk Samples

Author: Verma, Rajpal, Dhingra, Gaurav, Singh, Gurdeep, Singh, Jaswinder, Dureja, Nidhi, and Malik, Ashok Kumar
Published: 2024
Full Text: View/download PDF

48. A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models

Author: Sharma, Harsh, Dhingra, Pratyush, Doppa, Janardhan Rao, Ogras, Umit, and Pande, Partha Pratim
Subjects: Computer Science - Hardware Architecture, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Transformers have revolutionized deep learning and generative modeling, enabling unprecedented advancements in natural language processing tasks. However, the size of transformer models is increasing continuously, driven by enhanced capabilities across various deep-learning tasks. This trend of ever-increasing model size has given rise to new challenges in terms of memory and computing requirements. Conventional computing platforms, including GPUs, suffer from suboptimal performance due to the memory demands imposed by models with millions/billions of parameters. The emerging chiplet-based platforms provide a new avenue for compute- and data-intensive machine learning (ML) applications enabled by a Network-on-Interposer (NoI). However, designing suitable hardware accelerators for executing Transformer inference workloads is challenging due to a wide variety of complex computing kernels in the Transformer architecture. In this paper, we leverage chiplet-based heterogeneous integration (HI) to design a high-performance and energy-efficient multi-chiplet platform to accelerate transformer workloads. We demonstrate that the proposed NoI architecture caters to the data access patterns inherent in a transformer model. The optimized placement of the chiplets and the associated NoI links and routers enable superior performance compared to the state-of-the-art hardware accelerators. The proposed NoI-based architecture demonstrates scalability across varying transformer models and improves latency and energy efficiency by up to 22.8x and 5.36x respectively., Comment: Preprint for a Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models
Published: 2023

49. CRUSH4SQL: Collective Retrieval Using Schema Hallucination For Text2SQL

Author: Kothyari, Mayank, Dhingra, Dhruva, Sarawagi, Sunita, and Chakrabarti, Soumen
Subjects: Computer Science - Computation and Language
Abstract: Existing Text-to-SQL generators require the entire schema to be encoded with the user text. This is expensive or impractical for large databases with tens of thousands of columns. Standard dense retrieval techniques are inadequate for schema subsetting of a large structured database, where the correct semantics of retrieval demands that we rank sets of schema elements rather than individual elements. In response, we propose a two-stage process for effective coverage during retrieval. First, we instruct an LLM to hallucinate a minimal DB schema deemed adequate to answer the query. We use the hallucinated schema to retrieve a subset of the actual schema, by composing the results from multiple dense retrievals. Remarkably, hallucination $\unicode{x2013}$ generally considered a nuisance $\unicode{x2013}$ turns out to be actually useful as a bridging mechanism. Since no existing benchmarks exist for schema subsetting on large databases, we introduce three benchmarks. Two semi-synthetic datasets are derived from the union of schemas in two well-known datasets, SPIDER and BIRD, resulting in 4502 and 798 schema elements respectively. A real-life benchmark called SocialDB is sourced from an actual large data warehouse comprising 17844 schema elements. We show that our method1 leads to significantly higher recall than SOTA retrieval-based augmentation methods., Comment: To appear at EMNLP 2023 (Main)
Published: 2023

50. ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters

Author: Rathore, Vipul, Dhingra, Rajdeep, Singla, Parag, and Mausam
Subjects: Computer Science - Computation and Language
Abstract: We tackle the problem of zero-shot cross-lingual transfer in NLP tasks via the use of language adapters (LAs). Most of the earlier works have explored training with adapter of a single source (often English), and testing either using the target LA or LA of another related language. Training target LA requires unlabeled data, which may not be readily available for low resource unseen languages: those that are neither seen by the underlying multilingual language model (e.g., mBERT), nor do we have any (labeled or unlabeled) data for them. We posit that for more effective cross-lingual transfer, instead of just one source LA, we need to leverage LAs of multiple (linguistically or geographically related) source languages, both at train and test-time - which we investigate via our novel neural architecture, ZGUL. Extensive experimentation across four language groups, covering 15 unseen target languages, demonstrates improvements of up to 3.2 average F1 points over standard fine-tuning and other strong baselines on POS tagging and NER tasks. We also extend ZGUL to settings where either (1) some unlabeled data or (2) few-shot training examples are available for the target language. We find that ZGUL continues to outperform baselines in these settings too.
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

3,779 results on '"Dhingra P."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources