Search

Your search keyword '"Baral, Chitta"' showing total 769 results

Search Constraints

Start Over You searched for: Author "Baral, Chitta" Remove constraint Author: "Baral, Chitta"
769 results on '"Baral, Chitta"'

Search Results

1. REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models

2. Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?

3. UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization

4. Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models

5. Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation

6. ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints

7. Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies

8. Investigating the Robustness of LLMs on Math Word Problems

9. Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization

10. Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images

11. LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models

12. Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

13. On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

14. Getting it Right: Improving Spatial Consistency in Text-to-Image Models

15. Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts

16. Jailbreaking Proprietary Large Language Models using Word Substitution Cipher

17. $\lambda$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

18. The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness

19. ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

20. LongBoX: Evaluating Transformers on Long-Sequence Clinical Tasks

21. Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with LITE

22. TarGEN: Targeted Data Generation with Large Language Models

23. InstructExcel: A Benchmark for Natural Language Instruction in Excel

24. Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

25. Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?

26. Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains

27. MDDial: A Multi-turn Differential Diagnosis Dialogue Dataset with Reliability Evaluation

28. ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models

29. End-to-end Knowledge Retrieval with Multi-modal Queries

30. EDM3: Event Detection as Multi-task Text Generation

31. Dr.ICL: Demonstration-Retrieved In-context Learning

32. Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?

33. Instruction Tuned Models are Quick Learners

34. A Unified Evaluation Framework for Novelty Detection and Accommodation in NLP with an Instantiation in Authorship Attribution

35. Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA

36. Prompt-Based Learning for Thread Structure Prediction in Cybersecurity Forums

37. Prompt-Based Learning for Thread Structure Prediction in Cybersecurity Forums

38. Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

39. Exploring the Limits of Transfer Learning with Unified Model in the Cybersecurity Domain

40. InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis

41. Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow

42. Lexi: Self-Supervised Learning of the UI Language

43. Benchmarking Spatial Relationships in Text-to-Image Generation

44. Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task

45. Learning Action-Effect Dynamics from Pairs of Scene-graphs

46. Can Open-Domain QA Reader Utilize External Knowledge Efficiently like Humans?

47. CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering

48. Lila: A Unified Benchmark for Mathematical Reasoning

49. Pretrained Transformers Do not Always Improve Robustness

50. Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task

Catalog

Books, media, physical & digital resources