Author: "Mehrotra P" / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse

Author: Kalavasis, Alkis, Mehrotra, Anay, and Velegkas, Grigoris
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning
Abstract: Specifying all desirable properties of a language model is challenging, but certain requirements seem essential. Given samples from an unknown language, the trained model should produce valid strings not seen in training and be expressive enough to capture the language's full richness. Otherwise, outputting invalid strings constitutes "hallucination," and failing to capture the full range leads to "mode collapse." We ask if a language model can meet both requirements. We investigate this within a statistical language generation setting building on Gold and Angluin. Here, the model receives random samples from a distribution over an unknown language K, which belongs to a possibly infinite collection of languages. The goal is to generate unseen strings from K. We say the model generates from K with consistency and breadth if, as training size increases, its output converges to all unseen strings in K. Kleinberg and Mullainathan [KM24] asked if consistency and breadth in language generation are possible. We answer this negatively: for a large class of language models, including next-token prediction models, this is impossible for most collections of candidate languages. This contrasts with [KM24]'s result, showing consistent generation without breadth is possible for any countable collection of languages. Our finding highlights that generation with breadth fundamentally differs from generation without breadth. As a byproduct, we establish near-tight bounds on the number of samples needed for generation with or without breadth. Finally, our results offer hope: consistent generation with breadth is achievable for any countable collection of languages when negative examples (strings outside K) are available alongside positive ones. This suggests that post-training feedback, which encodes negative examples, can be crucial in reducing hallucinations while limiting mode collapse., Comment: Abstract shortened to fit arXiv limit
Published: 2024

2. Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models

Author: Hsu, Aliyah R., Zhu, James, Wang, Zhichao, Bi, Bin, Mehrotra, Shubham, Pentyala, Shiva K., Tan, Katherine, Mao, Xiang-Bo, Omrani, Roshanak, Chaudhuri, Sougata, Radhakrishnan, Regunathan, Asur, Sitaram, Cheng, Claire Na, and Yu, Bin
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: LLMs have demonstrated impressive proficiency in generating coherent and high-quality text, making them valuable across a range of text-generation tasks. However, rigorous evaluation of this generated content is crucial, as ensuring its quality remains a significant challenge due to persistent issues such as factual inaccuracies and hallucinations. This paper introduces two fine-tuned general-purpose LLM autoevaluators, REC-12B and REC-70B, specifically designed to evaluate generated text across several dimensions: faithfulness, instruction following, coherence, and completeness. These models not only provide ratings for these metrics but also offer detailed explanations and verifiable citations, thereby enhancing trust in the content. Moreover, the models support various citation modes, accommodating different requirements for latency and granularity. Extensive evaluations on diverse benchmarks demonstrate that our general-purpose LLM auto-evaluator, REC-70B, outperforms state-of-the-art LLMs, excelling in content evaluation by delivering better quality explanations and citations with minimal bias. It achieves Rank \#1 as a generative model on the RewardBench leaderboard\footnote{\url{https://huggingface.co/spaces/allenai/reward-bench}} under the model name \texttt{TextEval-Llama3.1-70B}. Our REC dataset and models are released at \url{https://github.com/adelaidehsu/REC}.
Published: 2024

3. Crafting Tomorrow: The Influence of Design Choices on Fresh Content in Social Media Recommendation

Author: Saket, Srijan, Agarwal, Mohit, and Mehrotra, Rishabh
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: The rise in popularity of social media platforms, has resulted in millions of new, content pieces being created every day. This surge in content creation underscores the need to pay attention to our design choices as they can greatly impact how long content remains relevant. In today's landscape where regularly recommending new content is crucial, particularly in the absence of detailed information, a variety of factors such as UI features, algorithms and system settings contribute to shaping the journey of content across the platform. While previous research has focused on how new content affects users' experiences, this study takes a different approach by analyzing these decisions considering the content itself. Through a series of carefully crafted experiments we explore how seemingly small decisions can influence the longevity of content, measured by metrics like Content Progression (CVP) and Content Survival (CSR). We also emphasize the importance of recognizing the stages that content goes through underscoring the need to tailor strategies for each stage as a one size fits all approach may not be effective. Additionally we argue for a departure from traditional experimental setups in the study of content lifecycles, to avoid potential misunderstandings while proposing advanced techniques, to achieve greater precision and accuracy in the evaluation process.
Published: 2024

4. Smaller Confidence Intervals From IPW Estimators via Data-Dependent Coarsening

Author: Kalavasis, Alkis, Mehrotra, Anay, and Zampetakis, Manolis
Subjects: Statistics - Methodology, Computer Science - Machine Learning, Economics - Econometrics, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: Inverse propensity-score weighted (IPW) estimators are prevalent in causal inference for estimating average treatment effects in observational studies. Under unconfoundedness, given accurate propensity scores and $n$ samples, the size of confidence intervals of IPW estimators scales down with $n$, and, several of their variants improve the rate of scaling. However, neither IPW estimators nor their variants are robust to inaccuracies: even if a single covariate has an $\varepsilon>0$ additive error in the propensity score, the size of confidence intervals of these estimators can increase arbitrarily. Moreover, even without errors, the rate with which the confidence intervals of these estimators go to zero with $n$ can be arbitrarily slow in the presence of extreme propensity scores (those close to 0 or 1). We introduce a family of Coarse IPW (CIPW) estimators that captures existing IPW estimators and their variants. Each CIPW estimator is an IPW estimator on a coarsened covariate space, where certain covariates are merged. Under mild assumptions, e.g., Lipschitzness in expected outcomes and sparsity of extreme propensity scores, we give an efficient algorithm to find a robust estimator: given $\varepsilon$-inaccurate propensity scores and $n$ samples, its confidence interval size scales with $\varepsilon+1/\sqrt{n}$. In contrast, under the same assumptions, existing estimators' confidence interval sizes are $\Omega(1)$ irrespective of $\varepsilon$ and $n$. Crucially, our estimator is data-dependent and we show that no data-independent CIPW estimator can be robust to inaccuracies., Comment: Accepted for presentation at the 37th Conference on Learning Theory (COLT) 2024
Published: 2024

5. Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

Author: Lee, Jane H., Mehrotra, Anay, and Zampetakis, Manolis
Subjects: Mathematics - Statistics Theory, Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning, Statistics - Computation, Statistics - Machine Learning
Abstract: We study the estimation of distributional parameters when samples are shown only if they fall in some unknown set $S \subseteq \mathbb{R}^d$. Kontonis, Tzamos, and Zampetakis (FOCS'19) gave a $d^{\mathrm{poly}(1/\varepsilon)}$ time algorithm for finding $\varepsilon$-accurate parameters for the special case of Gaussian distributions with diagonal covariance matrix. Recently, Diakonikolas, Kane, Pittas, and Zarifis (COLT'24) showed that this exponential dependence on $1/\varepsilon$ is necessary even when $S$ belongs to some well-behaved classes. These works leave the following open problems which we address in this work: Can we estimate the parameters of any Gaussian or even extend beyond Gaussians? Can we design $\mathrm{poly}(d/\varepsilon)$ time algorithms when $S$ is a simple set such as a halfspace? We make progress on both of these questions by providing the following results: 1. Toward the first question, we give a $d^{\mathrm{poly}(\ell/\varepsilon)}$ time algorithm for any exponential family that satisfies some structural assumptions and any unknown set $S$ that is $\varepsilon$-approximable by degree-$\ell$ polynomials. This result has two important applications: 1a) The first algorithm for estimating arbitrary Gaussian distributions from samples truncated to an unknown $S$; and 1b) The first algorithm for linear regression with unknown truncation and Gaussian features. 2. To address the second question, we provide an algorithm with runtime $\mathrm{poly}(d/\varepsilon)$ that works for a set of exponential families (containing all Gaussians) when $S$ is a halfspace or an axis-aligned rectangle. Along the way, we develop tools that may be of independent interest, including, a reduction from PAC learning with positive and unlabeled samples to PAC learning with positive and negative samples that is robust to certain covariate shifts., Comment: Accepted for presentation at the 65th IEEE Symposium on Foundations of Computer Science (FOCS), 2024; abstract shortened for arXiv
Published: 2024

6. The Gravitational Lensing Due to Schwarzchild Black Holes

Author: Mehrotra, Amritansh and Kanishka, R.
Subjects: High Energy Physics - Phenomenology
Abstract: Gravitational lensing is a powerful concept in the Astrophysics to study black holes. The gravitational field of a massive object like a galaxy or black hole bends and magnifies the light from a distant object behind it. The Schwarzchild black hole that are the simplest type of black hole, having no charge or angular momentum have been useful to observe the gravitational lensing. In the presented work, Schwarzchild black hole has been simulated keeping the spiral, elliptical, lenticular and irregular galaxies at the background to obtain the gravitational lensing., Comment: 8 pages, 7 figures
Published: 2024

7. Assessing FIFO and Round Robin Scheduling:Effects on Data Pipeline Performance and Energy Usage

Author: Choudhury, Malobika Roy and Mehrotra, Akshat
Subjects: Computer Science - Operating Systems
Abstract: In the case of compute-intensive machine learning, efficient operating system scheduling is crucial for performance and energy efficiency. This paper conducts a comparative study over FIFO(First-In-First-Out) and RR(Round-Robin) scheduling policies with the application of real-time machine learning training processes and data pipelines on Ubuntu-based systems. Knowing a few patterns of CPU usage and energy consumption, we identify which policy (the exclusive or the shared) provides higher performance and/or lower energy consumption for typical modern workloads. Results of this study would help in providing better operating system schedulers for modern systems like Ubuntu, working to improve performance and reducing energy consumption in compute intensive workloads.
Published: 2024

8. Dark Patterns in the Opt-Out Process and Compliance with the California Consumer Privacy Act (CCPA)

Author: Tran, Van Hong, Mehrotra, Aarushi, Sharma, Ranya, Chetty, Marshini, Feamster, Nick, Frankenreiter, Jens, and Strahilevitz, Lior
Subjects: Computer Science - Human-Computer Interaction
Abstract: To protect consumer privacy, the California Consumer Privacy Act (CCPA) mandates that businesses provide consumers with a straightforward way to opt out of the sale and sharing of their personal information. However, the control that businesses enjoy over the opt-out process allows them to impose hurdles on consumers aiming to opt out, including by employing dark patterns. Motivated by the enactment of the California Privacy Rights Act (CPRA), which strengthens the CCPA and explicitly forbids certain dark patterns in the opt-out process, we investigate how dark patterns are used in opt-out processes and assess their compliance with CCPA regulations. Our research reveals that websites employ a variety of dark patterns. Some of these patterns are explicitly prohibited under the CCPA; others evidently take advantage of legal loopholes. Despite the initial efforts to restrict dark patterns by policymakers, there is more work to be done.
Published: 2024

9. More than just a Tool: People's Perception and Acceptance of Prosocial Delivery Robots as Fellow Road Users

Author: Chi, Vivienne Bihe, Ulwelling, Elise, Salubre, Kevin, Mehrotra, Shashank, Misu, Teruhisa, and Akash, Kumar
Subjects: Computer Science - Human-Computer Interaction
Abstract: Service robots are increasingly deployed in public spaces, performing functional tasks such as making deliveries. To better integrate them into our social environment and enhance their adoption, we consider integrating social identities within delivery robots along with their functional identity. We conducted a virtual reality-based pilot study to explore people's perceptions and acceptance of delivery robots that perform prosocial behavior. Preliminary findings from thematic analysis of semi-structured interviews illustrate people's ambivalence about dual identity. We discussed the emerging themes in light of social identity theory, framing effect, and human-robot intergroup dynamics. Building on these insights, we propose that the next generation of delivery robots should use peer-based framing, an updated value proposition, and an interactive design that places greater emphasis on expressing intentionality and emotional responses.
Published: 2024

10. Can we enhance prosocial behavior? Using post-ride feedback to improve micromobility interactions

Author: Scott-Sharoni, Sidney T., Mehrotra, Shashank, Salubre, Kevin, Song, Miao, Misu, Teruhisa, and Akash, Kumar
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Robotics
Abstract: Micromobility devices, such as e-scooters and delivery robots, hold promise for eco-friendly and cost-effective alternatives for future urban transportation. However, their lack of societal acceptance remains a challenge. Therefore, we must consider ways to promote prosocial behavior in micromobility interactions. We investigate how post-ride feedback can encourage the prosocial behavior of e-scooter riders while interacting with sidewalk users, including pedestrians and delivery robots. Using a web-based platform, we measure the prosocial behavior of e-scooter riders. Results found that post-ride feedback can successfully promote prosocial behavior, and objective measures indicated better gap behavior, lower speeds at interaction, and longer stopping time around other sidewalk actors. The findings of this study demonstrate the efficacy of post-ride feedback and provide a step toward designing methodologies to improve the prosocial behavior of mobility users., Comment: In 16th International ACM Conference on Automotive User Interfaces and Interactive Vehicular Applications (AutomotiveUI'24), September 22-25, 2024, Stanford, CA, USA. 11 pages
Published: 2024
Full Text: View/download PDF

11. Convergence and Bound Computation for Chance Constrained Distributionally Robust Models using Sample Approximation

Author: Lei, Jiaqi and Mehrotra, Sanjay
Subjects: Mathematics - Optimization and Control
Abstract: This paper considers a distributionally robust chance constraint model with a general ambiguity set. We show that a sample based approximation of this model converges under suitable sufficient conditions. We also show that upper and lower bounds on the optimal value of the model can be estimated statistically. Specific ambiguity sets are discussed as examples.
Published: 2024

12. AI-assisted Coding with Cody: Lessons from Context Retrieval and Evaluation for Code Recommendations

Author: Hartman, Jan, Mehrotra, Rishabh, Sagtani, Hitesh, Cooney, Dominic, Gajdulewicz, Rafal, Liu, Beyang, Tibshirani, Julie, and Slack, Quinn
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning, Computer Science - Software Engineering
Abstract: In this work, we discuss a recently popular type of recommender system: an LLM-based coding assistant. Connecting the task of providing code recommendations in multiple formats to traditional RecSys challenges, we outline several similarities and differences due to domain specifics. We emphasize the importance of providing relevant context to an LLM for this use case and discuss lessons learned from context enhancements & offline and online evaluation of such AI-assisted coding systems.
Published: 2024
Full Text: View/download PDF

13. A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More

Author: Wang, Zhichao, Bi, Bin, Pentyala, Shiva Kumar, Ramnath, Kiran, Chaudhuri, Sougata, Mehrotra, Shubham, Zixu, Zhu, Mao, Xiang-Bo, Asur, Sitaram, Na, and Cheng
Subjects: Computer Science - Computation and Language
Abstract: With advancements in self-supervised learning, the availability of trillions tokens in a pre-training corpus, instruction fine-tuning, and the development of large Transformers with billions of parameters, large language models (LLMs) are now capable of generating factual and coherent responses to human queries. However, the mixed quality of training data can lead to the generation of undesired responses, presenting a significant challenge. Over the past two years, various methods have been proposed from different perspectives to enhance LLMs, particularly in aligning them with human expectation. Despite these efforts, there has not been a comprehensive survey paper that categorizes and details these approaches. In this work, we aim to address this gap by categorizing these papers into distinct topics and providing detailed explanations of each alignment method, thereby helping readers gain a thorough understanding of the current state of the field.
Published: 2024

14. Graph Structure Prompt Learning: A Novel Methodology to Improve Performance of Graph Neural Networks

Author: Huang, Zhenhua, Li, Kunhao, Wang, Shaojie, Jia, Zhaohong, Zhu, Wentao, and Mehrotra, Sharad
Subjects: Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: Graph neural networks (GNNs) are widely applied in graph data modeling. However, existing GNNs are often trained in a task-driven manner that fails to fully capture the intrinsic nature of the graph structure, resulting in sub-optimal node and graph representations. To address this limitation, we propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of GNNs, which is inspired by prompt mechanisms in natural language processing. GPL employs task-independent graph structure losses to encourage GNNs to learn intrinsic graph characteristics while simultaneously solving downstream tasks, producing higher-quality node and graph representations. In extensive experiments on eleven real-world datasets, after being trained by GPL, GNNs significantly outperform their original performance on node classification, graph classification, and edge prediction tasks (up to 10.28%, 16.5%, and 24.15%, respectively). By allowing GNNs to capture the inherent structural prompts of graphs in GPL, they can alleviate the issue of over-smooth and achieve new state-of-the-art performances, which introduces a novel and effective direction for GNN research with potential applications in various domains.
Published: 2024

15. SES: Bridging the Gap Between Explainability and Prediction of Graph Neural Networks

Author: Huang, Zhenhua, Li, Kunhao, Wang, Shaojie, Jia, Zhaohong, Zhu, Wentao, and Mehrotra, Sharad
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Despite the Graph Neural Networks' (GNNs) proficiency in analyzing graph data, achieving high-accuracy and interpretable predictions remains challenging. Existing GNN interpreters typically provide post-hoc explanations disjointed from GNNs' predictions, resulting in misrepresentations. Self-explainable GNNs offer built-in explanations during the training process. However, they cannot exploit the explanatory outcomes to augment prediction performance, and they fail to provide high-quality explanations of node features and require additional processes to generate explainable subgraphs, which is costly. To address the aforementioned limitations, we propose a self-explained and self-supervised graph neural network (SES) to bridge the gap between explainability and prediction. SES comprises two processes: explainable training and enhanced predictive learning. During explainable training, SES employs a global mask generator co-trained with a graph encoder and directly produces crucial structure and feature masks, reducing time consumption and providing node feature and subgraph explanations. In the enhanced predictive learning phase, mask-based positive-negative pairs are constructed utilizing the explanations to compute a triplet loss and enhance the node representations by contrastive learning., Comment: Accepted as a conference paper at ICDE 2024
Published: 2024

16. Indian Stock Market Prediction using Augmented Financial Intelligence ML

Author: Chauhan, Anishka, Mayur, Pratham, Gokarakonda, Yeshwanth Sai, Jamie, Pooriya, and Mehrotra, Naman
Subjects: Quantitative Finance - Trading and Market Microstructure, Computer Science - Artificial Intelligence, Computer Science - Computational Engineering, Finance, and Science, Statistics - Machine Learning
Abstract: This paper presents price prediction models using Machine Learning algorithms augmented with Superforecasters predictions, aimed at enhancing investment decisions. Five Machine Learning models are built, including Bidirectional LSTM, ARIMA, a combination of CNN and LSTM, GRU, and a model built using LSTM and GRU algorithms. The models are evaluated using the Mean Absolute Error to determine their predictive accuracy. Additionally, the paper suggests incorporating human intelligence by identifying Superforecasters and tracking their predictions to anticipate unpredictable shifts or changes in stock prices . The predictions made by these users can further enhance the accuracy of stock price predictions when combined with Machine Learning and Natural Language Processing techniques. Predicting the price of any commodity can be a significant task but predicting the price of a stock in the stock market deals with much more uncertainty. Recognising the limited knowledge and exposure to stocks among certain investors, this paper proposes price prediction models using Machine Learning algorithms. In this work, five Machine learning models are built using Bidirectional LSTM, ARIMA, a combination of CNN and LSTM, GRU and the last one is built using LSTM and GRU algorithms. Later these models are assessed using MAE scores to find which model is predicting with the highest accuracy. In addition to this, this paper also suggests the use of human intelligence to closely predict the shift in price patterns in the stock market The main goal is to identify Superforecasters and track their predictions to anticipate unpredictable shifts or changes in stock prices. By leveraging the combined power of Machine Learning and the Human Intelligence, predictive accuracy can be significantly increased., Comment: Keywords: Machine Learning, Artificial Intelligence, LSTM, GRU, ARMA, CNN, NLP, ANN, SVM, BSE, NIFTY, MAE, MSE, BiLSTM . Published in SSRN Journal
Published: 2024
Full Text: View/download PDF

17. ProBE: Proportioning Privacy Budget for Complex Exploratory Decision Support

Author: Lahjouji, Nada, Ghayyur, Sameera, He, Xi, and Mehrotra, Sharad
Subjects: Computer Science - Databases
Abstract: This paper studies privacy in the context of complex decision support queries composed of multiple conditions on different aggregate statistics combined using disjunction and conjunction operators. Utility requirements for such queries necessitate the need for private mechanisms that guarantee a bound on the false negative and false positive errors. This paper formally defines complex decision support queries and their accuracy requirements, and provides algorithms that proportion the existing budget to optimally minimize privacy loss while supporting a bounded guarantee on the accuracy. Our experimental results on multiple real-life datasets show that our algorithms successfully maintain such utility guarantees, while also minimizing privacy loss.
Published: 2024

18. MASAI: Modular Architecture for Software-engineering AI Agents

Author: Arora, Daman, Sonwane, Atharv, Wadhwa, Nalin, Mehrotra, Abhav, Utpala, Saiteja, Bairi, Ramakrishna, Kanade, Aditya, and Natarajan, Nagarajan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Software Engineering
Abstract: A common method to solve complex problems in software engineering, is to divide the problem into multiple sub-problems. Inspired by this, we propose a Modular Architecture for Software-engineering AI (MASAI) agents, where different LLM-powered sub-agents are instantiated with well-defined objectives and strategies tuned to achieve those objectives. Our modular architecture offers several advantages: (1) employing and tuning different problem-solving strategies across sub-agents, (2) enabling sub-agents to gather information from different sources scattered throughout a repository, and (3) avoiding unnecessarily long trajectories which inflate costs and add extraneous context. MASAI enabled us to achieve the highest performance (28.33% resolution rate) on the popular and highly challenging SWE-bench Lite dataset consisting of 300 GitHub issues from 11 Python repositories. We conduct a comprehensive evaluation of MASAI relative to other agentic methods and analyze the effects of our design decisions and their contribution to the success of MASAI.
Published: 2024

19. How is the Pilot Doing: VTOL Pilot Workload Estimation by Multimodal Machine Learning on Psycho-physiological Signals

Author: Park, Jong Hoon, Chen, Lawrence, Higgins, Ian, Zheng, Zhaobo, Mehrotra, Shashank, Salubre, Kevin, Mousaei, Mohammadreza, Willits, Steven, Levedahl, Blain, Buker, Timothy, Xing, Eliot, Misu, Teruhisa, Scherer, Sebastian, and Oh, Jean
Subjects: Computer Science - Human-Computer Interaction
Abstract: Vertical take-off and landing (VTOL) aircraft do not require a prolonged runway, thus allowing them to land almost anywhere. In recent years, their flexibility has made them popular in development, research, and operation. When compared to traditional fixed-wing aircraft and rotorcraft, VTOLs bring unique challenges as they combine many maneuvers from both types of aircraft. Pilot workload is a critical factor for safe and efficient operation of VTOLs. In this work, we conduct a user study to collect multimodal data from 28 pilots while they perform a variety of VTOL flight tasks. We analyze and interpolate behavioral patterns related to their performance and perceived workload. Finally, we build machine learning models to estimate their workload from the collected data. Our results are promising, suggesting that quantitative and accurate VTOL pilot workload monitoring is viable. Such assistive tools would help the research field understand VTOL operations and serve as a stepping stone for the industry to ensure VTOL safe operations and further remote operations., Comment: 8 pages, 7 figures
Published: 2024

20. On Approximation of Robust Max-Cut and Related Problems using Randomized Rounding Algorithms

Author: Shi, Haoyan and Mehrotra, Sanjay
Subjects: Computer Science - Data Structures and Algorithms, Mathematics - Optimization and Control
Abstract: Goemans and Williamson proposed a randomized rounding algorithm for the MAX-CUT problem with a 0.878 approximation bound in expectation. The 0.878 approximation bound remains the best-known approximation bound for this APX-hard problem. Their approach was subsequently applied to other related problems such as Max-DiCut, MAX-SAT, and Max-2SAT, etc. We show that the randomized rounding algorithm can also be used to achieve a 0.878 approximation bound for the robust and distributionally robust counterparts of the max-cut problem. We also show that the approximation bounds for the other problems are maintained for their robust and distributionally robust counterparts if the randomization projection framework is used.
Published: 2024

21. CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models

Author: Harne, Sarthak, Choudhury, Monjoy Narayan, Rao, Madhav, Srikanth, TK, Mehrotra, Seema, Vashisht, Apoorva, Basu, Aarushi, and Sodhi, Manjit
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The limited availability of psychologists necessitates efficient identification of individuals requiring urgent mental healthcare. This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations. By analyzing forum posts, these pipelines can flag users who may require immediate professional attention. A crucial challenge in this domain is data privacy and scarcity. To address this, we propose utilizing readily available curricular texts used in institutes specializing in mental health for pre-training the NLP pipelines. This helps us mimic the training process of a psychologist. Our work presents CASE-BERT that flags potential mental health disorders based on forum text. CASE-BERT demonstrates superior performance compared to existing methods, achieving an f1 score of 0.91 for Depression and 0.88 for Anxiety, two of the most commonly reported mental health disorders. Our code and data are publicly available.
Published: 2024

22. Enhancing Creativity in Large Language Models through Associative Thinking Strategies

Author: Mehrotra, Pronita, Parab, Aishni, and Gulwani, Sumit
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: This paper explores the enhancement of creativity in Large Language Models (LLMs) like vGPT-4 through associative thinking, a cognitive process where creative ideas emerge from linking seemingly unrelated concepts. Associative thinking strategies have been found to effectively help humans boost creativity. However, whether the same strategies can help LLMs become more creative remains under-explored. In this work, we investigate whether prompting LLMs to connect disparate concepts can augment their creative outputs. Focusing on three domains -- Product Design, Storytelling, and Marketing -- we introduce creativity tasks designed to assess vGPT-4's ability to generate original and useful content. By challenging the models to form novel associations, we evaluate the potential of associative thinking to enhance the creative capabilities of LLMs. Our findings show that leveraging associative thinking techniques can significantly improve the originality of vGPT-4's responses.
Published: 2024

23. S4: Self-Supervised Sensing Across the Spectrum

Author: Shenoy, Jayanth, Zhang, Xingjian Davis, Mehrotra, Shlok, Tao, Bill, Yang, Rem, Zhao, Han, and Vasisht, Deepak
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Satellite image time series (SITS) segmentation is crucial for many applications like environmental monitoring, land cover mapping and agricultural crop type classification. However, training models for SITS segmentation remains a challenging task due to the lack of abundant training data, which requires fine grained annotation. We propose S4 a new self-supervised pre-training approach that significantly reduces the requirement for labeled training data by utilizing two new insights: (a) Satellites capture images in different parts of the spectrum such as radio frequencies, and visible frequencies. (b) Satellite imagery is geo-registered allowing for fine-grained spatial alignment. We use these insights to formulate pre-training tasks in S4. We also curate m2s2-SITS, a large-scale dataset of unlabeled, spatially-aligned, multi-modal and geographic specific SITS that serves as representative pre-training data for S4. Finally, we evaluate S4 on multiple SITS segmentation datasets and demonstrate its efficacy against competing baselines while using limited labeled data.
Published: 2024

24. CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery

Author: Rongali, Sai Bhargav, Mehrotra, Sarthak, Jha, Ankit, C, Mohamad Hassan N, Bose, Shirsha, Gupta, Tanisha, Singha, Mainak, and Banerjee, Biplab
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets, while emphasizing the distinct categorization of the target data. To facilitate this, we propose an entropy-driven adversarial learning strategy that accounts for the distance distributions of target samples relative to source-domain class prototypes. Parallelly, the discriminative nature of the shared space is upheld through a fusion of three metric learning objectives. In the source domain, our focus is on refining the proximity between samples and their affiliated class prototypes, while in the target domain, we integrate a neighborhood-centric contrastive learning mechanism, enriched with an adept neighborsmining approach. To further accentuate the nuanced feature interrelation among semantically aligned images, we champion the concept of conditional image inpainting, underscoring the premise that semantically analogous images prove more efficacious to the task than their disjointed counterparts. Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present., Comment: Accepted in L3D-IVU, CVPR Workshop, 2024
Published: 2024

25. SoK: The Faults in our Graph Benchmarks

Author: Mehrotra, Puneet, Anand, Vaastav, Margo, Daniel, Hajidehi, Milad Rezaei, and Seltzer, Margo
Subjects: Computer Science - Databases
Abstract: Graph-structured data is prevalent in domains such as social networks, financial transactions, brain networks, and protein interactions. As a result, the research community has produced new databases and analytics engines to process such data. Unfortunately, there is not yet widespread benchmark standardization in graph processing, and the heterogeneity of evaluations found in the literature can lead researchers astray. Evaluations frequently ignore datasets' statistical idiosyncrasies, which significantly affect system performance. Scalability studies often use datasets that fit easily in memory on a modest desktop. Some studies rely on synthetic graph generators, but these generators produce graphs with unnatural characteristics that also affect performance, producing misleading results. Currently, the community has no consistent and principled manner with which to compare systems and provide guidance to developers who wish to select the system most suited to their application. We provide three different systematizations of benchmarking practices. First, we present a 12-year literary review of graph processing benchmarking, including a summary of the prevalence of specific datasets and benchmarks used in these papers. Second, we demonstrate the impact of two statistical properties of datasets that drastically affect benchmark performance. We show how different assignments of IDs to vertices, called vertex orderings, dramatically alter benchmark performance due to the caching behavior they induce. We also show the impact of zero-degree vertices on the runtime of benchmarks such as breadth-first search and single-source shortest path. We show that these issues can cause performance to change by as much as 38% on several popular graph processing systems. Finally, we suggest best practices to account for these issues when evaluating graph systems.
Published: 2024

26. Should I Help a Delivery Robot? Cultivating Prosocial Norms through Observations

Author: Chi, Vivienne Bihe, Mehrotra, Shashank, Misu, Teruhisa, and Akash, Kumar
Subjects: Computer Science - Robotics, Computer Science - Human-Computer Interaction
Abstract: We propose leveraging prosocial observations to cultivate new social norms to encourage prosocial behaviors toward delivery robots. With an online experiment, we quantitatively assess updates in norm beliefs regarding human-robot prosocial behaviors through observational learning. Results demonstrate the initially perceived normativity of helping robots is influenced by familiarity with delivery robots and perceptions of robots' social intelligence. Observing human-robot prosocial interactions notably shifts peoples' normative beliefs about prosocial actions; thereby changing their perceived obligations to offer help to delivery robots. Additionally, we found that observing robots offering help to humans, rather than receiving help, more significantly increased participants' feelings of obligation to help robots. Our findings provide insights into prosocial design for future mobility systems. Improved familiarity with robot capabilities and portraying them as desirable social partners can help foster wider acceptance. Furthermore, robots need to be designed to exhibit higher levels of interactivity and reciprocal capabilities for prosocial behavior., Comment: Accepted as a Late Breaking Work at CHI'24
Published: 2024
Full Text: View/download PDF

27. Measuring Compliance with the California Consumer Privacy Act Over Space and Time

Author: Tran, Van, Mehrotra, Aarushi, Chetty, Marshini, Feamster, Nick, Frankenreiter, Jens, and Strahilevitz, Lior
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Cryptography and Security, Computer Science - Computers and Society
Abstract: The widespread sharing of consumers personal information with third parties raises significant privacy concerns. The California Consumer Privacy Act (CCPA) mandates that online businesses offer consumers the option to opt out of the sale and sharing of personal information. Our study automatically tracks the presence of the opt-out link longitudinally across multiple states after the California Privacy Rights Act (CPRA) went into effect. We categorize websites based on whether they are subject to CCPA and investigate cases of potential non-compliance. We find a number of websites that implement the opt-out link early and across all examined states but also find a significant number of CCPA-subject websites that fail to offer any opt-out methods even when CCPA is in effect. Our findings can shed light on how websites are reacting to the CCPA and identify potential gaps in compliance and opt-out method designs that hinder consumers from exercising CCPA opt-out rights.
Published: 2024
Full Text: View/download PDF

28. Future of Assessments: Centering Equity and the Lived Experiences of Students, Families, and Educators

Author: Education Trust, Munyan-Penney, Nicholas, and Mehrotra, Sarah
Abstract: Addressing inequities in the educational outcomes--particularly for students of color and students from low-income backgrounds--cannot happen without comparable data from statewide summative assessments. Statewide assessment results help schools and district leaders target state and local resources to the students and schools with the greatest need and track whether these resources are impacting student achievement. Despite this, many educators, students, and families say that federal assessment and accountability policies take away from instructional time without providing actionable data. Meanwhile, pandemic pauses in administering statewide assessments and changes in political dynamics at the state and federal levels have opened a window of opportunity to develop new statewide summative assessments that gauge how students are doing, highlight disparities, and show where interventions aren't measuring up to their promise and might be improved. This paper centers the lived experiences and perspectives of students, families, educators, and district and state leaders, so that they can be used to design assessments that provide data that will enable the Ed Trust to promote equitable learning opportunities and improve outcomes for all students. To better understand how directly impacted communities are experiencing statewide assessments, Ed Trust held focus groups with diverse stakeholders who are on the ground, focusing on students of color, students from low-income backgrounds, English learners, and those who work in a school or district in which the majority of students are members of these student groups. The focus group findings informed the creation of "equity pillars," which highlight key values and identify criteria for improving federal assessment policy, and federal policy recommendations for how this vision could be achieved.
Published: 2023

29. Pivoting Retail Supply Chain with Deep Generative Techniques: Taxonomy, Survey and Insights

Author: Wang, Yuan, Sambasivan, Lokesh Kumar, Fu, Mingang, and Mehrotra, Prakhar
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Generative AI applications, such as ChatGPT or DALL-E, have shown the world their impressive capabilities in generating human-like text or image. Diving deeper, the science stakeholder for those AI applications are Deep Generative Models, a.k.a DGMs, which are designed to learn the underlying distribution of the data and generate new data points that are statistically similar to the original dataset. One critical question is raised: how can we leverage DGMs into morden retail supply chain realm? To address this question, this paper expects to provide a comprehensive review of DGMs and discuss their existing and potential usecases in retail supply chain, by (1) providing a taxonomy and overview of state-of-the-art DGMs and their variants, (2) reviewing existing DGM applications in retail supply chain from a end-to-end view of point, and (3) discussing insights and potential directions on how DGMs can be further utilized on solving retail supply chain problems.
Published: 2024

30. Fair Classification with Partial Feedback: An Exploration-Based Data Collection Approach

Author: Keswani, Vijay, Mehrotra, Anay, and Celis, L. Elisa
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Statistics - Machine Learning
Abstract: In many predictive contexts (e.g., credit lending), true outcomes are only observed for samples that were positively classified in the past. These past observations, in turn, form training datasets for classifiers that make future predictions. However, such training datasets lack information about the outcomes of samples that were (incorrectly) negatively classified in the past and can lead to erroneous classifiers. We present an approach that trains a classifier using available data and comes with a family of exploration strategies to collect outcome data about subpopulations that otherwise would have been ignored. For any exploration strategy, the approach comes with guarantees that (1) all sub-populations are explored, (2) the fraction of false positives is bounded, and (3) the trained classifier converges to a ``desired'' classifier. The right exploration strategy is context-dependent; it can be chosen to improve learning guarantees and encode context-specific group fairness properties. Evaluation on real-world datasets shows that this approach consistently boosts the quality of collected outcome data and improves the fraction of true positives for all groups, with only a small reduction in predictive utility., Comment: Accepted for presentation at ICML 2024
Published: 2024

31. Building Retrieval Systems for the ClueWeb22-B Corpus

Author: Mehrotra, Harshit, Callan, Jamie, and Fan, Zhen
Subjects: Computer Science - Information Retrieval
Abstract: The ClueWeb22 dataset containing nearly 10 billion documents was released in 2022 to support academic and industry research. The goal of this project was to build retrieval baselines for the English section of the "super head" part (category B) of this dataset. These baselines can then be used by the research community to compare their systems and also to generate data to train/evaluate new retrieval and ranking algorithms. The report covers sparse and dense first stage retrievals as well as neural rerankers that were implemented for this dataset. These systems are available as a service on a Carnegie Mellon University cluster.
Published: 2024

32. Elevating Industries with Unmanned Aerial Vehicles: Integrating Sustainability and Operational Innovation

Author: Kurbanzade, Ali Kaan, Baig, Ansaar M., and Mehrotra, Sanjay
Subjects: Mathematics - Optimization and Control
Abstract: Unmanned aerial vehicles, commonly known as drones, have emerged as a disruptive technology with the potential to revolutionize operations across various industries. Drones are the fast-growing internet-of-things technology and are estimated to have a $100 billion market value in the next decade. Exploring drone operations through research has the potential to yield innovative academic insights and create significant practical effects in diverse industries, offering a competitive edge. Drawing insights from both academic and industry literature, this article describes how technological advancements in UAVs may disrupt traditional operational practices in different industries (e.g., commercial last-mile delivery, commercial pickup and delivery, telecommunication, insurance, healthcare, humanitarian, environmental, urban planning, homeland security), identifies the value of this evolving disruptive technology from sustainability and operational innovation perspectives, argues the significance of this area for operations management by conceptualizing a research agenda. The current state of the art focuses on the computing aspect of analytical models to tackle a variety of synthetic drone-related problems, with mixed integer optimization being the primary tool. There is a very significant research gap that should focus on drone operations management with industry know-how by partnering with actual stakeholders and using a variety of tools (i.e., econometrics, field experiments, game theory, optimal control, utility functions). This article aims to promote research on UAVs from operations management and industry-specific point of view., Comment: 45 pages, 10 figures
Published: 2023

33. Fast Sampling Through The Reuse Of Attention Maps In Diffusion Models

Author: Hunter, Rosco, Dudziak, Łukasz, Abdelfattah, Mohamed S., Mehrotra, Abhinav, Bhattacharya, Sourav, and Wen, Hongkai
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Text-to-image diffusion models have demonstrated unprecedented capabilities for flexible and realistic image synthesis. Nevertheless, these models rely on a time-consuming sampling procedure, which has motivated attempts to reduce their latency. When improving efficiency, researchers often use the original diffusion model to train an additional network designed specifically for fast image generation. In contrast, our approach seeks to reduce latency directly, without any retraining, fine-tuning, or knowledge distillation. In particular, we find the repeated calculation of attention maps to be costly yet redundant, and instead suggest reusing them during sampling. Our specific reuse strategies are based on ODE theory, which implies that the later a map is reused, the smaller the distortion in the final image. We empirically compare these reuse strategies with few-step sampling procedures of comparable latency, finding that reuse generates images that are closer to those produced by the original high-latency diffusion model.
Published: 2023

34. Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Author: Mehrotra, Anay, Zampetakis, Manolis, Kassianik, Paul, Nelson, Blaine, Anderson, Hyrum, Singer, Yaron, and Karbasi, Amin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Cryptography and Security, Statistics - Machine Learning
Abstract: While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks. In this work, we present Tree of Attacks with Pruning (TAP), an automated method for generating jailbreaks that only requires black-box access to the target LLM. TAP utilizes an attacker LLM to iteratively refine candidate (attack) prompts until one of the refined prompts jailbreaks the target. In addition, before sending prompts to the target, TAP assesses them and prunes the ones unlikely to result in jailbreaks, reducing the number of queries sent to the target LLM. In empirical evaluations, we observe that TAP generates prompts that jailbreak state-of-the-art LLMs (including GPT4-Turbo and GPT4o) for more than 80% of the prompts. This significantly improves upon the previous state-of-the-art black-box methods for generating jailbreaks while using a smaller number of queries than them. Furthermore, TAP is also capable of jailbreaking LLMs protected by state-of-the-art guardrails, e.g., LlamaGuard., Comment: Accepted for presentation at NeurIPS 2024. Code: https://github.com/RICommunity/TAP
Published: 2023

35. On Gradient Boosted Decision Trees and Neural Rankers: A Case-Study on Short-Video Recommendations at ShareChat

Author: Jeunen, Olivier, Sagtani, Hitesh, Doi, Himanshu, Karimov, Rasul, Pokharna, Neeti, Kalim, Danish, Ustimenko, Aleksei, Green, Christopher, Shi, Wenzhe, and Mehrotra, Rishabh
Subjects: Computer Science - Information Retrieval
Abstract: Practitioners who wish to build real-world applications that rely on ranking models, need to decide which modelling paradigm to follow. This is not an easy choice to make, as the research literature on this topic has been shifting in recent years. In particular, whilst Gradient Boosted Decision Trees (GBDTs) have reigned supreme for more than a decade, the flexibility of neural networks has allowed them to catch up, and recent works report accuracy metrics that are on par. Nevertheless, practical systems require considerations beyond mere accuracy metrics to decide on a modelling approach. This work describes our experiences in balancing some of the trade-offs that arise, presenting a case study on a short-video recommendation application. We highlight (1) neural networks' ability to handle large training data size, user- and item-embeddings allows for more accurate models than GBDTs in this setting, and (2) because GBDTs are less reliant on specialised hardware, they can provide an equally accurate model at a lower cost. We believe these findings are of relevance to researchers in both academia and industry, and hope they can inspire practitioners who need to make similar modelling choices in the future., Comment: Appearing in the Industry Track Proceedings of the Forum for Information Retrieval Evaluation (FIRE '23)
Published: 2023
Full Text: View/download PDF

36. A Systematic Review on Fostering Appropriate Trust in Human-AI Interaction

Author: Mehrotra, Siddharth, Degachi, Chadha, Vereschak, Oleksandra, Jonker, Catholijn M., and Tielman, Myrthe L.
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: Appropriate Trust in Artificial Intelligence (AI) systems has rapidly become an important area of focus for both researchers and practitioners. Various approaches have been used to achieve it, such as confidence scores, explanations, trustworthiness cues, or uncertainty communication. However, a comprehensive understanding of the field is lacking due to the diversity of perspectives arising from various backgrounds that influence it and the lack of a single definition for appropriate trust. To investigate this topic, this paper presents a systematic review to identify current practices in building appropriate trust, different ways to measure it, types of tasks used, and potential challenges associated with it. We also propose a Belief, Intentions, and Actions (BIA) mapping to study commonalities and differences in the concepts related to appropriate trust by (a) describing the existing disagreements on defining appropriate trust, and (b) providing an overview of the concepts and definitions related to appropriate trust in AI from the existing literature. Finally, the challenges identified in studying appropriate trust are discussed, and observations are summarized as current trends, potential gaps, and research opportunities for future work. Overall, the paper provides insights into the complex concept of appropriate trust in human-AI interaction and presents research opportunities to advance our understanding on this topic., Comment: 39 Pages
Published: 2023

37. Bias in Evaluation Processes: An Optimization-Based Model

Author: Celis, L. Elisa, Kumar, Amit, Mehrotra, Anay, and Vishnoi, Nisheeth K.
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Biases with respect to socially-salient attributes of individuals have been well documented in evaluation processes used in settings such as admissions and hiring. We view such an evaluation process as a transformation of a distribution of the true utility of an individual for a task to an observed distribution and model it as a solution to a loss minimization problem subject to an information constraint. Our model has two parameters that have been identified as factors leading to biases: the resource-information trade-off parameter in the information constraint and the risk-averseness parameter in the loss function. We characterize the distributions that arise from our model and study the effect of the parameters on the observed distribution. The outputs of our model enrich the class of distributions that can be used to capture variation across groups in the observed evaluations. We empirically validate our model by fitting real-world datasets and use it to study the effect of interventions in a downstream selection task. These results contribute to an understanding of the emergence of bias in evaluation processes and provide tools to guide the deployment of interventions to mitigate biases., Comment: The conference version of this paper appears in NeurIPS 2023
Published: 2023

38. Hiding Access-pattern is Not Enough! Veil: A Storage and Communication Efficient Volume-Hiding Algorithm

Author: Han, Shanshan, Chakraborty, Vishal, Goodrich, Michael, Mehrotra, Sharad, and Sharma, Shantanu
Subjects: Computer Science - Databases, Computer Science - Cryptography and Security
Abstract: This paper addresses volume leakage (i.e., leakage of the number of records in the answer set) when processing keyword queries in encrypted key-value (KV) datasets. Volume leakage, coupled with prior knowledge about data distribution and/or previously executed queries, can reveal both ciphertexts and current user queries. We develop a solution to prevent volume leakage, entitled Veil, that partitions the dataset by randomly mapping keys to a set of equi-sized buckets. Veil provides a tunable mechanism for data owners to explore a trade-off between storage and communication overheads. To make buckets indistinguishable to the adversary, Veil uses a novel padding strategy that allow buckets to overlap, reducing the need to add fake records. Both theoretical and experimental results show Veil to significantly outperform existing state-of-the-art.
Published: 2023

39. Wellbeing in Future Mobility: Toward AV Policy Design to Increase Wellbeing through Interactions

Author: Mehrotra, Shashank, Zahedi, Zahra, Misu, Teruhisa, and Akash, Kumar
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Robotics
Abstract: Recent advances in Automated vehicle (AV) technology and micromobility devices promise a transformational change in the future of mobility usage. These advances also pose challenges concerning human-AV interactions. To ensure the smooth adoption of these new mobilities, it is essential to assess how past experiences and perceptions of social interactions by people may impact the interactions with AV mobility. This research identifies and estimates an individual's wellbeing based on their actions, prior experiences, social interaction perceptions, and dyadic interactions with other road users. An online video-based user study was designed, and responses from 300 participants were collected and analyzed to investigate the impact on individual wellbeing. A machine learning model was designed to predict the change in wellbeing. An optimal policy based on the model allows informed AV actions toward its yielding behavior with other road users to enhance users' wellbeing. The findings from this study have broader implications for creating human-aware systems by creating policies that align with the individual state and contribute toward designing systems that align with an individual's state of wellbeing., Comment: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), September 24-28, 2023, Bilbao, Bizkaia, Spain
Published: 2023

40. Ad-load Balancing via Off-policy Learning in a Content Marketplace

Author: Sagtani, Hitesh, Jhawar, Madan, Mehrotra, Rishabh, and Jeunen, Olivier
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Ad-load balancing is a critical challenge in online advertising systems, particularly in the context of social media platforms, where the goal is to maximize user engagement and revenue while maintaining a satisfactory user experience. This requires the optimization of conflicting objectives, such as user satisfaction and ads revenue. Traditional approaches to ad-load balancing rely on static allocation policies, which fail to adapt to changing user preferences and contextual factors. In this paper, we present an approach that leverages off-policy learning and evaluation from logged bandit feedback. We start by presenting a motivating analysis of the ad-load balancing problem, highlighting the conflicting objectives between user satisfaction and ads revenue. We emphasize the nuances that arise due to user heterogeneity and the dependence on the user's position within a session. Based on this analysis, we define the problem as determining the optimal ad-load for a particular feed fetch. To tackle this problem, we propose an off-policy learning framework that leverages unbiased estimators such as Inverse Propensity Scoring (IPS) and Doubly Robust (DR) to learn and estimate the policy values using offline collected stochastic data. We present insights from online A/B experiments deployed at scale across over 80 million users generating over 200 million sessions, where we find statistically significant improvements in both user satisfaction metrics and ads revenue for the platform., Comment: Early version presented at the CONSEQUENCES '23 workshop at RecSys '23, final version appearing at WSDM '24
Published: 2023
Full Text: View/download PDF

41. Optimizing Equitable Resource Allocation in Parallel Any-Scale Queues with Service Abandonment and its Application to Liver Transplant

Author: Li, Shukai and Mehrotra, Sanjay
Subjects: Mathematics - Optimization and Control
Abstract: We study the problem of equitably and efficiently allocating an arriving resource to multiple queues with customer abandonment. The problem is motivated by the cadaveric liver allocation system of the United States, which includes a large number of small-scale (in terms of yearly arrival intensities) patient waitlists with the possibility of patients abandoning (due to death) until the required service is completed (matched donor liver arrives). We model each waitlist as a GI/MI/1+GI queue, in which a virtual server receives a donor liver for the patient at the top of the waitlist, and patients may abandon while waiting or during service. To evaluate the performance of each queue, we develop a finite approximation technique as an alternative to fluid or diffusion approximations, which are inaccurate unless the queue's arrival intensity is large. This finite approximation for hundreds of queues is used within an optimization model to optimally allocate donor livers to each waitlist. A piecewise linear approximation of the optimization model is shown to provide the desired accuracy. Computational results show that solutions obtained in this way provide greater flexibility, and improve system performance when compared to solutions from the fluid models. Importantly, we find that appropriately increasing the proportion of livers allocated to waitlists with small scales or high mortality risks improves the allocation equity. This suggests a proportionately greater allocation of organs to smaller transplant centers and/or those with more vulnerable populations in an allocation policy. While our motivation is from liver allocation, the solution approach developed in this paper is applicable in other operational contexts with similar modeling frameworks., Comment: 48 Pages
Published: 2023

42. Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding

Author: Zhang, Jun, Wang, Jue, Li, Huan, Shou, Lidan, Chen, Ke, Chen, Gang, and Mehrotra, Sharad
Subjects: Computer Science - Computation and Language
Abstract: We present a novel inference scheme, self-speculative decoding, for accelerating Large Language Models (LLMs) without the need for an auxiliary model. This approach is characterized by a two-stage process: drafting and verification. The drafting stage generates draft tokens at a slightly lower quality but more quickly, which is achieved by selectively skipping certain intermediate layers during drafting. Subsequently, the verification stage employs the original LLM to validate those draft output tokens in one forward pass. This process ensures the final output remains identical to that produced by the unaltered LLM. Moreover, the proposed method requires no additional neural network training and no extra memory footprint, making it a plug-and-play and cost-effective solution for inference acceleration. Benchmarks with LLaMA-2 and its variants demonstrated a speedup up to 1.99$\times$., Comment: Accepted to ACL 2024
Published: 2023

43. A Use Case-Engineering Resources Taxonomy for Analytical Spreadsheet Models

Author: Grossman, Thomas A. and Mehrotra, Vijay
Subjects: Computer Science - Software Engineering
Abstract: This paper presents a taxonomy for analytical spreadsheet models. It considers both the use case that a spreadsheet is meant to serve, and the engineering resources devoted to its development. We extend a previous three-type taxonomy, to identify nine types of spreadsheet models, that encompass the many analytical spreadsheet models seen in the literature. We connect disparate research literature to distinguish between an "analytical solution" and an "industrial-quality analytical spreadsheet model". We explore the nature of each of the nine types, propose definitions for some, relate them to the literature, and hypothesize on how they might arise. The taxonomy aids in identifying where various spreadsheet development guidelines are most useful, provides a lens for viewing spreadsheet errors and risk, and offers a structure for understanding how spreadsheets change over time. This taxonomy opens the door to many interesting research questions, including refinements to itself., Comment: 13 Pages, 7 Figures, 2 Tables
Published: 2023

44. Data-CASE: Grounding Data Regulations for Compliant Data Processing Systems

Author: Chakraborty, Vishal, Ann-Elvy, Stacy, Mehrotra, Sharad, Nawab, Faisal, Sadoghi, Mohammad, Sharma, Shantanu, Venkatsubhramanian, Nalini, and Saeed, Farhan
Subjects: Computer Science - Databases
Abstract: Data regulations, such as GDPR, are increasingly being adopted globally to protect against unsafe data management practices. Such regulations are, often ambiguous (with multiple valid interpretations) when it comes to defining the expected dynamic behavior of data processing systems. This paper argues that it is possible to represent regulations such as GDPR formally as invariants using a (small set of) data processing concepts that capture system behavior. When such concepts are grounded, i.e., they are provided with a single unambiguous interpretation, systems can achieve compliance by demonstrating that the system-actions they implement maintain the invariants (representing the regulations). To illustrate our vision, we propose Data-CASE, a simple yet powerful model that (a) captures key data processing concepts (b) a set of invariants that describe regulations in terms of these concepts. We further illustrate the concept of grounding using "deletion" as an example and highlight several ways in which end-users, companies, and software designers/engineers can use Data-CASE., Comment: To appear in EDBT '24
Published: 2023

45. Fixing Rust Compilation Errors using LLMs

Author: Deligiannis, Pantazis, Lal, Akash, Mehrotra, Nikita, and Rastogi, Aseem
Subjects: Computer Science - Software Engineering, Computer Science - Programming Languages
Abstract: The Rust programming language, with its safety guarantees, has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++. These guarantees come from a strong ownership-based type system, as well as primitive support for features like closures, pattern matching, etc., that make the code more concise and amenable to reasoning. These unique Rust features also pose a steep learning curve for programmers. This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors. RustAssistant uses a careful combination of prompting techniques as well as iteration with an LLM to deliver high accuracy of fixes. RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories. We plan to release our dataset of Rust compilation errors to enable further research.
Published: 2023

46. Sampling Individually-Fair Rankings that are Always Group Fair

Author: Gorantla, Sruthi, Mehrotra, Anay, Deshpande, Amit, and Louis, Anand
Subjects: Computer Science - Computers and Society, Computer Science - Data Structures and Algorithms, Computer Science - Information Retrieval, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Rankings on online platforms help their end-users find the relevant information -- people, news, media, and products -- quickly. Fair ranking tasks, which ask to rank a set of items to maximize utility subject to satisfying group-fairness constraints, have gained significant interest in the Algorithmic Fairness, Information Retrieval, and Machine Learning literature. Recent works, however, identify uncertainty in the utilities of items as a primary cause of unfairness and propose introducing randomness in the output. This randomness is carefully chosen to guarantee an adequate representation of each item (while accounting for the uncertainty). However, due to this randomness, the output rankings may violate group fairness constraints. We give an efficient algorithm that samples rankings from an individually-fair distribution while ensuring that every output ranking is group fair. The expected utility of the output ranking is at least $\alpha$ times the utility of the optimal fair solution. Here, $\alpha$ depends on the utilities, position-discounts, and constraints -- it approaches 1 as the range of utilities or the position-discounts shrinks, or when utilities satisfy distributional assumptions. Empirically, we observe that our algorithm achieves individual and group fairness and that Pareto dominates the state-of-the-art baselines., Comment: Full version of a paper accepted for presentation in ACM AIES 2023
Published: 2023

47. Subset Selection Based On Multiple Rankings in the Presence of Bias: Effectiveness of Fairness Constraints for Multiwinner Voting Score Functions

Author: Boehmer, Niclas, Celis, L. Elisa, Huang, Lingxiao, Mehrotra, Anay, and Vishnoi, Nisheeth K.
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning
Abstract: We consider the problem of subset selection where one is given multiple rankings of items and the goal is to select the highest ``quality'' subset. Score functions from the multiwinner voting literature have been used to aggregate rankings into quality scores for subsets. We study this setting of subset selection problems when, in addition, rankings may contain systemic or unconscious biases toward a group of items. For a general model of input rankings and biases, we show that requiring the selected subset to satisfy group fairness constraints can improve the quality of the selection with respect to unbiased rankings. Importantly, we show that for fairness constraints to be effective, different multiwinner score functions may require a drastically different number of rankings: While for some functions, fairness constraints need an exponential number of rankings to recover a close-to-optimal solution, for others, this dependency is only polynomial. This result relies on a novel notion of ``smoothness'' of submodular functions in this setting that quantifies how well a function can ``correctly'' assess the quality of items in the presence of bias. The results in this paper can be used to guide the choice of multiwinner score functions for the subset selection setting considered here; we additionally provide a tool to empirically enable this., Comment: The conference version of this paper appears in ICML 2023
Published: 2023

48. Data-Aided CSI Estimation Using Affine-Precoded Superimposed Pilots in Orthogonal Time Frequency Space Modulated MIMO Systems

Author: Mehrotra, Anand, Srivastava, Suraj, Jagannatham, Aditya K., and Hanzo, Lajos
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: An orthogonal affine-precoded superimposed pilot-based architecture is developed for the cyclic prefix (CP)-aided SISO and MIMO orthogonal time frequency space systems relying on arbitrary transmitter-receiver pulse shaping. The data and pilot symbol matrices are affine-precoded and superimposed in the delay Doppler-domain followed by the development of an end-to-end DD-domain relationship for the input-output symbols. At the receiver, the decoupled pilot and data symbol are extracted by employing orthogonal precoder matrices, which eliminates the mutual interference. Furthermore, a novel pilot-aided Bayesian learning (PA-BL) technique is conceived for the channel state information (CSI) estimation of SISO OTFS systems based on the expectation-maximization (EM) technique. Subsequently, a data-aided Bayesian learning (DA-BL)-based joint CSI estimation and data detection technique is proposed, which beneficially harnesses the estimated data symbols for improved CSI estimation. In this scenario our sophisticated data detection rule also integrates the CSI uncertainty of channel estimation into our the linear minimum mean square error (LMMSE) detectors. The AP-SIP framework is also extended to MIMO OTFS systems, wherein the DD-domain input matrix is affine-precoded for each transmit antenna (TA). Then an EM algorithm-based PA-BL scheme is derived for simultaneous row-group sparse CSI estimation for this system, followed also by our data-aided DA-BL scheme that performs joint CSI estimation and data detection. Moreover, the Bayesian Cramer-Rao bounds (BCRBs) are also derived for both SISO as well as MIMO OTFS systems. Finally, simulation results are presented for characterizing the performance of the proposed CSI estimation techniques in a range of typical settings along with their bit error rate (BER) performance in comparison to an ideal system having perfect CSI.
Published: 2023

49. Optimization Modeling for Pandemic Vaccine Supply Chain Management: A Review and Future Research Opportunities

Author: Dey, Shibshankar, Kurbanzade, Ali Kaan, Gel, Esma S., Mihaljevic, Joseph, and Mehrotra, Sanjay
Subjects: Mathematics - Optimization and Control
Abstract: During various stages of the COVID-19 pandemic, countries implemented diverse vaccine management approaches, influenced by variations in infrastructure and socio-economic conditions. This article provides a comprehensive overview of optimization models developed by the research community throughout the COVID-19 era, aimed at enhancing vaccine distribution and establishing a standardized framework for future pandemic preparedness. These models address critical issues such as site selection, inventory management, allocation strategies, distribution logistics, and route optimization encountered during the COVID-19 crisis. A unified framework is employed to describe the models, emphasizing their integration with epidemiological models to facilitate a holistic understanding. This article also summarizes evolving nature of literature, relevant research gaps, and authors' perspectives for model selection. Finally, future research scopes are detailed both in the context of modeling and solutions approaches., Comment: 62 pages, 4 figures
Published: 2023
Full Text: View/download PDF

50. Maximizing Submodular Functions for Recommendation in the Presence of Biases

Author: Mehrotra, Anay and Vishnoi, Nisheeth K.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Information Retrieval, Statistics - Machine Learning
Abstract: Subset selection tasks, arise in recommendation systems and search engines and ask to select a subset of items that maximize the value for the user. The values of subsets often display diminishing returns, and hence, submodular functions have been used to model them. If the inputs defining the submodular function are known, then existing algorithms can be used. In many applications, however, inputs have been observed to have social biases that reduce the utility of the output subset. Hence, interventions to improve the utility are desired. Prior works focus on maximizing linear functions -- a special case of submodular functions -- and show that fairness constraint-based interventions can not only ensure proportional representation but also achieve near-optimal utility in the presence of biases. We study the maximization of a family of submodular functions that capture functions arising in the aforementioned applications. Our first result is that, unlike linear functions, constraint-based interventions cannot guarantee any constant fraction of the optimal utility for this family of submodular functions. Our second result is an algorithm for submodular maximization. The algorithm provably outputs subsets that have near-optimal utility for this family under mild assumptions and that proportionally represent items from each group. In empirical evaluation, with both synthetic and real-world data, we observe that this algorithm improves the utility of the output subset for this family of submodular functions over baselines., Comment: This is the full version of a paper accepted for presentation at the ACM Web Conference 2023
Published: 2023

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

220 results on '"Mehrotra P"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources