17,502 results on '"Murthy, P"'
Search Results
2. One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
- Author
-
Murthy, Sonia K., Ullman, Tomer, and Hu, Jennifer
- Subjects
Computer Science - Computation and Language - Abstract
Researchers in social science and psychology have recently proposed using large language models (LLMs) as replacements for humans in behavioral research. In addition to arguments about whether LLMs accurately capture population-level patterns, this has raised questions about whether LLMs capture human-like conceptual diversity. Separately, it is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity. Inspired by human studies, we use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability. We use this approach to evaluate non-aligned and aligned LLMs on two domains with rich human behavioral data. While no model reaches human-like diversity, aligned models generally display less diversity than their instruction fine-tuned counterparts. Our findings highlight potential trade-offs between increasing models' value alignment and decreasing the diversity of their conceptual representations., Comment: 17 pages, 10 figures
- Published
- 2024
3. Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex
- Author
-
Kumar, Tanishq, Bordelon, Blake, Pehlevan, Cengiz, Murthy, Venkatesh N., and Gershman, Samuel J.
- Subjects
Computer Science - Machine Learning ,Quantitative Biology - Neurons and Cognition - Abstract
Does learning of task-relevant representations stop when behavior stops changing? Motivated by recent theoretical advances in machine learning and the intuitive observation that human experts continue to learn from practice even after mastery, we hypothesize that task-specific representation learning can continue, even when behavior plateaus. In a novel reanalysis of recently published neural data, we find evidence for such learning in posterior piriform cortex of mice following continued training on a task, long after behavior saturates at near-ceiling performance ("overtraining"). This learning is marked by an increase in decoding accuracy from piriform neural populations and improved performance on held-out generalization tests. We demonstrate that class representations in cortex continue to separate during overtraining, so that examples that were incorrectly classified at the beginning of overtraining can abruptly be correctly classified later on, despite no changes in behavior during that time. We hypothesize this hidden yet rich learning takes the form of approximate margin maximization; we validate this and other predictions in the neural data, as well as build and interpret a simple synthetic model that recapitulates these phenomena. We conclude by showing how this model of late-time feature learning implies an explanation for the empirical puzzle of overtraining reversal in animal learning, where task-specific representations are more robust to particular task changes because the learned features can be reused.
- Published
- 2024
4. MILU: A Multi-task Indic Language Understanding Benchmark
- Author
-
Verma, Sshubam, Khan, Mohammed Safi Ur Rahman, Kumar, Vishwajeet, Murthy, Rudra, and Sen, Jaydeep
- Subjects
Computer Science - Computation and Language - Abstract
Evaluating Large Language Models (LLMs) in low-resource and linguistically diverse languages remains a significant challenge in NLP, particularly for languages using non-Latin scripts like those spoken in India. Existing benchmarks predominantly focus on English, leaving substantial gaps in assessing LLM capabilities in these languages. We introduce MILU, a Multi task Indic Language Understanding Benchmark, a comprehensive evaluation benchmark designed to address this gap. MILU spans 8 domains and 42 subjects across 11 Indic languages, reflecting both general and culturally specific knowledge. With an India-centric design, incorporates material from regional and state-level examinations, covering topics such as local history, arts, festivals, and laws, alongside standard subjects like science and mathematics. We evaluate over 42 LLMs, and find that current LLMs struggle with MILU, with GPT-4o achieving the highest average accuracy at 72 percent. Open multilingual models outperform language-specific fine-tuned models, which perform only slightly better than random baselines. Models also perform better in high resource languages as compared to low resource ones. Domain-wise analysis indicates that models perform poorly in culturally relevant areas like Arts and Humanities, Law and Governance compared to general fields like STEM. To the best of our knowledge, MILU is the first of its kind benchmark focused on Indic languages, serving as a crucial step towards comprehensive cultural evaluation. All code, benchmarks, and artifacts will be made publicly available to foster open research.
- Published
- 2024
5. A sky survey of ultraviolet sources observed through AstroSat's UVIT: A point source catalog
- Author
-
Bordoloi, Swagat, Shalima, P., Gogoi, Rupjyoti, and Murthy, Jayant
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
The Ultra Violet Imaging Telescope (UVIT) onboard India's first dedicated multiwavelength satellite \textit{AstroSat} observed a significant fraction of the sky in the ultraviolet with a spatial resolution of 1.4\arcsec. We present a catalog of the point sources observed by UVIT in the far ultraviolet (FUV; 1300-1800 \AA) and near ultraviolet (NUV; 2000-3000 \AA). We carried out astrometry and photometry of 428 field pointings in the FUV and 54 field pointings in the NUV band, observed in 5 filter bands in each channel respectively, covering an area of about 63 square degrees. The final catalog contains about 102,773 sources. The limiting magnitude(AB) of the F148W band filter, that has the largest number of detections is $\sim21.3$. For the NUV channel, we find the limiting magnitude at around $\sim23$. We describe the final catalog and present the results of the statistical analysis., Comment: 12 Pages, 17 Figures and 7 Tables. This paper has been accepted for publication in Publications of the Astronomical Society of Australia (PASA)
- Published
- 2024
6. PRACT: Optimizing Principled Reasoning and Acting of LLM Agent
- Author
-
Liu, Zhiwei, Yao, Weiran, Zhang, Jianguo, Murthy, Rithesh, Yang, Liangwei, Liu, Zuxin, Lan, Tian, Zhu, Ming, Tan, Juntao, Kokane, Shirley, Hoang, Thai, Niebles, Juan Carlos, Heinecke, Shelby, Wang, Huan, Savarese, Silvio, and Xiong, Caiming
- Subjects
Computer Science - Artificial Intelligence - Abstract
We introduce the Principled Reasoning and Acting (PRAct) framework, a novel method for learning and enforcing action principles from trajectory data. Central to our approach is the use of text gradients from a reflection and optimization engine to derive these action principles. To adapt action principles to specific task requirements, we propose a new optimization framework, Reflective Principle Optimization (RPO). After execution, RPO employs a reflector to critique current action principles and an optimizer to update them accordingly. We develop the RPO framework under two scenarios: Reward-RPO, which uses environmental rewards for reflection, and Self-RPO, which conducts self-reflection without external rewards. Additionally, two RPO methods, RPO-Traj and RPO-Batch, is introduced to adapt to different settings. Experimental results across four environments demonstrate that the PRAct agent, leveraging the RPO framework, effectively learns and applies action principles to enhance performance., Comment: Accepted to SIG CoNLL 2024
- Published
- 2024
7. A Unified Framework for Collecting Text-to-Speech Synthesis Datasets for 22 Indian Languages
- Author
-
Sathiyamoorthy, Sujitha, Mohana, N, Prakash, Anusha, and Murthy, Hema A
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The performance of a text-to-speech (TTS) synthesis model depends on various factors, of which the quality of the training data is of utmost importance. Millions of data are collected around the globe for various languages, but resources for Indian languages are few. Although there are many efforts involved in data collection, a common set of protocols for data collection becomes necessary for building TTS systems in Indian languages primarily because of the need for a uniform development of TTS systems across languages. In this paper, we present our learnings on data collection efforts' for Indic languages over 15 years. These databases have been used in unit selection synthesis, hidden Markov model based, and end-to-end frameworks, and for generating prosodically rich TTS systems. The most significant feature of the data collected is that data purity enables building high-quality TTS systems with a comparatively small dataset compared to that of European/Chinese languages., Comment: Submitted to ICASSP 2025
- Published
- 2024
8. Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks
- Author
-
Murthy, Rudra, Kumar, Prince, Venkateswaran, Praveen, and Contractor, Danish
- Subjects
Computer Science - Computation and Language - Abstract
In this work, we focus our attention on developing a benchmark for instruction-following where it is easy to verify both task performance as well as instruction-following capabilities. We adapt existing knowledge benchmarks and augment them with instructions that are a) conditional on correctly answering the knowledge task or b) use the space of candidate options in multiple-choice knowledge-answering tasks. This allows us to study model characteristics, such as their change in performance on the knowledge tasks in the presence of answer-modifying instructions and distractor instructions. In contrast to existing benchmarks for instruction following, we not only measure instruction-following capabilities but also use LLM-free methods to study task performance. We study a series of openly available large language models of varying parameter sizes (1B-405B) and closed source models namely GPT-4o-mini, GPT-4o. We find that even large-scale instruction-tuned LLMs fail to follow simple instructions in zero-shot settings. We release our dataset, the benchmark, code, and results for future work.
- Published
- 2024
9. MoonMetaSync: Lunar Image Registration Analysis
- Author
-
Kumar, Ashutosh, Kaushal, Sarthak, and Murthy, Shiv Vignesh
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Mathematics - Algebraic Geometry - Abstract
This paper compares scale-invariant (SIFT) and scale-variant (ORB) feature detection methods, alongside our novel feature detector, IntFeat, specifically applied to lunar imagery. We evaluate these methods using low (128x128) and high-resolution (1024x1024) lunar image patches, providing insights into their performance across scales in challenging extraterrestrial environments. IntFeat combines high-level features from SIFT and low-level features from ORB into a single vector space for robust lunar image registration. We introduce SyncVision, a Python package that compares lunar images using various registration methods, including SIFT, ORB, and IntFeat. Our analysis includes upscaling low-resolution lunar images using bi-linear and bi-cubic interpolation, offering a unique perspective on registration effectiveness across scales and feature detectors in lunar landscapes. This research contributes to computer vision and planetary science by comparing feature detection methods for lunar imagery and introducing a versatile tool for lunar image registration and evaluation, with implications for multi-resolution image analysis in space exploration applications.
- Published
- 2024
10. Everyday Speech in the Indian Subcontinent
- Author
-
Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Sathiyamoorthy, Sujitha, Agarwal, Keshav, and Murthy, Hema A.
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing ,I.2.7 - Abstract
India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End to End (E2E) framework for multilingual synthesis. This reduced the footprint of the synthesizer and also enabled fast adaptation to new languages which had similar phonotactics, provided language scripts belonged to the same family. In this paper, we provide new insights into speech synthesis, where the script belongs to one family, while the phonotactics comes from another. Indian language text is first converted to CLS, and then a synthesizer that matches the phonotactics of the language is used. Quality akin to that of a native speaker is obtained for Sanskrit and Konkani with zero adaptation data, using Kannada and Marathi synthesizers respectively. Further, this approach also lends itself seamless code switching across 13 Indian languages and English in a given native speaker's voice., Comment: 5 Pages, 1 Figure, Submitted to ICASSP 2025
- Published
- 2024
11. Gaussian Splatting Visual MPC for Granular Media Manipulation
- Author
-
Tseng, Wei-Cheng, Zhang, Ellina, Jatavallabhula, Krishna Murthy, and Shkurti, Florian
- Subjects
Computer Science - Robotics - Abstract
Recent advancements in learned 3D representations have enabled significant progress in solving complex robotic manipulation tasks, particularly for rigid-body objects. However, manipulating granular materials such as beans, nuts, and rice, remains challenging due to the intricate physics of particle interactions, high-dimensional and partially observable state, inability to visually track individual particles in a pile, and the computational demands of accurate dynamics prediction. Current deep latent dynamics models often struggle to generalize in granular material manipulation due to a lack of inductive biases. In this work, we propose a novel approach that learns a visual dynamics model over Gaussian splatting representations of scenes and leverages this model for manipulating granular media via Model-Predictive Control. Our method enables efficient optimization for complex manipulation tasks on piles of granular media. We evaluate our approach in both simulated and real-world settings, demonstrating its ability to solve unseen planning tasks and generalize to new environments in a zero-shot transfer. We also show significant prediction and manipulation performance improvements compared to existing granular media manipulation methods., Comment: project website https://weichengtseng.github.io/gs-granular-mani/
- Published
- 2024
12. When Does Interference Matter? Decision-Making in Platform Experiments
- Author
-
Johari, Ramesh, Li, Hannah, Murthy, Anushka, and Weintraub, Gabriel Y.
- Subjects
Statistics - Methodology - Abstract
This paper investigates decision-making in A/B experiments for online platforms and marketplaces. In such settings, due to constraints on inventory, A/B experiments typically lead to biased estimators because of interference; this phenomenon has been well studied in recent literature. By contrast, there has been relatively little discussion of the impact of interference on decision-making. In this paper, we analyze a benchmark Markovian model of an inventory-constrained platform, where arriving customers book listings that are limited in supply; our analysis builds on a self-contained analysis of general A/B experiments for Markov chains. We focus on the commonly used frequentist hypothesis testing approach for making launch decisions based on data from customer-randomized experiments, and we study the impact of interference on (1) false positive probability and (2) statistical power. We obtain three main findings. First, we show that for monotone treatments -- i.e., those where the treatment changes booking probabilities in the same direction relative to control in all states -- the false positive probability of the na\"ive difference-in-means estimator with classical variance estimation is correctly controlled. This result stems from a novel analysis of A/A experiments with arbitrary dependence structures, which may be of independent interest. Second, we demonstrate that for monotone treatments, the statistical power of this na\"ive approach is higher than that of any similar pipeline using a debiased estimator. Taken together, these two findings suggest that platforms may be better off not debiasing when treatments are monotone. Finally, using simulations, we investigate false positive probability and statistical power when treatments are non-monotone, and we show that the performance of the na\"ive approach can be arbitrarily worse than a debiased approach in such cases.
- Published
- 2024
13. ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution
- Author
-
Rivera, Corban, Byrd, Grayson, Paul, William, Feldman, Tyler, Booker, Meghan, Holmes, Emma, Handelman, David, Kemp, Bethany, Badger, Andrew, Schmidt, Aurora, Jatavallabhula, Krishna Murthy, de Melo, Celso M, Seenivasan, Lalithkumar, Unberath, Mathias, and Chellappa, Rama
- Subjects
Computer Science - Artificial Intelligence - Abstract
Robotic planning and execution in open-world environments is a complex problem due to the vast state spaces and high variability of task embodiment. Recent advances in perception algorithms, combined with Large Language Models (LLMs) for planning, offer promising solutions to these challenges, as the common sense reasoning capabilities of LLMs provide a strong heuristic for efficiently searching the action space. However, prior work fails to address the possibility of hallucinations from LLMs, which results in failures to execute the planned actions largely due to logical fallacies at high- or low-levels. To contend with automation failure due to such hallucinations, we introduce ConceptAgent, a natural language-driven robotic platform designed for task execution in unstructured environments. With a focus on scalability and reliability of LLM-based planning in complex state and action spaces, we present innovations designed to limit these shortcomings, including 1) Predicate Grounding to prevent and recover from infeasible actions, and 2) an embodied version of LLM-guided Monte Carlo Tree Search with self reflection. In simulation experiments, ConceptAgent achieved a 19% task completion rate across three room layouts and 30 easy level embodied tasks outperforming other state-of-the-art LLM-driven reasoning baselines that scored 10.26% and 8.11% on the same benchmark. Additionally, ablation studies on moderate to hard embodied tasks revealed a 20% increase in task completion from the baseline agent to the fully enhanced ConceptAgent, highlighting the individual and combined contributions of Predicate Grounding and LLM-guided Tree Search to enable more robust automation in complex state and action spaces.
- Published
- 2024
14. A Predictive and Optimization Approach for Enhanced Urban Mobility Using Spatiotemporal Data
- Author
-
Mishra, Shambhavi and Murthy, T. Satyanarayana
- Subjects
Computer Science - Machine Learning - Abstract
In modern urban centers, effective transportation management poses a significant challenge, with traffic jams and inconsistent travel durations greatly affecting commuters and logistics operations. This study introduces a novel method for enhancing urban mobility by combining machine learning algorithms with live traffic information. We developed predictive models for journey time and congestion analysis using data from New York City's yellow taxi trips. The research employed a spatiotemporal analysis framework to identify traffic trends and implemented real-time route optimization using the GraphHopper API. This system determines the most efficient paths based on current conditions, adapting to changes in traffic flow. The methodology utilizes Spark MLlib for predictive modeling and Spark Streaming for processing data in real-time. By integrating historical data analysis with current traffic inputs, our system shows notable enhancements in both travel time forecasts and route optimization, demonstrating its potential for widespread application in major urban areas. This research contributes to ongoing efforts aimed at reducing urban congestion and improving transportation efficiency through advanced data-driven methods.
- Published
- 2024
15. Parallel Corpus Augmentation using Masked Language Models
- Author
-
Kumari, Vibhuti and Kavi, Narayana Murthy
- Subjects
Computer Science - Computation and Language - Abstract
In this paper we propose a novel method of augmenting parallel text corpora which promises good quality and is also capable of producing many fold larger corpora than the seed corpus we start with. We do not need any additional monolingual corpora. We use Multi-Lingual Masked Language Model to mask and predict alternative words in context and we use Sentence Embeddings to check and select sentence pairs which are likely to be translations of each other. We cross check our method using metrics for MT Quality Estimation. We believe this method can greatly alleviate the data scarcity problem for all language pairs for which a reasonable seed corpus is available., Comment: 21 Pages, 3 Figures. arXiv admin note: text overlap with arXiv:2011.01536 by other authors
- Published
- 2024
16. Neural Light Spheres for Implicit Image Stitching and View Synthesis
- Author
-
Chugunov, Ilya, Joshi, Amogh, Murthy, Kiran, Bleibel, Francois, and Heide, Felix
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Challenging to capture, and challenging to display on a cellphone screen, the panorama paradoxically remains both a staple and underused feature of modern mobile camera applications. In this work we address both of these challenges with a spherical neural light field model for implicit panoramic image stitching and re-rendering; able to accommodate for depth parallax, view-dependent lighting, and local scene motion and color changes during capture. Fit during test-time to an arbitrary path panoramic video capture -- vertical, horizontal, random-walk -- these neural light spheres jointly estimate the camera path and a high-resolution scene reconstruction to produce novel wide field-of-view projections of the environment. Our single-layer model avoids expensive volumetric sampling, and decomposes the scene into compact view-dependent ray offset and color components, with a total model size of 80 MB per scene, and real-time (50 FPS) rendering at 1080p resolution. We demonstrate improved reconstruction quality over traditional image stitching and radiance field methods, with significantly higher tolerance to scene motion and non-ideal capture settings., Comment: Project site: https://light.princeton.edu/publication/neuls/
- Published
- 2024
- Full Text
- View/download PDF
17. Exploring an Inter-Pausal Unit (IPU) based Approach for Indic End-to-End TTS Systems
- Author
-
Prakash, Anusha and Murthy, Hema A
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Sentences in Indian languages are generally longer than those in English. Indian languages are also considered to be phrase-based, wherein semantically complete phrases are concatenated to make up sentences. Long utterances lead to poor training of text-to-speech models and result in poor prosody during synthesis. In this work, we explore an inter-pausal unit (IPU) based approach in the end-to-end (E2E) framework, focusing on synthesising conversational-style text. We consider both autoregressive Tacotron2 and non-autoregressive FastSpeech2 architectures in our study and perform experiments with three Indian languages, namely, Hindi, Tamil and Telugu. With the IPU-based Tacotron2 approach, we see a reduction in insertion and deletion errors in the synthesised audio, providing an alternative approach to the FastSpeech(2) network in terms of error reduction. The IPU-based approach requires less computational resources and produces prosodically richer synthesis compared to conventional sentence-based systems.
- Published
- 2024
18. Exploiting Beam-Split in IRS-aided Systems via OFDMA
- Author
-
Siddhartha, P., Yashvanth, L., and Murthy, Chandra R.
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
In wideband systems operating at mmWave frequencies, intelligent reflecting surfaces (IRSs) equipped with many passive elements can compensate for channel propagation losses. Then, a phenomenon known as the beam-split (B-SP) occurs in which the phase shifters at the IRS elements fail to beamform at a desired user equipment (UE) over the total allotted bandwidth (BW). Although B-SP is usually seen as an impairment, in this paper, we take an optimistic view and exploit the B-SP effect to enhance the system performance via an orthogonal frequency division multiple access (OFDMA). We argue that due to the B-SP, when an IRS is tuned to beamform at a particular angle on one frequency, it also forms beams in different directions on other frequencies. Then, by opportunistically scheduling different UEs on different subcarriers (SCs), we show that, almost surely, the optimal array gain that scales quadratically in the number of IRS elements can be achieved on all SCs in the system. We derive the achievable throughput of the proposed scheme and deduce that the system also enjoys additional multi-user diversity benefits on top of the optimal beamforming gain over the full BW. Finally, we verify our findings via numerical simulations.
- Published
- 2024
19. Disentangling the Impact of Quasiparticles and Two-Level Systems on the Statistics of Superconducting Qubit Lifetime
- Author
-
Zhu, Shaojiang, You, Xinyuan, Alyanak, Ugur, Bal, Mustafa, Crisa, Francesco, Garattoni, Sabrina, Lunin, Andrei, Pilipenko, Roman, Murthy, Akshay, Romanenko, Alexander, and Grassellino, Anna
- Subjects
Quantum Physics - Abstract
Temporal fluctuations in the superconducting qubit lifetime, $T_1$, bring up additional challenges in building a fault-tolerant quantum computer. While the exact mechanisms remain unclear, $T_1$ fluctuations are generally attributed to the strong coupling between the qubit and a few near-resonant two-level systems (TLSs) that can exchange energy with an assemble of thermally fluctuating two-level fluctuators (TLFs) at low frequencies. Here, we report $T_1$ measurements on the qubits with different geometrical footprints and surface dielectrics as a function of the temperature. By analyzing the noise spectrum of the qubit depolarization rate, $\Gamma_1 = 1/T_1$, we can disentangle the impact of TLSs, non-equilibrium quasiparticles (QPs), and equilibrium (thermally excited) QPs on the variance in $\Gamma_1$. We find that $\Gamma_1$ variances in the qubit with a small footprint are more susceptible to the QP and TLS fluctuations than those in the large-footprint qubits. Furthermore, the QP-induced variances in all qubits are consistent with the theoretical framework of QP diffusion and fluctuation. We suggest these findings can offer valuable insights for future qubit design and engineering optimization., Comment: 6+4 pages, 3+3 figures
- Published
- 2024
20. Benchmarking and Building Zero-Shot Hindi Retrieval Model with Hindi-BEIR and NLLB-E5
- Author
-
Acharya, Arkadeep, Murthy, Rudra, Kumar, Vishwajeet, and Sen, Jaydeep
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Given the large number of Hindi speakers worldwide, there is a pressing need for robust and efficient information retrieval systems for Hindi. Despite ongoing research, comprehensive benchmarks for evaluating retrieval models in Hindi are lacking. To address this gap, we introduce the Hindi-BEIR benchmark, comprising 15 datasets across seven distinct tasks. We evaluate state-of-the-art multilingual retrieval models on the Hindi-BEIR benchmark, identifying task and domain-specific challenges that impact Hindi retrieval performance. Building on the insights from these results, we introduce NLLB-E5, a multilingual retrieval model that leverages a zero-shot approach to support Hindi without the need for Hindi training data. We believe our contributions, which include the release of the Hindi-BEIR benchmark and the NLLB-E5 model, will prove to be a valuable resource for researchers and promote advancements in multilingual retrieval models., Comment: arXiv admin note: substantial text overlap with arXiv:2408.09437
- Published
- 2024
21. Training microwave pulses using quantum machine learning
- Author
-
Nola, Jaden, Sanchez, Uriah, Murthy, Anusha Krishna, Behrman, Elizabeth, and Steck, James
- Subjects
Quantum Physics - Abstract
A gate sequence of single-qubit transformations may be condensed into a single microwave pulse that maps a qubit from an initialized state directly into the desired state of the composite transformation. Here, machine learning is used to learn the parameterized values for a single driving pulse associated with a transformation of three sequential gate operations on a qubit. This implies that future quantum circuits may contain roughly a third of the number of single-qubit operations performed, greatly reducing the problems of noise and decoherence. There is a potential for even greater condensation and efficiency using the methods of quantum machine learning.
- Published
- 2024
22. xLAM: A Family of Large Action Models to Empower AI Agent Systems
- Author
-
Zhang, Jianguo, Lan, Tian, Zhu, Ming, Liu, Zuxin, Hoang, Thai, Kokane, Shirley, Yao, Weiran, Tan, Juntao, Prabhakar, Akshara, Chen, Haolin, Liu, Zhiwei, Feng, Yihao, Awalgaonkar, Tulika, Murthy, Rithesh, Hu, Eric, Chen, Zeyuan, Xu, Ran, Niebles, Juan Carlos, Heinecke, Shelby, Wang, Huan, Savarese, Silvio, and Xiong, Caiming
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Autonomous agents powered by large language models (LLMs) have attracted significant research interest. However, the open-source community faces many challenges in developing specialized models for agent tasks, driven by the scarcity of high-quality agent datasets and the absence of standard protocols in this area. We introduce and publicly release xLAM, a series of large action models designed for AI agent tasks. The xLAM series includes five models with both dense and mixture-of-expert architectures, ranging from 1B to 8x22B parameters, trained using a scalable, flexible pipeline that unifies, augments, and synthesizes diverse datasets to enhance AI agents' generalizability and performance across varied environments. Our experimental results demonstrate that xLAM consistently delivers exceptional performance across multiple agent ability benchmarks, notably securing the 1st position on the Berkeley Function-Calling Leaderboard, outperforming GPT-4, Claude-3, and many other models in terms of tool use. By releasing the xLAM series, we aim to advance the performance of open-source LLMs for autonomous AI agents, potentially accelerating progress and democratizing access to high-performance models for agent tasks. Models are available at https://huggingface.co/collections/Salesforce/xlam-models-65f00e2a0a63bbcd1c2dade4, Comment: Technical report for the Salesforce xLAM model series
- Published
- 2024
23. Professional Identity Development in Bioscience Education: A Systematic Review of the Literature
- Author
-
Sunita Ananda Raste and Sahana Murthy
- Abstract
This article addressed the significant issue of identity crisis experienced by students in their choice of profession, as highlighted in various research studies. The importance of examining professional was emphasized to identity development from a disciplinary perspective, particularly in the biosciences and allied fields. To achieve this, we conducted a synthesis of 85 research articles, aiming to comprehend the definitions and measurement approaches employed in understanding professional identity and its development within these disciplines. Our analysis also encompasses a summary of the factors influencing professional identity development, coupled with strategies to support it. The findings suggest that professional identity is linked to an individual's persistence and success in a given profession. This research provides valuable insights for researchers and educators striving to create an optimal learning environment that facilitates students in exploring and shaping their professional identities.
- Published
- 2024
24. 1.7-micron Optical Coherence Tomography Angiography for diagnosis and monitoring of Hereditary Hemorrhagic Telangiectasia - A pilot study
- Author
-
Murthy, Raksha Sreeramachandra, Elsanadi, Rachel, Soliman, John, Li, Yan, Chou, Li-Dek, Sprecher, Dennis, Kelly, Kristen M, and Chen, Zhongping
- Subjects
Engineering ,Biomedical Engineering ,Information and Computing Sciences ,Electronics ,Sensors and Digital Hardware ,Computer Vision and Multimedia Computation ,Rare Diseases ,Biomedical Imaging ,Clinical Research ,Hematology ,Bioengineering ,4.2 Evaluation of markers and technologies ,Artificial Intelligence and Image Processing ,Electrical and Electronic Engineering ,Biomedical engineering ,Electronics ,sensors and digital hardware ,Computer vision and multimedia computation - Abstract
ObjectiveDevelop a multi-functional imaging system that combines 1.7μm optical coherence tomography/angiography (OCT/OCTA) to accurately interrogate Hereditary Hemorrhagic Telangiectasia (HHT) skin lesions.MethodsThe study involved imaging HHT skin lesions on five subjects including lips, hands, and chest. We assessed the attributes of both HHT lesions and the healthy vasculature around them in these individuals, employing quantifiable measures such as vascular density and diameter. Additionally, we performed scans on an HHT patient who had undergone anti-angiogenic therapy, allowing us to observe changes in vasculature before and after treatment.ResultsThe results from this pilot study demonstrate the feasibility of evaluating the HHT lesion using this novel methodology and suggest the potential of OCTA to noninvasively track HHT lesions over time. The average percentage change in density between HHT patients' lesions and control was 37%. The percentage increase in vessel diameter between lesion and control vessels in HHT patients was 23.21%.ConclusionIn this study, we demonstrated that OCTA, as a functional extension of OCT, can non-invasively scan HHT lesions in vivo. We scanned five subjects with HHT lesions in various areas (lip, ear, finger, and palm) and quantified vascular density and diameter in both the lesions and adjacent healthy tissue. This non-invasive method will permit a more comprehensive examination of HHT lesions.SignificanceThis method of non-invasive imaging could offer new insights into the physiology, management, and therapeutics of HHT-associated lesion development and bleeding.
- Published
- 2024
25. Neuronal parts list and wiring diagram for a visual system.
- Author
-
Matsliah, Arie, Yu, Szi-Chieh, Kruk, Krzysztof, Bland, Doug, Burke, Austin T, Gager, Jay, Hebditch, James, Silverman, Ben, Willie, Kyle Patrick, Willie, Ryan, Sorek, Marissa, Sterling, Amy R, Kind, Emil, Garner, Dustin, Sancer, Gizem, Wernet, Mathias F, Kim, Sung Soo, Murthy, Mala, Seung, H Sebastian, and FlyWire Consortium
- Subjects
Animals ,Female ,Algorithms ,Color Vision ,Connectome ,Drosophila melanogaster ,Interneurons ,Models ,Neurological ,Motion Perception ,Neurons ,Neuropil ,Optic Lobe ,Nonmammalian ,Reproducibility of Results ,Visual Fields ,Visual Pathways ,General Science & Technology - Abstract
A catalogue of neuronal cell types has often been called a 'parts list' of the brain1, and regarded as a prerequisite for understanding brain function2,3. In the optic lobe of Drosophila, rules of connectivity between cell types have already proven to be essential for understanding fly vision4,5. Here we analyse the fly connectome to complete the list of cell types intrinsic to the optic lobe, as well as the rules governing their connectivity. Most new cell types contain 10 to 100 cells, and integrate information over medium distances in the visual field. Some existing type families (Tm, Li, and LPi)6-10 at least double in number of types. A new serpentine medulla (Sm) interneuron family contains more types than any other. Three families of cross-neuropil types are revealed. The consistency of types is demonstrated by analysing the distances in high-dimensional feature space, and is further validated by algorithms that select small subsets of discriminative features. We use connectivity to hypothesize about the functional roles of cell types in motion, object and colour vision. Connectivity with 'boundary types' that straddle the optic lobe and central brain is also quantified. We showcase the advantages of connectomic cell typing: complete and unbiased sampling, a rich array of features based on connectivity and reduction of the connectome to a substantially simpler wiring diagram of cell types, with immediate relevance for brain function and development.
- Published
- 2024
26. A Non-Traditional Approach to Assisting Data Address Translation
- Author
-
Murthy, Shyam and Sohi, Gurindar S
- Subjects
Computer Science - Hardware Architecture - Abstract
This paper proposes a novel way to assist conventional data address translation. The approach, PC-Indexed Data Address Translation (PCAX), uses the PC of a load instruction, and not a data virtual address, to obtain the page table entry (PTE) for the data accessed by a load instruction. PCAX is intended to be used for a small subset of the static loads in a program. We observe that: (i) a small subset of static loads is responsible for most of the misses in a data translation lookaside buffer (DTLB), and (ii) often a dynamic instance of a static load instruction accesses the same PTE as the last dynamic instance, and consider PCAX for this subset. With PCAX the effective miss rate of a conventional DTLB can be cut down by a factor of 2-3X in many cases, and even more in some cases. PCAX is also beneficial in reducing the number of secondary TLB (STLB) misses. Since the tables used for PCAX can be accessed alongside instruction fetch, they can be slow, yet frequently provide a valid PTE even before the data address calculation. This results in a performance improvement, and reduced data address translation energy, in most cases.
- Published
- 2024
27. Continuum Damage Model for Hydrogen Embrittlement in Ferritic Steels
- Author
-
Valiveti, Dakshina Murthy and Neeraj, T.
- Subjects
Computer Science - Computational Engineering, Finance, and Science ,Mathematics - Numerical Analysis - Abstract
Hydrogen embrittlement of metals and alloys, particularly steels, has been an important scientific and engineering challenge in the Oil and Gas industry for many years. It impacts the integrity and performance of a wide range of structures and equipment such as downhole tubulars and pipelines in sour service in the Upstream (U/S) and hydro-processing reactors in the Downstream (D/S). In addition, the rapidly growing interest in hydrogen as an energy carrier for fuel cells and mobility or as a clean fuel/heat source for hard to decarbonize industrial processes, draws attention to this key challenge of materials integrity in handling hydrogen. The fundamental understanding of failure mechanism(s) and the capability to model material behavior is important for managing the integrity and for repurposing existing infrastructure for transporting hydrogen as well as for extending the life of structures. To that extent, the present work develops a robust mathematical model to estimate the strength degradation and embrittlement due to hydrogen in steels. The model incorporates hydrogen affected constitutive response of material, within the framework of finite element method. The modified constitutive response is a Gurson plasticity based continuum damage model and incorporates two vital aspects of NVC failure theory. These key aspects are (i) hydrogen enhanced localized dislocation plasticity, and (ii) hydrogen enhanced vacancy stabilization forming nano-voids. The deformation and damage in the material is coupled with trap mediated hydrogen diffusion. Calibration of damage model parameters is performed for X65 commercial linepipe steel. Finally, capability of the damage model is demonstrated with numerical simulation of round bar tensile tests on X65 steel under hydrogen exposure. The numerical simulations are shown to be in excellent agreement with experimental results., Comment: 39 pages, 7 figures, 1 table
- Published
- 2024
28. Strings in AdS$_3$: one-loop partition function and near-extremal BTZ thermodynamics
- Author
-
Ferko, Christian, Murthy, Sameer, and Rangamani, Mukund
- Subjects
High Energy Physics - Theory - Abstract
We revisit the computation of the string partition function in AdS$_3$ focussing on the appearance of spacetime (super) symmetries. We show how the asymptotic symmetries of the AdS$_3$ spacetime, which generate the boundary (super) Virasoro currents, are captured by the one-loop partition sum. We use this to argue that the recent understanding of near-extremal black hole thermodynamics based on the gravitational path integral continues to hold for finite string length. Along the way we clarify some aspects of the AdS$_3$/CFT$_2$ duality and, in particular, deduce which bulk gauge fields lead to boundary currents. We also explain how one can interpolate between supersymmetric and thermal (Atick-Witten) fermion boundary conditions in the target space by suitably tuning rotational chemical potentials in the string partition function.
- Published
- 2024
29. CT scans without X-rays: parallel-beam imaging from nonlinear current flows
- Author
-
Alsaker, Melody, Rautio, Siiri, Moura, Fernando, Agnelli, Juan Pablo, Murthy, Rashmi, Lassas, Matti, Mueller, Jennifer L., and Siltanen, Samuli
- Subjects
Mathematics - Analysis of PDEs - Abstract
Parallel-beam X-ray computed tomography (CT) and electrical impedance tomography (EIT) are two imaging modalities which stem from completely different underlying physics, and for decades have been thought to have little in common either practically or mathematically. CT is only mildly ill-posed and uses straight X-rays as measurement energy, which admits simple linear mathematics. However, CT relies on exposing targets to ionizing radiation and requires cumbersome setups with expensive equipment. In contrast, EIT uses harmless electrical currents as measurement energy and can be implemented using simple low-cost portable setups. But EIT is burdened by nonlinearity stemming from the curved paths of electrical currents, as well as extreme ill-posedness which causes characteristic low spatial resolution. In practical EIT reconstruction methods, nonlinearity and ill-posedness have been considered intertwined in a complicated fashion. In this work we demonstrate a surprising connection between CT and EIT which partly unravels the main problems of EIT and leads directly to a proposed imaging modality which we call virtual hybrid parallel-beam tomography (VHPT). We show that hidden deep within EIT data is information which possesses the same linear geometry as parallel-beam CT data. This admits a fundamental restructuring of EIT, separating ill-posedness and nonlinearity into simple modular sub-problems, and yields ''virtual radiographs'' and CT-like images which reveal previously concealed information. Furthermore, as proof of concept we present VHPT images of real-world objects.
- Published
- 2024
30. Sequential Resource Trading Using Comparison-Based Gradient Estimation
- Author
-
Murthy, Surya, Karabag, Mustafa O., and Topcu, Ufuk
- Subjects
Computer Science - Multiagent Systems ,Computer Science - Artificial Intelligence ,Mathematics - Optimization and Control - Abstract
Autonomous agents interact with other agents of unknown preferences to share resources in their environment. We explore sequential trading for resource allocation in a setting where two greedily rational agents sequentially trade resources from a finite set of categories. Each agent has a utility function that depends on the amount of resources it possesses in each category. The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent only accepts offers that improve its utility. We present an algorithm for the offering agent to estimate the responding agent's gradient (preferences) and make offers based on previous acceptance or rejection responses. The algorithm's goal is to reach a Pareto-optimal resource allocation state while ensuring that the utilities of both agents improve after every accepted trade. We show that, after a finite number of consecutively rejected offers, the responding agent is at a near-optimal state, or the agents' gradients are closely aligned. We compare the proposed algorithm against various baselines in continuous and discrete trading scenarios and show that it improves the societal benefit with fewer offers.
- Published
- 2024
31. Mistral-SPLADE: LLMs for better Learned Sparse Retrieval
- Author
-
Doshi, Meet, Kumar, Vishwajeet, Murthy, Rudra, P, Vignesh, and Sen, Jaydeep
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Learned Sparse Retrievers (LSR) have evolved into an effective retrieval strategy that can bridge the gap between traditional keyword-based sparse retrievers and embedding-based dense retrievers. At its core, learned sparse retrievers try to learn the most important semantic keyword expansions from a query and/or document which can facilitate better retrieval with overlapping keyword expansions. LSR like SPLADE has typically been using encoder only models with MLM (masked language modeling) style objective in conjunction with known ways of retrieval performance improvement such as hard negative mining, distillation, etc. In this work, we propose to use decoder-only model for learning semantic keyword expansion. We posit, decoder only models that have seen much higher magnitudes of data are better equipped to learn keyword expansions needed for improved retrieval. We use Mistral as the backbone to develop our Learned Sparse Retriever similar to SPLADE and train it on a subset of sentence-transformer data which is often used for training text embedding models. Our experiments support the hypothesis that a sparse retrieval model based on decoder only large language model (LLM) surpasses the performance of existing LSR systems, including SPLADE and all its variants. The LLM based model (Echo-Mistral-SPLADE) now stands as a state-of-the-art learned sparse retrieval model on the BEIR text retrieval benchmark.
- Published
- 2024
32. Fractional quantum Hall coexistence phases in higher Landau levels of graphene
- Author
-
An, Jincheng, Balram, Ajit C., Khanna, Udit, and Murthy, Ganpathy
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Monolayer graphene under a strong magnetic field near charge neutrality manifests the integer and fractional quantum Hall effects. Since only some of the four spin/valley flavors available to the electrons in each Landau level manifold are filled, they also exhibit spontaneous symmetry breaking the in spin/valley sector, a phenomenon known as quantum Hall ferromagnetism. In this work, we study quantum Hall ferromagnets in the higher Landau level manifolds of monolayer graphene and show that there is an even richer set of symmetry-broken phases than in the lowest Landau level manifold. Specifically, both valley polarized and valley equatorial (where the occupied Landau levels are in an equal superposition of both valleys) ferromagnets, antiferromagnets, and canted antiferromagnets are found. Several types of spin valley entangled phases are found, all of which manifest the simultaneous spontaneous symmetry breaking of both magnetic and lattice symmetries., Comment: 27 pages, 18 figures
- Published
- 2024
33. Hindi-BEIR : A Large Scale Retrieval Benchmark in Hindi
- Author
-
Acharya, Arkadeep, Murthy, Rudra, Kumar, Vishwajeet, and Sen, Jaydeep
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
Given the large number of Hindi speakers worldwide, there is a pressing need for robust and efficient information retrieval systems for Hindi. Despite ongoing research, there is a lack of comprehensive benchmark for evaluating retrieval models in Hindi. To address this gap, we introduce the Hindi version of the BEIR benchmark, which includes a subset of English BEIR datasets translated to Hindi, existing Hindi retrieval datasets, and synthetically created datasets for retrieval. The benchmark is comprised of $15$ datasets spanning across $8$ distinct tasks. We evaluate state-of-the-art multilingual retrieval models on this benchmark to identify task and domain-specific challenges and their impact on retrieval performance. By releasing this benchmark and a set of relevant baselines, we enable researchers to understand the limitations and capabilities of current Hindi retrieval models, promoting advancements in this critical area. The datasets from Hindi-BEIR are publicly available.
- Published
- 2024
34. Entendre, a Social Bot Detection Tool for Niche, Fringe, and Extreme Social Media
- Author
-
Venkatesh, Pranav, Vinton, Kami, Murthy, Dhiraj, Sharp, Kellen, and Kolluri, Akaash
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction ,Computer Science - Social and Information Networks ,J.4 ,I.2 ,I.7 ,K.4 - Abstract
Social bots-automated accounts that generate and spread content on social media-are exploiting vulnerabilities in these platforms to manipulate public perception and disseminate disinformation. This has prompted the development of public bot detection services; however, most of these services focus primarily on Twitter, leaving niche platforms vulnerable. Fringe social media platforms such as Parler, Gab, and Gettr often have minimal moderation, which facilitates the spread of hate speech and misinformation. To address this gap, we introduce Entendre, an open-access, scalable, and platform-agnostic bot detection framework. Entendre can process a labeled dataset from any social platform to produce a tailored bot detection model using a random forest classification approach, ensuring robust social bot detection. We exploit the idea that most social platforms share a generic template, where users can post content, approve content, and provide a bio (common data features). By emphasizing general data features over platform-specific ones, Entendre offers rapid extensibility at the expense of some accuracy. To demonstrate Entendre's effectiveness, we used it to explore the presence of bots among accounts posting racist content on the now-defunct right-wing platform Parler. We examined 233,000 posts from 38,379 unique users and found that 1,916 unique users (4.99%) exhibited bot-like behavior. Visualization techniques further revealed that these bots significantly impacted the network, amplifying influential rhetoric and hashtags (e.g., #qanon, #trump, #antilgbt). These preliminary findings underscore the need for tools like Entendre to monitor and assess bot activity across diverse platforms., Comment: 6 pages
- Published
- 2024
35. Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
- Author
-
Zhang, Kexun, Yao, Weiran, Liu, Zuxin, Feng, Yihao, Liu, Zhiwei, Murthy, Rithesh, Lan, Tian, Li, Lei, Lou, Renze, Xu, Jiacheng, Pang, Bo, Zhou, Yingbo, Heinecke, Shelby, Savarese, Silvio, Wang, Huan, and Xiong, Caiming
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Large language model (LLM) agents have shown great potential in solving real-world software engineering (SWE) problems. The most advanced open-source SWE agent can resolve over 27% of real GitHub issues in SWE-Bench Lite. However, these sophisticated agent frameworks exhibit varying strengths, excelling in certain tasks while underperforming in others. To fully harness the diversity of these agents, we propose DEI (Diversity Empowered Intelligence), a framework that leverages their unique expertise. DEI functions as a meta-module atop existing SWE agent frameworks, managing agent collectives for enhanced problem-solving. Experimental results show that a DEI-guided committee of agents is able to surpass the best individual agent's performance by a large margin. For instance, a group of open-source SWE agents, with a maximum individual resolve rate of 27.3% on SWE-Bench Lite, can achieve a 34.3% resolve rate with DEI, making a 25% improvement and beating most closed-source solutions. Our best-performing group excels with a 55% resolve rate, securing the highest ranking on SWE-Bench Lite. Our findings contribute to the growing body of research on collaborative AI systems and their potential to solve complex software engineering challenges.
- Published
- 2024
36. Modeling Transit in a Fully Integrated Agent-Based Framework: Methodology and Large-Scale Application
- Author
-
Verbas, Omer, Cokyasar, Taner, de Camargo, Pedro Veiga, Gurumurthy, Krishna Murthy, Zuniga-Garcia, Natalia, and Auld, Joshua
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
This study presents a transit routing, assignment, and simulation framework which is fully embedded in a multimodal, multi-agent transportation demand and supply modeling platform. POLARIS, a high-performance agent-based simulation platform, efficiently integrates advanced travel and freight demand modeling, dynamic traffic and transit assignment, and multimodal transportation simulation within a unified framework. We focus on POLARIS's transit routing, assignment, and simulation components, detailing its structural design and essential terminologies. We demonstrate how the model integrates upstream decision-making processes - activity generation, location and timing choices, and mode selection, particularly for transit-inclusive trips - followed by routing, assignment decisions, and the movement of travelers and vehicles within a multimodal network. This integration enables modeling of interactions among all agents, including travelers, vehicles, and transportation service providers. The study reviews literature on transportation system modeling tools, describes the transit modeling framework within POLARIS, and presents findings from large-scale analyses of various policy interventions. Results from numerical experiments reveal that measures such as congestion pricing, transit service improvements, first-mile-last-mile subsidies, increased e-commerce deliveries, and vehicle electrification significantly impact transit ridership, with some interactions between these levers exhibiting synergistic or canceling effects. The case study underscores the necessity of integrating transit modeling within a broader multimodal network simulation and decision-making context.
- Published
- 2024
37. Enhanced Superconducting Qubit Performance Through Ammonium Fluoride Etch
- Author
-
Kopas, Cameron J., Goronzy, Dominic P., Pham, Thang, Castanedo, Carlos G. Torres, Cheng, Matthew, Cochrane, Rory, Nast, Patrick, Lachman, Ella, Zhelev, Nikolay Z., Vallieres, Andre, Murthy, Akshay A., Oh, Jin-su, Zhou, Lin, Kramer, Matthew J., Cansizoglu, Hilal, Bedzyk, Michael J., Dravid, Vinayak P., Romanenko, Alexander, Grassellino, Anna, Mutus, Josh Y., Hersam, Mark C., and Yadavalli, Kameshwar
- Subjects
Condensed Matter - Materials Science ,Quantum Physics - Abstract
The performance of superconducting qubits is often limited by dissipation and two-level systems (TLS) losses. The dominant sources of these losses are believed to originate from amorphous materials and defects at interfaces and surfaces, likely as a result of fabrication processes or ambient exposure. Here, we explore a novel wet chemical surface treatment at the Josephson junction-substrate and the substrate-air interfaces by replacing a buffered oxide etch (BOE) cleaning process with one that uses hydrofluoric acid followed by aqueous ammonium fluoride. We show that the ammonium fluoride etch process results in a statistically significant improvement in median $\text{T}_1$ by $\sim22\%$ ($p=0.002$), and a reduction in the number of strongly-coupled TLS in the tunable frequency range. Microwave resonator measurements on samples treated with the ammonium fluoride etch prior to niobium deposition also show $\sim33\%$ lower TLS-induced loss tangent compared to the BOE treated samples. As the chemical treatment primarily modifies the Josephson junction-substrate interface and substrate-air interface, we perform targeted chemical and structural characterizations to examine materials' differences at these interfaces and identify multiple microscopic changes that could contribute to decreased TLS.
- Published
- 2024
38. MaterioMiner -- An ontology-based text mining dataset for extraction of process-structure-property entities
- Author
-
Durmaz, Ali Riza, Thomas, Akhil, Mishra, Lokesh, Murthy, Rachana Niranjan, and Straub, Thomas
- Subjects
Computer Science - Computation and Language ,Condensed Matter - Materials Science - Abstract
While large language models learn sound statistical representations of the language and information therein, ontologies are symbolic knowledge representations that can complement the former ideally. Research at this critical intersection relies on datasets that intertwine ontologies and text corpora to enable training and comprehensive benchmarking of neurosymbolic models. We present the MaterioMiner dataset and the linked materials mechanics ontology where ontological concepts from the mechanics of materials domain are associated with textual entities within the literature corpus. Another distinctive feature of the dataset is its eminently fine-granular annotation. Specifically, 179 distinct classes are manually annotated by three raters within four publications, amounting to a total of 2191 entities that were annotated and curated. Conceptual work is presented for the symbolic representation of causal composition-process-microstructure-property relationships. We explore the annotation consistency between the three raters and perform fine-tuning of pre-trained models to showcase the feasibility of named-entity recognition model training. Reusing the dataset can foster training and benchmarking of materials language models, automated ontology construction, and knowledge graph generation from textual data.
- Published
- 2024
39. Personalized Multi-task Training for Recommender System
- Author
-
Yang, Liangwei, Liu, Zhiwei, Zhang, Jianguo, Murthy, Rithesh, Heinecke, Shelby, Wang, Huan, Xiong, Caiming, and Yu, Philip S.
- Subjects
Computer Science - Information Retrieval - Abstract
In the vast landscape of internet information, recommender systems (RecSys) have become essential for guiding users through a sea of choices aligned with their preferences. These systems have applications in diverse domains, such as news feeds, game suggestions, and shopping recommendations. Personalization is a key technique in RecSys, where modern methods leverage representation learning to encode user/item interactions into embeddings, forming the foundation for personalized recommendations. However, integrating information from multiple sources to enhance recommendation performance remains challenging. This paper introduces a novel approach named PMTRec, the first personalized multi-task learning algorithm to obtain comprehensive user/item embeddings from various information sources. Addressing challenges specific to personalized RecSys, we develop modules to handle personalized task weights, diverse task orientations, and variations in gradient magnitudes across tasks. PMTRec dynamically adjusts task weights based on gradient norms for each user/item, employs a Task Focusing module to align gradient combinations with the main recommendation task, and uses a Gradient Magnitude Balancing module to ensure balanced training across tasks. Through extensive experiments on three real-world datasets with different scales, we demonstrate that PMTRec significantly outperforms existing multi-task learning methods, showcasing its effectiveness in achieving enhanced recommendation accuracy by leveraging multiple tasks simultaneously. Our contributions open new avenues for advancing personalized multi-task training in recommender systems., Comment: 11 pages
- Published
- 2024
40. The Llama 3 Herd of Models
- Author
-
Dubey, Abhimanyu, Jauhri, Abhinav, Pandey, Abhinav, Kadian, Abhishek, Al-Dahle, Ahmad, Letman, Aiesha, Mathur, Akhil, Schelten, Alan, Yang, Amy, Fan, Angela, Goyal, Anirudh, Hartshorn, Anthony, Yang, Aobo, Mitra, Archi, Sravankumar, Archie, Korenev, Artem, Hinsvark, Arthur, Rao, Arun, Zhang, Aston, Rodriguez, Aurelien, Gregerson, Austen, Spataru, Ava, Roziere, Baptiste, Biron, Bethany, Tang, Binh, Chern, Bobbie, Caucheteux, Charlotte, Nayak, Chaya, Bi, Chloe, Marra, Chris, McConnell, Chris, Keller, Christian, Touret, Christophe, Wu, Chunyang, Wong, Corinne, Ferrer, Cristian Canton, Nikolaidis, Cyrus, Allonsius, Damien, Song, Daniel, Pintz, Danielle, Livshits, Danny, Esiobu, David, Choudhary, Dhruv, Mahajan, Dhruv, Garcia-Olano, Diego, Perino, Diego, Hupkes, Dieuwke, Lakomkin, Egor, AlBadawy, Ehab, Lobanova, Elina, Dinan, Emily, Smith, Eric Michael, Radenovic, Filip, Zhang, Frank, Synnaeve, Gabriel, Lee, Gabrielle, Anderson, Georgia Lewis, Nail, Graeme, Mialon, Gregoire, Pang, Guan, Cucurell, Guillem, Nguyen, Hailey, Korevaar, Hannah, Xu, Hu, Touvron, Hugo, Zarov, Iliyan, Ibarra, Imanol Arrieta, Kloumann, Isabel, Misra, Ishan, Evtimov, Ivan, Copet, Jade, Lee, Jaewon, Geffert, Jan, Vranes, Jana, Park, Jason, Mahadeokar, Jay, Shah, Jeet, van der Linde, Jelmer, Billock, Jennifer, Hong, Jenny, Lee, Jenya, Fu, Jeremy, Chi, Jianfeng, Huang, Jianyu, Liu, Jiawen, Wang, Jie, Yu, Jiecao, Bitton, Joanna, Spisak, Joe, Park, Jongsoo, Rocca, Joseph, Johnstun, Joshua, Saxe, Joshua, Jia, Junteng, Alwala, Kalyan Vasuden, Upasani, Kartikeya, Plawiak, Kate, Li, Ke, Heafield, Kenneth, Stone, Kevin, El-Arini, Khalid, Iyer, Krithika, Malik, Kshitiz, Chiu, Kuenley, Bhalla, Kunal, Rantala-Yeary, Lauren, van der Maaten, Laurens, Chen, Lawrence, Tan, Liang, Jenkins, Liz, Martin, Louis, Madaan, Lovish, Malo, Lubo, Blecher, Lukas, Landzaat, Lukas, de Oliveira, Luke, Muzzi, Madeline, Pasupuleti, Mahesh, Singh, Mannat, Paluri, Manohar, Kardas, Marcin, Oldham, Mathew, Rita, Mathieu, Pavlova, Maya, Kambadur, Melanie, Lewis, Mike, Si, Min, Singh, Mitesh Kumar, Hassan, Mona, Goyal, Naman, Torabi, Narjes, Bashlykov, Nikolay, Bogoychev, Nikolay, Chatterji, Niladri, Duchenne, Olivier, Çelebi, Onur, Alrassy, Patrick, Zhang, Pengchuan, Li, Pengwei, Vasic, Petar, Weng, Peter, Bhargava, Prajjwal, Dubal, Pratik, Krishnan, Praveen, Koura, Punit Singh, Xu, Puxin, He, Qing, Dong, Qingxiao, Srinivasan, Ragavan, Ganapathy, Raj, Calderer, Ramon, Cabral, Ricardo Silveira, Stojnic, Robert, Raileanu, Roberta, Girdhar, Rohit, Patel, Rohit, Sauvestre, Romain, Polidoro, Ronnie, Sumbaly, Roshan, Taylor, Ross, Silva, Ruan, Hou, Rui, Wang, Rui, Hosseini, Saghar, Chennabasappa, Sahana, Singh, Sanjay, Bell, Sean, Kim, Seohyun Sonia, Edunov, Sergey, Nie, Shaoliang, Narang, Sharan, Raparthy, Sharath, Shen, Sheng, Wan, Shengye, Bhosale, Shruti, Zhang, Shun, Vandenhende, Simon, Batra, Soumya, Whitman, Spencer, Sootla, Sten, Collot, Stephane, Gururangan, Suchin, Borodinsky, Sydney, Herman, Tamar, Fowler, Tara, Sheasha, Tarek, Georgiou, Thomas, Scialom, Thomas, Speckbacher, Tobias, Mihaylov, Todor, Xiao, Tong, Karn, Ujjwal, Goswami, Vedanuj, Gupta, Vibhor, Ramanathan, Vignesh, Kerkez, Viktor, Gonguet, Vincent, Do, Virginie, Vogeti, Vish, Petrovic, Vladan, Chu, Weiwei, Xiong, Wenhan, Fu, Wenyin, Meers, Whitney, Martinet, Xavier, Wang, Xiaodong, Tan, Xiaoqing Ellen, Xie, Xinfeng, Jia, Xuchao, Wang, Xuewei, Goldschlag, Yaelle, Gaur, Yashesh, Babaei, Yasmine, Wen, Yi, Song, Yiwen, Zhang, Yuchen, Li, Yue, Mao, Yuning, Coudert, Zacharie Delpierre, Yan, Zheng, Chen, Zhengxing, Papakipos, Zoe, Singh, Aaditya, Grattafiori, Aaron, Jain, Abha, Kelsey, Adam, Shajnfeld, Adam, Gangidi, Adithya, Victoria, Adolfo, Goldstand, Ahuva, Menon, Ajay, Sharma, Ajay, Boesenberg, Alex, Vaughan, Alex, Baevski, Alexei, Feinstein, Allie, Kallet, Amanda, Sangani, Amit, Yunus, Anam, Lupu, Andrei, Alvarado, Andres, Caples, Andrew, Gu, Andrew, Ho, Andrew, Poulton, Andrew, Ryan, Andrew, Ramchandani, Ankit, Franco, Annie, Saraf, Aparajita, Chowdhury, Arkabandhu, Gabriel, Ashley, Bharambe, Ashwin, Eisenman, Assaf, Yazdan, Azadeh, James, Beau, Maurer, Ben, Leonhardi, Benjamin, Huang, Bernie, Loyd, Beth, De Paola, Beto, Paranjape, Bhargavi, Liu, Bing, Wu, Bo, Ni, Boyu, Hancock, Braden, Wasti, Bram, Spence, Brandon, Stojkovic, Brani, Gamido, Brian, Montalvo, Britt, Parker, Carl, Burton, Carly, Mejia, Catalina, Wang, Changhan, Kim, Changkyu, Zhou, Chao, Hu, Chester, Chu, Ching-Hsiang, Cai, Chris, Tindal, Chris, Feichtenhofer, Christoph, Civin, Damon, Beaty, Dana, Kreymer, Daniel, Li, Daniel, Wyatt, Danny, Adkins, David, Xu, David, Testuggine, Davide, David, Delia, Parikh, Devi, Liskovich, Diana, Foss, Didem, Wang, Dingkang, Le, Duc, Holland, Dustin, Dowling, Edward, Jamil, Eissa, Montgomery, Elaine, Presani, Eleonora, Hahn, Emily, Wood, Emily, Brinkman, Erik, Arcaute, Esteban, Dunbar, Evan, Smothers, Evan, Sun, Fei, Kreuk, Felix, Tian, Feng, Ozgenel, Firat, Caggioni, Francesco, Guzmán, Francisco, Kanayet, Frank, Seide, Frank, Florez, Gabriela Medina, Schwarz, Gabriella, Badeer, Gada, Swee, Georgia, Halpern, Gil, Thattai, Govind, Herman, Grant, Sizov, Grigory, Guangyi, Zhang, Lakshminarayanan, Guna, Shojanazeri, Hamid, Zou, Han, Wang, Hannah, Zha, Hanwen, Habeeb, Haroun, Rudolph, Harrison, Suk, Helen, Aspegren, Henry, Goldman, Hunter, Damlaj, Ibrahim, Molybog, Igor, Tufanov, Igor, Veliche, Irina-Elena, Gat, Itai, Weissman, Jake, Geboski, James, Kohli, James, Asher, Japhet, Gaya, Jean-Baptiste, Marcus, Jeff, Tang, Jeff, Chan, Jennifer, Zhen, Jenny, Reizenstein, Jeremy, Teboul, Jeremy, Zhong, Jessica, Jin, Jian, Yang, Jingyi, Cummings, Joe, Carvill, Jon, Shepard, Jon, McPhie, Jonathan, Torres, Jonathan, Ginsburg, Josh, Wang, Junjie, Wu, Kai, U, Kam Hou, Saxena, Karan, Prasad, Karthik, Khandelwal, Kartikay, Zand, Katayoun, Matosich, Kathy, Veeraraghavan, Kaushik, Michelena, Kelly, Li, Keqian, Huang, Kun, Chawla, Kunal, Lakhotia, Kushal, Huang, Kyle, Chen, Lailin, Garg, Lakshya, A, Lavender, Silva, Leandro, Bell, Lee, Zhang, Lei, Guo, Liangpeng, Yu, Licheng, Moshkovich, Liron, Wehrstedt, Luca, Khabsa, Madian, Avalani, Manav, Bhatt, Manish, Tsimpoukelli, Maria, Mankus, Martynas, Hasson, Matan, Lennie, Matthew, Reso, Matthias, Groshev, Maxim, Naumov, Maxim, Lathi, Maya, Keneally, Meghan, Seltzer, Michael L., Valko, Michal, Restrepo, Michelle, Patel, Mihir, Vyatskov, Mik, Samvelyan, Mikayel, Clark, Mike, Macey, Mike, Wang, Mike, Hermoso, Miquel Jubert, Metanat, Mo, Rastegari, Mohammad, Bansal, Munish, Santhanam, Nandhini, Parks, Natascha, White, Natasha, Bawa, Navyata, Singhal, Nayan, Egebo, Nick, Usunier, Nicolas, Laptev, Nikolay Pavlovich, Dong, Ning, Zhang, Ning, Cheng, Norman, Chernoguz, Oleg, Hart, Olivia, Salpekar, Omkar, Kalinli, Ozlem, Kent, Parkin, Parekh, Parth, Saab, Paul, Balaji, Pavan, Rittner, Pedro, Bontrager, Philip, Roux, Pierre, Dollar, Piotr, Zvyagina, Polina, Ratanchandani, Prashant, Yuvraj, Pritish, Liang, Qian, Alao, Rachad, Rodriguez, Rachel, Ayub, Rafi, Murthy, Raghotham, Nayani, Raghu, Mitra, Rahul, Li, Raymond, Hogan, Rebekkah, Battey, Robin, Wang, Rocky, Maheswari, Rohan, Howes, Russ, Rinott, Ruty, Bondu, Sai Jayesh, Datta, Samyak, Chugh, Sara, Hunt, Sara, Dhillon, Sargun, Sidorov, Sasha, Pan, Satadru, Verma, Saurabh, Yamamoto, Seiji, Ramaswamy, Sharadh, Lindsay, Shaun, Feng, Sheng, Lin, Shenghao, Zha, Shengxin Cindy, Shankar, Shiva, Zhang, Shuqiang, Wang, Sinong, Agarwal, Sneha, Sajuyigbe, Soji, Chintala, Soumith, Max, Stephanie, Chen, Stephen, Kehoe, Steve, Satterfield, Steve, Govindaprasad, Sudarshan, Gupta, Sumit, Cho, Sungmin, Virk, Sunny, Subramanian, Suraj, Choudhury, Sy, Goldman, Sydney, Remez, Tal, Glaser, Tamar, Best, Tamara, Kohler, Thilo, Robinson, Thomas, Li, Tianhe, Zhang, Tianjun, Matthews, Tim, Chou, Timothy, Shaked, Tzook, Vontimitta, Varun, Ajayi, Victoria, Montanez, Victoria, Mohan, Vijai, Kumar, Vinay Satish, Mangla, Vishal, Albiero, Vítor, Ionescu, Vlad, Poenaru, Vlad, Mihailescu, Vlad Tiberiu, Ivanov, Vladimir, Li, Wei, Wang, Wenchen, Jiang, Wenwen, Bouaziz, Wes, Constable, Will, Tang, Xiaocheng, Wang, Xiaofang, Wu, Xiaojian, Wang, Xiaolan, Xia, Xide, Wu, Xilun, Gao, Xinbo, Chen, Yanjun, Hu, Ye, Jia, Ye, Qi, Ye, Li, Yenda, Zhang, Yilin, Zhang, Ying, Adi, Yossi, Nam, Youngjin, Yu, Wang, Hao, Yuchen, Qian, Yundi, He, Yuzi, Rait, Zach, DeVito, Zachary, Rosnbrick, Zef, Wen, Zhaoduo, Yang, Zhenyu, and Zhao, Zhiwei
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
- Published
- 2024
41. Evaporation limited spreading of ethanol on rectangular porous strips: an experimental and theoretical investigation
- Author
-
Murthy, Rampally Srirama Chandra and Kumar, Navneet
- Subjects
Physics - Fluid Dynamics ,76S05 - Abstract
Wicking is a widely studied process in both natural and artificial systems. In many industrial applications, such as heat pipes, the wicking liquid evaporates to regulate temperature effectively. This study focuses on a simpler scenario where liquid ethanol climbs a vertically oriented filter paper FP under laboratory conditions, facilitating mass loss through evaporation and inducing cooling. Three filter papers with different permeability values were used, and three diagnostic methods optical imaging, thermal imaging, and precision weighing were employed to understand the dynamics of the process. The results showed a steady state height Lc significantly lower than Jurins limit in all cases, indicating that evaporative mass loss, and not gravity, limits the process. For instance, the filter paper 1005FP, with a capillary radius of 59microm and an average pore size of 2.50microm, would reach a Jurins height of 9.6cm with ethanol if evaporation were not allowed. However, when evaporation occurred, the height reduced to 1.2cm, an eightfold decrease, a similar reduction by a factor of 3 was observed for 1004FP. Further, thermal imaging revealed a non constant temperature distribution along the filter paper, with an unusual temperature inversion near the middle of the wicking liquid. This observation led to an improvement of the Constant Evaporation Model CEM by Fries et al 2008 by accounting for the nonlinear behavior of evaporation rates varying with vertical position. This new model termed the Non-Constant Evaporation Model NCEM, tested two power-law relations for evaporation rates , both of which successfully captured the key features of the process., Comment: 43 pages, 19 figures
- Published
- 2024
42. Ultraviolet Extinction Sky Survey (UVESS): A mission concept for probing the interstellar medium in the Milky Way and Local Group galaxies
- Author
-
Mathew, Joice, Battisti, Andrew, Vaughn, Israel, Jain, Shubhangi, Mohan, Rekhesh, and Murthy, Jayant
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Astrophysics of Galaxies - Abstract
The 2175 {\AA} bump shows considerable variations in its strength, width, and central wavelength when observed along different sightlines in the Milky Way and other galaxies. These variations offer valuable insights into the composition, size distribution, and processing of interstellar dust grains along different sightlines. This paper introduces a mission concept called UVESS (Ultraviolet Extinction Sky Survey) aimed at exploring the composition of the interstellar medium (ISM) within both the Milky Way and nearby Local Group Galaxies by mapping the variation of UV extinction curve slopes and the 2175 {\AA} feature across a majority of the sky to gain insights into the makeup of the ISM. Recent advancements in UV instrumentation and technologies pave the way for the development of high-throughput instruments in compact form factors. In this paper, we outline mission science goals and instrument concept tailored for a small satellite-based platform dedicated to the study of UV extinction.
- Published
- 2024
43. INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages
- Author
-
Singh, Abhishek Kumar, Murthy, Rudra, kumar, Vishwajeet, Sen, Jaydeep, and Ramakrishnan, Ganesh
- Subjects
Computer Science - Machine Learning - Abstract
Large Language Models (LLMs) have demonstrated remarkable zero-shot and few-shot capabilities in unseen tasks, including context-grounded question answering (QA) in English. However, the evaluation of LLMs' capabilities in non-English languages for context-based QA is limited by the scarcity of benchmarks in non-English languages. To address this gap, we introduce Indic-QA, the largest publicly available context-grounded question-answering dataset for 11 major Indian languages from two language families. The dataset comprises both extractive and abstractive question-answering tasks and includes existing datasets as well as English QA datasets translated into Indian languages. Additionally, we generate a synthetic dataset using the Gemini model to create question-answer pairs given a passage, which is then manually verified for quality assurance. We evaluate various multilingual Large Language Models and their instruction-fine-tuned variants on the benchmark and observe that their performance is subpar, particularly for low-resource languages. We hope that the release of this dataset will stimulate further research on the question-answering abilities of LLMs for low-resource languages.
- Published
- 2024
44. Direct Measurement of Microwave Loss in Nb Films for Superconducting Qubits
- Author
-
Abdisatarov, B., Bafia, D., Murthy, A., Eremeev, G., Elsayed-Ali, H. E., Lee, J., Netepenko, A., Carlos, C. P. A., Leith, S., Rosaz, G. J., Romanenko, A., and Grassellino, A.
- Subjects
Condensed Matter - Superconductivity ,Quantum Physics - Abstract
Niobium films are a key component in modern two-dimensional superconducting qubits, yet their contribution to the total qubit decay rate is not fully understood. The presence of different layers of materials and interfaces makes it difficult to identify the dominant loss channels in present two-dimensional qubit designs. In this paper we present the first study which directly correlates measurements of RF losses in such films to material parameters by investigating a high-power impulse magnetron sputtered (HiPIMS) film atop a three-dimensional niobium superconducting radiofrequency (SRF) resonator. By using a 3D SRF structure, we are able to isolate the niobium film loss from other contributions. Our findings indicate that microwave dissipation in the HiPIMS-prepared niobium films, within the quantum regime, resembles that of record-high intrinsic quality factor of bulk niobium SRF cavities, with lifetimes extending into seconds. Microstructure and impurity level of the niobium film do not significantly affect the losses. These results set the scale of microwave losses in niobium films and show that niobium losses do not dominate the observed coherence times in present two-dimensional superconducting qubit designs, instead highlighting the dominant role of the dielectric oxide in limiting the performance. We can also set a bound for when niobium film losses will become a limitation for qubit lifetimes., Comment: 20 pages, 8 figures
- Published
- 2024
45. Hybrid Machine Learning Approach For Real-Time Malicious Url Detection Using Som-Rmo And Rbfn With Tabu Search Optimization
- Author
-
T, Swetha, M, Seshaiah, KL, Hemalatha, BH, ManjunathaKumar, and SVN, Murthy
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
The proliferation of malicious URLs has become a significant threat to internet security, encompassing SPAM, phishing, malware, and defacement attacks. Traditional detection methods struggle to keep pace with the evolving nature of these threats. Detecting malicious URLs in real-time requires advanced techniques capable of handling large datasets and identifying novel attack patterns. The challenge lies in developing a robust model that combines efficient feature extraction with accurate classification. We propose a hybrid machine learning approach combining Self-Organizing Map based Radial Movement Optimization (SOM-RMO) for feature extraction and Radial Basis Function Network (RBFN) based Tabu Search for classification. SOM-RMO effectively reduces dimensionality and highlights significant features, while RBFN, optimized with Tabu Search, classifies URLs with high precision. The proposed model demonstrates superior performance in detecting various malicious URL attacks. On a benchmark dataset, our approach achieved an accuracy of 96.5%, precision of 95.2%, recall of 94.8%, and an F1-score of 95.0%, outperforming traditional methods significantly.
- Published
- 2024
46. Physics-informed Neural Networks for Heterogeneous Poroelastic Media
- Author
-
Roy, Sumanta, Annavarapu, Chandrasekhar, Roy, Pratanu, and Valiveti, Dakshina Murthy
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
This study presents a novel physics-informed neural network (PINN) framework for modeling poroelasticity in heterogeneous media with material interfaces. The approach introduces a composite neural network (CoNN) where separate neural networks predict displacement and pressure variables for each material. While sharing identical activation functions, these networks are independently trained for all other parameters. To address challenges posed by heterogeneous material interfaces, the CoNN is integrated with the Interface-PINNs or I-PINNs framework (Sarma et al. 2024, https://dx.doi.org/10.1016/j.cma.2024.117135), allowing different activation functions across material interfaces. This ensures accurate approximation of discontinuous solution fields and gradients. Performance and accuracy of this combined architecture were evaluated against the conventional PINNs approach, a single neural network (SNN) architecture, and the eXtended PINNs (XPINNs) framework through two one-dimensional benchmark examples with discontinuous material properties. The results show that the proposed CoNN with I-PINNs architecture achieves an RMSE that is two orders of magnitude better than the conventional PINNs approach and is at least 40 times faster than the SNN framework. Compared to XPINNs, the proposed method achieves an RMSE at least one order of magnitude better and is 40% faster., Comment: 34 pages, 12 figures, 3 tables
- Published
- 2024
47. Comet C/2012 S1 (ISON) crossing the Jupiter orbit
- Author
-
Safonova, Margarita, Brosch, Noah, Kaspi, Shai, Polishook, David, Rich, R. Michael, Sutaria, Firoza, and Murthy, Jayant
- Subjects
Astrophysics - Earth and Planetary Astrophysics ,Astrophysics - Solar and Stellar Astrophysics - Abstract
We report results of intensive time-resolved imaging photometry and synoptic deep imaging of the comet C/2012 S1 (ISON) performed in February 2013. The data were obtained at the Wise Observatory in Israel (WO), at the Himalayan Chandra Telescope (HCT) in India, and at the Polaris Observatory Association in California, USA. During this period, the comet's heliocentric distance changed from 4.9 to 4.6 AU, just within the orbit of Jupiter. We analyze these early images in an attempt to determine the nuclear rotation period, assuming that at these relatively large heliocentric distances it would be possible to detect the photometric modulation of a rotating nucleus against an underdeveloped coma. Since this is not evident in our February 2013 data, with more than 400 independent photometric measurements analyzed, we can only set upper limits of 0.05 mag for periodic brightness modulations. We discuss (and discount) a possible brightening event (minor outburst) that occurred on $15-16$ February 2013. We also present deep synoptic images of the comet, obtained by combining our exposures for each night, and analyze them. We find that during the period of our observations the comet exhibited a $\sim$$30^{\prime\prime}\simeq 60000$-km tail with no substructures visible and that this appearance did not change throughout our campaign. The comet, as indicated by a single spectroscopic measurement obtained during this observation period, showed a dust coma reflecting the solar light. Our observations indicate that during February 2013, comet ISON was relatively quiet, with the dust coma presumably hiding any light modulation by a spinning nucleus., Comment: 15 pages, 10 figures
- Published
- 2024
48. Sparse Actuator Scheduling for Discrete-Time Linear Dynamical Systems
- Author
-
Kondapi, Krishna Praveen V. S., Sriram, Chandrasekhar, Joseph, Geethu, and Murthy, Chandra R.
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
We consider the control of discrete-time linear dynamical systems using sparse inputs where we limit the number of active actuators at every time step. We develop an algorithm for determining a sparse actuator schedule that ensures the existence of a sparse control input sequence, following the schedule, that takes the system from any given initial state to any desired final state. Since such an actuator schedule is not unique, we look for a schedule that minimizes the energy of sparse inputs. For this, we optimize the trace of the inverse of the resulting controllability Gramian, which is an approximate measure of the average energy of the inputs. We present a greedy algorithm along with its theoretical guarantees. Finally, we empirically show that our greedy algorithm ensures the controllability of the linear system with a small number of active actuators per time step without a significant average energy expenditure compared to the fully actuated system.
- Published
- 2024
49. APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
- Author
-
Liu, Zuxin, Hoang, Thai, Zhang, Jianguo, Zhu, Ming, Lan, Tian, Kokane, Shirley, Tan, Juntao, Yao, Weiran, Liu, Zhiwei, Feng, Yihao, Murthy, Rithesh, Yang, Liangwei, Savarese, Silvio, Niebles, Juan Carlos, Wang, Huan, Heinecke, Shelby, and Xiong, Caiming
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Software Engineering - Abstract
The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, ensuring its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing GPT-3.5-Turbo and Claude-3 Haiku. We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains. The dataset is available on Huggingface: https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k and the project homepage: https://apigen-pipeline.github.io/
- Published
- 2024
50. Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter
- Author
-
Aamir, M., Acar, B., Adamov, G., Adams, T., Adloff, C., Afanasiev, S., Agrawal, C., Ahmad, A., Ahmed, H. A., Akbar, S., Akchurin, N., Akgul, B., Akgun, B., Akpinar, R. O., Aktas, E., AlKadhim, A., Alexakhin, V., Alimena, J., Alison, J., Alpana, A., Alshehri, W., Dominguez, P. Alvarez, Alyari, M., Amendola, C., Amir, R. B., Andersen, S. B., Andreev, Y., Antoszczuk, P. D., Aras, U., Ardila, L., Aspell, P., Avila, M., Awad, I., Aydilek, O., Azimi, Z., Pretel, A. Aznar, Bach, O. A., Bainbridge, R., Bakshi, A., Bam, B., Banerjee, S., Barney, D., Bayraktar, O., Beaudette, F., Beaujean, F., Becheva, E., Behera, P. K., Belloni, A., Bergauer, T., Besancon, M., Bylund, O. Bessidskaia, Bhatt, L., Bhowmil, D., Blekman, F., Blinov, P., Bloch, P., Bodek, A., Boger, a., Bonnemaison, A., Bouyjou, F., Brennan, L., Brondolin, E., Brusamolino, A., Bubanja, I., Perraguin, A. Buchot, Bunin, P., Misura, A. Burazin, Butler-nalin, A., Cakir, A., Callier, S., Campbell, S., Canderan, K., Cankocak, K., Cappati, A., Caregari, S., Carron, S., Carty, C., Cauchois, A., Ceard, L., Cerci, S., Chang, P. J., Chatterjee, R. M., Chatterjee, S., Chattopadhyay, P., Chatzistavrou, T., Chaudhary, M. S., Chauhan, A., Chen, J. A., Chen, J., Chen, Y., Cheng, K., Cheung, H., Chhikara, J., Chiron, A., Chiusi, M., Chokheli, D., Chudasama, R., Clement, E., Mendez, S. Coco, Coko, D., Coskun, K., Couderc, F., Crossman, B., Cui, Z., Cuisset, T., Cummings, G., Curtis, E. M., D'Alfonso, M., D-hler-ball, J., Dadazhanova, O., Damgov, J., Das, I., DasGupta, S., Dauncey, P., Mendes, A. David Tinoco, Davies, G., Davignon, O., DeLa, P. deBarbaroC., DeSilva, M., DeWit, A., Debbins, P., Defranchis, M. M., Delagnes, E., Devouge, P., Dewangan, C., DiGuglielmo, G., Diehl, L., Dilsiz, K., Dincer, G. G., Dittmann, J., Dragicevic, M., Du, D., Dubinchik, B., Dugad, S., Dulucq, F., Dumanoglu, I., Duran, B., Dutta, S., Dutta, V., Dychkant, A., Dünser, M., Edberg, T., Ehle, I. T., Berni, A. El, Elias, F., Eno, S. C., Erdogan, E. N., Erkmen, B., Ershov, Y., Ertorer, E. Y., Extier, S., Eychenne, L., Fedar, Y. E., Fedi, G., De Almeida, J. P. Figueiredo De De Sá Sousa, Alves, B. A. Fontana Santos Santos, Frahm, E., Francis, K., Freeman, J., French, T., Gaede, F., Gandhi, P. K., Ganjour, S., Garcia-Bellido, A., Gastaldi, F., Gazi, L., Gecse, Z., Gerwig, H., Gevin, O., Ghosh, S., Gill, K., Gleyzer, S., Godinovic, N., Goek, M., Goettlicher, P., Goff, R., Golunov, A., Gonultas, B., Martínez, J. D. González, Gorbounov, N., Gouskos, L., Gray, A., Gray, L., Grieco, C., Groenroos, S., Groner, D., Gruber, A., Grummer, A., Grönroos, S., Guilloux, F., Guler, Y., Gungordu, A. D., Guo, J., Guo, K., Guler, E. Gurpinar, Gutti, H. K., Guvenli, A. A., Gülmez, E., Hacisahinoglu, B., Halkin, Y., Machado, G. Hamilton Ilha, Hare, H. S., Hatakeyama, K., Heering, A. H., Hegde, V., Heintz, U., Hinton, N., Hinzmann, A., Hirschauer, J., Hitlin, D., Hos, İ., Hou, B., Hou, X., Howard, A., Howe, C., Hsieh, H., Hsu, T., Hua, H., Hummer, F., Imran, M., Incandela, J., Iren, E., Isildak, B., Jackson, P. S., Jackson, W. J., Jain, S., Jana, P., Jaroslavceva, J., Jena, S., Jige, A., Jordano, P. P., Joshi, U., Kaadze, K., Kafizov, A., Kalipoliti, L., Tharayil, A. Kallil, Kaluzinska, O., Kamble, S., Kaminskiy, A., Kanemura, M., Kanso, H., Kao, Y., Kapic, A., Kapsiak, C., Karjavine, V., Karmakar, S., Karneyeu, A., Kaya, M., Topaksu, A. Kayis, Kaynak, B., Kazhykarim, Y., Khan, F. A., Khudiakov, A., Kieseler, J., Kim, R. S., Klijnsma, T., Kloiber, E. G., Klute, M., Kocak, Z., Kodali, K. R., Koetz, K., Kolberg, T., Kolcu, O. B., Komaragiri, J. R., Komm, M., Kopsalis, I., Krause, H. A., Krawczyk, M. A., Vinayakam, T. R. Krishnaswamy, Kristiansen, K., Kristic, A., Krohn, M., Kronheim, B., Krüger, K., Kudtarkar, C., Kulis, S., Kumar, M., Kumar, N., Kumar, S., Verma, R. Kumar, Kunori, S., Kunts, A., Kuo, C., Kurenkov, A., Kuryatkov, V., Kyre, S., Ladenson, J., Lamichhane, K., Landsberg, G., Langford, J., Laudrain, A., Laughlin, R., Lawhorn, J., Dortz, O. Le, Lee, S. W., Lektauers, A., Lelas, D., Leon, M., Levchuk, L., Li, A. J., Li, J., Li, Y., Liang, Z., Liao, H., Lin, K., Lin, W., Lin, Z., Lincoln, D., Linssen, L., Litomin, A., Liu, G., Liu, Y., Lobanov, A., Lohezic, V., Loiseau, T., Lu, C., Lu, R., Lu, S. Y., Lukens, P., Mackenzie, M., Magnan, A., Magniette, F., Mahjoub, A., Mahon, D., Majumder, G., Makarenko, V., Malakhov, A., Malgeri, L., Mallios, S., Mandloi, C., Mankel, A., Mannelli, M., Mans, J., Mantilla, C., Martinez, G., Massa, C., Masterson, P., Matthewman, M., Matveev, V., Mayekar, S., Mazlov, I., Mehta, A., Mestvirishvili, A., Miao, Y., Milella, G., Mirza, I. R., Mitra, P., Moccia, S., Mohanty, G. B., Monti, F., Moortgat, F., Murthy, S., Music, J., Musienko, Y., Nabili, S., Nayak, S., Nelson, J. W., Nema, A., Neutelings, I., Niedziela, J., Nikitenko, A., Noonan, D., Noy, M., Nurdan, K., Obraztsov, S., Ochando, C., Ogul, H., Olsson, J., Onel, Y., Ozkorucuklu, S., Paganis, E., Palit, P., Pan, R., Pandey, S., Pantaleo, F., Papageorgakis, C., Paramesvaran, S., Paranjpe, M. M., Parolia, S., Parsons, A. G., Parygin, P., Paulini, M., Paus, C., Peñaló, K., Pedro, K., Pekic, V., Peltola, T., Peng, B., Perego, A., Perini, D., Petrilli, A., Pham, H., Pierre-Emile, T., Podem, S. K., Popov, V., Portales, L., Potok, O., Pradeep, P. B., Pramanik, R., Prosper, H., Prvan, M., Qasim, S. R., Qu, H., Quast, T., Trivino, A. Quiroga, Rabour, L., Raicevic, N., Rajpoot, H., Rao, M. A., Rapacz, K., Redjeb, W., Reinecke, M., Revering, M., Roberts, A., Rohlf, J., Rosado, P., Rose, A., Rothman, S., Rout, P. K., Rovere, M., Rumerio, P., Rusack, R., Rygaard, L., Ryjov, V., Sadivnycha, S., Sahin, M. Ö., Sakarya, U., Salerno, R., Saradhy, R., Saraf, M., Sarbandi, K., Sarkisla, M. A., Satyshev, I., Saud, N., Sauvan, J., Schindler, G., Schmidt, A., Schmidt, I., Schmitt, M. H., Sculac, A., Sculac, T., Sedelnikov, A., Seez, C., Sefkow, F., Selivanova, D., Selvaggi, M., Sergeychik, V., Sert, H., Shahid, M., Sharma, P., Sharma, R., Sharma, S., Shelake, M., Shenai, A., Shih, C. W., Shinde, R., Shmygol, D., Shukla, R., Sicking, E., Silva, P., Simsek, C., Simsek, E., Sirasva, B. K., Sirois, Y., Song, S., Song, Y., Soudais, G., Sriram, S., StJacques, R. R., StahlLeiton, A. G., Steen, A., Stein, J., Strait, J., Strobbe, N., Su, X., Sukhov, E., Suleiman, A., Cerci, D. Sunar, Suryadevara, P., Swain, K., Syal, C., Tali, B., Tanay, K., Tang, W., Tanvir, A., Tao, J., Tarabini, A., Tatli, T., Taylor, R., Taysi, Z. C., Teafoe, G., Tee, C. Z., Terrill, W., Thienpont, D., Thomas, R., Titov, M., Todd, C., Todd, E., Toms, M., Tosun, A., Troska, J., Tsai, L., Tsamalaidze, Z., Tsionou, D., Tsipolitis, G., Tsirigoti, M., Tu, R., Polat, S. N. Tural, Undleeb, S., Usai, E., Uslan, E., Ustinov, V., Vernazza, E., Viahin, O., Viazlo, O., Vichoudis, P., Vijay, A., Virdee, T., Voirin, E., Vojinovic, M., Voytishin, N., Vámi, T. Á., Wade, A., Walter, D., Wang, C., Wang, F., Wang, J., Wang, K., Wang, X., Wang, Y., Wang, Z., Wanlin, E., Wayne, M., Wetzel, J., Whitbeck, A., Wickwire, R., Wilmot, D., Wilson, J., Wu, H., Xiao, M., Yang, J., Yazici, B., Ye, Y., Yetkin, T., Yi, R., Yohay, R., Yu, T., Yuan, C., Yuan, X., Yuksel, O., YushmanoV, I., Yusuff, I., Zabi, A., Zareckis, D., Zarubin, A., Zehetner, P., Zghiche, A., Zhang, C., Zhang, D., Zhang, H., Zhang, J., Zhang, Z., Zhao, X., Zhong, J., Zhou, Y., and Zorbilmez, Ç.
- Subjects
Physics - Instrumentation and Detectors ,High Energy Physics - Experiment ,Physics - Data Analysis, Statistics and Probability - Abstract
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated., Comment: Prepared for submission to JINST
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.