Author: "Raghunathan, A." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Raghunathan, A."' showing total 21,252 results

Start Over Author "Raghunathan, A."

21,252 results on '"Raghunathan, A."'

1. Scaling Laws for Precision

Author: Kumar, Tanishq, Ankner, Zachary, Spector, Benjamin F., Bordelon, Blake, Muennighoff, Niklas, Paul, Mansheej, Pehlevan, Cengiz, Ré, Christopher, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Low precision training and inference affect both the quality and cost of language models, but current scaling laws do not account for this. In this work, we devise "precision-aware" scaling laws for both training and inference. We propose that training in lower precision reduces the model's "effective parameter count," allowing us to predict the additional loss incurred from training in low precision and post-train quantization. For inference, we find that the degradation introduced by post-training quantization increases as models are trained on more data, eventually making additional pretraining data actively harmful. For training, our scaling laws allow us to predict the loss of a model with different parts in different precisions, and suggest that training larger models in lower precision may be compute optimal. We unify the scaling laws for post and pretraining quantization to arrive at a single functional form that predicts degradation from training and inference in varied precisions. We fit on over 465 pretraining runs and validate our predictions on model sizes up to 1.7B parameters trained on up to 26B tokens.
Published: 2024

2. Detection of Thermal Emission at Millimeter Wavelengths from Low-Earth Orbit Satellites

Author: Foster, A., Chokshi, A., Anderson, A. J., Ansarinejad, B., Archipley, M., Balkenhol, L., Benabed, K., Bender, A. N., Barron, D. R., Benson, B. A., Bianchini, F., Bleem, L. E., Bouchet, F. R., Bryant, L., Camphuis, E., Carlstrom, J. E., Chang, C. L., Chaubal, P., Chichura, P. M., Chou, T. -L., Coerver, A., Crawford, T. M., Daley, C., de Haan, T., Dibert, K. R., Dobbs, M. A., Doussot, A., Dutcher, D., Everett, W., Feng, C., Ferguson, K. R., Fichman, K., Galli, S., Gambrel, A. E., Gardner, R. W., Ge, F., Goeckner-Wald, N., Gualtieri, R., Guidi, F., Guns, S., Halverson, N. W., Hivon, E., Holder, G. P., Holzapfel, W. L., Hood, J. C., Hryciuk, A., Huang, N., Kéruzoré, F., Khalife, A. R., Knox, L., Korman, M., Kornoelje, K., Kuo, C. -L., Levy, K., Lowitz, A. E., Lu, C., Maniyar, A., Martsen, E. S., Menanteau, F., Millea, M., Montgomery, J., Nakato, Y., Natoli, T., Noble, G. I., Omori, Y., Pan, Z., Paschos, P., Phadke, K. A., Pollak, A. W., Prabhu, K., Quan, W., Raghunathan, S., Rahimi, M., Rahlin, A., Reichardt, C. L., Rouble, M., Ruhl, J. E., Schiappucci, E., Sobrin, J. A., Stark, A. A., Stephen, J., Tandoi, C., Thorne, B., Trendafilova, C., Umilta, C., Vieira, J. D., Vitrier, A., Wan, Y., Whitehorn, N., Wu, W. L. K., Young, M. R., and Zebrowski, J. A.
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: The detection of satellite thermal emission at millimeter wavelengths is presented using data from the 3rd-Generation receiver on the South Pole Telescope (SPT-3G). This represents the first reported detection of thermal emission from artificial satellites at millimeter wavelengths. Satellite thermal emission is shown to be detectable at high signal-to-noise on timescales as short as a few tens of milliseconds. An algorithm for downloading orbital information and tracking known satellites given observer constraints and time-ordered observatory pointing is described. Consequences for cosmological surveys and short-duration transient searches are discussed, revealing that the integrated thermal emission from all large satellites does not contribute significantly to the SPT-3G survey intensity map. Measured satellite positions are found to be discrepant from their two-line element (TLE) derived ephemerides up to several arcminutes which may present a difficulty in cross-checking or masking satellites from short-duration transient searches.
Published: 2024

3. Probabilistic Parallels in the Classical Limit of Quantum Mechanical Models

Author: Ramakrishnan, Raghunathan
Subjects: Quantum Physics
Abstract: At large quantum numbers, the probability densities for particle-in-a-box or simple harmonic oscillator converge to the classical result upon coarse-graining the quantum mechanical probability densities by introducing a finite resolution in the measurement of the particle's position. This resolution in the position can be related to the resolution of the secondary total angular momentum quantum number ($m$) when interpreting the probabilistic outcomes of the Stern--Gerlach-type thought experiments for large values of the angular momentum quantum numbers ($j$)., Comment: first draft
Published: 2024

4. Coach Reservation for Groups Requests

Author: Cardonha, Carlos H. and Raghunathan, Arvind U.
Subjects: Mathematics - Optimization and Control, Computer Science - Data Structures and Algorithms, 90B06
Abstract: Passenger transportation is a core aspect of a railway company's business, with ticket sales playing a central role in generating revenue. Profitable operations in this context rely heavily on the effectiveness of reject-or-assign policies for coach reservations. As in traditional revenue management, uncertainty in demand presents a significant challenge, particularly when seat availability is limited and passengers have varying itineraries. We extend traditional models from the literature by addressing both offline and online versions of the coach reservation problem for group requests, where two or more passengers must be seated in the same coach. For the offline case, in which all requests are known in advance, we propose an exact mathematical programming formulation that incorporates a first-come, first-served fairness condition, ensuring compliance with transportation regulations. We also propose algorithms for online models of the problem, in which requests are only revealed upon arrival, and the reject-or-assign decisions must be made in real-time. Our analysis for one of these models overcomes known barriers in the packing literature, yielding strong competitive ratio guarantees when group sizes are relatively small compared to coach capacity - a common scenario in practice. Using data from Shinkansen Tokyo-Shin-Osaka line, our numerical experiments demonstrate the practical effectiveness of the proposed policies. Our work provides compelling evidence supporting the adoption of fairness constraints, as revenue losses are minimal, and simple algorithms are sufficient for real-time decision-making. Moreover, our findings provide a strong support for the adoption of fairness in the railway industry and highlight the financial viability of a regulatory framework that allows railway companies to delay coach assignments if they adhere to stricter rules regarding request rejections., Comment: 24 pages, 5 figures
Published: 2024

5. Power side-channel leakage localization through adversarial training of deep neural networks

Author: Gammell, Jimmy, Raghunathan, Anand, and Roy, Kaushik
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Supervised deep learning has emerged as an effective tool for carrying out power side-channel attacks on cryptographic implementations. While increasingly-powerful deep learning-based attacks are regularly published, comparatively-little work has gone into using deep learning to defend against these attacks. In this work we propose a technique for identifying which timesteps in a power trace are responsible for leaking a cryptographic key, through an adversarial game between a deep learning-based side-channel attacker which seeks to classify a sensitive variable from the power traces recorded during encryption, and a trainable noise generator which seeks to thwart this attack by introducing a minimal amount of noise into the power traces. We demonstrate on synthetic datasets that our method can outperform existing techniques in the presence of common countermeasures such as Boolean masking and trace desynchronization. Results on real datasets are weak because the technique is highly sensitive to hyperparameters and early-stop point, and we lack a holdout dataset with ground truth knowledge of leaking points for model selection. Nonetheless, we believe our work represents an important first step towards deep side-channel leakage localization without relying on strong assumptions about the implementation or the nature of its leakage. An open-source PyTorch implementation of our experiments is provided.
Published: 2024

6. Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Author: Goyal, Sachin, Baek, Christina, Kolter, J. Zico, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: A standard practice when using large language models is for users to supplement their instruction with an input context containing new information for the model to process. However, models struggle to reliably follow the input context, especially when it conflicts with their parametric knowledge from pretraining. In-principle, one would expect models to adapt to the user context better after instruction finetuning, particularly when handling knowledge conflicts. However, we observe a surprising failure mode: during instruction tuning, the context reliance under knowledge conflicts initially increases as expected, but then gradually decreases as instruction finetuning progresses. This happens while the performance on standard benchmarks keeps on increasing far after this drop. We call this phenomenon context-parametric inversion and observe it across multiple general purpose instruction tuning datasets such as TULU, Alpaca and Ultrachat, across different model families like Llama, Mistral, and Pythia. We perform various controlled studies and theoretical analysis to show that context-parametric inversion occurs due to examples in the instruction finetuning data where the input context provides information that aligns with model's parametric knowledge. Our analysis suggests some natural mitigation strategies with limited but insightful gains, and serves as a useful starting point in addressing this deficiency in instruction finetuning., Comment: Under Review
Published: 2024

7. KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data

Author: Zhou, Andy, Xu, Xiaojun, Raghunathan, Ramesh, Lal, Alok, Guan, Xinze, Yu, Bin, and Li, Bo
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Graph-based anomaly detection is pivotal in diverse security applications, such as fraud detection in transaction networks and intrusion detection for network traffic. Standard approaches, including Graph Neural Networks (GNNs), often struggle to generalize across shifting data distributions. Meanwhile, real-world domain knowledge is more stable and a common existing component of real-world detection strategies. To explicitly integrate such knowledge into data-driven models such as GCNs, we propose KnowGraph, which integrates domain knowledge with data-driven learning for enhanced graph-based anomaly detection. KnowGraph comprises two principal components: (1) a statistical learning component that utilizes a main model for the overarching detection task, augmented by multiple specialized knowledge models that predict domain-specific semantic entities; (2) a reasoning component that employs probabilistic graphical models to execute logical inferences based on model outputs, encoding domain knowledge through weighted first-order logic formulas. Extensive experiments on these large-scale real-world datasets show that KnowGraph consistently outperforms state-of-the-art baselines in both transductive and inductive settings, achieving substantial gains in average precision when generalizing to completely unseen test graphs. Further ablation studies demonstrate the effectiveness of the proposed reasoning component in improving detection performance, especially under extreme class imbalance. These results highlight the potential of integrating domain knowledge into data-driven models for high-stakes, graph-based security applications., Comment: Accepted to ACM CCS 2024
Published: 2024

8. Constraining cosmological parameters using the pairwise kinematic Sunyaev-Zel'dovich effect with CMB-S4 and future galaxy cluster surveys

Author: Schiappucci, E., Raghunathan, S., To, C., Bianchini, F., Reichardt, C. L., Battaglia, N., Hadzhiyska, B., Kim, S., Melin, J. B., Sifón, C., and Vavagiakis, E. M.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: We present a forecast of the pairwise kinematic Sunyaev-Zel'dovich (kSZ) measurement that will be achievable with the future CMB-S4 experiment. CMB-S4 is the next stage for ground-based cosmic microwave background experiments, with a planned wide area survey that will observe approximately $50\%$ of the sky. We construct a simulated sample of galaxy clusters that have been optically selected in an LSST-like survey and have spectroscopic redshifts. For this cluster sample, we predict that CMB-S4 will reject the null hypothesis of zero pairwise kSZ signal at $36 \,\sigma$. We estimate the effects of systematic uncertainties such as scatter in the mass-richness scaling relation and cluster mis-centering. We find that these effects can reduce the signal-to-noise ratio of the CMB-S4 pairwise kSZ measurement by $20\%$. We explore the constraining power of the measured kSZ signal in combination with measurements of the galaxy clusters' thermal SZ emission on two extensions to the standard cosmological model. The first extension allows the dark energy equation of state $w$ to vary. We find the CMB-S4 pairwise kSZ measurement yields a modest reduction in the uncertainty on $w$ by a factor of 1.36 over the \Planck's 2018 uncertainty. The second extension tests General Relativity by varying the growth index $\gamma$. We find that CMB-S4's pairwise kSZ measurement will yield a $28\sigma$ constraint on $\gamma$, and strongly constrain alternative theories of gravity., Comment: arXiv admin note: text overlap with arXiv:2207.11937
Published: 2024

9. Metamorphosis of transition to periodic oscillations in a turbulent reactive flow system

Author: Thonti, Beeraiah, Sudarsanan, Sivakumar, Bhavi, Ramesh S., Bhaskaran, Anaswara, Raghunathan, Manikandan, and Sujith, R. I.
Subjects: Physics - Fluid Dynamics, Nonlinear Sciences - Adaptation and Self-Organizing Systems
Abstract: The emergence of periodic oscillations is observed in various complex systems in nature and engineering. Thermoacoustic oscillations in systems comprising turbulent reactive flow exemplify such complexity in the engineering context, where the emergence of oscillatory dynamics is often undesirable. In this work, we experimentally study the transition to periodic oscillations within a turbulent flow reactive system, with varying fuel-to-air ratio, represented by equivalence ratio as a bifurcation parameter. Further, we explore the change in the nature of the transition by varying a secondary parameter. In our system, we vary the thermal power input and the location of the flame stabilizer position individually as a secondary parameter. Our findings reveal five qualitatively distinct types of transitions to periodic oscillations. Two types of these transitions exhibit a continuous nature. Another two types of transitions involve multiple shifts in the dynamical states consisting of both continuous and discontinuous bifurcations. The last type of transition is characterized by an abrupt bifurcation to high-amplitude periodic oscillations. Understanding this metamorphosis of the transition - from continuous to discontinuous nature - is critical for advancing our comprehension of the dynamic behavior in turbulent reactive flow systems. The insights gained from this study have the potential to inform the design and control of similar engineering systems where managing oscillatory behavior is crucial., Comment: 32 pages, 12 figures
Published: 2024

10. Increased Secret Key Throughput in Twin Field Quantum Key Distribution using 4x4 Beam Splitter Detection Network

Author: Pandey, Ishan and Raghunathan, Varun
Subjects: Physics - Optics, Quantum Physics
Abstract: Twin Field Quantum key Distribution (TFQKD) has attracted recent interest due to the higher secret key capacity better than the fundamental repeaterless limit and extending the achievable distance. The key generation in TFQKD is based on the post selection of randomized phase slices. This paper describes a technique for enhancing the probability of choosing the phase slices by using four detectors at Charlie end placed after a 4x4 port beam-splitter network. Using theoretical modelling of secret keyrate and simulations using StrawberryFields, we observe an increase in secret key throughput when compared to conventional TFQKD.
Published: 2024

11. MoA is All You Need: Building LLM Research Team using Mixture of Agents

Author: Chen, Sandy, Zeng, Leqi, Raghunathan, Abhinav, Huang, Flora, and Kim, Terrence C.
Subjects: Quantitative Finance - Computational Finance
Abstract: Large Language Models (LLMs) research in the financial domain is particularly complex due to the sheer number of approaches proposed in literature. Retrieval-Augmented Generation (RAG) has emerged as one of the leading methods in the sector due to its inherent groundedness and data source variability. In this work, we introduce a RAG framework called Mixture of Agents (MoA) and demonstrate its viability as a practical, customizable, and highly effective approach for scaling RAG applications. MoA is essentially a layered network of individually customized small language models (Hoffmann et al., 2022) collaborating to answer questions and extract information. While there are many theoretical propositions for such an architecture and even a few libraries for generally applying the structure in practice, there are limited documented studies evaluating the potential of this framework considering real business constraints such as cost and speed. We find that the MoA framework, consisting of small language models (Hoffmann et al., 2022), produces higher quality and more grounded responses across various financial domains that are core to Vanguard's business while simultaneously maintaining low costs.
Published: 2024

12. SiTe CiM: Signed Ternary Computing-in-Memory for Ultra-Low Precision Deep Neural Networks

Author: Thakuria, Niharika, Malhotra, Akul, Thirumala, Sandeep K., Elangovan, Reena, Raghunathan, Anand, and Gupta, Sumeet K.
Subjects: Computer Science - Hardware Architecture
Abstract: Ternary Deep Neural Networks (DNN) have shown a large potential for highly energy-constrained systems by virtue of their low power operation (due to ultra-low precision) with only a mild degradation in accuracy. To enable an energy-efficient hardware substrate for such systems, we propose a compute-enabled memory design, referred to as SiTe-CiM, which features computing-in-memory (CiM) of dot products between signed ternary (SiTe) inputs and weights. SiTe CiM is based on cross-coupling of two bit cells to enable CiM of dot products in the signed ternary regime. We explore SiTe CiM with 8T-SRAM, 3T-embedded DRAM (3T-eDRAM) and 3T-ferroelectric metal FET (FEMFET) memories. We propose two flavors of this technique, namely SiTe CiM I/II. In SiTe CiM I, we employ two additional transistors per cell for cross-coupling, achieving fast CiM operations, albeit incurring an area overhead ranging from 18% to 34% (compared to standard ternary memories). In SiTe CiM II, four extra transistors are utilized for every 16 cells in a column, thereby incurring only 6% area cost (but leading to slower CiM than SiTe CiM I). Based on the array analysis, our designs achieve up to 88% lower CiM latency and 78% CiM energy savings across various technologies considered, as compared to their respective near-memory computing counterparts. Further, we perform system level analysis by incorporating SiTe CiM I/II arrays in a ternary DNN accelerator and show up to 7X throughput boost and up to 2.5X energy reduction compared to the near-memory ternary DNN accelerators.
Published: 2024

13. Constructing Tight Quadratic Relaxations for Global Optimization: II. Underestimating Difference-of-Convex (D.C.) Functions

Author: Strahl, William R., Raghunathan, Arvind U., Sahinidis, Nikolaos V., and Gounaris, Chrysanthos E.
Subjects: Mathematics - Optimization and Control
Abstract: Recent advances in the efficiency and robustness of algorithms solving convex quadratically constrained quadratic programming (QCQP) problems motivate developing techniques for creating convex quadratic relaxations that, although more expensive to compute, provide tighter bounds than their classical linear counterparts. In the first part of this two-paper series [Strahl et al., 2024], we developed a cutting plane algorithm to construct convex quadratic underestimators for twice-differentiable convex functions, which we extend here to address the case of non-convex difference-of-convex (d.c.) functions as well. Furthermore, we generalize our approach to consider a hierarchy of quadratic forms, thereby allowing the construction of even tighter underestimators. On a set of d.c. functions extracted from benchmark libraries, we demonstrate noteworthy reduction in the hypervolume between our quadratic underestimators and linear ones constructed at the same points. Additionally, we construct convex QCQP relaxations at the root node of a spatial branch-and-bound tree for a set of systematically created d.c. optimization problems in up to four dimensions, and we show that our relaxations reduce the gap between the lower bound computed by the state-of-the-art global optimization solver BARON and the optimal solution by an excess of 90%, on average.
Published: 2024

14. Constructing Tight Quadratic Relaxations for Global Optimization: I. Outer-Approximating Twice-Differentiable Convex Functions

Author: Strahl, William R., Raghunathan, Arvind U., Sahinidis, Nikolaos V., and Gounaris, Chrysanthos E.
Subjects: Mathematics - Optimization and Control
Abstract: When computing bounds, spatial branch-and-bound algorithms often linearly outer approximate convex relaxations for non-convex expressions in order to capitalize on the efficiency and robustness of linear programming solvers. Considering that linear outer approximations sacrifice accuracy when approximating highly nonlinear functions and recognizing the recent advancements in the efficiency and robustness of available methods to solve optimization problems with quadratic objectives and constraints, we contemplate here the construction of quadratic outer approximations of twice-differentiable convex functions for use in deterministic global optimization. To this end, we present a novel cutting-plane algorithm that determines the tightest scaling parameter, $\alpha$, in the second-order Taylor series approximation quadratic underestimator proposed by Su et al. We use a representative set of convex functions extracted from optimization benchmark libraries to showcase--qualitatively and quantitatively--the tightness of the constructed quadratic underestimators and to demonstrate the overall computational efficiency of our algorithm. Furthermore, we extend our construction procedure to generate even tighter quadratic underestimators by allowing overestimation in infeasible polyhedral regions of optimization problems, as informed by the latter's linear constraints.
Published: 2024

15. Investigating Mode Effects in Interviewer Variances Using Two Representative Multi-mode Surveys

Author: Yu, Wenshan, Elliott, Michael R., and Raghunathan, Trivellore E.
Subjects: Statistics - Applications
Abstract: This study examines whether interviewer variances remain consistent across different modes in mixed-mode studies, using data from two distinct designs. In the first design, when interviewers are responsible for either face-to-face or telephone mode, we examine whether there are mode differences in interviewer variances for 1) sensitive political questions, 2) international items, 3) and item missing indicators on international items, using the Arab Barometer wave 6 Jordan data. In the second design, we draw on Health and Retirement Study (HRS) 2016 core survey data to examine the question on three topics when interviewers are responsible for both modes. The topics cover 1) the CESD depression scale, 2) interviewer observations, and 3) the physical activity scale. To account for the lack of interpenetrated designs in both data sources, we include respondent-level covariates in our models. We find significant differences in interviewer variances on one item (twelve items in total) in the Arab Barometer study; whereas for HRS, the results are three out of eighteen. Overall, we find the magnitude of the interviewer variances larger in FTF than TEL on sensitive items. We conduct simulations to understand the power to detect mode effects in the typically modest interviewer sample sizes.
Published: 2024

16. Cross-variable amplitude-frequency coupling during intermittency in a turbulent thermoacoustic system

Author: Tandon, Shruti, Balaji, Aswin, Radhakrishnan, Rohit, Raghunathan, Manikandan, Chopra, Gaurav, and Sujith, R. I.
Subjects: Physics - Fluid Dynamics
Abstract: We investigate flame-acoustic interactions in a turbulent combustor during the state of intermittency before the onset of thermoacoustic instability using complex networks. Experiments are performed in a turbulent bluff-body stabilized dump combustor where the inlet airflow rate is varied quasi-statically and continuously. We construct a natural visibility graph from the local heat release rate fluctuations at each location. Comparing the average degree during epochs of high and low amplitude acoustic pressure oscillations during the state of intermittency, we detect frequency modulation in local heat release rate signals. Through this approach, we discover unique spatial patterns of cross-variable coupling between the frequency of heat release rate fluctuations and the amplitude of acoustic pressure fluctuations. The frequency of heat release rate lfuctuations increases in regions of flame anchoring owing to high-frequency excitation of the flow and flame during epochs of high-amplitude acoustic pressure dynamics. On the other hand, the frequency of heat release rate fluctuations decreases in regions associated with flame front distortions by large coherent vortices. In experiments with continuously varying airflow rates, the spatial pattern of frequency modulation varies with an increase in the average amplitude of acoustic pressure fluctuations owing to an increase in the epochs of periodic acoustic pressure dynamics and the size of vortices forming in the flow. Dynamic shifts in the location of flame anchoring induce low-frequency fluctuations in heat release rate fluctuations during very high-amplitude intermittent acoustic pressure dynamics. Our approach using conditional natural visibility graphs thus reveals the spatial pattern of amplitude-frequency coupling between the co-evolving flame and the acoustic field dynamics in turbulent reacting flows.
Published: 2024

17. Simultaneous Trajectory Optimization and Contact Selection for Contact-rich Manipulation with High-Fidelity Geometry

Author: Zhang, Mengchao, Jha, Devesh K., Raghunathan, Arvind U., and Hauser, Kris
Subjects: Computer Science - Robotics
Abstract: Contact-implicit trajectory optimization (CITO) is an effective method to plan complex trajectories for various contact-rich systems including manipulation and locomotion. CITO formulates a mathematical program with complementarity constraints (MPCC) that enforces that contact forces must be zero when points are not in contact. However, MPCC solve times increase steeply with the number of allowable points of contact, which limits CITO's applicability to problems in which only a few, simple geometries are allowed to make contact. This paper introduces simultaneous trajectory optimization and contact selection (STOCS), as an extension of CITO that overcomes this limitation. The innovation of STOCS is to identify salient contact points and times inside the iterative trajectory optimization process. This effectively reduces the number of variables and constraints in each MPCC invocation. The STOCS framework, instantiated with key contact identification subroutines, renders the optimization of manipulation trajectories computationally tractable even for high-fidelity geometries consisting of tens of thousands of vertices., Comment: arXiv admin note: text overlap with arXiv:2306.06465
Published: 2024

18. Influence of Pseudo-Jahn-Teller Activity on the Singlet-Triplet Gap of Azaphenalenes

Author: Majumdar, Atreyee, Jindal, Komal, Das, Surajit, and Ramakrishnan, Raghunathan
Subjects: Physics - Chemical Physics
Abstract: We analyze the possibility of symmetry-lowering induced by pseudo-Jahn--Teller interactions in six previously studied azaphenalenes that are known to have their first excited singlet state (S$_1$) lower in energy than the triplet state (T$_1$). The primary aim of this study is to explore whether Hund's rule violation is observed in these molecules when their structures are distorted from $C_{\rm 2v}$ or $D_{\rm 3h}$ point group symmetries by vibronic coupling. Along two interatomic distances connecting these point groups to their subgroups $C_{\rm s}$ or $C_{\rm 3h}$, we relaxed the other internal degrees of freedom and calculated two-dimensional potential energy subsurfaces. The many-body perturbation theory (MP2) suggests that the high-symmetry structures are the energy minima for all six systems. However, single-point energy calculations using the coupled-cluster method (CCSD(T)) indicate symmetry lowering in four cases. The singlet-triplet energy gap plotted on the potential energy surface also shows variations when deviating from high-symmetry structures. A full geometry optimization at the CCSD(T) level with the cc-pVTZ basis set reveals that the $D_{\rm 3h}$ structure of cyclazine (1AP) is a saddle point, connecting two equivalent minima of $C_{\rm 3h}$ symmetry undergoing rapid automerization. The combined effects of symmetry lowering and high-level corrections result in a nearly zero singlet-triplet gap for the $C_{\rm 3h}$ structure of cyclazine. Azaphenalenes containing nitrogen atoms at electron-deficient sites -- 2AP, 3AP, and 4AP -- exhibit more pronounced in-plane structural distortion; the effect is captured by the long-range exchange-interaction corrected DFT method, $\omega$B97XD. Excited state calculations of these systems indicate that in their low-symmetry energy minima, T$_1$ is indeed lower in energy than S$_1$, upholding the validity of Hund's rule., Comment: second version
Published: 2024

19. Strong nebular emissions associated with MgII absorptions detected in the SDSS spectra of background quasars

Author: Guha, Labanya Kumar and Srianand, Raghunathan
Subjects: Astrophysics - Astrophysics of Galaxies
Abstract: We present long-slit spectroscopic observations of 40 Galaxy On Top Of Quasars (GOTOQs) at ${0.37 \leqslant z \leqslant 1.01}$ using the South African Large Telescope. Using this and available photometric data, we measure the impact parameters of the foreground galaxies to be in the range of 3$-$16 kpc with a median value of 8.6 kpc. This is the largest sample of galaxies producing MgII absorption at such low impact parameters. These quasar-galaxy pairs are ideal for probing the disk-halo interface. At such impact parameters, we do not find any significant anti-correlation between rest equivalent width (REW) of CaII, MnII, FeII, MgII, and MgI absorptions and impact parameters. These sight lines are typically redder than those of strong MgII absorbers, with the color excess, E(B$-$V) for our sample ranging from $-$0.191 to 0.422, with a median value of 0.058. In the E(B$-$V) vs. W$_{3935}$ plane, GOTOQs occupy the same region as CaII absorbers. For a given E(B$-$V), we find larger W$_{3935}$ than what has been found in the Milky Way, probably due to a smaller dust-to-gas ratio in GOTOQs. Galaxy parameters could be measured for twelve cases, and their properties seem to follow the trends found for strong MgII absorbers. Measuring the host galaxy properties for the full sample using HST photometry or AO-assisted ground-based imaging is important to gain insights into the relationship between the stellar mass of galaxies and the metal line REW distributions at low impact parameters., Comment: Accepted for publication in MNRAS. 9 figures and 14 pages
Published: 2024

20. Understanding Finetuning for Factual Knowledge Extraction

Author: Ghosal, Gaurav, Hashimoto, Tatsunori, and Raghunathan, Aditi
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: In this work, we study the impact of QA fine-tuning data on downstream factuality. We show that fine-tuning on lesser-known facts that are poorly stored during pretraining yields significantly worse factuality than fine-tuning on well-known facts, even when all facts are seen during pretraining. We prove this phenomenon theoretically, showing that training on lesser-known facts can lead the model to ignore subject entity names and instead output a generic plausible response even when the relevant factual knowledge is encoded in the model. On three question answering benchmarks (PopQA, Entity Questions, and MMLU) and two language models (Llama-2-7B and Mistral-7B), we find that (i) finetuning on a completely factual but lesser-known subset of the data deteriorates downstream factuality (5-10%) and (ii) finetuning on a subset of better-known examples matches or outperforms finetuning on the entire dataset. Ultimately, our results shed light on the interaction between pretrained knowledge and finetuning data and demonstrate the importance of taking into account how facts are stored in the pretrained model when fine-tuning for knowledge-intensive tasks., Comment: To appear in ICML 2024
Published: 2024

21. Adversarial Attacks on Multimodal Agents

Author: Wu, Chen Henry, Koh, Jing Yu, Salakhutdinov, Ruslan, Fried, Daniel, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-enabled language models (VLMs) are now used to build autonomous multimodal agents capable of taking actions in real environments. In this paper, we show that multimodal agents raise new safety risks, even though attacking agents is more challenging than prior attacks due to limited access to and knowledge about the environment. Our attacks use adversarial text strings to guide gradient-based perturbation over one trigger image in the environment: (1) our captioner attack attacks white-box captioners if they are used to process images into captions as additional inputs to the VLM; (2) our CLIP attack attacks a set of CLIP models jointly, which can transfer to proprietary VLMs. To evaluate the attacks, we curated VisualWebArena-Adv, a set of adversarial tasks based on VisualWebArena, an environment for web-based multimodal agent tasks. Within an L-infinity norm of $16/256$ on a single image, the captioner attack can make a captioner-augmented GPT-4V agent execute the adversarial goals with a 75% success rate. When we remove the captioner or use GPT-4V to generate its own captions, the CLIP attack can achieve success rates of 21% and 43%, respectively. Experiments on agents based on other VLMs, such as Gemini-1.5, Claude-3, and GPT-4o, show interesting differences in their robustness. Further analysis reveals several key factors contributing to the attack's success, and we also discuss the implications for defenses as well. Project page: https://chenwu.io/attack-agent Code and data: https://github.com/ChenWu98/agent-attack, Comment: 19 pages
Published: 2024

22. Canard explosions in turbulent thermo-fluid systems

Author: Bhavi, Ramesh S., Sudarsanan, Sivakumar, Raghunathan, Manikandan, Bhaskaran, Anaswara, and Sujith, R. I.
Subjects: Physics - Fluid Dynamics, Nonlinear Sciences - Adaptation and Self-Organizing Systems, Physics - Applied Physics
Abstract: A sudden transition to a state of high amplitude limit cycle oscillations is catastrophic in a thermo-fluid system. Conventionally, upon varying the control parameter, a sudden transition is observed as an abrupt jump in the amplitude of the fluctuations in these systems. In contrast, we present an experimental discovery of a canard explosion in a turbulent reactive flow system where we observe a continuous bifurcation with a rapid rise in the amplitude of the fluctuations within a narrow range of control parameters. The observed transition is facilitated via a state of bursting, consisting of the epochs of large amplitude periodic oscillations amidst the epochs of low amplitude periodic oscillations. The amplitude of the bursts is higher than the amplitude of the bursts of intermittency state in a conventional gradual transition, as reported in turbulent reactive flow systems. During the bursting state, we observe that temperature fluctuations of exhaust gas vary at a slower time scale in correlation with the amplitude envelope of the bursts. We also present a phenomenological model for thermoacoustic systems to describe the observed canard explosion. Using the model, we explain that the large amplitude bursts occur due to the slow-fast dynamics at the bifurcation regime of the canard explosion.
Published: 2024

23. Use of Early Ketamine Sedation and Association With Clinical and Cost Outcomes Among Mechanically Ventilated Patients With COVID-19: A Retrospective Cohort Study.

Author: Royce-Nagel, Galen, Jarzebowski, Mary, Wongsripuemtet, Pattrapun, Krishnamoorthy, Vijay, Fuller, Matthew, Ohnuma, Tetsu, Treggiari, Miriam, Yaport, Miguel, Cobert, Julien, Garrigan, Ethan, Bartz, Raquel, and Raghunathan, Karthik
Subjects: Humans, Ketamine, Respiration, Artificial, Retrospective Studies, Male, Female, COVID-19, Middle Aged, Hospital Mortality, Aged, Length of Stay, Intensive Care Units, Cohort Studies, Hypnotics and Sedatives, SARS-CoV-2, Hospital Costs, Propensity Score
Abstract: OBJECTIVES: To describe the utilization of early ketamine use among patients mechanically ventilated for COVID-19, and examine associations with in-hospital mortality and other clinical outcomes. DESIGN: Retrospective cohort study. SETTING: Six hundred ten hospitals contributing data to the Premier Healthcare Database between April 2020 and June 2021. PATIENTS: Adults with COVID-19 and greater than or equal to 2 consecutive days of mechanical ventilation within 5 days of hospitalization. INTERVENTION: The exposures were early ketamine use initiated within 2 days of intubation and continued for greater than 1 day. MEASUREMENTS: Primary was hospital mortality. Secondary outcomes included length of stay (LOS) in the hospital and ICUs, ventilator days, vasopressor days, renal replacement therapy (RRT), and total hospital cost. The propensity score matching analysis was used to adjust for confounders. MAIN RESULTS: Among 42,954 patients, 1,423 (3.3%) were exposed to early ketamine use. After propensity score matching including 1,390 patients in each group, recipients of ketamine infusions were associated with higher hospital mortality (52.5% vs. 45.9%, risk ratio: 1.14, [1.06-1.23]), longer median ICU stay (13 vs. 12 d, mean ratio [MR]: 1.15 [1.08-1.23]), and longer ventilator days (12 vs. 11 d, MR: 1.19 [1.12-1.27]). There were no associations for hospital LOS (17 [10-27] vs. 17 [9-28], MR: 1.05 [0.99-1.12]), vasopressor days (4 vs. 4, MR: 1.04 [0.95-1.14]), and RRT (22.9% vs. 21.7%, RR: 1.05 [0.92-1.21]). Total hospital cost was higher (median $72,481 vs. $65,584, MR: 1.11 [1.05-1.19]). CONCLUSIONS: In a diverse sample of U.S. hospitals, about one in 30 patients mechanically ventilated with COVID-19 received ketamine infusions. Early ketamine may have an association with higher hospital mortality, increased total cost, ICU stay, and ventilator days, but no associations for hospital LOS, vasopressor days, and RRT. However, confounding by the severity of illness might occur due to higher extracorporeal membrane oxygenation and RRT use in the ketamine group. Further randomized trials are needed to better understand the role of ketamine infusions in the management of critically ill patients.
Published: 2024

24. Measuring Implicit Bias in ICU Notes Using Word-Embedding Neural Network Models.

Author: Cobert, Julien, Mills, Hunter, Lee, Albert, Gologorskaya, Oksana, Espejo, Edie, Jeon, Sun, Boscardin, W, Heintz, Timothy, Kennedy, Christopher, Ashana, Deepshikha, Chapman, Allyson, Raghunathan, Karthik, Smith, Alex, and Lee, Sei
Subjects: critical care, inequity, linguistics, machine learning, natural language processing, Humans, Natural Language Processing, Intensive Care Units, Neural Networks, Computer, Algorithms, Critical Illness, Bias, Electronic Health Records, Male, Female
Abstract: BACKGROUND: Language in nonmedical data sets is known to transmit human-like biases when used in natural language processing (NLP) algorithms that can reinforce disparities. It is unclear if NLP algorithms of medical notes could lead to similar transmissions of biases. RESEARCH QUESTION: Can we identify implicit bias in clinical notes, and are biases stable across time and geography? STUDY DESIGN AND METHODS: To determine whether different racial and ethnic descriptors are similar contextually to stigmatizing language in ICU notes and whether these relationships are stable across time and geography, we identified notes on critically ill adults admitted to the University of California, San Francisco (UCSF), from 2012 through 2022 and to Beth Israel Deaconess Hospital (BIDMC) from 2001 through 2012. Because word meaning is derived largely from context, we trained unsupervised word-embedding algorithms to measure the similarity (cosine similarity) quantitatively of the context between a racial or ethnic descriptor (eg, African-American) and a stigmatizing target word (eg, nonco-operative) or group of words (violence, passivity, noncompliance, nonadherence). RESULTS: In UCSF notes, Black descriptors were less likely to be similar contextually to violent words compared with White descriptors. Contrastingly, in BIDMC notes, Black descriptors were more likely to be similar contextually to violent words compared with White descriptors. The UCSF data set also showed that Black descriptors were more similar contextually to passivity and noncompliance words compared with Latinx descriptors. INTERPRETATION: Implicit bias is identifiable in ICU notes. Racial and ethnic group descriptors carry different contextual relationships to stigmatizing words, depending on when and where notes were written. Because NLP models seem able to transmit implicit bias from training data, use of NLP algorithms in clinical prediction could reinforce disparities. Active debiasing strategies may be necessary to achieve algorithmic fairness when using language models in clinical research.
Published: 2024

25. Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning

Author: Springer, Jacob Mitchell, Nagarajan, Vaishnavh, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning
Abstract: Sharpness-Aware Minimization (SAM) has emerged as a promising alternative optimizer to stochastic gradient descent (SGD). The originally-proposed motivation behind SAM was to bias neural networks towards flatter minima that are believed to generalize better. However, recent studies have shown conflicting evidence on the relationship between flatness and generalization, suggesting that flatness does fully explain SAM's success. Sidestepping this debate, we identify an orthogonal effect of SAM that is beneficial out-of-distribution: we argue that SAM implicitly balances the quality of diverse features. SAM achieves this effect by adaptively suppressing well-learned features which gives remaining features opportunity to be learned. We show that this mechanism is beneficial in datasets that contain redundant or spurious features where SGD falls for the simplicity bias and would not otherwise learn all available features. Our insights are supported by experiments on real data: we demonstrate that SAM improves the quality of features in datasets containing redundant or spurious features, including CelebA, Waterbirds, CIFAR-MNIST, and DomainBed., Comment: 25 pages, 10 figures, 2 tables
Published: 2024

26. Chemical Space-Informed Machine Learning Models for Rapid Predictions of X-ray Photoelectron Spectra of Organic Molecules

Author: Tripathy, Susmita, Das, Surajit, Jindal, Shweta, and Ramakrishnan, Raghunathan
Subjects: Physics - Chemical Physics
Abstract: We present machine learning models based on kernel-ridge regression for predicting X-ray photoelectron spectra of organic molecules originating from the $K$-shell ionization energies of carbon (C), nitrogen (N), oxygen (O), and fluorine (F) atoms. We constructed the training dataset through high-throughput calculations of $K$-shell core-electron binding energies (CEBEs) for 12,880 small organic molecules in the bigQM7$\omega$ dataset, employing the $\Delta$-SCF formalism coupled with meta-GGA-DFT and a variationally converged basis set. The models are cost-effective, as they require the atomic coordinates of a molecule generated using universal force fields while estimating the target-level CEBEs corresponding to DFT-level equilibrium geometry. We explore transfer learning by utilizing the atomic environment feature vectors learned using a graph neural network framework in kernel-ridge regression. Additionally, we enhance accuracy within the $\Delta$-machine learning framework by leveraging inexpensive baseline spectra derived from Kohn--Sham eigenvalues. When applied to 208 combinatorially substituted uracil molecules larger than those in the training set, our analyses suggest that the models may not provide quantitatively accurate predictions of CEBEs but offer a strong linear correlation relevant for virtual high-throughput screening. We present the dataset and models as the Python module, ${\tt cebeconf}$, to facilitate further explorations., Comment: Major Revision, New Figures and Tables added in the SI, Figures 1 and 4 revised
Published: 2024

27. Emergent inhomogeneity and non-locality in a graphene field-effect transistor on a near-parallel moire superlattice of transition metal dichalcogenides

Author: Sett, Shaili, Debnath, Rahul, Singha, Arup, Mandal, Shinjan, K, Jyothsna, Bhakar, Monika, Watanabe, Kenji, Taniguchi, Takashi, Raghunathan, Varun, Sheet, Goutam, Jain, Manish, and Ghosh, Arindam
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: At near-parallel orientation, twisted bilayer of transition metal dichalcogenides exhibit inter-layer charge transfer-driven out-of-plane ferroelectricity that may lead to unique electronic device architectures. Here we report detailed electrical transport in a dual-gated graphene field-effect transistor placed on 3R stacked twisted bilayer of WSe2 at a twist angle of 2.1 degree. We observe hysteretic transfer characteristics and an emergent charge inhomogeneity with multiple local Dirac points as the electric displacement field (D) is increased. Concomitantly, we also observe a strong non-local voltage signal at D = 0 V/nm that decreases rapidly with increasing D. A linear scaling of the non-local signal with longitudinal resistance suggests edge mode transport, which we attribute to the breaking of valley symmetry of the graphene channel due to the spatially fluctuating electric field from the moire domains of the underlying twisted WSe2. A quantitative analysis connecting the non-locality and channel inhomogeneity suggests emergence of finite-size domains in the graphene channel that modulate the charge and the valley currents simultaneously. This work underlines efficient control and impact of interfacial ferroelectricity that can trigger a new genre of devices for twistronic applications., Comment: 16 pages, 15 figures
Published: 2024
Full Text: View/download PDF

28. Why is SAM Robust to Label Noise?

Author: Baek, Christina, Kolter, Zico, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning
Abstract: Sharpness-Aware Minimization (SAM) is most known for achieving state-of the-art performances on natural image and language tasks. However, its most pronounced improvements (of tens of percent) is rather in the presence of label noise. Understanding SAM's label noise robustness requires a departure from characterizing the robustness of minimas lying in "flatter" regions of the loss landscape. In particular, the peak performance under label noise occurs with early stopping, far before the loss converges. We decompose SAM's robustness into two effects: one induced by changes to the logit term and the other induced by changes to the network Jacobian. The first can be observed in linear logistic regression where SAM provably up-weights the gradient contribution from clean examples. Although this explicit up-weighting is also observable in neural networks, when we intervene and modify SAM to remove this effect, surprisingly, we see no visible degradation in performance. We infer that SAM's effect in deeper networks is instead explained entirely by the effect SAM has on the network Jacobian. We theoretically derive the implicit regularization induced by this Jacobian effect in two layer linear networks. Motivated by our analysis, we see that cheaper alternatives to SAM that explicitly induce these regularization effects largely recover the benefits in deep networks trained on real-world datasets.
Published: 2024

29. Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

Author: Goyal, Sachin, Maini, Pratyush, Lipton, Zachary C., Raghunathan, Aditi, and Kolter, J. Zico
Subjects: Computer Science - Machine Learning
Abstract: Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets. In recent times, data curation has gained prominence with several works developing strategies to retain 'high-quality' subsets of 'raw' scraped data. For instance, the LAION public dataset retained only 10% of the total crawled data. However, these strategies are typically developed agnostic of the available compute for training. In this paper, we first demonstrate that making filtering decisions independent of training compute is often suboptimal: the limited high-quality data rapidly loses its utility when repeated, eventually requiring the inclusion of 'unseen' but 'lower-quality' data. To address this quality-quantity tradeoff ($\texttt{QQT}$), we introduce neural scaling laws that account for the non-homogeneous nature of web data, an angle ignored in existing literature. Our scaling laws (i) characterize the $\textit{differing}$ 'utility' of various quality subsets of web data; (ii) account for how utility diminishes for a data point at its 'nth' repetition; and (iii) formulate the mutual interaction of various data pools when combined, enabling the estimation of model performance on a combination of multiple data pools without ever jointly training on them. Our key message is that data curation $\textit{cannot}$ be agnostic of the total compute that a model will be trained for. Our scaling laws allow us to curate the best possible pool for achieving top performance on Datacomp at various compute budgets, carving out a pareto-frontier for data curation. Code is available at https://github.com/locuslab/scaling_laws_data_filtering., Comment: Published at CVPR 2024
Published: 2024

30. Mass calibration of DES Year-3 clusters via SPT-3G CMB cluster lensing

Author: Ansarinejad, B., Raghunathan, S., Abbott, T. M. C., Ade, P. A. R., Aguena, M., Alves, O., Anderson, A. J., Andrade-Oliveira, F., Archipley, M., Balkenhol, L., Benabed, K., Bender, A. N., Benson, B. A., Bertin, E., Bianchini, F., Bleem, L. E., Bocquet, S., Bouchet, F. R., Brooks, D., Bryant, L., Burke, D. L., Camphuis, E., Carlstrom, J. E., Rosell, A. Carnero, Carretero, J., Castander, F. J., Cecil, T. W., Chang, C. L., Chaubal, P., Chichura, P. M., Chou, T. -L., Coerver, A., Costanzi, M., Crawford, T. M., Cukierman, A., da Costa, L. N., Daley, C., Davis, T. M., de Haan, T., Desai, S., De Vicente, J., Dibert, K. R., Dobbs, M. A., Doel, P., Doussot, A., Doux, C., Dutcher, D., Everett, W., Feng, C., Ferguson, K. R., Ferrero, I., Fichman, K., Foster, A., Frieman, J., Galli, S., Gambrel, A. E., García-Bellido, J., Gardner, R. W., Gaztanaga, E., Ge, F., Giannini, G., Goeckner-Wald, N., Grandis, S., Gruendl, R. A., Gualtieri, R., Guidi, F., Guns, S., Gutierrez, G., Halverson, N. W., Hinton, S. R., Hivon, E., Holder, G. P., Hollowood, D. L., Holzapfel, W. L., Honscheid, K., Hood, J. C., Huang, N., James, D. J., Kéruzoré, F., Knox, L., Korman, M., Kuo, C. -L., Lee, A. T., Lee, S., Levy, K., Lowitz, A. E., Lu, C., Maniyar, A., Marshall, J. L., Mena-Fernández, J., Menanteau, F., Miquel, R., Millea, M., Mohr, J. J., Montgomery, J., Nakato, Y., Natoli, T., Noble, G. I., Novosad, V., Ogando, R. L. C., Omori, Y., Padin, S., Palmese, A., Pan, Z., Paschos, P., Pereira, M. E. S., Pieres, A., Malagón, A. A. Plazas, Prabhu, K., Quan, W., Rahlin, A., Rahimi, M., Reichardt, C. L., Reil, K., Romer, A. K., Rouble, M., Ruhl, J. E., Sanchez, E., Cid, D. Sanchez, Schiappucci, E., Sevilla-Noarbe, I., Smecher, G., Smith, M., Sobrin, J. A., Stark, A. A., Stephen, J., Suchyta, E., Suzuki, A., Swanson, M. E. C., Tandoi, C., Tarle, G., Thompson, K. L., Thorne, B., Trendafilova, C., Tucker, C., Umilta, C., Vieira, J. D., Wang, G., Weaverdyck, N., Whitehorn, N., Wiseman., P., Wu, W. L. K., Yefremenko, V., Young, M. R., and Zebrowski, J. A.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: We measure the stacked lensing signal in the direction of galaxy clusters in the Dark Energy Survey Year 3 (DES Y3) redMaPPer sample, using cosmic microwave background (CMB) temperature data from SPT-3G, the third-generation CMB camera on the South Pole Telescope (SPT). We estimate the lensing signal using temperature maps constructed from the initial 2 years of data from the SPT-3G 'Main' survey, covering 1500 deg$^2$ of the Southern sky. We then use this signal as a proxy for the mean cluster mass of the DES sample. In this work, we employ three versions of the redMaPPer catalogue: a Flux-Limited sample containing 8865 clusters, a Volume-Limited sample with 5391 clusters, and a Volume&Redshift-Limited sample with 4450 clusters. For the three samples, we find the mean cluster masses to be ${M}_{200{\rm{m}}}=1.66\pm0.13$ [stat.]$\pm0.03$ [sys.], $1.97\pm0.18$ [stat.]$\pm0.05$ [sys.], and $2.11\pm0.20$ [stat.]$\pm0.05$ [sys.]$\times{10}^{14}\ {\rm{M}}_{\odot }$, respectively. This is a factor of $\sim2$ improvement relative to the precision of measurements with previous generations of SPT surveys and the most constraining cluster mass measurements using CMB cluster lensing to date. Overall, we find no significant tensions between our results and masses given by redMaPPer mass-richness scaling relations of previous works, which were calibrated using CMB cluster lensing, optical weak lensing, and velocity dispersion measurements from various combinations of DES, SDSS and Planck data. We then divide our sample into 3 redshift and 3 richness bins, finding no significant tensions with optical weak-lensing calibrated masses in these bins. We forecast a $5.7\%$ constraint on the mean cluster mass of the DES Y3 sample with the complete SPT-3G surveys when using both temperature and polarization data and including an additional $\sim1400$ deg$^2$ of observations from the 'Extended' SPT-3G survey., Comment: 23 pages, 9 figures, accepted for publication in JCAP. Minor changes and corrections have been made relative to v1
Published: 2024

31. Predicting the Performance of Foundation Models via Agreement-on-the-Line

Author: Saxena, Rahul, Kim, Taeyoun, Mehra, Aman, Baek, Christina, Kolter, Zico, and Raghunathan, Aditi
Subjects: Computer Science - Machine Learning
Abstract: Estimating the out-of-distribution performance in regimes where labels are scarce is critical to safely deploy foundation models. Recently, it was shown that ensembles of neural networks observe the phenomena "agreement-on-the-line", which can be leveraged to reliably predict OOD performance without labels. However, in contrast to classical neural networks that are trained on in-distribution data from scratch for numerous epochs, foundation models undergo minimal finetuning from heavily pretrained weights, which may reduce the ensemble diversity needed to observe agreement-on-the-line. In our work, we demonstrate that when lightly finetuning multiple runs from a single foundation model, the choice of randomness during training (linear head initialization, data ordering, and data subsetting) can lead to drastically different levels of agreement-on-the-line in the resulting ensemble. Surprisingly, only random head initialization is able to reliably induce agreement-on-the-line in finetuned foundation models across vision and language benchmarks. Second, we demonstrate that ensembles of multiple foundation models pretrained on different datasets but finetuned on the same task can also show agreement-on-the-line. In total, by careful construction of a diverse ensemble, we can utilize agreement-on-the-line-based methods to predict the OOD performance of foundation models with high precision.
Published: 2024

32. Testing the $\mathbf{\Lambda}$CDM Cosmological Model with Forthcoming Measurements of the Cosmic Microwave Background with SPT-3G

Author: Prabhu, K., Raghunathan, S., Millea, M., Lynch, G., Ade, P. A. R., Anderes, E., Anderson, A. J., Ansarinejad, B., Archipley, M., Balkenhol, L., Benabed, K., Bender, A. N., Benson, B. A., Bianchini, F., Bleem, L. E., Bouchet, F. R., Bryant, L., Camphuis, E., Carlstrom, J. E., Cecil, T. W., Chang, C. L., Chaubal, P., Chichura, P. M., Chou, T. -L., Coerver, A., Crawford, T. M., Cukierman, A., Daley, C., de Haan, T., Dibert, K. R., Dobbs, M. A., Doussot, A., Dutcher, D., Everett, W., Feng, C., Ferguson, K. R., Fichman, K., Foster, A., Galli, S., Gambrel, A. E., Gardner, R. W., Ge, F., Goeckner-Wald, N., Gualtieri, R., Guidi, F., Guns, S., Halverson, N. W., Hivon, E., Holder, G. P., Holzapfel, W. L., Hood, J. C., Hryciuk, A., Huang, N., Kéruzoré, F., Knox, L., Korman, M., Kornoelje, K., Kuo, C. -L., Lee, A. T., Levy, K., Lowitz, A. E., Lu, C., Maniyar, A., Menanteau, F., Montgomery, J., Nakato, Y., Natoli, T., Noble, G. I., Novosad, V., Omori, Y., Padin, S., Pan, Z., Paschos, P., Phadke, K. A., Quan, W., Rahimi, M., Rahlin, A., Reichardt, C. L., Rouble, M., Ruhl, J. E., Schiappucci, E., Smecher, G., Sobrin, J. A., Stark, A. A., Stephen, J., Suzuki, A., Tandoi, C., Thompson, K. L., Thorne, B., Trendafilova, C., Tucker, C., Umilta, C., Vitrier, A., Vieira, J. D., Wan, Y., Wang, G., Whitehorn, N., Wu, W. L. K., Yefremenko, V., Young, M. R., and Zebrowski, J. A.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: We forecast constraints on cosmological parameters enabled by three surveys conducted with SPT-3G, the third-generation camera on the South Pole Telescope. The surveys cover separate regions of 1500, 2650, and 6000 ${\rm deg}^{2}$ to different depths, in total observing 25% of the sky. These regions will be measured to white noise levels of roughly 2.5, 9, and 12 $\mu{\rm K-arcmin}$, respectively, in CMB temperature units at 150 GHz by the end of 2024. The survey also includes measurements at 95 and 220 GHz, which have noise levels a factor of ~1.2 and 3.5 times higher than 150 GHz, respectively, with each band having a polarization noise level ~$\sqrt{\text{2}}$ times higher than the temperature noise. We use a novel approach to obtain the covariance matrices for jointly and optimally estimated gravitational lensing potential bandpowers and unlensed CMB temperature and polarization bandpowers. We demonstrate the ability to test the $\Lambda{\rm CDM}$ model via the consistency of cosmological parameters constrained independently from SPT-3G and Planck data, and consider the improvement in constraints on $\Lambda{\rm CDM}$ extension parameters from a joint analysis of SPT-3G and Planck data. The $\Lambda{\rm CDM}$ cosmological parameters are typically constrained with uncertainties up to ~2 times smaller with SPT-3G data, compared to Planck, with the two data sets measuring significantly different angular scales and polarization levels, providing additional tests of the standard cosmological model., Comment: 26 pages; 13 figures; Accepted for publication in ApJ; Minor edits have been made
Published: 2024

33. Ev-Edge: Efficient Execution of Event-based Vision Algorithms on Commodity Edge Platforms

Author: Sridharan, Shrihari, Selvam, Surya, Roy, Kaushik, and Raghunathan, Anand
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Robotics
Abstract: Event cameras have emerged as a promising sensing modality for autonomous navigation systems, owing to their high temporal resolution, high dynamic range and negligible motion blur. To process the asynchronous temporal event streams from such sensors, recent research has shown that a mix of Artificial Neural Networks (ANNs), Spiking Neural Networks (SNNs) as well as hybrid SNN-ANN algorithms are necessary to achieve high accuracies across a range of perception tasks. However, we observe that executing such workloads on commodity edge platforms which feature heterogeneous processing elements such as CPUs, GPUs and neural accelerators results in inferior performance. This is due to the mismatch between the irregular nature of event streams and diverse characteristics of algorithms on the one hand and the underlying hardware platform on the other. We propose Ev-Edge, a framework that contains three key optimizations to boost the performance of event-based vision systems on edge platforms: (1) An Event2Sparse Frame converter directly transforms raw event streams into sparse frames, enabling the use of sparse libraries with minimal encoding overheads (2) A Dynamic Sparse Frame Aggregator merges sparse frames at runtime by trading off the temporal granularity of events and computational demand thereby improving hardware utilization (3) A Network Mapper maps concurrently executing tasks to different processing elements while also selecting layer precision by considering both compute and communication overheads. On several state-of-art networks for a range of autonomous navigation tasks, Ev-Edge achieves 1.28x-2.05x improvements in latency and 1.23x-2.15x in energy over an all-GPU implementation on the NVIDIA Jetson Xavier AGX platform for single-task execution scenarios. Ev-Edge also achieves 1.43x-1.81x latency improvements over round-robin scheduling methods in multi-task execution scenarios.
Published: 2024

34. Testing the Limits of Jailbreaking Defenses with the Purple Problem

Author: Kim, Taeyoun, Kotha, Suhas, and Raghunathan, Aditi
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The rise of "jailbreak" attacks on language models has led to a flurry of defenses aimed at preventing undesirable responses. We critically examine the two stages of the defense pipeline: (i) defining what constitutes unsafe outputs, and (ii) enforcing the definition via methods such as input processing or fine-tuning. To test the efficacy of existing enforcement mechanisms, we consider a simple and well-specified definition of unsafe outputs--outputs that contain the word "purple". Surprisingly, existing fine-tuning and input defenses fail on this simple problem, casting doubt on whether enforcement algorithms can be robust for more complicated definitions. We find that real safety benchmarks similarly test enforcement for a fixed definition. We hope that future research can lead to effective/fast enforcement as well as high quality definitions used for enforcement and evaluation.
Published: 2024

35. Quotients of $L$-functions: degrees $n$ and $n-2$

Author: Raghunathan, Ravi
Subjects: Mathematics - Number Theory, 11F66, 11M41, 11F70
Abstract: If $L(s,\pi)$ and $L(s,\rho)$ are the Dirichlet series attached to cuspidal automorphic representations $\pi$ and $\rho$ of ${\rm GL}_n({\mathbb A}_{\mathbb Q})$ and ${\rm GL}_{n-2}({\mathbb A}_{\mathbb Q})$ respectively, we show that $F_2(s)=L(s,\pi)/L(s,\rho)$ has infinitely many poles. We also establish analogous results for Artin $L$-functions and other $L$-functions not yet proven to be automorphic. Using the classification theorems of \cite{Ragh20} and \cite{BaRa20}, we show that cuspidal $L$-functions of ${\rm GL}_3({\mathbb A}_{\mathbb Q})$ are primitive in ${\mathfrak G}$, a monoid that contains both the Selberg class ${\mathcal{S}}$ and $L(s,\sigma)$ for all unitary cuspidal automorphic representations $\sigma$ of ${\rm GL}_n({\mathbb A}_{\mathbb Q})$., Comment: 38
Published: 2024

36. Non-homogeneous anisotropic bulk viscosity for acoustic wave attenuation in weakly compressible methods

Author: Raghunathan, Dheeraj and Sudhakar, Y.
Subjects: Physics - Fluid Dynamics, Physics - Computational Physics
Abstract: A major limitation of the weakly compressible approaches to simulate incompressible flows is the appearance of artificial acoustic waves that introduce a large mass conservation error and lead to spurious oscillations in the force coefficients. In this work, we propose a non-homogeneous anisotropic bulk viscosity term to effectively damp the acoustic waves. By implementing this term in a computational framework based on the recently proposed general pressure equation, we demonstrate that the non-homogeneous and anisotropic nature of the term makes it significantly more effective than the isotropic homogeneous version widely used in the literature. Moreover, it is computationally more efficient than the pressure (or mass) diffusion term, which is an alternative mechanism used to suppress acoustic waves. We simulate a range of benchmark problems to comprehensively investigate the performance of the bulk viscosity on the effective suppression of acoustic waves, mass conservation error, order of convergence of the solver, and computational efficiency. The proposed form of the bulk viscosity enables fairly accurate modelling of the initial transients of unsteady simulations, which is highly challenging for weakly compressible approaches, and to the best of our knowledge, existing approaches can't provide an accurate prediction of such transients.
Published: 2024

37. Super-resolution on network telemetry time series

Author: Gong, Fengchen, Raghunathan, Divya, Gupta, Aarti, and Apostolaki, Maria
Subjects: Computer Science - Networking and Internet Architecture
Abstract: Fine-grained monitoring is crucial for multiple data-driven tasks such as debugging, provisioning, and securing networks. Yet, practical constraints in collecting, extracting, and storing data often force operators to use coarse-grained sampled monitoring, degrading the performance of the various tasks. In this work, we explore the feasibility of leveraging the correlations among coarse-grained time series to impute their fine-grained counterparts in software. We present Zoom2Net, a transformer-based model for network imputation that incorporates domain knowledge through operational and measurement constraints, ensuring that the imputed network telemetry time series are not only realistic but also align with existing measurements and are plausible. This approach enhances the capabilities of current monitoring infrastructures, allowing operators to gain more insights into system behaviors without the need for hardware upgrades. We evaluate Zoom2Net on four diverse datasets (e.g. cloud telemetry and Internet data transfer) and use cases (such as bursts analysis and traffic classification). We demonstrate that Zoom2Net consistently achieves high imputation accuracy with a zoom-in factor of up to 100 and performs better on downstream tasks compared to baselines by an average of 38%.
Published: 2024

38. First Constraints on the Epoch of Reionization Using the non-Gaussianity of the Kinematic Sunyaev-Zel{'}dovich Effect from the South Pole Telescope and {\it Herschel}-SPIRE Observations

Author: Raghunathan, S., Ade, P. A. R., Anderson, A. J., Ansarinejad, B., Archipley, M., Austermann, J. E., Balkenhol, L., Beall, J. A., Benabed, K., Bender, A. N., Benson, B. A., Bianchini, F., Bleem, L. E., Bock, J., Bouchet, F. R., Bryant, L., Camphuis, E., Carlstrom, J. E., Cecil, T. W., Chang, C. L., Chaubal, P., Chiang, H. C., Chichura, P. M., Chou, T. -L., Citron, R., Coerver, A., Crawford, T. M., Crites, A. T., Cukierman, A., Daley, C., Dibert, K. R., Dobbs, M. A., Doussot, A., Dutcher, D., Everett, W., Feng, C., Ferguson, K. R., Fichman, K., Foster, A., Galli, S., Gallicchio, J., Gambrel, A. E., Gardner, R. W., Ge, F., George, E. M., Goeckner-Wald, N., Gualtieri, R., Guidi, F., Guns, S., Gupta, N., de Haan, T., Halverson, N. W., Hivon, E., Holder, G. P., Holzapfel, W. L., Hood, J. C., Hrubes, J. D., Hryciuk, A., Huang, N., Hubmayr, J., Irwin, K. D., Kéruzoré, F., Khalife, A. R., Knox, L., Korman, M., Kornoelje, K., Kuo, C. -L., Lee, A. T., Levy, K., Li, D., Lowitz, A. E., Lu, C., Maniyar, A., Martsen, E. S., McMahon, J. J., Menanteau, F., Millea, M., Montgomery, J., Moran, C. Corbett, Nakato, Y., Natoli, T., Nibarger, J. P., Noble, G. I., Novosad, V., Omori, Y., Padin, S., Pan, Z., Paschos, P., Patil, S., Phadke, K. A., Prabhu, K., Pryke, C., Quan, W., Rahimi, M., Rahlin, A., Reichardt, C. L., Rouble, M., Ruhl, J. E., Saliwanchik, B. R., Schaffer, K. K., Schiappucci, E., Sievers, C., Smecher, G., Sobrin, J. A., Stark, A. A., Stephen, J., Suzuki, A., Tandoi, C., Thompson, K. L., Thorne, B., Trendafilova, C., Tucker, C., Umilta, C., Veach, T., Vieira, J. D., Viero, M. P., Wan, Y., Wang, G., Whitehorn, N., Wu, W. L. K., Yefremenko, V., Young, M. R., Zebrowski, J. A., and Zemcov, M.
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: We report results from an analysis aimed at detecting the trispectrum of the kinematic Sunyaev-Zel{'}dovich (kSZ) effect by combining data from the South Pole Telescope (SPT) and {\it Herschel}-SPIRE experiments over a 100 ${\rm deg}^{2}$ field. The SPT observations combine data from the previous and current surveys, namely SPTpol and SPT-3G, to achieve depths of 4.5, 3, and 16 $\mu {\rm K-arcmin}$ in bands centered at 95, 150, and 220 GHz. For SPIRE, we include data from the 600 and 857 GHz bands. We reconstruct the velocity-induced large-scale correlation of the small-scale kSZ signal with a quadratic estimator that uses two cosmic microwave background (CMB) temperature maps, constructed by optimally combining data from all the frequency bands. We reject the null hypothesis of a zero trispectrum at $10.3\sigma$ level. However, the measured trispectrum contains contributions from both the kSZ and other undesired components, such as CMB lensing and astrophysical foregrounds, with kSZ being sub-dominant. We use the \textsc{Agora} simulations to estimate the expected signal from CMB lensing and astrophysical foregrounds. After accounting for the contributions from CMB lensing and foreground signals, we do not detect an excess kSZ-only trispectrum and use this non-detection to set constraints on reionization. By applying a prior based on observations of the Gunn-Peterson trough, we obtain an upper limit on the duration of reionization of $\Delta z_{\rm re, 50} < 4.5$ (95\% C.L). We find these constraints are fairly robust to foregrounds assumptions. This trispectrum measurement is independent of, but consistent with, {\it Planck}'s optical depth measurement. This result is the first constraint on the epoch of reionization using the non-Gaussian nature of the kSZ signal., Comment: 15 pages, 5 figures (3 in main text and 2 in Appendix); Accepted for publication in PRL; Some texts have been moved to Appendix; Minor change in Fig. 2 to include nomalization; Data products and plotting scripts can be downloaded from https://github.com/sriniraghunathan/kSZ_4pt_SPT_SPIRE
Published: 2024

39. Atacama Large Aperture Submillimeter Telescope (AtLAST) Science: Resolving the Hot and Ionized Universe through the Sunyaev-Zeldovich effect

Author: Di Mascolo, Luca, Perrott, Yvette, Mroczkowski, Tony, Andreon, Stefano, Ettori, Stefano, Simionescu, Aurora, Raghunathan, Srinivasan, van Marrewijk, Joshiwa, Cicone, Claudia, Lee, Minju, Nelson, Dylan, Sommovigo, Laura, Booth, Mark, Klaassen, Pamela, Andreani, Paola, Cordiner, Martin A., Johnstone, Doug, van Kampen, Eelco, Liu, Daizhong, Maccarone, Thomas J., Morris, Thomas W., Saintonge, Amélie, Smith, Matthew, Thelen, Alexander E., and Wedemeyer, Sven
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics, Astrophysics - Astrophysics of Galaxies, Astrophysics - Instrumentation and Methods for Astrophysics
Abstract: An omnipresent feature of the multi-phase ``cosmic web'' is that warm/hot (>$10^5$ K) ionized gas pervades it. This gas constitutes a relevant contribution to the overall universal matter budget across multiple scales, from the several tens of Mpc-scale IGM filaments, to the Mpc ICM, all the way down to the CGM surrounding individual galaxies, on scales from ~1 kpc up to their respective virial radii (~100 kpc). The study of the hot baryonic component of cosmic matter density represents a powerful means for constraining the intertwined evolution of galactic populations and large-scale cosmological structures, for tracing the matter assembly in the Universe and its thermal history. To this end, the SZ effect provides the ideal observational tool for measurements out to the beginnings of structure formation. The SZ effect is caused by the scattering of the photons from the cosmic microwave background off the hot electrons embedded within cosmic structures, and provides a redshift-independent perspective on the thermal and kinematic properties of the warm/hot gas. Still, current and future (sub)mm facilities have been providing only a partial view of the SZ Universe due to any combination of: limited angular resolution, spectral coverage, field of view, spatial dynamic range, sensitivity. In this paper, we motivate the development of a wide-field, broad-band, multi-chroic continuum instrument for the Atacama Large Aperture Submillimeter Telescope (AtLAST) by identifying the scientific drivers that will deepen our understanding of the complex thermal evolution of cosmic structures. On a technical side, this will necessarily require efficient multi-wavelength mapping of the SZ signal with an unprecedented spatial dynamic range (from arcsecond to degree scales) and we employ theoretical forecasts to determine the key instrumental constraints for achieving our goals. [abridged], Comment: 29 pages, 17 figures, 1 table. Submitted to Open Research Europe as part of the AtLAST Design Study collection: https://open-research-europe.ec.europa.eu/collections/atlast/about. Comments are welcome
Published: 2024

40. A systematic review and meta-analysis of efficacy of vasopressin as a vasoconstrictive and uterotonic drug in laparoscopic myomectomy

Author: Balachandran, Amrita, Mishra, R. K., Effie, A. Ouma, Raghunathan, Akshay, Mathew, Anoopa, and Archana, S.
Subjects: Laparoscopy -- Analysis, Hemoglobin -- Analysis, Laparoscopic surgery -- Analysis, Blood transfusion -- Analysis, Pituitary hormones -- Analysis, Misoprostol -- Analysis, Octreotide acetate -- Analysis
Abstract: Abstract Introduction: Laparoscopic myomectomy is a commonly performed operation with fast recovery and excellent results. However, haemorrhagic nature of the operation mandates us to use variety of vasoconstrictive and uterotonic agents. Amongst which, one of them is vasopressin. It is a synthetic antidiuretic hormone analogue which has been in common use as a vasoconstrictive agent in various surgical procedures including laparoscopic myomectomy. Methods: A meta-analysis of randomised controlled trials published from 2013 to 2023 (10 years) comparing the use of vasopressin against other drug or placebo or different doses of vasopressin was performed. The outcome measures were intraoperative blood loss, need for blood transfusion, difference in the haemoglobin (Hb) and haematocrit (Hct). Results: We identified 176 articles through the study search, amongst which 12 articles were included for the meta-analysis. There was a significant heterogeneity in the studies with moderate risk of bias in eight studies and low risk of bias in four studies. Compared to placebo, vasopressin showed significantly lower odds need of blood transfusion (odds ratio [OR] 0.28, 95 confidence interval [CI]: 0.13-0.61, P = 0.002) and significantly lower pre-post fall in Hb (OR -3.12, 95 CI: -4.63--1.60, P < 0.0001). However, there was no statistically significant difference in intraoperative blood loss (OR -0.56 (95 CI: -2.04-0.92, P = 0.46) and pre-post fall in Hct (OR -0.94, 95 CI: -1.96-0.07, P > 0.05). Compared to other drug (epinephrine, misoprostol and octreotide acetate), vasopressin showed no significant superiority in controlling blood loss (P > 0.05). Even the two doses of vasopressin (dilute vs. concentrated) showed no statistically significant difference between surgical blood loss and need for blood transfusion (P > 0.05). Conclusion: Vasopressin is an efficacious drug to be used for controlling blood loss, decreasing blood transfusion requirement and maintaining Hb and Hct during laparoscopic myomectomy. Keywords: Haemostasis, laparoscopic myomectomy, myomectomy, vasoconstriction, vasopressin, Author(s): Amrita Balachandran (corresponding author) [1]; R. K. Mishra [1]; A. Ouma Effie [2]; Akshay Raghunathan [3]; Anoopa Mathew [4]; S. Archana [5] INTRODUCTION Leiomyomas (derived from uterine smooth muscle [...]
Published: 2024
Full Text: View/download PDF

41. Biodegradable products from renewable sources: impact on replacing single-use plastic for protecting the environment

Author: Raghunathan, R., Nelluri, P., Rajendran, D., Pandiselvam, R., Thulasiraman, V., Sahoo, S. K., Pillai, S., Jerome, R. E., and Kothakota, A.
Published: 2024
Full Text: View/download PDF

42. Spintronic neural systems

Author: Roy, Kaushik, Wang, Cheng, Roy, Sourjya, Raghunathan, Anand, Yang, Kezhou, and Sengupta, Abhronil
Published: 2024
Full Text: View/download PDF

43. Dual blockade of IL-10 and PD-1 leads to control of SIV viral rebound following analytical treatment interruption

Author: Pereira Ribeiro, Susan, Strongin, Zachary, Soudeyns, Hugo, ten-Caten, Felipe, Ghneim, Khader, Pacheco Sanchez, Gabriela, Xavier de Medeiros, Giuliana, Del Rio Estrada, Perla Mariana, Pelletier, Adam-Nicolas, Hoang, Timothy, Nguyen, Kevin, Harper, Justin, Jean, Sherrie, Wallace, Chelsea, Balderas, Robert, Lifson, Jeffrey D., Raghunathan, Gopalan, Rimmer, Eric, Pastuskova, Cinthia, Wu, Guoxin, Micci, Luca, Ribeiro, Ruy M., Chan, Chi Ngai, Estes, Jacob D., Silvestri, Guido, Gorman, Daniel M., Howell, Bonnie J., Hazuda, Daria J., Paiardini, Mirko, and Sekaly, Rafick P.
Published: 2024
Full Text: View/download PDF

44. A New Record of Soft Coral, Lobophytum varium Tixier-Durivault, 1970 (Sarcophytidae: Malacalcyonacea) from the Andaman Islands, India

Author: Rajendra, S. and Raghunathan, C.
Published: 2024
Full Text: View/download PDF

45. Influence of alkali-treated and raw Zanthoxylum acanthopodium fibers on the mechanical, water resistance, and morphological behavior of polymeric composites for lightweight applications

Author: Raghunathan, Vijay, Ayyappan, Vinod, Dhilip, Jafrey Daniel James, Sundarrajan, D., Rangappa, Sanjay Mavinkere, and Siengchin, Suchart
Published: 2024
Full Text: View/download PDF

46. Mechanical and flammability properties of ultrasonically processed silane-treated areca-banana fiber-reinforced epoxy composites for lightweight applications

Author: Dhilip, Jafrey Daniel James, Raghunathan, Vijay, Mohan, Ramesh, Ayyappan, Vinod, Rangappa, Sanjay Mavinkere, and Siengchin, Suchart
Published: 2024
Full Text: View/download PDF

47. Repetition Improves Language Model Embeddings

Author: Springer, Jacob Mitchell, Kotha, Suhas, Fried, Daniel, Neubig, Graham, and Raghunathan, Aditi
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Recent approaches to improving the extraction of text embeddings from autoregressive large language models (LLMs) have largely focused on improvements to data, backbone pretrained language models, or improving task-differentiation via instructions. In this work, we address an architectural limitation of autoregressive models: token embeddings cannot contain information from tokens that appear later in the input. To address this limitation, we propose a simple approach, "echo embeddings," in which we repeat the input twice in context and extract embeddings from the second occurrence. We show that echo embeddings of early tokens can encode information about later tokens, allowing us to maximally leverage high-quality LLMs for embeddings. On the MTEB leaderboard, echo embeddings improve over classical embeddings by over 9% zero-shot and by around 0.7% when fine-tuned. Echo embeddings with a Mistral-7B model achieve state-of-the-art compared to prior open source models that do not leverage synthetic fine-tuning data., Comment: 36 pages, 11 figures, 16 tables
Published: 2024

48. Resilience of Hund's rule in the Chemical Space of Small Organic Molecules

Author: Majumdar, Atreyee and Ramakrishnan, Raghunathan
Subjects: Physics - Chemical Physics
Abstract: We embark on a quest to identify small molecules in the chemical space that can potentially violate Hund's rule. Utilizing twelve TDDFT approximations and the ADC(2) many-body method, we report the energies of S$_1$ and T$_1$ excited states of 12,880 closed-shell organic molecules within the bigQM7$\omega$ dataset with up to 7 CONF atoms. In this comprehensive dataset, none of the molecules, in their minimum energy geometry, exhibit a negative S$_1$-T$_1$ energy gap at the ADC($2$) level while several molecules display values $<0.1$ eV. The spin-component-scaled double-hybrid method, SCS-PBE-QIDH, demonstrates the best agreement with ADC(2). Yet, at this level, a few molecules with a strained $sp^3$-N center turn out as false-positives with the S$_1$ state lower in energy than T$_1$. We investigate a prototypical cage molecule with an energy gap $<-0.2$ eV, which a closer examination revealed as another false positive. We conclude that in the chemical space of small closed-shell organic molecules, it is possible to identify geometric and electronic structural features giving rise to S$_1$-T$_1$ degeneracy; still, there is no evidence of a negative gap. We share the dataset generated for this study as a module, to facilitate seamless molecular discovery through data mining., Comment: Minor revision. Fig.5 revised, SI Tables reorganized
Published: 2024
Full Text: View/download PDF

49. Self-consistent context aware conformer transducer for speech recognition

Author: Kolokolov, Konstantin, Pekichev, Pavel, and Raghunathan, Karthik
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: We introduce a novel neural network module that adeptly handles recursive data flow in neural network architectures. At its core, this module employs a self-consistent approach where a set of recursive equations is solved iteratively, halting when the difference between two consecutive iterations falls below a defined threshold. Leveraging this mechanism, we construct a new neural network architecture, an extension of the conformer transducer, which enriches automatic speech recognition systems with a stream of contextual information. Our method notably improves the accuracy of recognizing rare words without adversely affecting the word error rate for common vocabulary. We investigate the improvement in accuracy for these uncommon words using our novel model, both independently and in conjunction with shallow fusion with a context language model. Our findings reveal that the combination of both approaches can improve the accuracy of detecting rare words by as much as 4.5 times. Our proposed self-consistent recursive methodology is versatile and adaptable, compatible with many recently developed encoders, and has the potential to drive model improvements in speech recognition and beyond.
Published: 2024

50. What Values Do ImageNet-trained Classifiers Enact?

Author: Penman, Will, Babu, Joshua, and Raghunathan, Abhinaya
Subjects: Computer Science - Computers and Society
Abstract: We identify "values" as actions that classifiers take that speak to open questions of significant social concern. Investigating a classifier's values builds on studies of social bias that uncover how classifiers participate in social processes beyond their creators' forethought. In our case, this participation involves what counts as nutritious, what it means to be modest, and more. Unlike AI social bias, however, a classifier's values are not necessarily morally loathsome. Attending to image classifiers' values can facilitate public debate and introspection about the future of society. To substantiate these claims, we report on an extensive examination of both ImageNet training/validation data and ImageNet-trained classifiers with custom testing data. We identify perceptual decision boundaries in 118 categories that address open questions in society, and through quantitative testing of rival datasets we find that ImageNet-trained classifiers enact at least 7 values through their perceptual decisions. To contextualize these results, we develop a conceptual framework that integrates values, social bias, and accuracy, and we describe a rhetorical method for identifying how context affects the values that a classifier enacts. We also discover that classifier performance does not straightforwardly reflect the proportions of subgroups in a training set. Our findings bring a rich sense of the social world to ML researchers that can be applied to other domains beyond computer vision., Comment: Submitted to FAT [FAccT] 2020, 12 pages, 4 figures, 3 appendices
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

21,252 results on '"Raghunathan, A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources