1,257,187 results on '"Were, Victor"'
Search Results
202. Quark Flavor Balancing in Nuclear Collisions
- Author
-
Patley, Yash, Nandi, Basanta, Dash, Sadhana, Gonzalez, Victor, and Pruneau, Claude
- Subjects
High Energy Physics - Phenomenology - Abstract
The notion of charge balance function, originally designed to study the evolution of charge production in heavy-ion collisions, is extended to consider quark flavor balancing. This extension is considered based on simulations performed with the PYTHIA 8 event generator in the context of pp collisions at $\sqrt{s} = 13.6$ TeV but can be trivially applied to any other scenario and implemented in measurements of correlated particle production in pp, pA, and AA collisions at colliders. Correlation of selected flavor balancing pairs are examined as function of the produced charged particle multiplicity. One finds that the amplitude of the correlations increases monotonically with the number of balanced flavors and the actual flavor content of correlated particles.
- Published
- 2024
203. Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
- Author
-
Boateng, Emmanuel Aboah, Becker, Cassiano O., Asghar, Nabiha, Walia, Kabir, Srinivasan, Ashwin, Nosakhare, Ehi, Dibia, Victor, and Srinivasan, Soundar
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Hand-crafting high quality prompts to optimize the performance of language models is a complicated and labor-intensive process. Furthermore, when migrating to newer, smaller, or weaker models (possibly due to latency or cost gains), prompts need to be updated to re-optimize the task performance. We propose Concept Distillation (CD), an automatic prompt optimization technique for enhancing weaker models on complex tasks. CD involves: (1) collecting mistakes made by weak models with a base prompt (initialization), (2) using a strong model to generate reasons for these mistakes and create rules/concepts for weak models (induction), and (3) filtering these rules based on validation set performance and integrating them into the base prompt (deduction/verification). We evaluated CD on NL2Code and mathematical reasoning tasks, observing significant performance boosts for small and weaker language models. Notably, Mistral-7B's accuracy on Multi-Arith increased by 20%, and Phi-3-mini-3.8B's accuracy on HumanEval rose by 34%. Compared to other automated methods, CD offers an effective, cost-efficient strategy for improving weak models' performance on complex tasks and enables seamless workload migration across different language models without compromising performance., Comment: 13 pages, 8 figures, conference
- Published
- 2024
204. On holomorphic $\mathbb{C}^*$-actions
- Author
-
León, Víctor and Scárdua, Bruno
- Subjects
Mathematics - Complex Variables - Abstract
In this paper we study holomorphic actions of the complex multiplicative group on complex manifolds around a singular (fixed) point. We prove linearization results for the germ of action and also for the whole action under some conditions on the manifold. This can be seen as a follow-up to previous works of M. Suzuki and other authors.
- Published
- 2024
205. ASGM-KG: Unveiling Alluvial Gold Mining Through Knowledge Graphs
- Author
-
Gupta, Debashis, Golder, Aditi, Fernendez, Luis, Silman, Miles, Lersen, Greg, Yang, Fan, Plemmons, Bob, Alqahtani, Sarra, and Pauca, Paul Victor
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval ,Computer Science - Machine Learning ,Computer Science - Multiagent Systems - Abstract
Artisanal and Small-Scale Gold Mining (ASGM) is a low-cost yet highly destructive mining practice, leading to environmental disasters across the world's tropical watersheds. The topic of ASGM spans multiple domains of research and information, including natural and social systems, and knowledge is often atomized across a diversity of media and documents. We therefore introduce a knowledge graph (ASGM-KG) that consolidates and provides crucial information about ASGM practices and their environmental effects. The current version of ASGM-KG consists of 1,899 triples extracted using a large language model (LLM) from documents and reports published by both non-governmental and governmental organizations. These documents were carefully selected by a group of tropical ecologists with expertise in ASGM. This knowledge graph was validated using two methods. First, a small team of ASGM experts reviewed and labeled triples as factual or non-factual. Second, we devised and applied an automated factual reduction framework that relies on a search engine and an LLM for labeling triples. Our framework performs as well as five baselines on a publicly available knowledge graph and achieves over 90 accuracy on our ASGM-KG validated by domain experts. ASGM-KG demonstrates an advancement in knowledge aggregation and representation for complex, interdisciplinary environmental crises such as ASGM.
- Published
- 2024
206. On the functional graph of $f(X)=X(X^{q-1}-c)^{q+1},$ over quadratic extensions of finite fields
- Author
-
Aguirre, Josimar J. R., Lemos, Abílio, and Neumann, Victor G. L.
- Subjects
Mathematics - Number Theory ,12E20, 11T06, 05C20 - Abstract
Let $\mathbb{F}_{q}$ be the finite field with $q$ elements. In this paper we will describe the dynamics of the map $f(X)=X(X^{q-1}-c)^{q+1},$ with $c\in\mathbb{F}_{q}^{\ast},$ over the finite field $\mathbb{F}_{q^2}$.
- Published
- 2024
207. Gaussian Processes with Noisy Regression Inputs for Dynamical Systems
- Author
-
Wolff, Tobias M., Lopez, Victor G., and Müller, Matthias A.
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
This paper is centered around the approximation of dynamical systems by means of Gaussian processes. To this end, trajectories of such systems must be collected to be used as training data. The measurements of these trajectories are typically noisy, which implies that both the regression inputs and outputs are corrupted by noise. However, most of the literature considers only noise in the regression outputs. In this paper, we show how to account for the noise in the regression inputs in an extended Gaussian process framework to approximate scalar and multidimensional systems. We demonstrate the potential of our framework by comparing it to different state-of-the-art methods in several simulation examples., Comment: 6 pages
- Published
- 2024
208. Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program
- Author
-
Carrasco, Alejandro, Rodriguez-Fernandez, Victor, and Linares, Richard
- Subjects
Computer Science - Artificial Intelligence ,Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
Recent trends are emerging in the use of Large Language Models (LLMs) as autonomous agents that take actions based on the content of the user text prompt. This study explores the use of fine-tuned Large Language Models (LLMs) for autonomous spacecraft control, using the Kerbal Space Program Differential Games suite (KSPDG) as a testing environment. Traditional Reinforcement Learning (RL) approaches face limitations in this domain due to insufficient simulation capabilities and data. By leveraging LLMs, specifically fine-tuning models like GPT-3.5 and LLaMA, we demonstrate how these models can effectively control spacecraft using language-based inputs and outputs. Our approach integrates real-time mission telemetry into textual prompts processed by the LLM, which then generate control actions via an agent. The results open a discussion about the potential of LLMs for space operations beyond their nominal use for text-related tasks. Future work aims to expand this methodology to other space control tasks and evaluate the performance of different LLM families. The code is available at this URL: \texttt{https://github.com/ARCLab-MIT/kspdg}., Comment: ESA SPAICE Conference 2024. arXiv admin note: text overlap with arXiv:2404.00413
- Published
- 2024
209. Understanding Enthymemes in Argument Maps: Bridging Argument Mining and Logic-based Argumentation
- Author
-
Ben-Naim, Jonathan, David, Victor, and Hunter, Anthony
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Argument mining is natural language processing technology aimed at identifying arguments in text. Furthermore, the approach is being developed to identify the premises and claims of those arguments, and to identify the relationships between arguments including support and attack relationships. In this paper, we assume that an argument map contains the premises and claims of arguments, and support and attack relationships between them, that have been identified by argument mining. So from a piece of text, we assume an argument map is obtained automatically by natural language processing. However, to understand and to automatically analyse that argument map, it would be desirable to instantiate that argument map with logical arguments. Once we have the logical representation of the arguments in an argument map, we can use automated reasoning to analyze the argumentation (e.g. check consistency of premises, check validity of claims, and check the labelling on each arc corresponds with thw logical arguments). We address this need by using classical logic for representing the explicit information in the text, and using default logic for representing the implicit information in the text. In order to investigate our proposal, we consider some specific options for instantiation., Comment: Research note
- Published
- 2024
210. $\mathcal{H}_2$-optimal Model Reduction of Linear Quadratic Output Systems in Finite Frequency Range
- Author
-
Zulfiqar, Umair, Xiao, Zhi-Hua, Song, Qiu-Yan, Uddin, Mohammad Monir, and Sreeram, Victor
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
Linear quadratic output systems constitute an important class of dynamical systems with numerous practical applications. When the order of these models is exceptionally high, simulating and analyzing these systems becomes computationally prohibitive. In such instances, model order reduction offers an effective solution by approximating the original high-order system with a reduced-order model while preserving the system's essential characteristics. In frequency-limited model order reduction, the objective is to maintain the frequency response of the original system within a specified frequency range in the reduced-order model. In this paper, a mathematical expression for the frequency-limited $\mathcal{H}_2$ norm is derived, which quantifies the error within the desired frequency interval. Subsequently, the necessary conditions for a local optimum of the frequency-limited $\mathcal{H}_2$ norm of the error are derived. The inherent difficulty in satisfying these conditions within a Petrov-Galerkin projection framework is also discussed. Based on the optimality conditions and Petrov-Galerkin projection, a stationary point iteration algorithm is proposed that enforces three of the four optimality conditions upon convergence. A numerical example is provided to illustrate the algorithm's effectiveness in accurately approximating the original high-order model within the specified frequency interval.
- Published
- 2024
211. A New Control Law for TS Fuzzy Models: Less Conservative LMI Conditions by Using Membership Functions Derivative
- Author
-
Mozelli, Leonardo Amaral and Campos, Victor Costa da Silva
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,93C42, 93D15, 37B25 - Abstract
This note proposes a new type of Parallel Distributed Controller (PDC) for Takagi-Sugeno (TS) fuzzy models. Our idea consists of using two control terms based on state feedback, one composed of a convex combination of linear gains weighted by the normalized membership grade, as in traditional PDC, and the other composed of linear gains weighted by the time-derivatives of the membership functions. We present the design conditions as Linear Matrix Inequalities, solvable through numerical optimization tools. Numerical examples are given to illustrate the advantages of the proposed approach, which contains the the traditional PDC as a special case., Comment: 20 pages, 4 figures
- Published
- 2024
212. D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning
- Author
-
Rafailov, Rafael, Hatch, Kyle, Singh, Anikait, Smith, Laura, Kumar, Aviral, Kostrikov, Ilya, Hansen-Estruch, Philippe, Kolev, Victor, Ball, Philip, Wu, Jiajun, Finn, Chelsea, and Levine, Sergey
- Subjects
Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
Offline reinforcement learning algorithms hold the promise of enabling data-driven RL methods that do not require costly or dangerous real-world exploration and benefit from large pre-collected datasets. This in turn can facilitate real-world applications, as well as a more standardized approach to RL research. Furthermore, offline RL methods can provide effective initializations for online finetuning to overcome challenges with exploration. However, evaluating progress on offline RL algorithms requires effective and challenging benchmarks that capture properties of real-world tasks, provide a range of task difficulties, and cover a range of challenges both in terms of the parameters of the domain (e.g., length of the horizon, sparsity of rewards) and the parameters of the data (e.g., narrow demonstration data or broad exploratory data). While considerable progress in offline RL in recent years has been enabled by simpler benchmark tasks, the most widely used datasets are increasingly saturating in performance and may fail to reflect properties of realistic tasks. We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments, based on models of real-world robotic systems, and comprising a variety of data sources, including scripted data, play-style data collected by human teleoperators, and other data sources. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation, with some of the tasks specifically designed to require both pre-training and fine-tuning. We hope that our proposed benchmark will facilitate further progress on both offline RL and fine-tuning algorithms. Website with code, examples, tasks, and data is available at \url{https://sites.google.com/view/d5rl/}, Comment: RLC 2024
- Published
- 2024
213. De Sitter Bra-Ket Wormholes
- Author
-
Fumagalli, Alessandro, Gorbenko, Victor, and Kames-King, Joshua
- Subjects
High Energy Physics - Theory ,General Relativity and Quantum Cosmology - Abstract
We study a model for the initial state of the universe based on a gravitational path integral that includes connected geometries which simultaneously produce bra and ket of the wave function. We argue that a natural object to describe this state is the Wigner distribution, which is a function on a classical phase space obtained by a certain integral transform of the density matrix. We work with Lorentzian de Sitter Jackiw-Teitelboim gravity in which we find semiclassical saddle-points for pure gravity, as well as when we include matter components such as a CFT and a classical inflaton field. We also discuss different choices of fixing time reparametrizations. In the regime of large universes our connected geometry dominates over the Hartle-Hawking saddle and gives a distribution that has a meaningful probabilistic interpretation for local observables. It does not, however, give a normalizable probability measure on the entire phase space of the theory.
- Published
- 2024
214. EXPLAIN, AGREE, LEARN: Scaling Learning for Neural Probabilistic Logic
- Author
-
Verreet, Victor, De Smet, Lennert, De Raedt, Luc, and Sansone, Emanuele
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Neural probabilistic logic systems follow the neuro-symbolic (NeSy) paradigm by combining the perceptive and learning capabilities of neural networks with the robustness of probabilistic logic. Learning corresponds to likelihood optimization of the neural networks. However, to obtain the likelihood exactly, expensive probabilistic logic inference is required. To scale learning to more complex systems, we therefore propose to instead optimize a sampling based objective. We prove that the objective has a bounded error with respect to the likelihood, which vanishes when increasing the sample count. Furthermore, the error vanishes faster by exploiting a new concept of sample diversity. We then develop the EXPLAIN, AGREE, LEARN (EXAL) method that uses this objective. EXPLAIN samples explanations for the data. AGREE reweighs each explanation in concordance with the neural component. LEARN uses the reweighed explanations as a signal for learning. In contrast to previous NeSy methods, EXAL can scale to larger problem sizes while retaining theoretical guarantees on the error. Experimentally, our theoretical claims are verified and EXAL outperforms recent NeSy methods when scaling up the MNIST addition and Warcraft pathfinding problems.
- Published
- 2024
215. Quantifying the informativity of emission lines to infer physical conditions in giant molecular clouds. I. Application to model predictions
- Author
-
Einig, Lucas, Palud, Pierre, Roueff, Antoine, Pety, Jérôme, Bron, Emeric, Petit, Franck Le, Gerin, Maryvonne, Chanussot, Jocelyn, Chainais, Pierre, Thouvenin, Pierre-Antoine, Languignon, David, Bešlić, Ivana, Coudé, Simon, Mazurek, Helena, Orkisz, Jan H., Santa-Maria, Miriam G., Ségal, Léontine, Zakardjian, Antoine, Bardeau, Sébastien, Demyk, Karine, Magalhães, Victor de Souza, Goicoechea, Javier R., Gratier, Pierre, Guzmán, Viviana V., Hughes, Annie, Levrier, François, Bourlot, Jacques Le, Lis, Dariusz C., Liszt, Harvey S., Peretto, Nicolas, Roueff, Evelyne, and Sievers, Albrecht
- Subjects
Astrophysics - Astrophysics of Galaxies ,Statistics - Applications - Abstract
Observations of ionic, atomic, or molecular lines are performed to improve our understanding of the interstellar medium (ISM). However, the potential of a line to constrain the physical conditions of the ISM is difficult to assess quantitatively, because of the complexity of the ISM physics. The situation is even more complex when trying to assess which combinations of lines are the most useful. Therefore, observation campaigns usually try to observe as many lines as possible for as much time as possible. We search for a quantitative statistical criterion to evaluate the constraining power of a (or combination of) tracer(s) with respect to physical conditions in order to improve our understanding of the statistical relationships between ISM tracers and physical conditions and helps observers to motivate their observation proposals. The best tracers are obtained by comparing the mutual information between a physical parameter and different sets of lines. We apply this method to simulations of radio molecular lines emitted by a photodissociation region similar to the Horsehead Nebula that would be observed at the IRAM 30m telescope. We search for the best lines to constrain the visual extinction $A_v^{tot}$ or the far UV illumination $G_0$. The most informative lines change with the physical regime (e.g., cloud extinction). Short integration time of the CO isotopologue $J=1-0$ lines already yields much information on the total column density most regimes. The best set of lines to constrain the visual extinction does not necessarily combine the most informative individual lines. Precise constraints on $G_0$ are more difficult to achieve with molecular lines. They require spectral lines emitted at the cloud surface (e.g., [CII] and [CI] lines). This approach allows one to better explore the knowledge provided by ISM codes, and to guide future observation campaigns.
- Published
- 2024
216. Stable State Space SubSpace (S$^5$) Identification
- Author
-
Rong, Xinhui and Solo, Victor
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
State space subspace algorithms for input-output systems have been widely applied but also have a reasonably well-developedasymptotic theory dealing with consistency. However, guaranteeing the stability of the estimated system matrix is a major issue. Existing stability-guaranteed algorithms are computationally expensive, require several tuning parameters, and scale badly to high state dimensions. Here, we develop a new algorithm that is closed-form and requires no tuning parameters. It is thus computationally cheap and scales easily to high state dimensions. We also prove its consistency under reasonable conditions.
- Published
- 2024
217. RAVE Checklist: Recommendations for Overcoming Challenges in Retrospective Safety Studies of Automated Driving Systems
- Author
-
Scanlon, John M., Teoh, Eric R., Kidd, David G., Kusano, Kristofer D., Bärgman, Jonas, Chi-Johnston, Geoffrey, Di Lillo, Luigi, Favaro, Francesca, Flannagan, Carol, Liers, Henrik, Lin, Bonnie, Lindman, Magdalena, McLaughlin, Shane, Perez, Miguel, and Victor, Trent
- Subjects
Computer Science - Robotics - Abstract
The public, regulators, and domain experts alike seek to understand the effect of deployed SAE level 4 automated driving system (ADS) technologies on safety. The recent expansion of ADS technology deployments is paving the way for early stage safety impact evaluations, whereby the observational data from both an ADS and a representative benchmark fleet are compared to quantify safety performance. In January 2024, a working group of experts across academia, insurance, and industry came together in Washington, DC to discuss the current and future challenges in performing such evaluations. A subset of this working group then met, virtually, on multiple occasions to produce this paper. This paper presents the RAVE (Retrospective Automated Vehicle Evaluation) checklist, a set of fifteen recommendations for performing and evaluating retrospective ADS performance comparisons. The recommendations are centered around the concepts of (1) quality and validity, (2) transparency, and (3) interpretation. Over time, it is anticipated there will be a large and varied body of work evaluating the observed performance of these ADS fleets. Establishing and promoting good scientific practices benefits the work of stakeholders, many of whom may not be subject matter experts. This working group's intentions are to: i) strengthen individual research studies and ii) make the at-large community more informed on how to evaluate this collective body of work.
- Published
- 2024
218. Non-Gaited Legged Locomotion with Monte-Carlo Tree Search and Supervised Learning
- Author
-
Taouil, Ilyass, Amatucci, Lorenzo, Khadiv, Majid, Dai, Angela, Barasuol, Victor, Turrisi, Giulio, and Semini, Claudio
- Subjects
Computer Science - Robotics - Abstract
Legged robots are able to navigate complex terrains by continuously interacting with the environment through careful selection of contact sequences and timings. However, the combinatorial nature behind contact planning hinders the applicability of such optimization problems on hardware. In this work, we present a novel approach that optimizes gait sequences and respective timings for legged robots in the context of optimization-based controllers through the use of sampling-based methods and supervised learning techniques. We propose to bootstrap the search by learning an optimal value function in order to speed-up the gait planning procedure making it applicable in real-time. To validate our proposed method, we showcase its performance both in simulation and on hardware using a 22 kg electric quadruped robot. The method is assessed on different terrains, under external perturbations, and in comparison to a standard control approach where the gait sequence is fixed a priori.
- Published
- 2024
219. ASPEN: ASP-Based System for Collective Entity Resolution
- Author
-
Xiang, Zhiliang, Bienvenu, Meghyn, Cima, Gianluca, Gutiérrez-Basulto, Víctor, and Ibáñez-García, Yazmín
- Subjects
Computer Science - Databases - Abstract
In this paper, we present ASPEN, an answer set programming (ASP) implementation of a recently proposed declarative framework for collective entity resolution (ER). While an ASP encoding had been previously suggested, several practical issues had been neglected, most notably, the question of how to efficiently compute the (externally defined) similarity facts that are used in rule bodies. This leads us to propose new variants of the encodings (including Datalog approximations) and show how to employ different functionalities of ASP solvers to compute (maximal) solutions, and (approximations of) the sets of possible and certain merges. A comprehensive experimental evaluation of ASPEN on real-world datasets shows that the approach is promising, achieving high accuracy in real-life ER scenarios. Our experiments also yield useful insights into the relative merits of different types of (approximate) ER solutions, the impact of recursion, and factors influencing performance., Comment: Extended version of a paper accepted at KR 2024
- Published
- 2024
220. How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts
- Author
-
Wang, Huichen Will, Hoffswell, Jane, Thane, Sao Myat Thazin, Bursztyn, Victor S., and Bearfield, Cindy Xiong
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Large Language Models (LLMs) have been adopted for a variety of visualizations tasks, but how far are we from perceptually aware LLMs that can predict human takeaways? Graphical perception literature has shown that human chart takeaways are sensitive to visualization design choices, such as spatial layouts. In this work, we examine the extent to which LLMs exhibit such sensitivity when generating takeaways, using bar charts with varying spatial layouts as a case study. We conducted three experiments and tested four common bar chart layouts: vertically juxtaposed, horizontally juxtaposed, overlaid, and stacked. In Experiment 1, we identified the optimal configurations to generate meaningful chart takeaways by testing four LLMs, two temperature settings, nine chart specifications, and two prompting strategies. We found that even state-of-the-art LLMs struggled to generate semantically diverse and factually accurate takeaways. In Experiment 2, we used the optimal configurations to generate 30 chart takeaways each for eight visualizations across four layouts and two datasets in both zero-shot and one-shot settings. Compared to human takeaways, we found that the takeaways LLMs generated often did not match the types of comparisons made by humans. In Experiment 3, we examined the effect of chart context and data on LLM takeaways. We found that LLMs, unlike humans, exhibited variation in takeaway comparison types for different bar charts using the same bar layout. Overall, our case study evaluates the ability of LLMs to emulate human interpretations of data and points to challenges and opportunities in using LLMs to predict human chart takeaways., Comment: IEEE Transactions on Visualization and Computer Graphics (Proc. VIS 2024)
- Published
- 2024
221. Spatial Dynamics Behavioral Analysis of Motivational Operations Using Weighted Voronoi Diagrams
- Author
-
Hernández-Linares, Carlos Alberto, Toledo, Porfirio, Medina-Pérez, Brenda Zarahí, Hernández, Varsovia, Garrido, Martha Lorena Avendaño, Quintero, Víctor, and León, Alejandro
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
This paper presents a novel approach to the analysis of spatial behavior distribution, utilizing weighted Voronoi diagrams. The objective is to map and understand how an experimental subject moves and spends time in various areas of a given space, thus identifying the areas of greatest behavioral interest. The technique entails the partitioning of the space into a grid, the designation of generator points, and the assignment of weights based on the time the subject spends in each region. The data analyzed were derived from multiple experimental sessions in which subjects were exposed to various conditions, including food deprivation, water deprivation, and combined deprivation and no deprivation. The aforementioned conditions resulted in the formation of clearly delineated spatial patterns. Weighted Voronoi diagrams provided a comprehensive and precise representation of these areas of interest, facilitating an in-depth examination of the evolution of behavioral patterns in diverse contexts, such as under different Motivational Operations. This tool offers a valuable perspective for the dynamic study of spatial behaviors in variable experimental settings., Comment: 10 pages, 3 figures
- Published
- 2024
222. Representation-space diffusion models for generating periodic materials
- Author
-
Sinha, Anshuman, Jia, Shuyi, and Fung, Victor
- Subjects
Condensed Matter - Materials Science - Abstract
Generative models hold the promise of significantly expediting the materials design process when compared to traditional human-guided or rule-based methodologies. However, effectively generating high-quality periodic structures of materials on limited but diverse datasets remains an ongoing challenge. Here we propose a novel approach for periodic structure generation which fully respect the intrinsic symmetries, periodicity, and invariances of the structure space. Namely, we utilize differentiable, physics-based, structural descriptors which can describe periodic systems and satisfy the necessary invariances, in conjunction with a denoising diffusion model which generates new materials within this descriptor or representation space. Reconstruction is then performed on these representations using gradient-based optimization to recover the corresponding Cartesian positions of the crystal structure. This approach differs significantly from current methods by generating materials in the representation space, rather than in the Cartesian space, which is made possible using an efficient reconstruction algorithm. Consequently, known issues with respecting periodic boundaries and translational and rotational invariances during generation can be avoided, and the model training process can be greatly simplified. We show this approach is able to provide competitive performance on established benchmarks compared to current state-of-the-art methods.
- Published
- 2024
223. Neural Networks as Spin Models: From Glass to Hidden Order Through Training
- Author
-
Barney, Richard, Winer, Michael, and Galitski, Victor
- Subjects
Condensed Matter - Disordered Systems and Neural Networks ,Computer Science - Machine Learning ,Nonlinear Sciences - Adaptation and Self-Organizing Systems - Abstract
We explore a one-to-one correspondence between a neural network (NN) and a statistical mechanical spin model where neurons are mapped to Ising spins and weights to spin-spin couplings. The process of training an NN produces a family of spin Hamiltonians parameterized by training time. We study the magnetic phases and the melting transition temperature as training progresses. First, we prove analytically that the common initial state before training--an NN with independent random weights--maps to a layered version of the classical Sherrington-Kirkpatrick spin glass exhibiting a replica symmetry breaking. The spin-glass-to-paramagnet transition temperature is calculated. Further, we use the Thouless-Anderson-Palmer (TAP) equations--a theoretical technique to analyze the landscape of energy minima of random systems--to determine the evolution of the magnetic phases on two types of NNs (one with continuous and one with binarized activations) trained on the MNIST dataset. The two NN types give rise to similar results, showing a quick destruction of the spin glass and the appearance of a phase with a hidden order, whose melting transition temperature $T_c$ grows as a power law in training time. We also discuss the properties of the spectrum of the spin system's bond matrix in the context of rich vs. lazy learning. We suggest that this statistical mechanical view of NNs provides a useful unifying perspective on the training process, which can be viewed as selecting and strengthening a symmetry-broken state associated with the training task., Comment: 18 pages, 9 figures
- Published
- 2024
224. Event-triggered moving horizon estimation for nonlinear systems
- Author
-
Krauss, Isabelle, Schiller, Julian D., Lopez, Victor G., and Müller, Matthias A.
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
This work proposes an event-triggered moving horizon estimation (ET-MHE) scheme for general nonlinear systems. The key components of the proposed scheme are a novel event-triggering mechanism (ETM) and the suitable design of the MHE cost function. The main characteristic of our method is that the MHE's nonlinear optimization problem is only solved when the ETM triggers the transmission of measured data to the remote state estimator. If no event occurs, then the current state estimate results from an open-loop prediction using the system dynamics. Furthermore, we show robust global exponential stability of the ET-MHE under a suitable detectability condition. Finally, we illustrate the applicability of the proposed method in terms of a nonlinear benchmark example, where we achieved similar estimation performance compared to standard MHE using 86% less computational resources.
- Published
- 2024
225. Time-limited H2-optimal Model Order Reduction of Linear Systems with Quadratic Outputs
- Author
-
Zulfiqar, Umair, Xiao, Zhi-Hua, Song, Qiu-Yan, Uddin, Mohammad Monir, and Sreeram, Victor
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
An important class of dynamical systems with several practical applications is linear systems with quadratic outputs. These models have the same state equation as standard linear time-invariant systems but differ in their output equations, which are nonlinear quadratic functions of the system states. When dealing with models of exceptionally high order, the computational demands for simulation and analysis can become overwhelming. In such cases, model order reduction proves to be a useful technique, as it allows for constructing a reduced-order model that accurately represents the essential characteristics of the original high-order system while significantly simplifying its complexity. In time-limited model order reduction, the main goal is to maintain the output response of the original system within a specific time range in the reduced-order model. To assess the error within this time interval, a mathematical expression for the time-limited $\mathcal{H}_2$-norm is derived in this paper. This norm acts as a measure of the accuracy of the reduced-order model within the specified time range. Subsequently, the necessary conditions for achieving a local optimum of the time-limited $\mathcal{H}_2$ norm error are derived. The inherent inability to satisfy these optimality conditions within the Petrov-Galerkin projection framework is also discussed. After that, a stationary point iteration algorithm based on the optimality conditions and Petrov-Galerkin projection is proposed. Upon convergence, this algorithm fulfills three of the four optimality conditions. To demonstrate the effectiveness of the proposed algorithm, a numerical example is provided that showcases its ability to effectively approximate the original high-order model within the desired time interval.
- Published
- 2024
226. Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends
- Author
-
Victor, Jeffry, Wang, Chunguang, and Gupta, Sumeet K.
- Subjects
Computer Science - Emerging Technologies ,Computer Science - Machine Learning - Abstract
Crossbar memory arrays have been touted as the workhorse of in-memory computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the associated hardware non-idealities limit their efficacy. To address this, cross-layer design solutions that reduce the impact of hardware non-idealities on DNN accuracy are needed. In Part 1 of this paper, we established the co-optimization strategies for various memory technologies and their crossbar arrays, and conducted a comparative technology evaluation in the context of IMC robustness. In this part, we analyze various design knobs such as array size and bit-slice (number of bits per device) and their impact on the performance of 8T SRAM, ferroelectric transistor (FeFET), Resistive RAM (ReRAM) and spin-orbit-torque magnetic RAM (SOT-MRAM) in the context of inference accuracy at 7nm technology node. Further, we study the effect of circuit design solutions such as Partial Wordline Activation (PWA) and custom ADC reference levels that reduce the hardware non-idealities and comparatively analyze the response of each technology to such accuracy enhancing techniques. Our results on ResNet-20 (with CIFAR-10) show that PWA increases accuracy by up to 32.56% while custom ADC reference levels yield up to 31.62% accuracy enhancement. We observe that compared to the other technologies, FeFET, by virtue of its small layout height and high distinguishability of its memory states, is best suited for large arrays. For higher bit-slices and a more complex dataset (ResNet-50 with Cifar-100) we found that ReRAM matches the performance of FeFET.
- Published
- 2024
227. CURLing the Dream: Contrastive Representations for World Modeling in Reinforcement Learning
- Author
-
Kich, Victor Augusto, Bottega, Jair Augusto, Steinmetz, Raul, Grando, Ricardo Bedin, Yorozu, Ayano, and Ohya, Akihisa
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
In this work, we present Curled-Dreamer, a novel reinforcement learning algorithm that integrates contrastive learning into the DreamerV3 framework to enhance performance in visual reinforcement learning tasks. By incorporating the contrastive loss from the CURL algorithm and a reconstruction loss from autoencoder, Curled-Dreamer achieves significant improvements in various DeepMind Control Suite tasks. Our extensive experiments demonstrate that Curled-Dreamer consistently outperforms state-of-the-art algorithms, achieving higher mean and median scores across a diverse set of tasks. The results indicate that the proposed approach not only accelerates learning but also enhances the robustness of the learned policies. This work highlights the potential of combining different learning paradigms to achieve superior performance in reinforcement learning applications., Comment: Paper accepted for 24th International Conference on Control, Automation and Systems (ICCAS)
- Published
- 2024
228. Parallel Distributional Deep Reinforcement Learning for Mapless Navigation of Terrestrial Mobile Robots
- Author
-
Kich, Victor Augusto, Kolling, Alisson Henrique, de Jesus, Junior Costa, Heisler, Gabriel V., Jacobs, Hiago, Bottega, Jair Augusto, Kelbouscas, André L. da S., Ohya, Akihisa, Grando, Ricardo Bedin, Drews-Jr, Paulo Lilles Jorge, and Gamarra, Daniel Fernando Tello
- Subjects
Computer Science - Robotics - Abstract
This paper introduces novel deep reinforcement learning (Deep-RL) techniques using parallel distributional actor-critic networks for navigating terrestrial mobile robots. Our approaches use laser range findings, relative distance, and angle to the target to guide the robot. We trained agents in the Gazebo simulator and deployed them in real scenarios. Results show that parallel distributional Deep-RL algorithms enhance decision-making and outperform non-distributional and behavior-based approaches in navigation and spatial generalization., Comment: Paper accepted at the 24th International Conference on Control, Automation and Systems (ICCAS)
- Published
- 2024
229. Canonical analysis of linearized $\lambda R$ gravity plus a Chern-Simons term
- Author
-
Escalante, Alberto, Pantoja-Gonzalez, J. Aldair, and Pérez-Aquino, Victor Julian
- Subjects
General Relativity and Quantum Cosmology ,High Energy Physics - Theory - Abstract
The Hamiltonian analysis for the linearized $\lambda R$ gravity plus a Chern-Simons term is performed. The first-class and second-class constraints for arbitrary values of $\lambda$ are presented, and one physical degree of freedom is reported. The second-class constraints are removed, and the corresponding generalized Dirac brackets are constructed; then, the difference between theories with different values of $\lambda$ is remarked.
- Published
- 2024
230. Mechanisms of de-icing by surface Rayleigh and plate Lamb acoustic waves
- Author
-
Pandey, Shilpi, del Moral, Jaime, Jacob, Stefan, Montes, Laura, Gil-Rostra, Jorge, Frechilla, Alejandro, Karimzadeh, Atefeh, Rico, Victor J., Kantar, Raul, Kandelin, Niklas, Santos, Carmen Lopez, Koivuluoto, Heli, Angurel, Luis, Winkler, Andreas, Borras, Ana, and Elipe, Agustin R. Gonzalez
- Subjects
Condensed Matter - Materials Science ,Physics - Fluid Dynamics - Abstract
Acoustic waves (AW) have recently emerged as an energy-efficient ice removal procedure compatible with functional and industrial-relevant substrates. However, critical aspects at fundamental and experimental levels have yet to be disclosed to optimize their operational conditions. Identifying the processes and mechanisms by which different types of AWs induce de-icing are some of these issues. Herein, using model LiNbO3 systems and two types of interdigitated transducers, we analyze the de-icing and anti-icing efficiencies and mechanisms driven by Rayleigh surface acoustic waves (R-SAW) and Lamb waves with 120 and 510 um wavelengths, respectively. Through the experimental analysis of de-icing and active anti-icing processes and the finite element simulation of the AW generation, propagation, and interaction with small ice aggregates, we disclose that Lamb waves are more favorable than R-SAWs to induce de-icing and/or prevent the freezing of droplets. Prospects for applications of this study are supported by proof of concept experiments, including de-icing in an ice wind tunnel, demonstrating that Lamb waves can efficiently remove ice layers covering large LN substrates. Results indicate that the de-icing mechanism may differ for Lamb waves or R-SAWs and that the wavelength must be considered as an important parameter for controlling the efficiency.
- Published
- 2024
231. Monero Traceability Heuristics: Wallet Application Bugs and the Mordinal-P2Pool Perspective
- Author
-
Hammad, Nada and Victor, Friedhelm
- Subjects
Computer Science - Cryptography and Security - Abstract
Privacy-focused cryptoassets like Monero are intentionally difficult to trace. Over the years, several traceability heuristics have been proposed, most of which have been rendered ineffective with subsequent protocol upgrades. Between 2019 and 2023, Monero wallet application bugs "Differ By One" and "10 Block Decoy Bug" have been observed and identified and discussed in the Monero community. In addition, a decentralized mining pool named P2Pool has proliferated, and a controversial UTXO NFT imitation known as Mordinals has been tried for Monero. In this paper, we systematically describe the traceability heuristics that have emerged from these developments, and evaluate their quality based on ground truth, and through pairwise comparisons. We also explore the temporal perspective, and show which of these heuristics have been applicable over the past years, what fraction of decoys could be eliminated and what the remaining effective ring size is. Our findings illustrate that most of the heuristics have a high precision, that the "10 Block Decoy Bug" and the Coinbase decoy identification heuristics have had the most impact between 2019 and 2023, and that the former could be used to evaluate future heuristics, if they are also applicable during that time frame., Comment: 8 pages, 11 figures, author version of IEEE International Conference on Blockchain and Cryptocurrency 2024 paper
- Published
- 2024
232. Moving past point-contacts: Extending the ALIP model to humanoids with non-trivial feet using hierarchical, full-body momentum control
- Author
-
Paredes, Victor C., Hagen, Daniel A., Chesebrough, Samuel W., Swann, Riley, Garagic, Denis, and Hereid, Ayonga
- Subjects
Computer Science - Robotics ,Electrical Engineering and Systems Science - Systems and Control - Abstract
The Angular-Momentum Linear Inverted Pendulum (ALIP) model is a promising motion planner for bipedal robots. However, it relies on two assumptions: (1) the robot has point-contact feet or passive ankles, and (2) the angular momentum around the center of mass, known as centroidal angular momentum, is negligible. This paper addresses the question of whether the ALIP paradigm can be applied to more general bipedal systems with complex foot geometry (e.g., flat feet) and nontrivial torso/limb inertia and mass distribution (e.g., non-centralized arms). In such systems, the dynamics introduce non-negligible centroidal momentum and contact wrenches at the feet, rendering the assumptions of the ALIP model invalid. This paper presents the ALIP planner for general bipedal robots with non-point-contact feet through the use of a task-space whole-body controller that regulates centroidal momentum, thereby ensuring that the robot's behavior aligns with the desired template dynamics. To demonstrate the effectiveness of our proposed approach, we conduct simulations using the Sarcos Guardian XO robot, which is a hybrid humanoid/exoskeleton with large, offset feet. The results demonstrate the practicality and effectiveness of our approach in achieving stable and versatile bipedal locomotion., Comment: 7 pages, 9 figures
- Published
- 2024
233. Knowledge Base Embeddings: Semantics and Theoretical Properties
- Author
-
Bourgaux, Camille, Guimarães, Ricardo, Koudijs, Raoul, Lacerda, Victor, and Ozaki, Ana
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Logic in Computer Science - Abstract
Research on knowledge graph embeddings has recently evolved into knowledge base embeddings, where the goal is not only to map facts into vector spaces but also constrain the models so that they take into account the relevant conceptual knowledge available. This paper examines recent methods that have been proposed to embed knowledge bases in description logic into vector spaces through the lens of their geometric-based semantics. We identify several relevant theoretical properties, which we draw from the literature and sometimes generalize or unify. We then investigate how concrete embedding methods fit in this theoretical framework., Comment: This is an extended version of a paper appearing at the 21st International Conference on Principles of Knowledge Representation and Reasoning (KR 2024). 17 pages
- Published
- 2024
234. Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil
- Author
-
Locatelli, Marcelo Sartori, Miranda, Matheus Prado, Costa, Igor Joaquim da Silva, Prates, Matheus Torres, Thomé, Victor, Monteiro, Mateus Zaparoli, Lacerda, Tomas, Pagano, Adriana, Neto, Eduardo Rios, Meira Jr., Wagner, and Almeida, Virgilio
- Subjects
Computer Science - Computation and Language ,Computer Science - Computers and Society - Abstract
The Exame Nacional do Ensino M\'edio (ENEM) is a pivotal test for Brazilian students, required for admission to a significant number of universities in Brazil. The test consists of four objective high-school level tests on Math, Humanities, Natural Sciences and Languages, and one writing essay. Students' answers to the test and to the accompanying socioeconomic status questionnaire are made public every year (albeit anonymized) due to transparency policies from the Brazilian Government. In the context of large language models (LLMs), these data lend themselves nicely to comparing different groups of humans with AI, as we can have access to human and machine answer distributions. We leverage these characteristics of the ENEM dataset and compare GPT-3.5 and 4, and MariTalk, a model trained using Portuguese data, to humans, aiming to ascertain how their answers relate to real societal groups and what that may reveal about the model biases. We divide the human groups by using socioeconomic status (SES), and compare their answer distribution with LLMs for each question and for the essay. We find no significant biases when comparing LLM performance to humans on the multiple-choice Brazilian Portuguese tests, as the distance between model and human answers is mostly determined by the human accuracy. A similar conclusion is found by looking at the generated text as, when analyzing the essays, we observe that human and LLM essays differ in a few key factors, one being the choice of words where model essays were easily separable from human ones. The texts also differ syntactically, with LLM generated essays exhibiting, on average, smaller sentences and less thought units, among other differences. These results suggest that, for Brazilian Portuguese in the ENEM context, LLM outputs represent no group of humans, being significantly different from the answers from Brazilian students across all tests., Comment: Accepted at the Seventh AAAI/ACM Conference on AI, Ethics and Society (AIES 2024). 14 pages, 4 figures
- Published
- 2024
235. Towards aerodynamic surrogate modeling based on $\beta$-variational autoencoders
- Author
-
Francés-Belda, Víctor, Solera-Rico, Alberto, Nieto-Centenero, Javier, Andrés, Esther, Vila, Carlos Sanmiguel, and Castellanos, Rodrigo
- Subjects
Computer Science - Machine Learning ,Physics - Fluid Dynamics - Abstract
Surrogate models combining dimensionality reduction and regression techniques are essential to reduce the need for costly high-fidelity CFD data. New approaches using $\beta$-Variational Autoencoder ($\beta$-VAE) architectures have shown promise in obtaining high-quality low-dimensional representations of high-dimensional flow data while enabling physical interpretation of their latent spaces. We propose a surrogate model based on latent space regression to predict pressure distributions on a transonic wing given the flight conditions: Mach number and angle of attack. The $\beta$-VAE model, enhanced with Principal Component Analysis (PCA), maps high-dimensional data to a low-dimensional latent space, showing a direct correlation with flight conditions. Regularization through $\beta$ requires careful tuning to improve the overall performance, while PCA pre-processing aids in constructing an effective latent space, improving autoencoder training and performance. Gaussian Process Regression is used to predict latent space variables from flight conditions, showing robust behavior independent of $\beta$, and the decoder reconstructs the high-dimensional pressure field data. This pipeline provides insight into unexplored flight conditions. Additionally, a fine-tuning process of the decoder further refines the model, reducing dependency on $\beta$ and enhancing accuracy. The structured latent space, robust regression performance, and significant improvements from fine-tuning collectively create a highly accurate and efficient surrogate model. Our methodology demonstrates the effectiveness of $\beta$-VAEs for aerodynamic surrogate modeling, offering a rapid, cost-effective, and reliable alternative for aerodynamic data prediction., Comment: 18 pages, 12 figures
- Published
- 2024
236. AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems
- Author
-
Dibia, Victor, Chen, Jingya, Bansal, Gagan, Syed, Suff, Fourney, Adam, Zhu, Erkang, Wang, Chi, and Amershi, Saleema
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Human-Computer Interaction ,Computer Science - Machine Learning - Abstract
Multi-agent systems, where multiple agents (generative AI models + tools) collaborate, are emerging as an effective pattern for solving long-running, complex tasks in numerous domains. However, specifying their parameters (such as models, tools, and orchestration mechanisms etc,.) and debugging them remains challenging for most developers. To address this challenge, we present AUTOGEN STUDIO, a no-code developer tool for rapidly prototyping, debugging, and evaluating multi-agent workflows built upon the AUTOGEN framework. AUTOGEN STUDIO offers a web interface and a Python API for representing LLM-enabled agents using a declarative (JSON-based) specification. It provides an intuitive drag-and-drop UI for agent workflow specification, interactive evaluation and debugging of workflows, and a gallery of reusable agent components. We highlight four design principles for no-code multi-agent developer tools and contribute an open-source implementation at https://github.com/microsoft/autogen/tree/main/samples/apps/autogen-studio, Comment: 8 pages
- Published
- 2024
237. Kolmogorov-Arnold Network for Online Reinforcement Learning
- Author
-
Kich, Victor Augusto, Bottega, Jair Augusto, Steinmetz, Raul, Grando, Ricardo Bedin, Yorozu, Ayano, and Ohya, Akihisa
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage. In this paper, we explore the use of KANs as function approximators within the Proximal Policy Optimization (PPO) algorithm. We evaluate this approach by comparing its performance to the original MLP-based PPO using the DeepMind Control Proprio Robotics benchmark. Our results indicate that the KAN-based reinforcement learning algorithm can achieve comparable performance to its MLP-based counterpart, often with fewer parameters. These findings suggest that KANs may offer a more efficient option for reinforcement learning models., Comment: Paper accepted at 24th International Conference on Control, Automation and Systems (ICCAS)
- Published
- 2024
238. Exploring Scalability in Large-Scale Time Series in DeepVATS framework
- Author
-
Santamaria-Valenzuela, Inmaculada, Rodriguez-Fernandez, Victor, and Camacho, David
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Visual analytics is essential for studying large time series due to its ability to reveal trends, anomalies, and insights. DeepVATS is a tool that merges Deep Learning (Deep) with Visual Analytics (VA) for the analysis of large time series data (TS). It has three interconnected modules. The Deep Learning module, developed in R, manages the load of datasets and Deep Learning models from and to the Storage module. This module also supports models training and the acquisition of the embeddings from the latent space of the trained model. The Storage module operates using the Weights and Biases system. Subsequently, these embeddings can be analyzed in the Visual Analytics module. This module, based on an R Shiny application, allows the adjustment of the parameters related to the projection and clustering of the embeddings space. Once these parameters are set, interactive plots representing both the embeddings, and the time series are shown. This paper introduces the tool and examines its scalability through log analytics. The execution time evolution is examined while the length of the time series is varied. This is achieved by resampling a large data series into smaller subsets and logging the main execution and rendering times for later analysis of scalability., Comment: Admitted pending publication in Lecture Notes in Network and Systems (LNNS) series (Springer). Code available at https://github.com/vrodriguezf/deepvats
- Published
- 2024
239. Enhanced Prototypical Part Network (EPPNet) For Explainable Image Classification Via Prototypes
- Author
-
Atote, Bhushan and Sanchez, Victor
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Explainable Artificial Intelligence (xAI) has the potential to enhance the transparency and trust of AI-based systems. Although accurate predictions can be made using Deep Neural Networks (DNNs), the process used to arrive at such predictions is usually hard to explain. In terms of perceptibly human-friendly representations, such as word phrases in text or super-pixels in images, prototype-based explanations can justify a model's decision. In this work, we introduce a DNN architecture for image classification, the Enhanced Prototypical Part Network (EPPNet), which achieves strong performance while discovering relevant prototypes that can be used to explain the classification results. This is achieved by introducing a novel cluster loss that helps to discover more relevant human-understandable prototypes. We also introduce a faithfulness score to evaluate the explainability of the results based on the discovered prototypes. Our score not only accounts for the relevance of the learned prototypes but also the performance of a model. Our evaluations on the CUB-200-2011 dataset show that the EPPNet outperforms state-of-the-art xAI-based methods, in terms of both classification accuracy and explainability, Comment: Accepted at the International Conference on Image Processing (ICIP), IEEE (2024), we will update the new version after published through IEEE
- Published
- 2024
240. Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers
- Author
-
Scherer, Moritz, Macan, Luka, Jung, Victor, Wiese, Philip, Bompani, Luca, Burrello, Alessio, Conti, Francesco, and Benini, Luca
- Subjects
Computer Science - Machine Learning ,Computer Science - Hardware Architecture - Abstract
With the rise of Embodied Foundation Models (EFMs), most notably Small Language Models (SLMs), adapting Transformers for edge applications has become a very active field of research. However, achieving end-to-end deployment of SLMs on microcontroller (MCU)-class chips without high-bandwidth off-chip main memory access is still an open challenge. In this paper, we demonstrate high-efficiency end-to-end SLM deployment on a multicore RISC-V (RV32) MCU augmented with ML instruction extensions and a hardware neural processing unit (NPU). To automate the exploration of the constrained, multi-dimensional memory vs. computation tradeoffs involved in aggressive SLM deployment on heterogeneous (multicore+NPU) resources, we introduce Deeploy, a novel Deep Neural Network (DNN) compiler, which generates highly-optimized C code requiring minimal runtime support. We demonstrate that Deeploy generates end-to-end code for executing SLMs, fully exploiting the RV32 cores' instruction extensions and the NPU: We achieve leading-edge energy and throughput of \SI{490}{\micro\joule \per Token}, at \SI{340}{Token \per \second} for an SLM trained on the TinyStories dataset, running for the first time on an MCU-class device without external memory., Comment: Accepted for publication at ESWEEK - CASES 2024
- Published
- 2024
241. Enhanced Cooper Pairing via Random Matrix Phonons in Superconducting Grains
- Author
-
Grankin, Andrey, Hafezi, Mohammad, and Galitski, Victor
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
There is rich experimental evidence that granular superconductors and superconducting films often exhibit a higher transition temperature, $T_{c}$, than that in bulk samples of the same material. This paper suggests that this enhancement hinges on random matrix phonons mediating Cooper pairing more efficiently than bulk phonons. We develop the Eliashberg theory of superconductivity in chaotic grains, calculate the random phonon spectrum and solve the Eliashberg equations numerically. Self-averaging of the effective electron-phonon coupling constant is noted, which allows us to fit the numerical data with analytical results based on a generalization of the Berry conjecture. The key insight is that the phonon density of states, and hence $T_{c}$, shows an enhancement proportional to the ratio of the perimeter and area of the grain - the Weyl law. We benchmark our results for aluminum films, and find an enhancement of $T_{c}$ of about $10\%$ for a randomly-generated shape. A larger enhancement of $T_{c}$ is readily possible by optimizing grain geometries. We conclude by noticing that mesoscopic shape fluctuations in realistic granular structures should give rise to a further enhancement of global $T_{c}$ due to the formation of a percolating Josephson network.
- Published
- 2024
242. Advancing spectroscopic understanding of HOCS$^+$: Laboratory investigations and astronomical implications
- Author
-
Lattanzi, Valerio, Sanz-Novo, Miguel, Rivilla, Víctor M., Araki, Mitsunori, Bunn, Hayley A, Martín-Pintado, Jesús, Jiménez-Serra, Izaskun, and Caselli, Paola
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Sulphur-bearing species play crucial roles in interstellar chemistry, yet their precise characterisation remains challenging. Here, we present laboratory experiments aimed at extending the high-resolution spectroscopy of protonated carbonyl sulphide (HOCS$^+$), a recently detected molecular ion in space. Using a frequency-modulated free-space absorption spectrometer, we detected rotational transitions of HOCS$^+$ in an extended negative glow discharge with a mixture of H$_2$ and OCS, extending the high-resolution rotational characterisation of the cation well into the millimetre wave region (200-370 GHz). Comparisons with prior measurements and quantum chemical calculations revealed an overall agreement in the spectroscopic parameters. With the new spectroscopic dataset in hand, we re-investigated the observations of HOCS$^+$ towards G+0.693-0.027, which were initially based solely on K$_a$ = 0 lines contaminated by HNC$^{34}$S. This re-investigation enabled the detection of weak K$_a$ = 0 transitions, free from HNC$^{34}$S contamination. Our high-resolution spectroscopic characterisation also provides valuable insights for future millimetre and submillimetre astronomical observations of these species in different interstellar environments. In particular, the new high-resolution catalogue will facilitate the search for this cation in cold dark clouds, where very narrow line widths are typically observed., Comment: 9 pages, 3 figures, and 3 tables. Accepted for publication in Astronomy & Astrophysics
- Published
- 2024
- Full Text
- View/download PDF
243. Generative Design of Periodic Orbits in the Restricted Three-Body Problem
- Author
-
Gil, Alvaro Francisco, Litteri, Walther, Rodriguez-Fernandez, Victor, Camacho, David, and Vasile, Massimiliano
- Subjects
Computer Science - Machine Learning ,Astrophysics - Earth and Planetary Astrophysics ,Computer Science - Artificial Intelligence - Abstract
The Three-Body Problem has fascinated scientists for centuries and it has been crucial in the design of modern space missions. Recent developments in Generative Artificial Intelligence hold transformative promise for addressing this longstanding problem. This work investigates the use of Variational Autoencoder (VAE) and its internal representation to generate periodic orbits. We utilize a comprehensive dataset of periodic orbits in the Circular Restricted Three-Body Problem (CR3BP) to train deep-learning architectures that capture key orbital characteristics, and we set up physical evaluation metrics for the generated trajectories. Through this investigation, we seek to enhance the understanding of how Generative AI can improve space mission planning and astrodynamics research, leading to novel, data-driven approaches in the field., Comment: SPAICE Conference 2024 (7 pages)
- Published
- 2024
244. Optimal sums of three cubes in $\mathbb{F}_q[t]$
- Author
-
Browning, Tim, Glas, Jakob, and Wang, Victor Y.
- Subjects
Mathematics - Number Theory ,11D45 (11D25, 11G40, 11M50, 11P55, 11T55) - Abstract
We use the circle method to prove that a density 1 of elements in $\mathbb{F}_q[t]$ are representable as a sum of three cubes of essentially minimal degree from $\mathbb{F}_q[t]$, assuming the Ratios Conjecture and that the characteristic is bigger than 3. Roughly speaking, to do so, we upgrade an order of magnitude result to a full asymptotic formula that was conjectured by Hooley in the number field setting., Comment: 23 pages
- Published
- 2024
245. SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature
- Author
-
Di Oliveira, Vinícius, Bezerra, Yuri Façanha, Weigang, Li, Brom, Pedro Carvalho, and Celestino, Victor Rafael R.
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Natural language processing (NLP) has seen significant advancements with the advent of large language models (LLMs). However, substantial improvements are still needed for languages other than English, especially for specific domains like the applications of Mercosur Common Nomenclature (NCM), a Brazilian Harmonized System (HS). To address this gap, this study uses TeenyTineLLaMA, a foundational Portuguese LLM, as an LLM source to implement the NCM application processing. Additionally, a simplified Retrieval-Augmented Fine-Tuning (RAFT) technique, termed SLIM-RAFT, is proposed for task-specific fine-tuning of LLMs. This approach retains the chain-of-thought (CoT) methodology for prompt development in a more concise and streamlined manner, utilizing brief and focused documents for training. The proposed model demonstrates an efficient and cost-effective alternative for fine-tuning smaller LLMs, significantly outperforming TeenyTineLLaMA and ChatGPT-4 in the same task. Although the research focuses on NCM applications, the methodology can be easily adapted for HS applications worldwide., Comment: 13 pages, 1 figure, to be publish in International Conference on Web Information Systems and Technologies - WEBIST 2024 proceedings
- Published
- 2024
246. Detecting Quantum and Classical Phase Transitions via Unsupervised Machine Learning of the Fisher Information Metric
- Author
-
Kasatkin, Victor, Mozgunov, Evgeny, Ezzell, Nicholas, and Lidar, Daniel
- Subjects
Quantum Physics - Abstract
The detection of quantum and classical phase transitions in the absence of an order parameter is possible using the Fisher information metric (FIM), also known as fidelity susceptibility. Here, we propose and investigate an unsupervised machine learning (ML) task: estimating the FIM given limited samples from a multivariate probability distribution of measurements made throughout the phase diagram. We utilize an unsupervised ML method called ClassiFIM (developed in a companion paper) to solve this task and demonstrate its empirical effectiveness in detecting both quantum and classical phase transitions using a variety of spin and fermionic models, for which we generate several publicly available datasets with accompanying ground-truth FIM. We find that ClassiFIM reliably detects both topological (e.g., XXZ chain) and dynamical (e.g., metal-insulator transition in Hubbard model) quantum phase transitions. We perform a detailed quantitative comparison with prior unsupervised ML methods for detecting quantum phase transitions. We demonstrate that ClassiFIM is competitive with these prior methods in terms of appropriate accuracy metrics while requiring significantly less resource-intensive training data compared to the original formulation of the prior methods. In particular, ClassiFIM only requires classical (single-basis) measurements. As part of our methodology development, we prove several theorems connecting the classical and quantum fidelity susceptibilities through equalities or bounds. We also significantly expand the existence conditions of the fidelity susceptibility, e.g., by relaxing standard differentiability conditions. These results may be of independent interest to the mathematical physics community., Comment: 31 pages, 10 figures; acknowledged two papers with three existing methods for FIM-Estimation task
- Published
- 2024
247. ClassiFIM: An Unsupervised Method To Detect Phase Transitions
- Author
-
Kasatkin, Victor, Mozgunov, Evgeny, Ezzell, Nicholas, Mishra, Utkarsh, Hen, Itay, and Lidar, Daniel
- Subjects
Computer Science - Machine Learning - Abstract
Estimation of the Fisher Information Metric (FIM-estimation) is an important task that arises in unsupervised learning of phase transitions, a problem proposed by physicists. This work completes the definition of the task by defining rigorous evaluation metrics distMSE, distMSEPS, and distRE and introduces ClassiFIM, a novel machine learning method designed to solve the FIM-estimation task. Unlike existing methods for unsupervised learning of phase transitions, ClassiFIM directly estimates a well-defined quantity (the FIM), allowing it to be rigorously compared to any present and future other methods that estimate the same. ClassiFIM transforms a dataset for the FIM-estimation task into a dataset for an auxiliary binary classification task and involves selecting and training a model for the latter. We prove that the output of ClassiFIM approaches the exact FIM in the limit of infinite dataset size and under certain regularity conditions. We implement ClassiFIM on multiple datasets, including datasets describing classical and quantum phase transitions, and find that it achieves a good ground truth approximation with modest computational resources. Furthermore, we independently implement two alternative state-of-the-art methods for unsupervised estimation of phase transition locations on the same datasets and find that ClassiFIM predicts such locations at least as well as these other methods. To emphasize the generality of our method, we also propose and generate the MNIST-CNN dataset, which consists of the output of CNNs trained on MNIST for different hyperparameter choices. Using ClassiFIM on this dataset suggests there is a phase transition in the distribution of image-prediction pairs for CNNs trained on MNIST, demonstrating the broad scope of FIM-estimation beyond physics., Comment: 23 pages, 5 figures
- Published
- 2024
248. Classification of groups whose common divisor graph on $p$-regular classes has no triangles
- Author
-
Felipe, María José, Jean-Philippe, Marc Kelly, and Sotomayor, Víctor
- Subjects
Mathematics - Group Theory ,20E45, 20D20 - Abstract
Let $p$ be a prime. In this paper we classify the $p$-structure of those finite $p$-separable groups such that, given any three non-central conjugacy classes of $p$-regular elements, two of them necessarily have coprime lengths.
- Published
- 2024
249. Dimensionality Reduction and Nearest Neighbors for Improving Out-of-Distribution Detection in Medical Image Segmentation
- Author
-
Woodland, McKell, Patel, Nihil, Castelo, Austin, Taie, Mais Al, Eltaher, Mohamed, Yung, Joshua P., Netherton, Tucker J., Calderone, Tiffany L., Sanchez, Jessica I., Cleere, Darrel W., Elsaiey, Ahmed, Gupta, Nakul, Victor, David, Beretta, Laura, Patel, Ankit B., and Brock, Kristy K.
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Clinically deployed deep learning-based segmentation models are known to fail on data outside of their training distributions. While clinicians review the segmentations, these models tend to perform well in most instances, which could exacerbate automation bias. Therefore, detecting out-of-distribution images at inference is critical to warn the clinicians that the model likely failed. This work applied the Mahalanobis distance (MD) post hoc to the bottleneck features of four Swin UNETR and nnU-net models that segmented the liver on T1-weighted magnetic resonance imaging and computed tomography. By reducing the dimensions of the bottleneck features with either principal component analysis or uniform manifold approximation and projection, images the models failed on were detected with high performance and minimal computational load. In addition, this work explored a non-parametric alternative to the MD, a k-th nearest neighbors distance (KNN). KNN drastically improved scalability and performance over MD when both were applied to raw and average-pooled bottleneck features., Comment: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2024:020. Expansion of "Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation" arXiv:2308.03723. Code available at https://github.com/mckellwoodland/dimen_reduce_mahal (https://zenodo.org/records/13881989)
- Published
- 2024
- Full Text
- View/download PDF
250. On the Infinite-Nudging Limit of the Nudging Filter for Continuous Data Assimilation
- Author
-
Carlson, Elizabeth, Farhat, Aseel, Martinez, Vincent R., and Victor, Collin
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Optimization and Control ,35Q30, 35B30, 37L15, 76B75, 76D05, 93B52 - Abstract
This article studies the intimate relationship between two filtering algorithms for continuous data assimilation, the synchronization filter and the nudging filter, in the paradigmatic context of the two-dimensional (2D) Navier-Stokes equations (NSE) for incompressible fluids. In this setting, the nudging filter can formally be viewed as an affine perturbation of the 2D NSE. Thus, in the degenerate limit of zero nudging parameter, the nudging filter converges to the solution of the 2D NSE. However, when the nudging parameter of the nudging filter is large, the perturbation becomes singular. It is shown that in the singular limit of infinite nudging parameter, the nudging filter converges to the synchronization filter. In establishing this result, the article fills a notable gap in the literature surrounding these algorithms. Numerical experiments are then presented that confirm the theoretical results and probes the issue of selecting a nudging strategy in the presence of observational noise. In this direction, an adaptive nudging strategy is proposed that leverages the insight gained from the relationship between the synchronization filter and the nudging filter that produces measurable improvement over the constant nudging strategy., Comment: 23 pages, 8 figures
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.