Author: "Borja, A." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Borja, A."' showing total 77,514 results

Start Over Author "Borja, A."

77,514 results on '"Borja, A."'

1. A Collaborative Content Moderation Framework for Toxicity Detection based on Conformalized Estimates of Annotation Disagreement

Author: Villate-Castillo, Guillermo, Del Ser, Javier, and Sanz, Borja
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, 68T50 (Primary) 68T37 (Secondary), I.2.7, I.2.1
Abstract: Content moderation typically combines the efforts of human moderators and machine learning models. However, these systems often rely on data where significant disagreement occurs during moderation, reflecting the subjective nature of toxicity perception. Rather than dismissing this disagreement as noise, we interpret it as a valuable signal that highlights the inherent ambiguity of the content,an insight missed when only the majority label is considered. In this work, we introduce a novel content moderation framework that emphasizes the importance of capturing annotation disagreement. Our approach uses multitask learning, where toxicity classification serves as the primary task and annotation disagreement is addressed as an auxiliary task. Additionally, we leverage uncertainty estimation techniques, specifically Conformal Prediction, to account for both the ambiguity in comment annotations and the model's inherent uncertainty in predicting toxicity and disagreement.The framework also allows moderators to adjust thresholds for annotation disagreement, offering flexibility in determining when ambiguity should trigger a review. We demonstrate that our joint approach enhances model performance, calibration, and uncertainty estimation, while offering greater parameter efficiency and improving the review process in comparison to single-task methods., Comment: 35 pages, 1 figure
Published: 2024

2. Towards Interoperability Testing of Smart Energy Systems -- An Overview and Discussion of Possibilities

Author: Strasser, Thomas I., Widl, Edmund, Kuchenbuch, René A., Lázaro-Elorriaga, Laura, Laraudogoitia, Borja Tellado, Ginocchi, Mirko, Penthong, Thanakorn, Ponci, Ferdinanda, Gyrard, Amelie, Kung, Antonio, Mac Gregor, Carlos A., Montero, Carmen Garcia, and Algaba, Eduardo Relano
Subjects: Computer Science - Software Engineering
Abstract: Interoperability is the key to implementing a wide range of energy systems applications. It involves the seamless cooperation of different methods and components. With smart energy systems, interoperability faces challenges due to integrating differ-ent approaches and technologies. This includes dealing with heterogeneous approaches with various communication proto-cols and data formats. However, it is essential for smart energy systems to carry out thorough interoperability tests. They are usually diverse, and challenging, thus requiring careful consideration of compatibility issues and complex integration scenari-os. Overcoming these challenges requires a systematic approach that includes thorough test planning, rigorous testing, and continuous test monitoring. Although numerous testing approaches exist, most are more developed at the component/device level than at the system level. Consequently, there are few approaches and related facilities to test the interoperability of smart energy approaches and solutions at the system level. This work analyses existing interoperability test concepts, identi-fies enablers and the potential for harmonisation of procedures, and proposes further developments of these approaches., Comment: 14th Mediterranean Conference on Power Generation Transmission, Distribution and Energy Conversion (MED POWER 2024)
Published: 2024

3. Quantum Large Language Models via Tensor Network Disentanglers

Author: Aizpurua, Borja, Jahromi, Saeed S., Singh, Sukhbinder, and Orus, Roman
Subjects: Quantum Physics, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We propose a method to enhance the performance of Large Language Models (LLMs) by integrating quantum computing and quantum-inspired techniques. Specifically, our approach involves replacing the weight matrices in the Self-Attention and Multi-layer Perceptron layers with a combination of two variational quantum circuits and a quantum-inspired tensor network, such as a Matrix Product Operator (MPO). This substitution enables the reproduction of classical LLM functionality by decomposing weight matrices through the application of tensor network disentanglers and MPOs, leveraging well-established tensor network techniques. By incorporating more complex and deeper quantum circuits, along with increasing the bond dimensions of the MPOs, our method captures additional correlations within the quantum-enhanced LLM, leading to improved accuracy beyond classical models while maintaining low memory overhead., Comment: 4 pages, 2 figures
Published: 2024

4. Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Author: Bongole, Raghav, Gouverneur, Amaury, Rodríguez-Gálvez, Borja, Oechtering, Tobias J., and Skoglund, Mikael
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory
Abstract: We study agents acting in an unknown environment where the agent's goal is to find a robust policy. We consider robust policies as policies that achieve high cumulative rewards for all possible environments. To this end, we consider agents minimizing the maximum regret over different environment parameters, leading to the study of minimax regret. This research focuses on deriving information-theoretic bounds for minimax regret in Markov Decision Processes (MDPs) with a finite time horizon. Building on concepts from supervised learning, such as minimum excess risk (MER) and minimax excess risk, we use recent bounds on the Bayesian regret to derive minimax regret bounds. Specifically, we establish minimax theorems and use bounds on the Bayesian regret to perform minimax regret analysis using these minimax theorems. Our contributions include defining a suitable minimax regret in the context of MDPs, finding information-theoretic bounds for it, and applying these bounds in various scenarios.
Published: 2024

5. The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Author: Steinke, Thomas, Nasr, Milad, Ganesh, Arun, Balle, Borja, Choquette-Choo, Christopher A., Jagielski, Matthew, Hayes, Jamie, Thakurta, Abhradeep Guha, Smith, Adam, and Terzis, Andreas
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks.
Published: 2024

6. Real-time Ship Recognition and Georeferencing for the Improvement of Maritime Situational Awareness

Author: Perez, Borja Carrillo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: In an era where maritime infrastructures are crucial, advanced situational awareness solutions are increasingly important. The use of optical camera systems can allow real-time usage of maritime footage. This thesis presents an investigation into leveraging deep learning and computer vision to advance real-time ship recognition and georeferencing for the improvement of maritime situational awareness. A novel dataset, ShipSG, is introduced, containing 3,505 images and 11,625 ship masks with corresponding class and geographic position. After an exploration of state-of-the-art, a custom real-time segmentation architecture, ScatYOLOv8+CBAM, is designed for the NVIDIA Jetson AGX Xavier embedded system. This architecture adds the 2D scattering transform and attention mechanisms to YOLOv8, achieving an mAP of 75.46% and an 25.3 ms per frame, outperforming state-of-the-art methods by over 5%. To improve small and distant ship recognition in high-resolution images on embedded systems, an enhanced slicing mechanism is introduced, improving mAP by 8% to 11%. Additionally, a georeferencing method is proposed, achieving positioning errors of 18 m for ships up to 400 m away and 44 m for ships between 400 m and 1200 m. The findings are also applied in real-world scenarios, such as the detection of abnormal ship behaviour, camera integrity assessment and 3D reconstruction. The approach of this thesis outperforms existing methods and provides a framework for integrating recognized and georeferenced ships into real-time systems, enhancing operational effectiveness and decision-making for maritime stakeholders. This thesis contributes to the maritime computer vision field by establishing a benchmark for ship segmentation and georeferencing research, demonstrating the viability of deep-learning-based recognition and georeferencing methods for real-time maritime monitoring.
Published: 2024
Full Text: View/download PDF

7. DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction

Author: Zhang, Xinwei, Bu, Zhiqi, Balle, Borja, Hong, Mingyi, Razaviyayn, Meisam, and Mirrokni, Vahab
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Statistics - Machine Learning
Abstract: Differential privacy (DP) offers a robust framework for safeguarding individual data privacy. To utilize DP in training modern machine learning models, differentially private optimizers have been widely used in recent years. A popular approach to privatize an optimizer is to clip the individual gradients and add sufficiently large noise to the clipped gradient. This approach led to the development of DP optimizers that have comparable performance with their non-private counterparts in fine-tuning tasks or in tasks with a small number of training parameters. However, a significant performance drop is observed when these optimizers are applied to large-scale training. This degradation stems from the substantial noise injection required to maintain DP, which disrupts the optimizer's dynamics. This paper introduces DiSK, a novel framework designed to significantly enhance the performance of DP optimizers. DiSK employs Kalman filtering, a technique drawn from control and signal processing, to effectively denoise privatized gradients and generate progressively refined gradient estimations. To ensure practicality for large-scale training, we simplify the Kalman filtering process, minimizing its memory and computational demands. We establish theoretical privacy-utility trade-off guarantees for DiSK, and demonstrate provable improvements over standard DP optimizers like DPSGD in terms of iteration complexity upper-bound. Extensive experiments across diverse tasks, including vision tasks such as CIFAR-100 and ImageNet-1k and language fine-tuning tasks such as GLUE, E2E, and DART, validate the effectiveness of DiSK. The results showcase its ability to significantly improve the performance of DP optimizers, surpassing state-of-the-art results under the same privacy constraints on several benchmarks.
Published: 2024

8. Can a Neural Model Guide Fieldwork? A Case Study on Morphological Inflection

Author: Mahmudi, Aso, Herce, Borja, Amestica, Demian Inostroza, Scherbakov, Andreas, Hovy, Eduard, and Vylomova, Ekaterina
Subjects: Computer Science - Computation and Language
Abstract: Linguistic fieldwork is an important component in language documentation and preservation. However, it is a long, exhaustive, and time-consuming process. This paper presents a novel model that guides a linguist during the fieldwork and accounts for the dynamics of linguist-speaker interactions. We introduce a novel framework that evaluates the efficiency of various sampling strategies for obtaining morphological data and assesses the effectiveness of state-of-the-art neural models in generalising morphological structures. Our experiments highlight two key strategies for improving the efficiency: (1) increasing the diversity of annotated data by uniform sampling among the cells of the paradigm tables, and (2) using model confidence as a guide to enhance positive interaction by providing reliable predictions during annotation.
Published: 2024

9. CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Author: Cheng, Zhao, Wan, Diane, Abueg, Matthew, Ghalebikesabi, Sahra, Yi, Ren, Bagdasarian, Eugene, Balle, Borja, Mellem, Stefan, and O'Banion, Shawn
Subjects: Computer Science - Artificial Intelligence
Abstract: Advances in generative AI point towards a new era of personalized applications that perform diverse tasks on behalf of users. While general AI assistants have yet to fully emerge, their potential to share personal data raises significant privacy challenges. This paper introduces CI-Bench, a comprehensive synthetic benchmark for evaluating the ability of AI assistants to protect personal information during model inference. Leveraging the Contextual Integrity framework, our benchmark enables systematic assessment of information flow across important context dimensions, including roles, information types, and transmission principles. We present a novel, scalable, multi-step synthetic data pipeline for generating natural communications, including dialogues and emails. Unlike previous work with smaller, narrowly focused evaluations, we present a novel, scalable, multi-step data pipeline that synthetically generates natural communications, including dialogues and emails, which we use to generate 44 thousand test samples across eight domains. Additionally, we formulate and evaluate a naive AI assistant to demonstrate the need for further study and careful training towards personal assistant tasks. We envision CI-Bench as a valuable tool for guiding future language model development, deployment, system design, and dataset construction, ultimately contributing to the development of AI assistants that align with users' privacy expectations.
Published: 2024

10. Decolonising Data Systems: Using Jyutping or Pinyin as tonal representations of Chinese names for data linkage

Author: Lam, Joseph, Cortina-Borja, Mario, Aldridge, Robert, Blackburn, Ruth, and Harron, Katie
Subjects: Computer Science - Computation and Language
Abstract: Data linkage is increasingly used in health research and policy making and is relied on for understanding health inequalities. However, linked data is only as useful as the underlying data quality, and differential linkage rates may induce selection bias in the linked data. A mechanism that selectively compromises data quality is name romanisation. Converting text of a different writing system into Latin based writing, or romanisation, has long been the standard process of representing names in character-based writing systems such as Chinese, Vietnamese, and other languages such as Swahili. Unstandardised romanisation of Chinese characters, due in part to problems of preserving the correct name orders the lack of proper phonetic representation of a tonal language, has resulted in poor linkage rates for Chinese immigrants. This opinion piece aims to suggests that the use of standardised romanisation systems for Cantonese (Jyutping) or Mandarin (Pinyin) Chinese, which incorporate tonal information, could improve linkage rates and accuracy for individuals with Chinese names. We used 771 Chinese and English names scraped from openly available sources, and compared the utility of Jyutping, Pinyin and the Hong Kong Government Romanisation system (HKG-romanisation) for representing Chinese names. We demonstrate that both Jyutping and Pinyin result in fewer errors compared with the HKG-romanisation system. We suggest that collecting and preserving people's names in their original writing systems is ethically and socially pertinent. This may inform development of language-specific pre-processing and linkage paradigms that result in more inclusive research data which better represents the targeted populations.
Published: 2024

11. Hacking Cryptographic Protocols with Tensor Network Attacks

Author: Aizpurua, Borja, Patra, Siddhartha, Martinez, Josu Etxezarreta, and Orus, Roman
Subjects: Quantum Physics
Abstract: Here we introduce the application of Tensor Networks (TN) to launch attacks on symmetric-key cryptography. Our approaches make use of Matrix Product States (MPS) as well as our recently-introduced Flexible-PEPS Quantum Circuit Simulator (FQCS). We compare these approaches with traditional brute-force attacks and Variational Quantum Attack Algorithm (VQAA) methods also proposed by us. Our benchmarks include the Simplified Data Encryption Standard (S-DES) with 10-bit keys, Simplified Advanced Encryption Standard (S-AES) with 16-bit keys, and Blowfish with 32-bit keys. We find that for small key size, MPS outperforms VQAA and FQCS in both time and average iterations required to recover the key. As key size increases, FQCS becomes more efficient in terms of average iterations compared to VQAA and MPS, while MPS remains the fastest in terms of time. These results highlight the potential of TN methods in advancing quantum cryptanalysis, particularly in optimizing both speed and efficiency. Our results also show that entanglement becomes crucial as key size increases., Comment: 10 pages, 8 figures
Published: 2024

12. Automatic occlusion removal from 3D maps for maritime situational awareness

Author: Sattler, Felix, Perez, Borja Carrillo, Stephan, Maurice, and Barnes, Sarah
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We introduce a novel method for updating 3D geospatial models, specifically targeting occlusion removal in large-scale maritime environments. Traditional 3D reconstruction techniques often face problems with dynamic objects, like cars or vessels, that obscure the true environment, leading to inaccurate models or requiring extensive manual editing. Our approach leverages deep learning techniques, including instance segmentation and generative inpainting, to directly modify both the texture and geometry of 3D meshes without the need for costly reprocessing. By selectively targeting occluding objects and preserving static elements, the method enhances both geometric and visual accuracy. This approach not only preserves structural and textural details of map data but also maintains compatibility with current geospatial standards, ensuring robust performance across diverse datasets. The results demonstrate significant improvements in 3D model fidelity, making this method highly applicable for maritime situational awareness and the dynamic display of auxiliary information., Comment: Preprint of SPIE Sensor + Imaging 2024 conference paper
Published: 2024

13. On Using Curved Mirrors to Decrease Shadowing in VLC

Author: Guzman, Borja Genoves, Armada, Ana Garcia, and Brandt-Pearce, Maïté
Subjects: Electrical Engineering and Systems Science - Signal Processing, Electrical Engineering and Systems Science - Systems and Control
Abstract: Visible light communication (VLC) complements radio frequency in indoor environments with large wireless data traffic. However, VLC is hindered by dramatic path losses when an opaque object is interposed between the transmitter and the receiver. Prior works propose the use of plane mirrors as optical reconfigurable intelligent surfaces (ORISs) to enhance communications through non-line-of-sight links. Plane mirrors rely on their orientation to forward the light to the target user location, which is challenging to implement in practice. This paper studies the potential of curved mirrors as static reflective surfaces to provide a broadening specular reflection that increases the signal coverage in mirror-assisted VLC scenarios. We study the behavior of paraboloid and semi-spherical mirrors and derive the irradiance equations. We provide extensive numerical and analytical results and show that curved mirrors, when developed with proper dimensions, may reduce the shadowing probability to zero, while static plane mirrors of the same size have shadowing probabilities larger than 65%. Furthermore, the signal-to-noise ratio offered by curved mirrors may suffice to provide connectivity to users deployed in the room even when a line-of-sight link blockage occurs., Comment: Accepted to be published in IEEE Globecom 2024
Published: 2024

14. Physics-informed DeepONet with stiffness-based loss functions for structural response prediction

Author: Ahmed, Bilal, Qiu, Yuqing, Abueidda, Diab W., El-Sekelly, Waleed, de Soto, Borja Garcia, Abdoun, Tarek, and Mobasher, Mostafa E.
Subjects: Computer Science - Machine Learning, Computer Science - Computational Engineering, Finance, and Science
Abstract: Finite element modeling is a well-established tool for structural analysis, yet modeling complex structures often requires extensive pre-processing, significant analysis effort, and considerable time. This study addresses this challenge by introducing an innovative method for real-time prediction of structural static responses using DeepOnet which relies on a novel approach to physics-informed networks driven by structural balance laws. This approach offers the flexibility to accurately predict responses under various load classes and magnitudes. The trained DeepONet can generate solutions for the entire domain, within a fraction of a second. This capability effectively eliminates the need for extensive remodeling and analysis typically required for each new case in FE modeling. We apply the proposed method to two structures: a simple 2D beam structure and a comprehensive 3D model of a real bridge. To predict multiple variables with DeepONet, we utilize two strategies: a split branch/trunk and multiple DeepONets combined into a single DeepONet. In addition to data-driven training, we introduce a novel physics-informed training approaches. This method leverages structural stiffness matrices to enforce fundamental equilibrium and energy conservation principles, resulting in two novel physics-informed loss functions: energy conservation and static equilibrium using the Schur complement. We use various combinations of loss functions to achieve an error rate of less than 5% with significantly reduced training time. This study shows that DeepONet, enhanced with hybrid loss functions, can accurately and efficiently predict displacements and rotations at each mesh point, with reduced training time.
Published: 2024

15. Enhancing Student's Conceptual Understanding on the Patterns of Mendelian Genetics through Task-Based Learning

Author: Emmylou Aspacio Borja and Romel Cayao Mutya
Abstract: Mendelian genetics are essential for students seeking to comprehend the complexities of inheritance; although fundamental, these biology concepts are difficult for students to understand. This study examined the effectiveness of task-based learning (TBL) in enhancing the students' conceptual understanding of the Patterns of Mendelian Genetics. A pretest and posttest quasi-experimental research design involved an experimental and a control group. An Intrinsic Motivation Inventory questionnaire was utilized to assess the level of intrinsic motivation for task evaluation for the experimental group. Paired t-test was used to compare the pretest and post-test results. Before the intervention, both groups had a low conceptual understanding of the topic. At the end of the intervention, both groups had significantly increased their performances from pretest to posttest scores. The study revealed that TBL is more effective than Traditional Lecture-Based Instruction (TLI), as seen in their enhanced student performance, implying the effectiveness of the TBL as an innovative instructional approach. Participants from the experimental group expressed enjoyment, competence, and ownership of their task activities, and they did not feel nervous and anxious about doing the tasks. Pearson r-correlation was used to establish a relationship among the variables. Perceived choice, pressure/tension, and student performance in the experimental group have low positive correlations, and perceived choice, interest/enjoyment, and performance have a negligible correlation. This approach is highly commendable for biology instruction. By illuminating the effectiveness of active learning in improving student's conceptual understanding, this study bridges the gap between theoretical-practical gap in genetics via active learning effectiveness.
Published: 2024

16. Celtibero: Robust Layered Aggregation for Federated Learning

Author: Molina-Coronado, Borja
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Federated Learning (FL) is an innovative approach to distributed machine learning. While FL offers significant privacy advantages, it also faces security challenges, particularly from poisoning attacks where adversaries deliberately manipulate local model updates to degrade model performance or introduce hidden backdoors. Existing defenses against these attacks have been shown to be effective when the data on the nodes is identically and independently distributed (i.i.d.), but they often fail under less restrictive, non-i.i.d data conditions. To overcome these limitations, we introduce Celtibero, a novel defense mechanism that integrates layered aggregation to enhance robustness against adversarial manipulation. Through extensive experiments on the MNIST and IMDB datasets, we demonstrate that Celtibero consistently achieves high main task accuracy (MTA) while maintaining minimal attack success rates (ASR) across a range of untargeted and targeted poisoning attacks. Our results highlight the superiority of Celtibero over existing defenses such as FL-Defender, LFighter, and FLAME, establishing it as a highly effective solution for securing federated learning systems against sophisticated poisoning attacks.
Published: 2024

17. On Information Theoretic Fairness: Compressed Representations With Perfect Demographic Parity

Author: Zamani, Amirreza, Rodríguez-Gálvez, Borja, and Skoglund, Mikael
Subjects: Computer Science - Information Theory
Abstract: In this article, we study the fundamental limits in the design of fair and/or private representations achieving perfect demographic parity and/or perfect privacy through the lens of information theory. More precisely, given some useful data $X$ that we wish to employ to solve a task $T$, we consider the design of a representation $Y$ that has no information of some sensitive attribute or secret $S$, that is, such that $I(Y;S) = 0$. We consider two scenarios. First, we consider a design desiderata where we want to maximize the information $I(Y;T)$ that the representation contains about the task, while constraining the level of compression (or encoding rate), that is, ensuring that $I(Y;X) \leq r$. Second, inspired by the Conditional Fairness Bottleneck problem, we consider a design desiderata where we want to maximize the information $I(Y;T|S)$ that the representation contains about the task which is not shared by the sensitive attribute or secret, while constraining the amount of irrelevant information, that is, ensuring that $I(Y;X|T,S) \leq r$. In both cases, we employ extended versions of the Functional Representation Lemma and the Strong Functional Representation Lemma and study the tightness of the obtained bounds. Every result here can also be interpreted as a coding with perfect privacy problem by considering the sensitive attribute as a secret.
Published: 2024

18. An Information-Theoretic Approach to Generalization Theory

Author: Rodríguez-Gálvez, Borja, Thobaben, Ragnar, and Skoglund, Mikael
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We investigate the in-distribution generalization of machine learning algorithms. We depart from traditional complexity-based approaches by analyzing information-theoretic bounds that quantify the dependence between a learning algorithm and the training data. We consider two categories of generalization guarantees: 1) Guarantees in expectation: These bounds measure performance in the average case. Here, the dependence between the algorithm and the data is often captured by information measures. While these measures offer an intuitive interpretation, they overlook the geometry of the algorithm's hypothesis class. Here, we introduce bounds using the Wasserstein distance to incorporate geometry, and a structured, systematic method to derive bounds capturing the dependence between the algorithm and an individual datum, and between the algorithm and subsets of the training data. 2) PAC-Bayesian guarantees: These bounds measure the performance level with high probability. Here, the dependence between the algorithm and the data is often measured by the relative entropy. We establish connections between the Seeger--Langford and Catoni's bounds, revealing that the former is optimized by the Gibbs posterior. We introduce novel, tighter bounds for various types of loss functions. To achieve this, we introduce a new technique to optimize parameters in probabilistic statements. To study the limitations of these approaches, we present a counter-example where most of the information-theoretic bounds fail while traditional approaches do not. Finally, we explore the relationship between privacy and generalization. We show that algorithms with a bounded maximal leakage generalize. For discrete data, we derive new bounds for differentially private algorithms that guarantee generalization even with a constant privacy parameter, which is in contrast to previous bounds in the literature.
Published: 2024

19. Damage identification for bridges using machine learning: Development and application to KW51 bridge

Author: Qiu, Yuqing, Ahmed, Bilal, Abueidda, Diab W., El-Sekelly, Waleed, de Soto, Borja Garcia, Abdoun, Tarek, Ji, Hongli, Qiu, Jinhao, and Mobasher, Mostafa E.
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: The available tools for damage identification in civil engineering structures are known to be computationally expensive and data-demanding. This paper proposes a comprehensive machine learning based damage identification (CMLDI) method that integrates modal analysis and dynamic analysis strategies. The proposed approach is applied to a real structure - KW51 railway bridge in Leuven. CMLDI diligently combines signal processing, machine learning (ML), and structural analysis techniques to achieve a fast damage identification solver that relies on minimal monitoring data. CMLDI considers modal analysis inputs and extracted features from acceleration responses to inform the damage identification based on the long-term and short-term monitoring data. Results of operational modal analysis, through the analysis of long-term monitoring data, are analyzed using pre-trained k-nearest neighbor (kNN) classifiers to identify damage existence, location, and magnitude. A well-crafted assembly of signal processing and ML methods is used to analyze acceleration time histories. Stacked gated recurrent unit (Stacked GRU) networks are used to identify damage existence, kNN classifiers are used to identify damage magnitude, and convolutions neural networks (CNN) are used to identify damage location. The damage identification results for the KW51 bridge demonstrate this approach's high accuracy, efficiency, and robustness. In this work, the training data is retrieved from the sensor of the KW51 bridge as well as the numerical finite element model (FEM). The proposed approach presents a systematic path to the generation of training data using a validated FEM. The data generation relies on modeling combinations of damage locations and magnitudes along the bridge.
Published: 2024

20. Operationalizing Contextual Integrity in Privacy-Conscious Assistants

Author: Ghalebikesabi, Sahra, Bagdasaryan, Eugene, Yi, Ren, Yona, Itay, Shumailov, Ilia, Pappu, Aneesh, Shi, Chongyang, Weidinger, Laura, Stanforth, Robert, Berrada, Leonard, Kohli, Pushmeet, Huang, Po-Sen, and Balle, Borja
Subjects: Computer Science - Artificial Intelligence
Abstract: Advanced AI assistants combine frontier LLMs and tool access to autonomously perform complex tasks on behalf of users. While the helpfulness of such assistants can increase dramatically with access to user information including emails and documents, this raises privacy concerns about assistants sharing inappropriate information with third parties without user supervision. To steer information-sharing assistants to behave in accordance with privacy expectations, we propose to operationalize contextual integrity (CI), a framework that equates privacy with the appropriate flow of information in a given context. In particular, we design and evaluate a number of strategies to steer assistants' information-sharing actions to be CI compliant. Our evaluation is based on a novel form filling benchmark composed of human annotations of common webform applications, and it reveals that prompting frontier LLMs to perform CI-based reasoning yields strong results.
Published: 2024

21. The Bicameral Cache: a split cache for vector architectures

Author: Rebolledo, Susana, Perez, Borja, Bosque, Jose Luis, and Hsu, Peter
Subjects: Computer Science - Hardware Architecture, Computer Science - Performance
Abstract: The Bicameral Cache is a cache organization proposal for a vector architecture that segregates data according to their access type, distinguishing scalar from vector references. Its aim is to avoid both types of references from interfering in each other's data locality, with a special focus on prioritizing the performance on vector references. The proposed system incorporates an additional, non-polluting prefetching mechanism to help populate the long vector cache lines in advance to increase the hit rate by further exploiting the spatial locality on vector data. Its evaluation was conducted on the Cavatools simulator, comparing the performance to a standard conventional cache, over different typical vector benchmarks for several vector lengths. The results proved the proposed cache speeds up performance on stride-1 vector benchmarks, while hardly impacting non-stride-1's. In addition, the prefetching feature consistently provided an additional value., Comment: 9 pages, 5 figures
Published: 2024

22. Beyond Generative Artificial Intelligence: Roadmap for Natural Language Generation

Author: Maestre, María Miró, Martínez-Murillo, Iván, Martin, Tania J., Navarro-Colorado, Borja, Ferrández, Antonio, Cueto, Armando Suárez, and Lloret, Elena
Subjects: Computer Science - Computation and Language
Abstract: Generative Artificial Intelligence has grown exponentially as a result of Large Language Models (LLMs). This has been possible because of the impressive performance of deep learning methods created within the field of Natural Language Processing (NLP) and its subfield Natural Language Generation (NLG), which is the focus of this paper. Within the growing LLM family are the popular GPT-4, Bard and more specifically, tools such as ChatGPT have become a benchmark for other LLMs when solving most of the tasks involved in NLG research. This scenario poses new questions about the next steps for NLG and how the field can adapt and evolve to deal with new challenges in the era of LLMs. To address this, the present paper conducts a review of a representative sample of surveys recently published in NLG. By doing so, we aim to provide the scientific community with a research roadmap to identify which NLG aspects are still not suitably addressed by LLMs, as well as suggest future lines of research that should be addressed going forward.
Published: 2024

23. BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023

Author: Kazerooni, Anahita Fathi, Khalili, Nastaran, Liu, Xinyang, Haldar, Debanjan, Jiang, Zhifan, Zapaishchykova, Anna, Pavaine, Julija, Shah, Lubdha M., Jones, Blaise V., Sheth, Nakul, Prabhu, Sanjay P., McAllister, Aaron S., Tu, Wenxin, Nandolia, Khanak K., Rodriguez, Andres F., Shaikh, Ibraheem Salman, Montano, Mariana Sanchez, Lai, Hollie Anne, Adewole, Maruf, Albrecht, Jake, Anazodo, Udunna, Anderson, Hannah, Anwar, Syed Muhammed, Aristizabal, Alejandro, Bagheri, Sina, Baid, Ujjwal, Bergquist, Timothy, Borja, Austin J., Calabrese, Evan, Chung, Verena, Conte, Gian-Marco, Eddy, James, Ezhov, Ivan, Familiar, Ariana M., Farahani, Keyvan, Gandhi, Deep, Gottipati, Anurag, Haldar, Shuvanjan, Iglesias, Juan Eugenio, Janas, Anastasia, Elaine, Elaine, Karargyris, Alexandros, Kassem, Hasan, Khalili, Neda, Kofler, Florian, LaBella, Dominic, Van Leemput, Koen, Li, Hongwei B., Maleki, Nazanin, Meier, Zeke, Menze, Bjoern, Moawad, Ahmed W., Pati, Sarthak, Piraud, Marie, Poussaint, Tina, Reitman, Zachary J., Rudie, Jeffrey D., Saluja, Rachit, Sheller, MIcah, Shinohara, Russell Takeshi, Viswanathan, Karthik, Wang, Chunhao, Wiestler, Benedikt, Wiggins, Walter F., Davatzikos, Christos, Storm, Phillip B., Bornhorst, Miriam, Packer, Roger, Hummel, Trent, de Blank, Peter, Hoffman, Lindsey, Aboian, Mariam, Nabavizadeh, Ali, Ware, Jeffrey B., Kann, Benjamin H., Rood, Brian, Resnick, Adam, Bakas, Spyridon, Vossough, Arastoo, and Linguraru, Marius George
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 challenge, the first Brain Tumor Segmentation (BraTS) challenge focused on pediatric brain tumors. This challenge utilized data acquired from multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. BraTS-PEDs 2023 aimed to evaluate volumetric segmentation algorithms for pediatric brain gliomas from magnetic resonance imaging using standardized quantitative performance evaluation metrics employed across the BraTS 2023 challenges. The top-performing AI approaches for pediatric tumor analysis included ensembles of nnU-Net and Swin UNETR, Auto3DSeg, or nnU-Net with a self-supervised framework. The BraTSPEDs 2023 challenge fostered collaboration between clinicians (neuro-oncologists, neuroradiologists) and AI/imaging scientists, promoting faster data sharing and the development of automated volumetric analysis techniques. These advancements could significantly benefit clinical trials and improve the care of children with brain tumors.
Published: 2024

24. Energy-Based Control Approaches for Weakly Coupled Electromechanical Systems

Author: Javanmardi, N., Borja, P., Yazdanpanah, M. J., and Scherpen, J. M. A.
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: This paper addresses the regulation and trajectory-tracking problems for two classes of weakly coupled electromechanical systems. To this end, we formulate an energy-based model for these systems within the port-Hamiltonian framework. Then, we employ Lyapunov theory and the notion of contractive systems to develop control approaches in the port-Hamiltonian framework. Remarkably, these control methods eliminate the need for solving partial differential equations or implementing any change of coordinates and are endowed with a physical interpretation. We also investigate the effect of coupled damping on the transient performance and convergence rate of the closed-loop system. Finally, the applicability of the proposed approaches is illustrated in two applications of electromechanical systems via simulations.
Published: 2024

25. A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry

Author: Lindström, Martin, Rodríguez-Gálvez, Borja, Thobaben, Ragnar, and Skoglund, Mikael
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning
Abstract: Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere. The prototypes bias the representations to class separation in a scale invariant and known geometry. Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension. In this paper, we address both shortcomings. To address (i), we present a principled optimisation procedure whose solution we show is optimal. To address (ii), we construct well-separated prototypes in a wide range of dimensions using linear block codes. Additionally, we give a full characterisation of the optimal prototype placement in terms of achievable and converse bounds, showing that our proposed methods are near-optimal., Comment: 14 pages: 9 of the main paper, 2 of references, and 3 of appendices. To appear in the Proceedings of the Geometry-grounded Representation Learning and Generative Modeling at the 41st International Conference on Machine Learning, Vienna, Austria. Code is available at: https://github.com/martinlindstrom/coding_theoretic_hpl
Published: 2024

26. Quantum Technology masters: A shortcut to the quantum industry?

Author: Goorney, Simon, Munoz, Borja, and Sherson, Jacob
Subjects: Physics - Physics Education, Quantum Physics
Abstract: In this article, we investigate a growing trend in the worldwide Quantum Technology (QT) education landscape, that of the development of masters programs, intended to provide graduates with the knowledge and skills to take a job in the quantum industry, while serving a much shorter timeline than a doctoral degree. Through a global survey, we identified 86 masters programs, with substantial growth since 2021. Over time masters have become increasingly interdisciplinary, organised by multiple faculties or through joint degree programs, and offer more hands-on experiences such as internships in companies. Information from program organisers and websites suggests that the intended career destinations of their graduates are a diverse range of industries, and therefore masters programs may be a boon to the industrialisation of quantum technologies. Finally, we identify a range of national efforts to grow the quantum workforce of many countries, quantum program enhancements, which augment the content of existing study programs with quantum content. This may further contribute to the growth and viability of masters programs as a route to the quantum industry.
Published: 2024

27. Strong Charge-Photon Coupling in Planar Germanium Enabled by Granular Aluminium Superinductors

Author: Janík, Marián, Roux, Kevin, Espinosa, Carla Borja, Sagi, Oliver, Baghdadi, Abdulhamid, Adletzberger, Thomas, Calcaterra, Stefano, Botifoll, Marc, Manjón, Alba Garzón, Arbiol, Jordi, Chrastina, Daniel, Isella, Giovanni, Pop, Ioan M., and Katsaros, Georgios
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Quantum Physics
Abstract: High kinetic inductance superconductors are gaining increasing interest for the realisation of qubits, amplifiers and detectors. Moreover, thanks to their high impedance, quantum buses made of such materials enable large zero-point fluctuations of the voltage, boosting the coupling rates to spin and charge qubits. However, fully exploiting the potential of disordered or granular superconductors is challenging, as their inductance and, therefore, impedance at high values are difficult to control. Here we have integrated a granular aluminium resonator, having a characteristic impedance exceeding the resistance quantum, with a germanium double quantum dot and demonstrate strong charge-photon coupling with a rate of $g_\text{c}/2\pi= (566 \pm 2)$ MHz. This was achieved due to the realisation of a wireless ohmmeter, which allows \emph{in situ} measurements during film deposition and, therefore, control of the kinetic inductance of granular aluminium films. Reproducible fabrication of circuits with impedances (inductances) exceeding 13 k$\Omega$ (1 nH per square) is now possible. This broadly applicable method opens the path for novel qubits and high-fidelity, long-distance two-qubit gates.
Published: 2024

28. Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios

Author: Apellániz, Patricia A., Jiménez, Ana, Galende, Borja Arroyo, Parras, Juan, and Zazo, Santiago
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, I.2.0
Abstract: While synthetic tabular data generation using Deep Generative Models (DGMs) offers a compelling solution to data scarcity and privacy concerns, their effectiveness relies on substantial training data, often unavailable in real-world applications. This paper addresses this challenge by proposing a novel methodology for generating realistic and reliable synthetic tabular data with DGMs in limited real-data environments. Our approach proposes several ways to generate an artificial inductive bias in a DGM through transfer learning and meta-learning techniques. We explore and compare four different methods within this framework, demonstrating that transfer learning strategies like pre-training and model averaging outperform meta-learning approaches, like Model-Agnostic Meta-Learning, and Domain Randomized Search. We validate our approach using two state-of-the-art DGMs, namely, a Variational Autoencoder and a Generative Adversarial Network, to show that our artificial inductive bias fuels superior synthetic data quality, as measured by Jensen-Shannon divergence, achieving relative gains of up to 50\% when using our proposed approach. This methodology has broad applicability in various DGMs and machine learning tasks, particularly in areas like healthcare and finance, where data scarcity is often a critical issue., Comment: 19 pages, 6 Figures
Published: 2024

29. Transformers meet Neural Algorithmic Reasoners

Author: Bounsi, Wilfried, Ibarz, Borja, Dudzik, Andrew, Hamrick, Jessica B., Markeeva, Larisa, Vitvitskyi, Alex, Pascanu, Razvan, and Veličković, Petar
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Transformers have revolutionized machine learning with their simple yet effective architecture. Pre-training Transformers on massive text datasets from the Internet has led to unmatched generalization for natural language understanding (NLU) tasks. However, such language models remain fragile when tasked with algorithmic forms of reasoning, where computations must be precise and robust. To address this limitation, we propose a novel approach that combines the Transformer's language understanding with the robustness of graph neural network (GNN)-based neural algorithmic reasoners (NARs). Such NARs proved effective as generic solvers for algorithmic tasks, when specified in graph form. To make their embeddings accessible to a Transformer, we propose a hybrid architecture with a two-phase training procedure, allowing the tokens in the language model to cross-attend to the node embeddings from the NAR. We evaluate our resulting TransNAR model on CLRS-Text, the text-based version of the CLRS-30 benchmark, and demonstrate significant gains over Transformer-only models for algorithmic reasoning, both in and out of distribution., Comment: To appear at CVPR 2024 Multimodal Algorithmic Reasoning (MAR) Workshop. 10 pages, 5 figures
Published: 2024

30. Beyond the Calibration Point: Mechanism Comparison in Differential Privacy

Author: Kaissis, Georgios, Kolek, Stefan, Balle, Borja, Hayes, Jamie, and Rueckert, Daniel
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single $(\varepsilon, \delta)$-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given $(\varepsilon, \delta)$, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need for robust, rigorous methods for comparing DP guarantees in such cases. Here, we introduce the $\Delta$-divergence between mechanisms which quantifies the worst-case excess privacy vulnerability of choosing one mechanism over another in terms of $(\varepsilon, \delta)$, $f$-DP and in terms of a newly presented Bayesian interpretation. Moreover, as a generalisation of the Blackwell theorem, it is endowed with strong decision-theoretic foundations. Through application examples, we show that our techniques can facilitate informed decision-making and reveal gaps in the current understanding of privacy risks, as current practices in DP-SGD often result in choosing mechanisms with high excess privacy vulnerabilities., Comment: ICML 2024
Published: 2024

31. Effects of rotation and anisotropy on the properties of type-II holographic superconductors

Author: Herrera-Mendoza, Jhony A., Herrera-Aguilar, Alfredo, Higuita-Borja, Daniel F., Méndez-Zavaleta, Julio A., Pérez-Rodríguez, Felipe, and Yin, Jia-Xin
Subjects: High Energy Physics - Theory, Condensed Matter - Superconductivity, General Relativity and Quantum Cosmology
Abstract: The present work concerns the detailed construction of a holographic model for a type-II s-wave superconductor defined on a 5-dimensional anisotropic rotating black hole. We examine the role of rotation and anisotropy on the properties of the superconductor model focusing on the condensate and the AC conductivity, for which we obtain closed formulas, using both analytical and numerical methods. The results reveal that the rotation is responsible for the appearance of a peak and for introducing an exponentially vanishing behavior in the high-frequency limit of the real component of the AC conductivity. Such a behavior aligns with that observed in high-temperature superconductor models and experiments, where the peak and vanishing behavior result from quasiparticle damping, suggesting a plausible relation between the rotation of a black hole and quasiparticle damping effects in a superconducting material. In addition, we provide a detailed construction of the vortex lattice presented in arXiv:2208.05988 and study its behavior as a function of an external uniform magnetic field. Once again, it is shown that the vortex lattice can be continuously deformed along with a change in the vortex population by virtue of the magnetic field, providing a promising avenue for holographically modeling the vortex lattice deformations observed in experimental superconducting materials. As an observed experimental effect, we describe both the vortex lattice deformation and the vortex population increment under the action of an external magnetic field in the LiFeAs type-II superconductor. These effects supplement those previously found for the FeSe type-II superconductor studied in arXiv:2208.05988., Comment: 25 pages, 15 figures
Published: 2024

32. The CLRS-Text Algorithmic Reasoning Language Benchmark

Author: Markeeva, Larisa, McLeish, Sean, Ibarz, Borja, Bounsi, Wilfried, Kozlova, Olga, Vitvitskyi, Alex, Blundell, Charles, Goldstein, Tom, Schwarzschild, Avi, and Veličković, Petar
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Data Structures and Algorithms, Statistics - Machine Learning
Abstract: Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path towards building intelligent systems. Most recent studies dedicated to reasoning focus on out-of-distribution performance on procedurally-generated synthetic benchmarks, bespoke-built to evaluate specific skills only. This trend makes results hard to transfer across publications, slowing down progress. Three years ago, a similar issue was identified and rectified in the field of neural algorithmic reasoning, with the advent of the CLRS benchmark. CLRS is a dataset generator comprising graph execution traces of classical algorithms from the Introduction to Algorithms textbook. Inspired by this, we propose CLRS-Text -- a textual version of these algorithmic traces. Out of the box, CLRS-Text is capable of procedurally generating trace data for thirty diverse, challenging algorithmic tasks across any desirable input distribution, while offering a standard pipeline in which any additional algorithmic tasks may be created in the benchmark. We fine-tune and evaluate various LMs as generalist executors on this benchmark, validating prior work and revealing a novel, interesting challenge for the LM reasoning community. Our code is available at https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text., Comment: Preprint, under review. Comments welcome
Published: 2024

33. 'It Blew My Mind'. Creating Spaces for Integrating Creativity, Electroacoustic Music and Digital Competencies for Student Teachers

Author: Jesús Tejada, Adolf Murillo, and Borja Mateu-Luján
Abstract: This exploratory study describes the design and implementation of a sound-based intervention in the initial training of specialist music teachers at a Spanish university. It aimed to create spaces geared towards more creative and contemporary approaches to musical learning in order to gauge the perceptions of trainee teachers regarding this kind of approach. The intervention (45 h of class time) was based on the creation of electroacoustic compositions following the SBM (Sound Based Music) approach using digital tools (Aglaya Play, AP). Qualitative process data were collected through self-reports, individual memories, and nine focus groups. The results suggest that the implementation of new intervention models that take into account the development of future teachers' creativity with activities focused on exploration, experimentation, and creation with sound can generate new opportunities to enrich their teaching identities.
Published: 2024
Full Text: View/download PDF

34. Engagement and Commitment in Higher Education: Looking at the Role of Identification and Perception of Performance

Author: Javier Borja-Gil, Mario Castellanos Verdugo, and M. Ángeles Oviedo-García
Abstract: Within OECD countries, 20% of university students continue no further than the first year. The objective of this research is to analyse the antecedents of student commitment, so as to design action plans for reducing dropout rates within higher education. Educational engagement, student--university identification and perception of performance were analysed at the campus of Social and Juridical Sciences of Seville University (Spain). A sample of 641 valid questionnaires was used to analyse the constructs and their complex inter-relations using Partial Least Squares-Simultaneous Equation Modelling. The results revealed a strong relation between identification with and commitment to the university, as well as between educational engagement and identification. Universities as institutions are called on to take action to reduce high early dropout rates by encouraging educational engagement and identification with the university. Such actions might include departments for educational orientation and the promotion of activities to foster enduring links between students and universities.
Published: 2024
Full Text: View/download PDF

35. Quantifying the Coastal Hazard Risk Reduction Benefits of Coral Reef Restoration in the U.S. Virgin Islands

Author: Gaido-Lasserre, Camila, Pietsch McNulty, Valerie, Storlazzi, Curt D., Reguero, Borja G., Perez, Denise, Fogg, Sandra, Cummings, Kristen A, Ward, Jessica, Schill, Steve, Jarvis, Celeste, and Beck, Michael W
Subjects: Benefit to Cost Analysis, Flood risk, Coral reef restoration
Abstract: Coastal habitat restoration, especially of coral reef ecosystems, can significantly reduce the exposure of coastal communities to natural hazards and, consequently, the risk of wave-driven flooding. Likewise, reef degradation can increase coastal flood risks to people and property. In this study, the valuation of coral reefs in the United States Virgin Islands (USVI), along the coasts of St. Croix, St. John, and St. Thomas, demonstrated the social and economic benefits provided by these natural defenses. Across the territory, more than 481 people and $31.2 million of infrastructure were estimated to receive protection from coral reefs per year (2010 U.S. dollars). In 2017, Hurricanes Irma and Maria significantly damaged coral reefs throughout the archipelago. By combining engineering, ecological, geospatial, social, and economic data and tools, this study provided a rigorous valuation of where potential coral reef restoration projects could help rebuild these damaged habitats and decrease the risks from coastal hazards faced by USVI’s reef-fronted communities. Multiple restoration scenarios were considered in the analysis, two of which are detailed in this report. These include (1) ‘Ecological’ restoration, where restoration creates a structure that is 0.25 m high and 25-m-wide reef, and (2) ‘Hybrid’ restoration, where restoration creates a structure that is 1.25 m high and 5 m wide. There are many ways that such structures could be developed. In the hydrodynamic analyses, there are no assumptions about how the restoration is developed. Many practitioners of both coral (and oyster reef) restoration consider that a reef height of 0.25 m might be delivered from planting corals alone and that 1.25 m might require a combination of artificial structures and coral planting. In a third scenario, the analysis investigated the reduction of protection benefits that would occur through the reduction of 1 meter of naturally occurring reef height due to reef degradation. The reduction of protection due to the loss of reefs can also be interpreted as the protection value of the existing reefs. In all studied restoration scenarios, it was assumed that the planting of corals would enhance hydrodynamic roughness, effectively dissipating incident wave energy and reducing the potential for coastal flooding. A standardized approach was employed to strategically locate potential restoration projects along the entire linear extent of existing reefs bordering the USVI, and to identify where coral reef restoration could offer valuable benefits in flood reduction. Potential restoration projects were only located within the existing distribution of reefs across the region, even though numerous sites were positioned far offshore (2-3 km), and some were at relatively deep depths (up to 7 m). Risk-based valuation approaches were followed to delineate flood zones at a 10 m2 resolution along the entire region's reef-lined shorelines for all the potential coral reef restoration scenarios. These were subsequently compared to flood zones without coral reef restoration. The potential reduction in coastal flood risk provided by coral reef restoration, and the protection value of existing reefs, were quantified utilizing the latest information available at the time of analysis from the U.S. Census Bureau, Federal Emergency Management Agency (FEMA), and Bureau of Economic Analysis for return-interval storm events. The change in Expected Annual Damages (EAD), a metric indicating the annual protection gained due to coral reef restoration, was calculated based on the damages associated with each storm probability. The findings suggest that the benefits of reef restoration are spatially variable within the USVI. In some areas, the analysis showed limited benefits from reef restoration, which may be attributed to the depth or offshore distances of proposed restoration sites. However, there were a number of key areas where reef restoration could have substantial benefits for flood risk reduction. The annual flood risk reduction attributed to potential ‘ecological’ coral reef restoration in the USVI was 99 people and $6.1 million (2010 U.S. dollars). The Benefit-to-Cost Ratio (BCR) for this restoration approach was found to be larger than 1 (i.e., cost-effective) along 11% of the St. Croix coastline, 4.9% of the St. John coastline, and 8.7% of the St. Thomas coastline. This analysis offers stakeholders and decision-makers a spatially explicit and rigorous evaluation that illustrates how, where, and when potential coral reef restoration efforts in St. Croix, St. John, and St. Thomas could be instrumental to reducing coastal storm-induced flooding. Understanding areas where reef management, recovery, and restoration could effectively reduce climate hazard-related risks is crucial to protect and enhance the resilience of coastal communities in USVI.
Published: 2024

36. Improved efficient physics-based computational modeling of regional wave-driven coastal flooding for reef-lined coastlines

Author: Gaido-Lasserre, Camila, Nederhoff, Kees, Storlazzi, Curt D, Reguero, Borja G, and Beck, Michael W
Subjects: Earth Sciences, Maritime Engineering, Engineering, Climate Action, Regional flood risk, Wave-driven flooding, SFINCS, XBeach, Flood extent, Flood depth, Oceanography, Maritime engineering
Published: 2024

37. Neural Optimization with Adaptive Heuristics for Intelligent Marketing System

Author: Wei, Changshuai, Zelditch, Benjamin, Chen, Joyce, Ribeiro, Andre Assuncao Silva T, Tay, Jingyi Kenneth, Elizondo, Borja Ocejo, Selvaraj, Keerthi, Gupta, Aman, and De Almeida, Licurgo Benemann
Subjects: Statistics - Methodology, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval, Computer Science - Machine Learning, Mathematics - Optimization and Control, G.3, G.1.6, I.2
Abstract: Computational marketing has become increasingly important in today's digital world, facing challenges such as massive heterogeneous data, multi-channel customer journeys, and limited marketing budgets. In this paper, we propose a general framework for marketing AI systems, the Neural Optimization with Adaptive Heuristics (NOAH) framework. NOAH is the first general framework for marketing optimization that considers both to-business (2B) and to-consumer (2C) products, as well as both owned and paid channels. We describe key modules of the NOAH framework, including prediction, optimization, and adaptive heuristics, providing examples for bidding and content optimization. We then detail the successful application of NOAH to LinkedIn's email marketing system, showcasing significant wins over the legacy ranking system. Additionally, we share details and insights that are broadly useful, particularly on: (i) addressing delayed feedback with lifetime value, (ii) performing large-scale linear programming with randomization, (iii) improving retrieval with audience expansion, (iv) reducing signal dilution in targeting tests, and (v) handling zero-inflated heavy-tail metrics in statistical testing., Comment: KDD 2024
Published: 2024
Full Text: View/download PDF

38. Synthetic Tabular Data Validation: A Divergence-Based Approach

Author: Apellániz, Patricia A., Jiménez, Ana, Galende, Borja Arroyo, Parras, Juan, and Zazo, Santiago
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, I.2.0
Abstract: The ever-increasing use of generative models in various fields where tabular data is used highlights the need for robust and standardized validation metrics to assess the similarity between real and synthetic data. Current methods lack a unified framework and rely on diverse and often inconclusive statistical measures. Divergences, which quantify discrepancies between data distributions, offer a promising avenue for validation. However, traditional approaches calculate divergences independently for each feature due to the complexity of joint distribution modeling. This paper addresses this challenge by proposing a novel approach that uses divergence estimation to overcome the limitations of marginal comparisons. Our core contribution lies in applying a divergence estimator to build a validation metric considering the joint distribution of real and synthetic data. We leverage a probabilistic classifier to approximate the density ratio between datasets, allowing the capture of complex relationships. We specifically calculate two divergences: the well-known Kullback-Leibler (KL) divergence and the Jensen-Shannon (JS) divergence. KL divergence offers an established use in the field, while JS divergence is symmetric and bounded, providing a reliable metric. The efficacy of this approach is demonstrated through a series of experiments with varying distribution complexities. The initial phase involves comparing estimated divergences with analytical solutions for simple distributions, setting a benchmark for accuracy. Finally, we validate our method on a real-world dataset and its corresponding synthetic counterpart, showcasing its effectiveness in practical applications. This research offers a significant contribution with applicability beyond tabular data and the potential to improve synthetic data validation in various fields., Comment: 15 pages, 14 figures
Published: 2024
Full Text: View/download PDF

39. Chemo-dynamical Evolution of Simulated Satellites for a Milky Way-like Galaxy

Author: Hirai, Yutaka, Kirby, Evan N., Chiba, Masashi, Hayashi, Kohei, Anguiano, Borja, Saitoh, Takayuki R., Ishigaki, Miho N., and Beers, Timothy C.
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - High Energy Astrophysical Phenomena, Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Solar and Stellar Astrophysics
Abstract: The chemical abundances of Milky Way's satellites reflect their star formation histories (SFHs), yet, due to the difficulty of determining the ages of old stars, the SFHs of most satellites are poorly measured. Ongoing and upcoming surveys will obtain around ten times more medium-resolution spectra for stars in satellites than are currently available. To correctly extract SFHs from large samples of chemical abundances, the relationship between chemical abundances and SFHs needs to be clarified. Here, we perform a high-resolution cosmological zoom-in simulation of a Milky Way-like galaxy with detailed models of star formation, supernova feedback, and metal diffusion. We quantify SFHs, metallicity distribution functions, and the $\alpha$-element (Mg, Ca, and Si) abundances in satellites of the host galaxy. We find that star formation in most simulated satellites is quenched before infalling to their host. Star formation episodes in simulated satellites are separated by a few hundred Myr owing to supernova feedback; each star formation event produces groups of stars with similar [$\alpha$/Fe] and [Fe/H]. We then perform a mock observation of the upcoming Subaru Prime Focus Spectrograph (PFS) observations. We find that Subaru PFS will be able to detect distinct groups of stars in [$\alpha$/Fe] vs. [Fe/H] space, produced by episodic star formation. This result means that episodic SFHs can be estimated from the chemical abundances of $\gtrsim$ 1,000 stars determined with medium-resolution spectroscopy., Comment: 16 pages, 9 figures, accepted for publication in The Astrophysical Journal
Published: 2024

40. AirGapAgent: Protecting Privacy-Conscious Conversational Agents

Author: Bagdasarian, Eugene, Yi, Ren, Ghalebikesabi, Sahra, Kairouz, Peter, Gruteser, Marco, Oh, Sewoong, Balle, Borja, and Ramage, Daniel
Subjects: Computer Science - Cryptography and Security, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The growing use of large language model (LLM)-based conversational agents to manage sensitive user data raises significant privacy concerns. While these agents excel at understanding and acting on context, this capability can be exploited by malicious actors. We introduce a novel threat model where adversarial third-party apps manipulate the context of interaction to trick LLM-based agents into revealing private information not relevant to the task at hand. Grounded in the framework of contextual integrity, we introduce AirGapAgent, a privacy-conscious agent designed to prevent unintended data leakage by restricting the agent's access to only the data necessary for a specific task. Extensive experiments using Gemini, GPT, and Mistral models as agents validate our approach's effectiveness in mitigating this form of context hijacking while maintaining core agent functionality. For example, we show that a single-query context hijacking attack on a Gemini Ultra agent reduces its ability to protect user data from 94% to 45%, while an AirGapAgent achieves 97% protection, rendering the same attack ineffective., Comment: at CCS'24
Published: 2024

41. Early Fault-Tolerant Quantum Algorithms in Practice: Application to Ground-State Energy Estimation

Author: Kiss, Oriel, Azad, Utkarsh, Requena, Borja, Roggero, Alessandro, Wakeham, David, and Arrazola, Juan Miguel
Subjects: Quantum Physics
Abstract: We explore the practicality of early fault-tolerant quantum algorithms, focusing on ground-state energy estimation problems. Specifically, we address the computation of the cumulative distribution function (CDF) of the spectral measure of the Hamiltonian and the identification of its discontinuities. Scaling to bigger system sizes unveils three challenges: the smoothness of the CDF for large supports, the absence of tight lower bounds on the overlap with the actual ground state, and the complexity of preparing high-quality initial states. To tackle these challenges, we introduce a signal processing technique for identifying the inflection point of the CDF. We argue that this change of paradigm significantly simplifies the problem, making it more accessible while still being accurate. Hence, instead of trying to find the exact ground-state energy, we advocate improving on the classical estimate by aiming at the low-energy support of the initial state. Furthermore, we offer quantitative resource estimates for the maximum number of samples required to identify an increase in the CDF of a given size. Finally, we conduct numerical experiments on a 26-qubit fully-connected Heisenberg model using a truncated density-matrix renormalization group (DMRG) initial state of low bond dimension. Results show that the prediction obtained with the quantum algorithm aligns well with the DMRG-converged energy at large bond dimensions and requires several orders of magnitude fewer samples than predicted by the theory. Hence, we argue that CDF-based quantum algorithms are a viable, practical alternative to quantum phase estimation in resource-limited scenarios., Comment: 16 pages, 9 figures
Published: 2024

42. The Ethics of Advanced AI Assistants

Author: Gabriel, Iason, Manzini, Arianna, Keeling, Geoff, Hendricks, Lisa Anne, Rieser, Verena, Iqbal, Hasan, Tomašev, Nenad, Ktena, Ira, Kenton, Zachary, Rodriguez, Mikel, El-Sayed, Seliem, Brown, Sasha, Akbulut, Canfer, Trask, Andrew, Hughes, Edward, Bergman, A. Stevie, Shelby, Renee, Marchal, Nahema, Griffin, Conor, Mateos-Garcia, Juan, Weidinger, Laura, Street, Winnie, Lange, Benjamin, Ingerman, Alex, Lentz, Alison, Enger, Reed, Barakat, Andrew, Krakovna, Victoria, Siy, John Oliver, Kurth-Nelson, Zeb, McCroskery, Amanda, Bolina, Vijay, Law, Harry, Shanahan, Murray, Alberts, Lize, Balle, Borja, de Haas, Sarah, Ibitoye, Yetunde, Dafoe, Allan, Goldberg, Beth, Krier, Sébastien, Reese, Alexander, Witherspoon, Sims, Hawkins, Will, Rauh, Maribeth, Wallace, Don, Franklin, Matija, Goldstein, Josh A., Lehman, Joel, Klenk, Michael, Vallor, Shannon, Biles, Courtney, Morris, Meredith Ringel, King, Helen, Arcas, Blaise Agüera y, Isaac, William, and Manyika, James
Subjects: Computer Science - Computers and Society
Abstract: This paper focuses on the opportunities and the ethical and societal risks posed by advanced AI assistants. We define advanced AI assistants as artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user, across one or more domains, in line with the user's expectations. The paper starts by considering the technology itself, providing an overview of AI assistants, their technical foundations and potential range of applications. It then explores questions around AI value alignment, well-being, safety and malicious uses. Extending the circle of inquiry further, we next consider the relationship between advanced AI assistants and individual users in more detail, exploring topics such as manipulation and persuasion, anthropomorphism, appropriate relationships, trust and privacy. With this analysis in place, we consider the deployment of advanced assistants at a societal scale, focusing on cooperation, equity and access, misinformation, economic impact, the environment and how best to evaluate advanced AI assistants. Finally, we conclude by providing a range of recommendations for researchers, developers, policymakers and public stakeholders.
Published: 2024

43. The Brain Tumor Segmentation in Pediatrics (BraTS-PEDs) Challenge: Focus on Pediatrics (CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs)

Author: Kazerooni, Anahita Fathi, Khalili, Nastaran, Liu, Xinyang, Gandhi, Deep, Jiang, Zhifan, Anwar, Syed Muhammed, Albrecht, Jake, Adewole, Maruf, Anazodo, Udunna, Anderson, Hannah, Baid, Ujjwal, Bergquist, Timothy, Borja, Austin J., Calabrese, Evan, Chung, Verena, Conte, Gian-Marco, Dako, Farouk, Eddy, James, Ezhov, Ivan, Familiar, Ariana, Farahani, Keyvan, Franson, Andrea, Gottipati, Anurag, Haldar, Shuvanjan, Iglesias, Juan Eugenio, Janas, Anastasia, Johansen, Elaine, Jones, Blaise V, Khalili, Neda, Kofler, Florian, LaBella, Dominic, Lai, Hollie Anne, Van Leemput, Koen, Li, Hongwei Bran, Maleki, Nazanin, McAllister, Aaron S, Meier, Zeke, Menze, Bjoern, Moawad, Ahmed W, Nandolia, Khanak K, Pavaine, Julija, Piraud, Marie, Poussaint, Tina, Prabhu, Sanjay P, Reitman, Zachary, Rudie, Jeffrey D, Sanchez-Montano, Mariana, Shaikh, Ibraheem Salman, Sheth, Nakul, Tu, Wenxin, Wang, Chunhao, Ware, Jeffrey B, Wiestler, Benedikt, Zapaishchykova, Anna, Bornhorst, Miriam, Deutsch, Michelle, Fouladi, Maryam, Lazow, Margot, Mikael, Leonie, Hummel, Trent, Kann, Benjamin, de Blank, Peter, Hoffman, Lindsey, Aboian, Mariam, Nabavizadeh, Ali, Packer, Roger, Bakas, Spyridon, Resnick, Adam, Rood, Brian, Vossough, Arastoo, and Linguraru, Marius George
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Pediatric tumors of the central nervous system are the most common cause of cancer-related death in children. The five-year survival rate for high-grade gliomas in children is less than 20%. Due to their rarity, the diagnosis of these entities is often delayed, their treatment is mainly based on historic treatment concepts, and clinical trials require multi-institutional collaborations. Here we present the CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs challenge, focused on pediatric brain tumors with data acquired across multiple international consortia dedicated to pediatric neuro-oncology and clinical trials. The CBTN-CONNECT-DIPGR-ASNR-MICCAI BraTS-PEDs challenge brings together clinicians and AI/imaging scientists to lead to faster development of automated segmentation techniques that could benefit clinical trials, and ultimately the care of children with brain tumors., Comment: arXiv admin note: substantial text overlap with arXiv:2305.17033
Published: 2024

44. A note on generalization bounds for losses with finite moments

Author: Rodríguez-Gálvez, Borja, Rivasplata, Omar, Thobaben, Ragnar, and Skoglund, Mikael
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: This paper studies the truncation method from Alquier [1] to derive high-probability PAC-Bayes bounds for unbounded losses with heavy tails. Assuming that the $p$-th moment is bounded, the resulting bounds interpolate between a slow rate $1 / \sqrt{n}$ when $p=2$, and a fast rate $1 / n$ when $p \to \infty$ and the loss is essentially bounded. Moreover, the paper derives a high-probability PAC-Bayes bound for losses with a bounded variance. This bound has an exponentially better dependence on the confidence parameter and the dependency measure than previous bounds in the literature. Finally, the paper extends all results to guarantees in expectation and single-draw PAC-Bayes. In order to so, it obtains analogues of the PAC-Bayes fast rate bound for bounded losses from [2] in these settings., Comment: 9 pages: 5 of main text, 1 of references, and 3 of appendices
Published: 2024

45. Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics

Author: Grillotti, Luca, Faldor, Maxence, León, Borja G., and Cully, Antoine
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: A key aspect of intelligence is the ability to demonstrate a broad spectrum of behaviors for adapting to unexpected situations. Over the past decade, advancements in deep reinforcement learning have led to groundbreaking achievements to solve complex continuous control tasks. However, most approaches return only one solution specialized for a specific problem. We introduce Quality-Diversity Actor-Critic (QDAC), an off-policy actor-critic deep reinforcement learning algorithm that leverages a value function critic and a successor features critic to learn high-performing and diverse behaviors. In this framework, the actor optimizes an objective that seamlessly unifies both critics using constrained optimization to (1) maximize return, while (2) executing diverse skills. Compared with other Quality-Diversity methods, QDAC achieves significantly higher performance and more diverse behaviors on six challenging continuous control locomotion tasks. We also demonstrate that we can harness the learned skills to adapt better than other baselines to five perturbed environments. Finally, qualitative analyses showcase a range of remarkable behaviors: adaptive-intelligent-robotics.github.io/QDAC., Comment: The first two authors contributed equally to this work. Accepted at ICML 2024
Published: 2024

46. Chained Information-Theoretic bounds and Tight Regret Rate for Linear Bandit Problems

Author: Gouverneur, Amaury, Rodríguez-Gálvez, Borja, Oechtering, Tobias J., and Skoglund, Mikael
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: This paper studies the Bayesian regret of a variant of the Thompson-Sampling algorithm for bandit problems. It builds upon the information-theoretic framework of [Russo and Van Roy, 2015] and, more specifically, on the rate-distortion analysis from [Dong and Van Roy, 2020], where they proved a bound with regret rate of $O(d\sqrt{T \log(T)})$ for the $d$-dimensional linear bandit setting. We focus on bandit problems with a metric action space and, using a chaining argument, we establish new bounds that depend on the metric entropy of the action space for a variant of Thompson-Sampling. Under suitable continuity assumption of the rewards, our bound offers a tight rate of $O(d\sqrt{T})$ for $d$-dimensional linear bandit problems., Comment: 15 pages: 8 of main text and 7 of appendices
Published: 2024

47. Chemical abundances and deviations from the solar S/O ratio in the gas-phase ISM of galaxies based on infrared emission lines

Author: Pérez-Díaz, Borja, Pérez-Montero, Enrique, Fernández-Ontiveros, Juan A., Vílchez, José M., Hernán-Caballero, Antonio, and Amorín, Ricardo
Subjects: Astrophysics - Astrophysics of Galaxies
Abstract: The infrared (IR) range is extremely useful in the context of chemical abundance studies of the gas-phase interstellar medium (ISM) due to the large variety of ionic species traced in this regime, the negligible effects from dust attenuation or temperature stratification, and the amount of data that has been and will be released in the coming years. Taking advantage of available IR emission lines, we analysed the chemical content of the gas-phase ISM in a sample of 131 Star-Forming Galaxies (SFGs) and 73 Active Galactic Nuclei (AGNs). Particularly, we derived the chemical content via their total oxygen abundance in combination with nitrogen and sulfur abundances, and with the ionisation parameter. We used a new version of the code HII-CHI-Mistry-IR v3.1 which allows us to estimate log(N/O), 12+log(O/H), log(U), and, for the first time, 12+log(S/H) from IR emission lines, which can be applied to both SFGs and AGNs. We tested that the estimations from this new version, that only considers sulfur lines for the derivation of sulfur abundances, are compatible with previous studies. While most of the SFGs and AGNs show solar log(N/O) abundances, we found a large spread in the log(S/O) relative abundances. Specifically, we found extremely low log(S/O) values (1/10th solar) in some SFGs and AGNs with solar-like oxygen abundances. This result warns against the use of optical and IR sulfur emission lines to estimate oxygen abundances when no prior estimation of log(S/O) is provided., Comment: Accepted for publication in A&A. 16 pages, 11 figures, 4 electronic tables
Published: 2024
Full Text: View/download PDF

48. Nitrogen fixation in the North Atlantic supported by Gulf Stream eddy-borne diazotrophs

Author: Hoerstmann, Cora, Aguiar-González, Borja, Barrillon, Stéphanie, Carpaneto Bastos, Cécile, Grosso, Olivier, Pérez-Hernández, M. D., Doglioli, Andrea M., Petrenko, Anne A., Carracedo, Lidia I., and Benavides, Mar
Published: 2024
Full Text: View/download PDF

49. Reference values for body composition in healthy urban Mexican children and adolescents

Author: Desiree, Lopez-Gonzalez, C Wells, Jonathan, Armando, Partida-Gaytan, Mario, Cortina-Borja, and Patricia, Clark
Published: 2024
Full Text: View/download PDF

50. APOE genotype, hippocampal volume, and cognitive reserve predict improvement by cognitive training in older adults without dementia: a randomized controlled trial

Author: Montejo Carrasco, Pedro, Montenegro-Peña, Mercedes, Prada Crespo, David, Rodríguez Rojo, Inmaculada, Barabash Bustelo, Ana, Montejo Rubio, Borja, Marcos Dolado, Alberto, Maestú Unturbe, Fernando, and Delgado Losada, María Luisa
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

77,514 results on '"Borja, A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources