Author: "Hammoud, Hasan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hammoud, Hasan"' showing total 31 results

Start Over Author "Hammoud, Hasan"

31 results on '"Hammoud, Hasan"'

1. Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

Author: Malinovsky, Grigory, Michieli, Umberto, Hammoud, Hasan Abed Al Kader, Ceritli, Taha, Elesedy, Hayder, Ozay, Mete, and Richtárik, Peter
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: Fine-tuning has become a popular approach to adapting large foundational models to specific tasks. As the size of models and datasets grows, parameter-efficient fine-tuning techniques are increasingly important. One of the most widely used methods is Low-Rank Adaptation (LoRA), with adaptation update expressed as the product of two low-rank matrices. While LoRA was shown to possess strong performance in fine-tuning, it often under-performs when compared to full-parameter fine-tuning (FPFT). Although many variants of LoRA have been extensively studied empirically, their theoretical optimization analysis is heavily under-explored. The starting point of our work is a demonstration that LoRA and its two extensions, Asymmetric LoRA and Chain of LoRA, indeed encounter convergence issues. To address these issues, we propose Randomized Asymmetric Chain of LoRA (RAC-LoRA) -- a general optimization framework that rigorously analyzes the convergence rates of LoRA-based methods. Our approach inherits the empirical benefits of LoRA-style heuristics, but introduces several small but important algorithmic modifications which turn it into a provably convergent method. Our framework serves as a bridge between FPFT and low-rank adaptation. We provide provable guarantees of convergence to the same solution as FPFT, along with the rate of convergence. Additionally, we present a convergence analysis for smooth, non-convex loss functions, covering gradient descent, stochastic gradient descent, and federated learning settings. Our theoretical findings are supported by experimental results., Comment: 36 pages, 4 figures, 2 algorithms
Published: 2024

2. Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Author: Hammoud, Hasan Abed Al Kader, Michieli, Umberto, Pizzati, Fabio, Torr, Philip, Bibi, Adel, Ghanem, Bernard, and Ozay, Mete
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Merging Large Language Models (LLMs) is a cost-effective technique for combining multiple expert LLMs into a single versatile model, retaining the expertise of the original ones. However, current approaches often overlook the importance of safety alignment during merging, leading to highly misaligned models. This work investigates the effects of model merging on alignment. We evaluate several popular model merging techniques, demonstrating that existing methods do not only transfer domain expertise but also propagate misalignment. We propose a simple two-step approach to address this problem: (i) generating synthetic safety and domain-specific data, and (ii) incorporating these generated data into the optimization process of existing data-aware model merging techniques. This allows us to treat alignment as a skill that can be maximized in the resulting merged LLM. Our experiments illustrate the effectiveness of integrating alignment-related data during merging, resulting in models that excel in both domain expertise and alignment., Comment: Under review
Published: 2024

3. Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

Author: Yang, Yibo, Li, Xiaojie, Alfarra, Motasem, Hammoud, Hasan, Bibi, Adel, Torr, Philip, and Ghanem, Bernard
Subjects: Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Abstract: Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption., Comment: ICML 2024
Published: 2024

4. On Pretraining Data Diversity for Self-Supervised Learning

Author: Hammoud, Hasan Abed Al Kader, Das, Tuhin, Pizzati, Fabio, Torr, Philip, Bibi, Adel, and Ghanem, Bernard
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We explore the impact of training with more diverse datasets, characterized by the number of unique samples, on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal. Notably, even with an exceptionally large pretraining data diversity achieved through methods like web crawling or diffusion-generated data, among other ways, the distribution shift remains a challenge. Our experiments are comprehensive with seven SSL methods using large-scale datasets such as ImageNet and YFCC100M amounting to over 200 GPU days. Code and trained models are available at https://github.com/hammoudhasan/DiversitySSL, Comment: ECCV 2024
Published: 2024

5. SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?

Author: Hammoud, Hasan Abed Al Kader, Itani, Hani, Pizzati, Fabio, Torr, Philip, Bibi, Adel, and Ghanem, Bernard
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We present SynthCLIP, a CLIP model trained on entirely synthetic text-image pairs. Leveraging recent text-to-image (TTI) networks and large language models (LLM), we generate synthetic datasets of images and corresponding captions at scale, with no human intervention. In this work, we provide an analysis on CLIP models trained on synthetic data. We provide insights on the data generation strategy, number of samples required, scaling trends, and resulting properties. We also introduce SynthCI-30M, a purely synthetic dataset comprising 30 million captioned images. Our code, trained models, and data, are released as open source at https://github.com/hammoudhasan/SynthCLIP, Comment: Under review
Published: 2024

6. From Categories to Classifiers: Name-Only Continual Learning by Exploring the Web

Author: Prabhu, Ameya, Hammoud, Hasan Abed Al Kader, Lim, Ser-Nam, Ghanem, Bernard, Torr, Philip H. S., and Bibi, Adel
Subjects: Computer Science - Machine Learning
Abstract: Continual Learning (CL) often relies on the availability of extensive annotated datasets, an assumption that is unrealistically time-consuming and costly in practice. We explore a novel paradigm termed name-only continual learning where time and cost constraints prohibit manual annotation. In this scenario, learners adapt to new category shifts using only category names without the luxury of annotated training data. Our proposed solution leverages the expansive and ever-evolving internet to query and download uncurated webly-supervised data for image classification. We investigate the reliability of our web data and find them comparable, and in some cases superior, to manually annotated datasets. Additionally, we show that by harnessing the web, we can create support sets that surpass state-of-the-art name-only classification that create support sets using generative models or image retrieval from LAION-5B, achieving up to 25% boost in accuracy. When applied across varied continual learning contexts, our method consistently exhibits a small performance gap in comparison to models trained on manually annotated datasets. We present EvoTrends, a class-incremental dataset made from the web to capture real-world trends, created in just minutes. Overall, this paper underscores the potential of using uncurated webly-supervised data to mitigate the challenges associated with manual data labeling in continual learning.
Published: 2023

7. Mindstorms in Natural Language-Based Societies of Mind

Author: Zhuge, Mingchen, Liu, Haozhe, Faccio, Francesco, Ashley, Dylan R., Csordás, Róbert, Gopalakrishnan, Anand, Hamdi, Abdullah, Hammoud, Hasan Abed Al Kader, Herrmann, Vincent, Irie, Kazuki, Kirsch, Louis, Li, Bing, Li, Guohao, Liu, Shuming, Mai, Jinjie, Piękos, Piotr, Ramesh, Aditya, Schlag, Imanol, Shi, Weimin, Stanić, Aleksandar, Wang, Wenyi, Wang, Yuhui, Xu, Mengmeng, Fan, Deng-Ping, Ghanem, Bernard, and Schmidhuber, Jürgen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Multiagent Systems, 68T07, I.2.6, I.2.11
Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overcome the limitations of single LLMs, improving multimodal zero-shot reasoning. In these natural language-based societies of mind (NLSOMs), new agents -- all communicating through the same universal symbolic language -- are easily added in a modular fashion. To demonstrate the power of NLSOMs, we assemble and experiment with several of them (having up to 129 members), leveraging mindstorms in them to solve some practical AI tasks: visual question answering, image captioning, text-to-image synthesis, 3D generation, egocentric retrieval, embodied AI, and general language-based task solving. We view this as a starting point towards much larger NLSOMs with billions of agents-some of which may be humans. And with this emergence of great societies of heterogeneous minds, many new research questions have suddenly become paramount to the future of artificial intelligence. What should be the social structure of an NLSOM? What would be the (dis)advantages of having a monarchical rather than a democratic structure? How can principles of NN economies be used to maximize the total reward of a reinforcement learning NLSOM? In this work, we identify, discuss, and try to answer some of these questions., Comment: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices
Published: 2023

8. Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

Author: Hammoud, Hasan Abed Al Kader, Prabhu, Ameya, Lim, Ser-Nam, Torr, Philip H. S., Bibi, Adel, and Ghanem, Bernard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy, which measures the accuracy of the model on the immediate next few samples. However, we show that this metric is unreliable, as even vacuous blind classifiers, which do not use input images for prediction, can achieve unrealistically high online accuracy by exploiting spurious label correlations in the data stream. Our study reveals that existing OCL algorithms can also achieve high online accuracy, but perform poorly in retaining useful information, suggesting that they unintentionally learn spurious label correlations. To address this issue, we propose a novel metric for measuring adaptation based on the accuracy on the near-future samples, where spurious correlations are removed. We benchmark existing OCL approaches using our proposed metric on large-scale datasets under various computational budgets and find that better generalization can be achieved by retaining and reusing past seen information. We believe that our proposed metric can aid in the development of truly adaptive OCL methods. We provide code to reproduce our results at https://github.com/drimpossible/EvalOCL.
Published: 2023

9. CAMEL: Communicative Agents for 'Mind' Exploration of Large Language Model Society

Author: Li, Guohao, Hammoud, Hasan Abed Al Kader, Itani, Hani, Khizbullin, Dmitrii, and Ghanem, Bernard
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: The rapid advancement of chat-based language models has led to remarkable progress in complex task-solving. However, their success heavily relies on human input to guide the conversation, which can be challenging and time-consuming. This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents, and provides insight into their "cognitive" processes. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing. Our approach involves using inception prompting to guide chat agents toward task completion while maintaining consistency with human intentions. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of a society of agents, providing a valuable resource for investigating conversational language models. In particular, we conduct comprehensive studies on instruction-following cooperation in multi-agent settings. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond: https://github.com/camel-ai/camel., Comment: Accepted at NeurIPS'2023, 77 pages, project website: https://www.camel-ai.org, github repository: https://github.com/camel-ai/camel
Published: 2023

10. Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs

Author: Hammoud, Hasan Abed Al Kader, Bibi, Adel, Torr, Philip H. S., and Ghanem, Bernard
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: In this paper we investigate the frequency sensitivity of Deep Neural Networks (DNNs) when presented with clean samples versus poisoned samples. Our analysis shows significant disparities in frequency sensitivity between these two types of samples. Building on these findings, we propose FREAK, a frequency-based poisoned sample detection algorithm that is simple yet effective. Our experimental results demonstrate the efficacy of FREAK not only against frequency backdoor attacks but also against some spatial attacks. Our work is just the first step in leveraging these insights. We believe that our analysis and proposed defense mechanism will provide a foundation for future research and development of backdoor defenses., Comment: Accepted at CVPRW (The Art of Robustness)
Published: 2023

11. Computationally Budgeted Continual Learning: What Does Matter?

Author: Prabhu, Ameya, Hammoud, Hasan Abed Al Kader, Dokania, Puneet, Torr, Philip H. S., Lim, Ser-Nam, Ghanem, Bernard, and Bibi, Adel
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data. Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training. This is unreasonable for applications in-the-wild, where systems are primarily constrained by computational and time budgets, not storage. We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting, where effective memory samples used in training can be implicitly restricted as a consequence of limited computation. We conduct experiments evaluating various CL sampling strategies, distillation losses, and partial fine-tuning on two large-scale datasets, namely ImageNet2K and Continual Google Landmarks V2 in data incremental, class incremental, and time incremental settings. Through extensive experiments amounting to a total of over 1500 GPU-hours, we find that, under compute-constrained setting, traditional CL approaches, with no exception, fail to outperform a simple minimal baseline that samples uniformly from memory. Our conclusions are consistent in a different number of stream time steps, e.g., 20 to 200, and under several computational budgets. This suggests that most existing CL methods are particularly too computationally expensive for realistic budgeted deployment. Code for this project is available at: https://github.com/drimpossible/BudgetCL., Comment: CVPR 2023
Published: 2023

12. Real-Time Evaluation in Online Continual Learning: A New Hope

Author: Ghunaim, Yasir, Bibi, Adel, Alhamoud, Kumail, Alfarra, Motasem, Hammoud, Hasan Abed Al Kader, Prabhu, Ameya, Torr, Philip H. S., and Ghanem, Bernard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Current evaluations of Continual Learning (CL) methods typically assume that there is no constraint on training time and computation. This is an unrealistic assumption for any real-world setting, which motivates us to propose: a practical real-time evaluation of continual learning, in which the stream does not wait for the model to complete training before revealing the next data for predictions. To do this, we evaluate current CL methods with respect to their computational costs. We conduct extensive experiments on CLOC, a large-scale dataset containing 39 million time-stamped images with geolocation labels. We show that a simple baseline outperforms state-of-the-art CL methods under this evaluation, questioning the applicability of existing methods in realistic settings. In addition, we explore various CL components commonly used in the literature, including memory sampling strategies and regularization approaches. We find that all considered methods fail to be competitive against our simple baseline. This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical. We hope that the evaluation we provide will be the first step towards a paradigm shift to consider the computational cost in the development of online continual learning methods., Comment: Accepted at CVPR'23 as Highlight (Top 2.5%)
Published: 2023

13. Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition

Author: Hammoud, Hasan Abed Al Kader, Liu, Shuming, Alkhrashi, Mohammed, AlBalawi, Fahad, and Ghanem, Bernard
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean test images, yet persistently predicts an attacker-defined label for any sample in the presence of the backdoor trigger. Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain. In this work, we revisit the traditional backdoor threat model and incorporate additional video-related aspects to that model. We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically, leading to highly effective attacks in the video domain. In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain. And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models, where we show that attacking a single modality is enough for achieving a high attack success rate.
Published: 2023

14. Generalizability of Adversarial Robustness Under Distribution Shifts

Author: Alhamoud, Kumail, Hammoud, Hasan Abed Al Kader, Alfarra, Motasem, and Ghanem, Bernard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent progress in empirical and certified robustness promises to deliver reliable and deployable Deep Neural Networks (DNNs). Despite that success, most existing evaluations of DNN robustness have been done on images sampled from the same distribution on which the model was trained. However, in the real world, DNNs may be deployed in dynamic environments that exhibit significant distribution shifts. In this work, we take a first step towards thoroughly investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another. To do so, we train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain. We observe that: (1) both empirical and certified robustness generalize to unseen domains, and (2) the level of generalizability does not correlate well with input visual similarity, measured by the FID between source and target domains. We also extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy., Comment: TMLR 2023 (Featured Certification)
Published: 2022

15. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

Author: Qian, Guocheng, Li, Yuchen, Peng, Houwen, Mai, Jinjie, Hammoud, Hasan Abed Al Kader, Elhoseiny, Mohamed, and Ghanem, Bernard
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: PointNet++ is one of the most influential neural architectures for point cloud understanding. Although the accuracy of PointNet++ has been largely surpassed by recent networks such as PointMLP and Point Transformer, we find that a large portion of the performance gain is due to improved training strategies, i.e. data augmentation and optimization techniques, and increased model sizes rather than architectural innovations. Thus, the full potential of PointNet++ has yet to be explored. In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions. First, we propose a set of improved training strategies that significantly improve PointNet++ performance. For example, we show that, without any change in architecture, the overall accuracy (OA) of PointNet++ on ScanObjectNN object classification can be raised from 77.9% to 86.1%, even outperforming state-of-the-art PointMLP. Second, we introduce an inverted residual bottleneck design and separable MLPs into PointNet++ to enable efficient and effective model scaling and propose PointNeXt, the next version of PointNets. PointNeXt can be flexibly scaled up and outperforms state-of-the-art methods on both 3D classification and segmentation tasks. For classification, PointNeXt reaches an overall accuracy of 87.7 on ScanObjectNN, surpassing PointMLP by 2.3%, while being 10x faster in inference. For semantic segmentation, PointNeXt establishes a new state-of-the-art performance with 74.9% mean IoU on S3DIS (6-fold cross-validation), being superior to the recent Point Transformer. The code and models are available at https://github.com/guochengqian/pointnext., Comment: Accepted by NeurIPS'22. Code and models are available at https://github.com/guochengqian/pointnext
Published: 2022

16. ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning

Author: Qian, Guocheng, Hammoud, Hasan Abed Al Kader, Li, Guohao, Thabet, Ali, and Ghanem, Bernard
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Access to 3D point cloud representations has been widely facilitated by LiDAR sensors embedded in various mobile devices. This has led to an emerging need for fast and accurate point cloud processing techniques. In this paper, we revisit and dive deeper into PointNet++, one of the most influential yet under-explored networks, and develop faster and more accurate variants of the model. We first present a novel Separable Set Abstraction (SA) module that disentangles the vanilla SA module used in PointNet++ into two separate learning stages: (1) learning channel correlation and (2) learning spatial correlation. The Separable SA module is significantly faster than the vanilla version, yet it achieves comparable performance. We then introduce a new Anisotropic Reduction function into our Separable SA module and propose an Anisotropic Separable SA (ASSA) module that substantially increases the network's accuracy. We later replace the vanilla SA modules in PointNet++ with the proposed ASSA module, and denote the modified network as ASSANet. Extensive experiments on point cloud classification, semantic segmentation, and part segmentation show that ASSANet outperforms PointNet++ and other methods, achieving much higher accuracy and faster speeds. In particular, ASSANet outperforms PointNet++ by $7.4$ mIoU on S3DIS Area 5, while maintaining $1.6 \times $ faster inference speed on a single NVIDIA 2080Ti GPU. Our scaled ASSANet variant achieves $66.8$ mIoU and outperforms KPConv, while being more than $54 \times$ faster., Comment: ASSANet gets accepted to NeurIPS'21 as a Spotlight paper. code available at https://github.com/guochengqian/ASSANet
Published: 2021

17. Check Your Other Door! Creating Backdoor Attacks in the Frequency Domain

Author: Hammoud, Hasan Abed Al Kader and Ghanem, Bernard
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification to real-time object detection. As DNN models become more sophisticated, the computational cost of training these models becomes a burden. For this reason, outsourcing the training process has been the go-to option for many DNN users. Unfortunately, this comes at the cost of vulnerability to backdoor attacks. These attacks aim to establish hidden backdoors in the DNN so that it performs well on clean samples, but outputs a particular target label when a trigger is applied to the input. Existing backdoor attacks either generate triggers in the spatial domain or naively poison frequencies in the Fourier domain. In this work, we propose a pipeline based on Fourier heatmaps to generate a spatially dynamic and invisible backdoor attack in the frequency domain. The proposed attack is extensively evaluated on various datasets and network architectures. Unlike most existing backdoor attacks, the proposed attack can achieve high attack success rates with low poisoning rates and little to no drop in performance while remaining imperceptible to the human eye. Moreover, we show that the models poisoned by our attack are resistant to various state-of-the-art (SOTA) defenses, so we contribute two possible defenses that can evade the attack., Comment: Accepted to BMVC 2022
Published: 2021

18. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

Author: Alfarra, Motasem, Bibi, Adel, Hammoud, Hasan, Gaafar, Mohamed, and Ghanem, Bernard
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations. We use tropical geometry, a new development in the area of algebraic geometry, to characterize the decision boundaries of a simple network of the form (Affine, ReLU, Affine). Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes. The generators of these zonotopes are functions of the network parameters. This geometric characterization provides new perspectives to three tasks. (i) We propose a new tropical perspective to the lottery ticket hypothesis, where we view the effect of different initializations on the tropical geometric representation of a network's decision boundaries. (ii) Moreover, we propose new tropical based optimization reformulations that directly influence the decision boundaries of the network for the task of network pruning. (iii) At last, we discuss the reformulation of the generation of adversarial attacks in a tropical sense. We demonstrate that one can construct adversaries in a new tropical setting by perturbing a specific set of decision boundaries by perturbing a set of parameters in the network., Comment: First two authors contributed equally to this work
Published: 2020

19. Large eddy simulations of ammonia-hydrogen jet flames at elevated pressure using principal component analysis and deep neural networks

Author: Abdelwahid, Suliman, Malik, Mohammad Rafi, Al Kader Hammoud, Hasan Abed, Hernández-Pérez, Francisco E., Ghanem, Bernard, and Im, Hong G.
Published: 2023
Full Text: View/download PDF

20. Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

Author: Al Kader Hammoud, Hasan Abed, primary, Prabhu, Ameya, additional, Lim, Ser-Nam, additional, Torr, Philip H.S., additional, Bibi, Adel, additional, and Ghanem, Bernard, additional
Published: 2023
Full Text: View/download PDF

21. Computationally Budgeted Continual Learning: What Does Matter?

Author: Prabhu, Ameya, primary, Al Kader Hammoud, Hasan Abed, additional, Dokania, Puneet, additional, Torr, Philip H.S., additional, Lim, Ser-Nam, additional, Ghanem, Bernard, additional, and Bibi, Adel, additional
Published: 2023
Full Text: View/download PDF

22. Don’t FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs

Author: Al Kader Hammoud, Hasan Abed, primary, Bibi, Adel, additional, Torr, Philip H.S., additional, and Ghanem, Bernard, additional
Published: 2023
Full Text: View/download PDF

23. Real-Time Evaluation in Online Continual Learning: A New Hope

Author: Ghunaim, Yasir, primary, Bibi, Adel, additional, Alhamoud, Kumail, additional, Alfarra, Motasem, additional, Hammoud, Hasan Abed Al Kader, additional, Prabhu, Ameya, additional, Torr, Philip H.S., additional, and Ghanem, Bernard, additional
Published: 2023
Full Text: View/download PDF

24. CAMEL: Communicative Agents for 'Mind' Exploration of Large Scale Language Model Society

Author: Li, Guohao, Hammoud, Hasan Abed Al Kader, Itani, Hani, Khizbullin, Dmitrii, and Ghanem, Bernard
Subjects: FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computers and Society (cs.CY), Computer Science - Multiagent Systems, Computation and Language (cs.CL), Machine Learning (cs.LG), Multiagent Systems (cs.MA)
Abstract: The rapid advancement of conversational and chat-based language models has led to remarkable progress in complex task-solving. However, their success heavily relies on human input to guide the conversation, which can be challenging and time-consuming. This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents and provide insight into their "cognitive" processes. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing. Our approach involves using inception prompting to guide chat agents toward task completion while maintaining consistency with human intentions. We showcase how role-playing can be used to generate conversational data for studying the behaviors and capabilities of chat agents, providing a valuable resource for investigating conversational language models. Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems, and open-sourcing our library to support research on communicative agents and beyond. The GitHub repository of this project is made publicly available on: https://github.com/lightaime/camel.
Published: 2023

25. Correction: Large eddy simulations of NH3-H2 jet flame at elevated pressure using PCA with inclusion of NH3/H2 ratio variation

Author: Abdelwahid, Suliman, primary, Rafi Malik, Mohammad, additional, Abed Al Kader Hammoud, Hasan, additional, E. Hern'andez P'erez, Francisco, additional, Ghanem, Bernard, additional, and Im, Hong G., additional
Published: 2023
Full Text: View/download PDF

26. Large eddy simulations of NH3-H2 jet flame at elevated pressure using PCA with inclusion of NH3/H2 ratio variation

Author: Abdelwahid, Suliman, primary, Rafi Malik, Mohammad, additional, Abed Al Kader Hammoud, Hasan, additional, E. Hern'andez P'erez, Francisco, additional, Ghanem, Bernard, additional, and Im, Hong G., additional
Published: 2023
Full Text: View/download PDF

27. Check Your Other Door: Creating Backdoor Attacks in the Frequency Domain

Author: Hammoud, Hasan Abed Al Kader
Abstract: Deep Neural Networks (DNNs) are ubiquitous and span a variety of applications ranging from image classification and facial recognition to medical image analysis and real-time object detection. As DNN models become more sophisticated and complex, the computational cost of training these models becomes a burden. For this reason, outsourcing the training process has been the go-to option for many DNN users. Unfortunately, this comes at the cost of vulnerability to backdoor attacks. These attacks aim at establishing hidden backdoors in the DNN such that it performs well on clean samples but outputs a particular target label when a trigger is applied to the input. Current backdoor attacks generate triggers in the spatial domain; however, as we show in this work, it is not the only domain to exploit and one should always "check the other doors". To the best of our knowledge, this work is the first to propose a pipeline for generating a spatially dynamic (changing) and invisible (low norm) backdoor attack in the frequency domain. We show the advantages of utilizing the frequency domain for creating undetectable and powerful backdoor attacks through extensive experiments on various datasets and network architectures. Unlike most spatial domain attacks, frequency-based backdoor attacks can achieve high attack success rates with low poisoning rates and little to no drop in performance while remaining imperceptible to the human eye. Moreover, we show that the backdoored models (poisoned by our attacks) are resistant to various state-of-the-art (SOTA) defenses, and so we contribute two possible defenses that can successfully evade the attack. We conclude the work with some remarks regarding a network’s learning capacity and the capability of embedding a backdoor attack in the model.
Published: 2022
Full Text: View/download PDF

28. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

Author: Alfarra, Motasem, primary, Bibi, Adel, additional, Hammoud, Hasan, additional, Gaafar, Mohamed, additional, and Ghanem, Bernard, additional
Published: 2022
Full Text: View/download PDF

29. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

Author: Alfarra, Motasem, Bibi, Adel, Hammoud, Hasan, Gaafar, Mohamed, and Ghanem, Bernard
Abstract: This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations. We use tropical geometry, a new development in the area of algebraic geometry, to characterize the decision boundaries of a simple network of the form (Affine, ReLU, Affine). Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes. The generators of these zonotopes are functions of the network parameters. This geometric characterization provides new perspectives to three tasks. (i) We propose a new tropical perspective to the lottery ticket hypothesis, where we view the effect of different initializations on the tropical geometric representation of a network's decision boundaries. (ii) Moreover, we propose new tropical based optimization reformulations that directly influence the decision boundaries of the network for the task of network pruning. (iii) At last, we discuss the reformulation of the generation of adversarial attacks in a tropical sense. We demonstrate that one can construct adversaries in a new tropical setting by perturbing a specific set of decision boundaries by perturbing a set of parameters in the network.
Published: 2023
Full Text: View/download PDF

30. Adaptive Ripple Correlation Control (ARCC) for Solar Maximum Power Point Tracking

Author: Al Kader Hammoud, Hasan Abed, primary and Bazzi, Ali M., additional
Published: 2020
Full Text: View/download PDF

31. Model-based MPPT with Corrective Ripple Correlation Control

Author: Al Kader Hammoud, Hasan Abed, primary and Bazzi, Ali M., additional
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

31 results on '"Hammoud, Hasan"'

1. Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

2. Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

3. Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

4. On Pretraining Data Diversity for Self-Supervised Learning

5. SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?

6. From Categories to Classifiers: Name-Only Continual Learning by Exploring the Web

7. Mindstorms in Natural Language-Based Societies of Mind

8. Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

9. CAMEL: Communicative Agents for 'Mind' Exploration of Large Language Model Society

10. Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs

11. Computationally Budgeted Continual Learning: What Does Matter?

12. Real-Time Evaluation in Online Continual Learning: A New Hope

13. Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition

14. Generalizability of Adversarial Robustness Under Distribution Shifts

15. PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

16. ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning

17. Check Your Other Door! Creating Backdoor Attacks in the Frequency Domain

18. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

19. Large eddy simulations of ammonia-hydrogen jet flames at elevated pressure using principal component analysis and deep neural networks

20. Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

21. Computationally Budgeted Continual Learning: What Does Matter?

22. Don’t FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs

23. Real-Time Evaluation in Online Continual Learning: A New Hope

24. CAMEL: Communicative Agents for 'Mind' Exploration of Large Scale Language Model Society

25. Correction: Large eddy simulations of NH3-H2 jet flame at elevated pressure using PCA with inclusion of NH3/H2 ratio variation

26. Large eddy simulations of NH3-H2 jet flame at elevated pressure using PCA with inclusion of NH3/H2 ratio variation

27. Check Your Other Door: Creating Backdoor Attacks in the Frequency Domain

28. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

29. On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

30. Adaptive Ripple Correlation Control (ARCC) for Solar Maximum Power Point Tracking

31. Model-based MPPT with Corrective Ripple Correlation Control

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

31 results on '"Hammoud, Hasan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources