Author: "Mohamed A." / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs

Author: Saeed, Muhammed, Mohamed, Elgizouli, Mohamed, Mukhtar, Raza, Shaina, Shehata, Shady, and Abdul-Mageed, Muhammad
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases. This study examines LLM biases against Arabs versus Westerners across eight domains, including women's rights, terrorism, and anti-Semitism and assesses model resistance to perpetuating these biases. To this end, we create two datasets: one to evaluate LLM bias toward Arabs versus Westerners and another to test model safety against prompts that exaggerate negative traits ("jailbreaks"). We evaluate six LLMs -- GPT-4, GPT-4o, LlaMA 3.1 (8B & 405B), Mistral 7B, and Claude 3.5 Sonnet. We find 79% of cases displaying negative biases toward Arabs, with LlaMA 3.1-405B being the most biased. Our jailbreak tests reveal GPT-4o as the most vulnerable, despite being an optimized version, followed by LlaMA 3.1-8B and Mistral 7B. All LLMs except Claude exhibit attack success rates above 87% in three categories. We also find Claude 3.5 Sonnet the safest, but it still displays biases in seven of eight categories. Despite being an optimized version of GPT4, We find GPT-4o to be more prone to biases and jailbreaks, suggesting optimization flaws. Our findings underscore the pressing need for more robust bias mitigation strategies and strengthened security measures in LLMs.
Published: 2024

2. P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving

Author: Elshamy, Mohamed R., Emara, Heba M., Shoaib, Mohamed R., and Badawy, Abdel-Hameed A.
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Distracted driving is a critical safety issue that leads to numerous fatalities and injuries worldwide. This study addresses the urgent need for efficient and real-time machine learning models to detect distracted driving behaviors. Leveraging the Pretrained YOLOv8 (P-YOLOv8) model, a real-time object detection system is introduced, optimized for both speed and accuracy. This approach addresses the computational constraints and latency limitations commonly associated with conventional detection models. The study demonstrates P-YOLOv8 versatility in both object detection and image classification tasks using the Distracted Driver Detection dataset from State Farm, which includes 22,424 images across ten behavior categories. Our research explores the application of P-YOLOv8 for image classification, evaluating its performance compared to deep learning models such as VGG16, VGG19, and ResNet. Some traditional models often struggle with low accuracy, while others achieve high accuracy but come with high computational costs and slow detection speeds, making them unsuitable for real-time applications. P-YOLOv8 addresses these issues by achieving competitive accuracy with significant computational cost and efficiency advantages. In particular, P-YOLOv8 generates a lightweight model with a size of only 2.84 MB and a lower number of parameters, totaling 1,451,098, due to its innovative architecture. It achieves a high accuracy of 99.46 percent with this small model size, opening new directions for deployment on inexpensive and small embedded devices using Tiny Machine Learning (TinyML). The experimental results show robust performance, making P-YOLOv8 a cost-effective solution for real-time deployment. This study provides a detailed analysis of P-YOLOv8's architecture, training, and performance benchmarks, highlighting its potential for real-time use in detecting distracted driving.
Published: 2024

3. Predicting Bitcoin Market Trends with Enhanced Technical Indicator Integration and Classification Models

Author: Hafid, Abdelatif, Rahouti, Mohamed, Kong, Linglong, Ebrahim, Maad, and Serhani, Mohamed Adel
Subjects: Computer Science - Machine Learning
Abstract: Thanks to the high potential for profit, trading has become increasingly attractive to investors as the cryptocurrency and stock markets rapidly expand. However, because financial markets are intricate and dynamic, accurately predicting prices remains a significant challenge. The volatile nature of the cryptocurrency market makes it even harder for traders and investors to make decisions. This study presents a machine learning model based on classification to forecast the direction of the cryptocurrency market, i.e., whether prices will increase or decrease. The model is trained using historical data and important technical indicators such as the Moving Average Convergence Divergence, the Relative Strength Index, and Bollinger Bands. We illustrate our approach with an empirical study of the closing price of Bitcoin. Several simulations, including a confusion matrix and Receiver Operating Characteristic curve, are used to assess the model's performance, and the results show a buy/sell signal accuracy of over 92%. These findings demonstrate how machine learning models can assist investors and traders of cryptocurrencies in making wise/informed decisions in a very volatile market., Comment: 12 pages, 8 figures, and 6 tables
Published: 2024

4. Casablanca: Data and Models for Multidialectal Arabic Speech Recognition

Author: Talafha, Bashar, Kadaoui, Karima, Magdy, Samar Mohamed, Habiboullah, Mariem, Chafei, Chafei Mohamed, El-Shangiti, Ahmed Oumar, Zayed, Hiba, tourad, Mohamedou cheikh, Alhamouri, Rahaf, Assi, Rwaa, Alraeesi, Aisha, Mohamed, Hour, Alwajih, Fakhraddin, Mohamed, Abdelrahman, Mekki, Abdellah El, Nagoudi, El Moatez Billah, Saadia, Benelhadj Djelloul Mama, Alsayadi, Hamzah A., Al-Dhabyani, Walid, Shatnawi, Sara, Ech-Chammakhy, Yasir, Makouar, Amal, Berrachedi, Yousra, Jarrar, Mustafa, Shehata, Shady, Berrada, Ismail, and Abdul-Mageed, Muhammad
Subjects: Computer Science - Computation and Language
Abstract: In spite of the recent progress in speech processing, the majority of world languages and dialects remain uncovered. This situation only furthers an already wide technological divide, thereby hindering technological and socioeconomic inclusion. This challenge is largely due to the absence of datasets that can empower diverse speech systems. In this paper, we seek to mitigate this obstacle for a number of Arabic dialects by presenting Casablanca, a large-scale community-driven effort to collect and transcribe a multi-dialectal Arabic dataset. The dataset covers eight dialects: Algerian, Egyptian, Emirati, Jordanian, Mauritanian, Moroccan, Palestinian, and Yemeni, and includes annotations for transcription, gender, dialect, and code-switching. We also develop a number of strong baselines exploiting Casablanca. The project page for Casablanca is accessible at: www.dlnlp.ai/speech/casablanca.
Published: 2024

5. Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments

Author: Elnoor, Mohamed, Weerakoon, Kasun, Seneviratne, Gershom, Xian, Ruiqi, Guan, Tianrui, Jaffar, Mohamed Khalid M, Rajagopal, Vignesh, and Manocha, Dinesh
Subjects: Computer Science - Robotics
Abstract: We present a novel autonomous robot navigation algorithm for outdoor environments that is capable of handling diverse terrain traversability conditions. Our approach, VLM-GroNav, uses vision-language models (VLMs) and integrates them with physical grounding that is used to assess intrinsic terrain properties such as deformability and slipperiness. We use proprioceptive-based sensing, which provides direct measurements of these physical properties, and enhances the overall semantic understanding of the terrains. Our formulation uses in-context learning to ground the VLM's semantic understanding with proprioceptive data to allow dynamic updates of traversability estimates based on the robot's real-time physical interactions with the environment. We use the updated traversability estimations to inform both the local and global planners for real-time trajectory replanning. We validate our method on a legged robot (Ghost Vision 60) and a wheeled robot (Clearpath Husky), in diverse real-world outdoor environments with different deformable and slippery terrains. In practice, we observe significant improvements over state-of-the-art methods by up to 50% increase in navigation success rate.
Published: 2024

6. CROSS-GAiT: Cross-Attention-Based Multimodal Representation Fusion for Parametric Gait Adaptation in Complex Terrains

Author: Seneviratne, Gershom, Weerakoon, Kasun, Elnoor, Mohamed, Rajgopal, Vignesh, Varatharajan, Harshavarthan, Jaffar, Mohamed Khalid M, Pusey, Jason, and Manocha, Dinesh
Subjects: Computer Science - Robotics
Abstract: We present CROSS-GAiT, a novel algorithm for quadruped robots that uses Cross Attention to fuse terrain representations derived from visual and time-series inputs, including linear accelerations, angular velocities, and joint efforts. These fused representations are used to adjust the robot's step height and hip splay, enabling adaptive gaits that respond dynamically to varying terrain conditions. We generate these terrain representations by processing visual inputs through a masked Vision Transformer (ViT) encoder and time-series data through a dilated causal convolutional encoder. The cross-attention mechanism then selects and integrates the most relevant features from each modality, combining terrain characteristics with robot dynamics for better-informed gait adjustments. CROSS-GAiT uses the combined representation to dynamically adjust gait parameters in response to varying and unpredictable terrains. We train CROSS-GAiT on data from diverse terrains, including asphalt, concrete, brick pavements, grass, dense vegetation, pebbles, gravel, and sand. Our algorithm generalizes well and adapts to unseen environmental conditions, enhancing real-time navigation performance. CROSS-GAiT was implemented on a Ghost Robotics Vision 60 robot and extensively tested in complex terrains with high vegetation density, uneven/unstable surfaces, sand banks, deformable substrates, etc. We observe at least a 7.04% reduction in IMU energy density and a 27.3% reduction in total joint effort, which directly correlates with increased stability and reduced energy usage when compared to state-of-the-art methods. Furthermore, CROSS-GAiT demonstrates at least a 64.5% increase in success rate and a 4.91% reduction in time to reach the goal in four complex scenarios. Additionally, the learned representations perform 4.48% better than the state-of-the-art on a terrain classification task.
Published: 2024

7. BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes

Author: Weerakoon, Kasun, Elnoor, Mohamed, Seneviratne, Gershom, Rajagopal, Vignesh, Arul, Senthil Hariharan, Liang, Jing, Jaffar, Mohamed Khalid M, and Manocha, Dinesh
Subjects: Computer Science - Robotics
Abstract: We present BehAV, a novel approach for autonomous robot navigation in outdoor scenes guided by human instructions and leveraging Vision Language Models (VLMs). Our method interprets human commands using a Large Language Model (LLM) and categorizes the instructions into navigation and behavioral guidelines. Navigation guidelines consist of directional commands (e.g., "move forward until") and associated landmarks (e.g., "the building with blue windows"), while behavioral guidelines encompass regulatory actions (e.g., "stay on") and their corresponding objects (e.g., "pavements"). We use VLMs for their zero-shot scene understanding capabilities to estimate landmark locations from RGB images for robot navigation. Further, we introduce a novel scene representation that utilizes VLMs to ground behavioral rules into a behavioral cost map. This cost map encodes the presence of behavioral objects within the scene and assigns costs based on their regulatory actions. The behavioral cost map is integrated with a LiDAR-based occupancy map for navigation. To navigate outdoor scenes while adhering to the instructed behaviors, we present an unconstrained Model Predictive Control (MPC)-based planner that prioritizes both reaching landmarks and following behavioral guidelines. We evaluate the performance of BehAV on a quadruped robot across diverse real-world scenarios, demonstrating a 22.49% improvement in alignment with human-teleoperated actions, as measured by Frechet distance, and achieving a 40% higher navigation success rate compared to state-of-the-art methods.
Published: 2024

8. Outlier-Oriented Poisoning Attack: A Grey-box Approach to Disturb Decision Boundaries by Perturbing Outliers in Multiclass Learning

Author: Paracha, Anum, Arshad, Junaid, Farah, Mohamed Ben, and Ismail, Khalid
Subjects: Computer Science - Machine Learning
Abstract: Poisoning attacks are a primary threat to machine learning models, aiming to compromise their performance and reliability by manipulating training datasets. This paper introduces a novel attack - Outlier-Oriented Poisoning (OOP) attack, which manipulates labels of most distanced samples from the decision boundaries. The paper also investigates the adverse impact of such attacks on different machine learning algorithms within a multiclass classification scenario, analyzing their variance and correlation between different poisoning levels and performance degradation. To ascertain the severity of the OOP attack for different degrees (5% - 25%) of poisoning, we analyzed variance, accuracy, precision, recall, f1-score, and false positive rate for chosen ML models.Benchmarking our OOP attack, we have analyzed key characteristics of multiclass machine learning algorithms and their sensitivity to poisoning attacks. Our experimentation used three publicly available datasets: IRIS, MNIST, and ISIC. Our analysis shows that KNN and GNB are the most affected algorithms with a decrease in accuracy of 22.81% and 56.07% while increasing false positive rate to 17.14% and 40.45% for IRIS dataset with 15% poisoning. Further, Decision Trees and Random Forest are the most resilient algorithms with the least accuracy disruption of 12.28% and 17.52% with 15% poisoning of the IRIS dataset. We have also analyzed the correlation between number of dataset classes and the performance degradation of models. Our analysis highlighted that number of classes are inversely proportional to the performance degradation, specifically the decrease in accuracy of the models, which is normalized with increasing number of classes. Further, our analysis identified that imbalanced dataset distribution can aggravate the impact of poisoning for machine learning models
Published: 2024

9. P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation

Author: Elgaar, Mohamed and Amiri, Hadi
Subjects: Computer Science - Computation and Language
Abstract: We introduce LingGen, a novel approach for controlled text generation that offers precise control over a wide array of linguistic attributes, even as the number of attributes varies. LingGen employs a dynamic P-MASKING strategy, which samples masking rates from a power law distribution during training. This innovative approach enables the model to develop robust representations and adapt its attribute control capabilities across a variable number of attributes, from a single attribute to multiple complex configurations. The P-MASKING technique enhances LingGen's ability to manage different levels of attribute visibility, resulting in superior performance in multi-attribute generation tasks. Our experiments demonstrate that LingGen surpasses current state-of-the-art models in both attribute control accuracy and text fluency, particularly excelling in scenarios with varying attribute demands. Additionally, our ablation studies highlight the effectiveness of P-MASKING and the influence of different base language models on performance. These findings demonstrate LingGen's potential for applications requiring precise and adaptable control over multiple linguistic attributes in text generation.
Published: 2024

10. Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation

Author: Elgaar, Mohamed and Amiri, Hadi
Subjects: Computer Science - Computation and Language
Abstract: We present a novel approach to paraphrase generation that enables precise control and fine-tuning of 40 linguistic attributes for English. Our model is an encoder-decoder architecture that takes as input a source sentence and desired linguistic attributes, and produces paraphrases of the source that satisfy the desired attributes. To guarantee high-quality outputs at inference time, our method is equipped with a quality control mechanism that gradually adjusts the embedding of linguistic attributes to find the nearest and most attainable configuration of desired attributes for paraphrase generation. We evaluate the effectiveness of our method by comparing it to recent controllable generation models. Experimental results demonstrate that the proposed model outperforms baselines in generating paraphrases that satisfy desired linguistic attributes.
Published: 2024

11. Optical forces on atoms subject to higher-order Poincar{\'e} vortex modes

Author: Bougouffa, Smail and Babiker, Mohamed
Subjects: Quantum Physics
Abstract: The interaction of atoms with higher-order Poincar\'e optical vortex modes of order $m\geq 0$ is explored for light close to resonance with atomic dipole transitions. It is well-known that atoms subject to optical vortex modes experience both translational and rotational forces acting on the atomic centre of mass, leading to atom dynamics and atom trapping. Here we consider the optical forces on atoms immersed in general paraxial higher-order Poincar\'e optical vector modes. The coupling to atoms gives rise to wide-ranging scenarios involving such modes in which any specific polarisation is within a spectrum of wave polarisation and all the interactions are treatable within a single formulation. We show that this gives rise to a variety of physical situations, governed by the mode order $m$, the polarisation represented by the angular coordinates of the mode on the surface of the unit Poincar\'e sphere, the atomic transitions involved, and their selection rules. We present the analytical steps leading to the optical forces on sodium atoms and display their variations in various situations., Comment: 10 pages, 5 figures
Published: 2024

12. Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community

Author: Yoshii, Kazutomo and El-Hadedy, Mohamed
Subjects: Computer Science - Hardware Architecture
Abstract: The advent of chiplet technology introduces cutting-edge opportunities for constructing highly heterogeneous platforms with specialized accelerators. However, the HPC community currently lacks expertise in hardware development, a gap that must be bridged to leverage these advancements. Additionally, technologies like chiplet is cutting-edge with limited educational resource available. This paper addresses potential hardware specialization direction in HPC and how to cultivate these skills among students and staff, emphasizing the importance of understanding and developing custom hardware (e.g., rapid prototyping and resource estimation). We have been mentoring graduate-level students and new staff in hardware designs in a hands-on manner, encouraging them to utilize modern open-source hardware tools for their designs, which facilitates the sharing of research ideas. Additionally, we provide a summary of theses tools as part of our approach to prototyping and mentoring., Comment: SC24 eduHPC 2 pages
Published: 2024

13. PDSR: Efficient UAV Deployment for Swift and Accurate Post-Disaster Search and Rescue

Author: Abdellatif, Alaa Awad, Elmancy, Ali, Mohamed, Amr, Massoud, Ahmed, Lebda, Wadha, and Naji, Khalid K.
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Systems and Control
Abstract: This paper introduces a comprehensive framework for Post-Disaster Search and Rescue (PDSR), aiming to optimize search and rescue operations leveraging Unmanned Aerial Vehicles (UAVs). The primary goal is to improve the precision and availability of sensing capabilities, particularly in various catastrophic scenarios. Central to this concept is the rapid deployment of UAV swarms equipped with diverse sensing, communication, and intelligence capabilities, functioning as an integrated system that incorporates multiple technologies and approaches for efficient detection of individuals buried beneath rubble or debris following a disaster. Within this framework, we propose architectural solution and address associated challenges to ensure optimal performance in real-world disaster scenarios. The proposed framework aims to achieve complete coverage of damaged areas significantly faster than traditional methods using a multi-tier swarm architecture. Furthermore, integrating multi-modal sensing data with machine learning for data fusion could enhance detection accuracy, ensuring precise identification of survivors., Comment: This paper is currently under review at IEEE IoT Magazine
Published: 2024

14. Investigating the Electroweak Phase Transition with a Real Scalar Singlet at a Muon Collider

Author: Aboudonia, Mohamed, Balazs, Csaba, Papaefstathiou, Andreas, and White, Graham
Subjects: High Energy Physics - Phenomenology, Astrophysics - Cosmology and Nongalactic Astrophysics
Abstract: A strong first-order electroweak phase transition (SFOEWPT) is essential for explaining baryogenesis and for potentially generating observable gravitational waves. This study investigates the potential of a high-energy muon collider to examine the occurrence of SFOEWPT within the context of a Standard Model extended by a real scalar singlet (xSM). We analyzed all possible decay modes of the singlet to constrain the valid parameter space of SFOEWPT, which was extracted numerically at different renormalization scales to account for theoretical uncertainties, thereby determining the sensitivity of a muon collider to the production and decay channels of novel heavy scalar particles that emerge in the xSM. The findings demonstrate that a 3 TeV muon collider can directly examine the nature of electroweak symmetry breaking by efficiently detecting novel scalar particles associated with a first-order electroweak phase transition through jet-rich final states, thus complementing the indirect constraints from gravitational wave experiments.
Published: 2024

15. SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset

Author: Huynh, Ngoc Dung, Bouadjenek, Mohamed Reda, Aryal, Sunil, Razzak, Imran, and Hacid, Hakim
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual Question Answering (VQA) has emerged as a promising area of research to develop AI-based systems for enabling interactive and immersive learning. Numerous VQA datasets have been introduced to facilitate various tasks, such as answering questions or identifying unanswerable ones. However, most of these datasets are constructed using real-world images, leaving the performance of existing models on cartoon images largely unexplored. Hence, in this paper, we present "SimpsonsVQA", a novel dataset for VQA derived from The Simpsons TV show, designed to promote inquiry-based learning. Our dataset is specifically designed to address not only the traditional VQA task but also to identify irrelevant questions related to images, as well as the reverse scenario where a user provides an answer to a question that the system must evaluate (e.g., as correct, incorrect, or ambiguous). It aims to cater to various visual applications, harnessing the visual content of "The Simpsons" to create engaging and informative interactive systems. SimpsonsVQA contains approximately 23K images, 166K QA pairs, and 500K judgments (https://simpsonsvqa.org). Our experiments show that current large vision-language models like ChatGPT4o underperform in zero-shot settings across all three tasks, highlighting the dataset's value for improving model performance on cartoon images. We anticipate that SimpsonsVQA will inspire further research, innovation, and advancements in inquiry-based learning VQA.
Published: 2024

16. Coulomb interactions in systems of generalized semi-Dirac fermions

Author: Elsayed, Mohamed M., Uchoa, Bruno, and Kotov, Valeri N.
Subjects: Condensed Matter - Strongly Correlated Electrons, Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: Interactions have strong effects in systems with flat bands. We examine the role of Coulomb interactions in two dimensional chiral anisotropic quasiparticles that disperse linearly in one direction and have relatively flat bands near the neutrality point in the other direction, dispersing with an arbitrary positive even power law $2n\geq 2$. As in the conventional semi-Dirac case ($n = 1$), we show using renormalization group that strong logarithmic divergences in the self-energy of generalized semi-Dirac fermions resum and lead to a restoration of the linearity of the spectrum for arbitrary n over a sizable energy window in the perturbative regime. We discuss those results in light of previous non-perturbative large $N_f$ results and address the implications for physical observables., Comment: 11 pages, 6 figures
Published: 2024

17. Generalized Method of Moments and Percentile Method: Estimating parameters of the Novel Median Based Unit Weibull Distribution

Author: Attia, Iman Mohamed
Subjects: Statistics - Methodology, Mathematics - Probability
Abstract: The Median Based Unit Weibull is a new 2 parameter unit Weibull distribution defined on the unit interval (0,1). Estimation of the parameters using MLE encountered some problems like large variance. Using generalized method of moments (GMMs) and percentile method may ameliorate this condition. This paper introduces GMMs and the percentile methods for estimating the parameters of the new distribution with illustrative real data analysis., Comment: arXiv admin note: text overlap with arXiv:2410.19019
Published: 2024

18. Fine-Grained Clustering-Based Power Identification for Multicores

Author: Elshamy, Mohamed R., Elahi, Mehdi, Patooghy, Ahmad, and Badawy, Abdel-Hameed A.
Subjects: Computer Science - Performance
Abstract: Fine-grained power estimation in multicore Systems on Chips (SoCs) is crucial for efficient thermal management. BPI (Blind Power Identification) is a recent approach that determines the power consumption of different cores and the thermal model of the chip using only thermal sensor measurements and total power consumption. BPI relies on steady-state thermal data along with a naive initialization in its Non-negative Matrix Factorization (NMF) process, which negatively impacts the power estimation accuracy of BPI. This paper proposes a two-fold approach to reduce these impacts on BPI. First, this paper introduces an innovative approach for NMF initializing, i.e., density-oriented spatial clustering to identify centroid data points of active cores as initial values. This enhances BPI accuracy by focusing on dense regions in the dataset and excluding outlier data points. Second, it proposes the utilization of steady-state temperature data points to enhance the power estimation accuracy by leveraging the physical relationship between temperature and power consumption. Our extensive simulations of real-world cases demonstrate that our approach enhances BPI accuracy in estimating the power per core with no performance cost. For instance, in a four-core processor, the proposed approach reduces the error rate by 76% compared to BPI and by 24% compared to the state of the art in the literature, namely, Blind Power Identification Steady State (BPISS). The results underline the potential of integrating advanced clustering techniques in thermal model identification, paving the way for more accurate and reliable thermal management in multicores and SoCs.
Published: 2024

19. AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?

Author: Bao, Han, Huang, Yue, Wang, Yanbo, Ye, Jiayi, Wang, Xiangqi, Chen, Xiuying, Elhoseiny, Mohamed, and Zhang, Xiangliang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Large Vision-Language Models (LVLMs) have become essential for advancing the integration of visual and linguistic information, facilitating a wide range of complex applications and tasks. However, the evaluation of LVLMs presents significant challenges as the evaluation benchmark always demands lots of human cost for its construction, and remains static, lacking flexibility once constructed. Even though automatic evaluation has been explored in textual modality, the visual modality remains under-explored. As a result, in this work, we address a question: "Can LVLMs serve as a path to automatic benchmarking?". We introduce AutoBench-V, an automated framework for serving evaluation on demand, i.e., benchmarking LVLMs based on specific aspects of model capability. Upon receiving an evaluation capability, AutoBench-V leverages text-to-image models to generate relevant image samples and then utilizes LVLMs to orchestrate visual question-answering (VQA) tasks, completing the evaluation process efficiently and flexibly. Through an extensive evaluation of seven popular LVLMs across five demanded user inputs (i.e., evaluation capabilities), the framework shows effectiveness and reliability. We observe the following: (1) Our constructed benchmark accurately reflects varying task difficulties; (2) As task difficulty rises, the performance gap between models widens; (3) While models exhibit strong performance in abstract level understanding, they underperform in details reasoning tasks; and (4) Constructing a dataset with varying levels of difficulties is critical for a comprehensive and exhaustive evaluation. Overall, AutoBench-V not only successfully utilizes LVLMs for automated benchmarking but also reveals that LVLMs as judges have significant potential in various domains.
Published: 2024

20. Time-delay Induced Stochastic Optimization and Extremum Seeking

Author: Dimitrieski, Naum, Reyer, Michael, Belabbas, Mohamed-Ali, and Ebenbauer, Christian
Subjects: Mathematics - Optimization and Control
Abstract: In this paper a novel stochastic optimization and extremum seeking algorithm is presented, one which is based on time-delayed random perturbations and step size adaptation. For the case of a one-dimensional quadratic unconstrained optimization problem, global exponential convergence in expectation and global exponential practical convergence of the variance of the trajectories are proven. The theoretical results are complemented by numerical simulations for one- and multi-dimensional quadratic and non-quadratic objective functions., Comment: Preprint to be submitted to the 2025 European Control Conference (ECC25)
Published: 2024

21. Learning Approximated Maximal Safe Sets via Hypernetworks for MPC-Based Local Motion Planning

Author: Derajić, Bojan, Bouzidi, Mohamed-Khalil, Bernhard, Sebastian, and Hönig, Wolfgang
Subjects: Computer Science - Robotics, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control
Abstract: This paper presents a novel learning-based approach for online estimation of maximal safe sets for local motion planning tasks in mobile robotics. We leverage the idea of hypernetworks to achieve good generalization properties and real-time performance simultaneously. As the source of supervision, we employ the Hamilton-Jacobi (HJ) reachability analysis, allowing us to consider general nonlinear dynamics and arbitrary constraints. We integrate our model into a model predictive control (MPC) local planner as a safety constraint and compare the performance with relevant baselines in realistic 3D simulations for different environments and robot dynamics. The results show the advantages of our approach in terms of a significantly higher success rate: 2 to 18 percent over the best baseline, while achieving real-time performance.
Published: 2024

22. Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting

Author: Aissi, Mohamed Salim, Romac, Clement, Carta, Thomas, Lamprier, Sylvain, Oudeyer, Pierre-Yves, Sigaud, Olivier, Soulier, Laure, and Thome, Nicolas
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement learning (RL) is a promising approach for aligning large language models (LLMs) knowledge with sequential decision-making tasks. However, few studies have thoroughly investigated the impact on LLM agents capabilities of fine-tuning them with RL in a specific environment. In this paper, we propose a novel framework to analyze the sensitivity of LLMs to prompt formulations following RL training in a textual environment. Our findings reveal that the performance of LLMs degrades when faced with prompt formulations different from those used during the RL training phase. Besides, we analyze the source of this sensitivity by examining the model's internal representations and salient tokens. Finally, we propose to use a contrastive loss to mitigate this sensitivity and improve the robustness and generalization capabilities of LLMs.
Published: 2024

23. Collaborative Inference over Wireless Channels with Feature Differential Privacy

Author: Seif, Mohamed, Nie, Yuqi, Goldsmith, Andrea J., and Poor, H. Vincent
Subjects: Computer Science - Cryptography and Security, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Collaborative inference among multiple wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications, particularly for sensing and computer vision. This approach typically involves a three-stage process: a) data acquisition through sensing, b) feature extraction, and c) feature encoding for transmission. However, transmitting the extracted features poses a significant privacy risk, as sensitive personal data can be exposed during the process. To address this challenge, we propose a novel privacy-preserving collaborative inference mechanism, wherein each edge device in the network secures the privacy of extracted features before transmitting them to a central server for inference. Our approach is designed to achieve two primary objectives: 1) reducing communication overhead and 2) ensuring strict privacy guarantees during feature transmission, while maintaining effective inference performance. Additionally, we introduce an over-the-air pooling scheme specifically designed for classification tasks, which provides formal guarantees on the privacy of transmitted features and establishes a lower bound on classification accuracy., Comment: This work is under review for possible IEEE publication. arXiv admin note: substantial text overlap with arXiv:2406.00256
Published: 2024

24. Some Results on the $1$-Laplacian Elliptic Problems with Singularities and Robin Boundary Conditions

Author: Hichami, Mohamed El and Hadfi, Youssef El
Subjects: Mathematics - Analysis of PDEs
Abstract: In this paper, we investigate the existence and uniqueness of solutions for the following model problem, involving singularities and inhomogeneous Robin boundary conditions \begin{equation*} \left\{ \begin{array}{ll} -\Delta_{p}u_{p}=\frac{f}{u_{p}^{\gamma}}& \hbox{in $\Omega,$} \frac{\partial u_{p}}{\partial \sigma}+\lambda\vert u_{p}\vert^{p-2} u_{p}+\vert u_{p}\vert^{s-1}u_{p}=\frac{g}{u_{p}^{\eta}} & \hbox{on $\partial\Omega,$} \end{array} \right. \end{equation*} where $\Omega \subset \mathbb{R}^{m}$ represents an open bounded domain, with smooth boundary, $m \geq 2$, the symbol $\sigma $ stands for the unit outward normal vector, $ \Delta_{p}u:=\mbox{div}(\vert\nabla u\vert^{p-2}\nabla u) $ is the $p-$Laplacian operator $(1\leq p0$ and $s\geq 1.$ The function $ f\in L^{\frac{m}{p}}(\Omega)$ is a nonnegative additionally $ \lambda$ and $ g$ are nonnegative functions in $L^{\infty}(\partial \Omega).$
Published: 2024

25. Arabic Music Classification and Generation using Deep Learning

Author: Elshaarawy, Mohamed, Saeed, Ashrakat, Sheta, Mariam, Said, Abdelrahman, Bakr, Asem, Bahaa, Omar, and Gomaa, Walid
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper proposes a machine learning approach for classifying classical and new Egyptian music by composer and generating new similar music. The proposed system utilizes a convolutional neural network (CNN) for classification and a CNN autoencoder for generation. The dataset used in this project consists of new and classical Egyptian music pieces composed by different composers. To classify the music by composer, each sample is normalized and transformed into a mel spectrogram. The CNN model is trained on the dataset using the mel spectrograms as input features and the composer labels as output classes. The model achieves 81.4\% accuracy in classifying the music by composer, demonstrating the effectiveness of the proposed approach. To generate new music similar to the original pieces, a CNN autoencoder is trained on a similar dataset. The model is trained to encode the mel spectrograms of the original pieces into a lower-dimensional latent space and then decode them back into the original mel spectrogram. The generated music is produced by sampling from the latent space and decoding the samples back into mel spectrograms, which are then transformed into audio. In conclusion, the proposed system provides a promising approach to classifying and generating classical Egyptian music, which can be applied in various musical applications, such as music recommendation systems, music production, and music education.
Published: 2024

26. Age of Coded Updates In Gossip Networks Under Memory and Memoryless Schemes

Author: Bayram, Erkan, Bastopcu, Melih, Belabbas, Mohamed-Ali, and Başar, Tamer
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Systems and Control
Abstract: We consider an information update system on a gossip network, where a source node encodes information into $n$ total keys such that any subset of at least $k+1$ keys can fully reconstruct the original information. This encoding process follows the principles of a $k$-out-of-$n$ threshold system. The encoded updates are then disseminated across the network through peer-to-peer communication. We have two different types of nodes in a network: subscriber nodes, which receive a unique key from the source node for every status update instantaneously, and nonsubscriber nodes, which receive a unique key for an update only if the node is selected by the source, and this selection is renewed for each update. For the message structure between nodes, we consider two different schemes: a memory scheme (in which the nodes keep the source's current and previous encrypted messages) and a memoryless scheme (in which the nodes are allowed to only keep the source's current message). We measure the timeliness of information updates by using a recent performance metric, the version age of information. We present explicit formulas for the time average AoI in a scalable homogeneous network as functions of the network parameters under a memoryless scheme. Additionally, we provide strict lower and upper bounds for the time average AoI under a memory scheme., Comment: A part of this work is presented at the ACSSC24. This work has been submitted to IEEE for possible publication. arXiv admin note: text overlap with arXiv:2402.11462
Published: 2024

27. Quasi-orthogonal extension of skew-symmetric matrices

Author: Boussaïri, Abderrahim, Chergui, Brahim, Sarir, Zaineb, and Zouagui, Mohamed
Subjects: Mathematics - Combinatorics, 15A18, 15B10
Abstract: A real matrix $Q$ is quasi-orthogonal if $Q^{\top}Q=qI$, for some positive real number $q$. We prove that any $n\times n$ skew-symmetric matrix $S$ is a principal sub-matrix of a skew-symmetric quasi-orthogonal matrix $Q$, called a quasi-orthogonal extension of $S$. Moreover, we determine the least integer $d$ such that $S$ has a quasi-orthogonal extension of order $n+d$. This integer is called the quasi-orthogonality index of $S$. Lastly, we give a spectral characterization of skew-adjacency matrices of tournaments with quasi-orthogonality index at most three.
Published: 2024

28. VECTOR: Velocity-Enhanced GRU Neural Network for Real-Time 3D UAV Trajectory Prediction

Author: Nacar, Omer, Abdelkader, Mohamed, Ghouti, Lahouari, Gabr, Kahled, Al-Batati, Abdulrahman S., and Koubaa, Anis
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: This paper tackles the challenge of real-time 3D trajectory prediction for UAVs, which is critical for applications such as aerial surveillance and defense. Existing prediction models that rely primarily on position data struggle with accuracy, especially when UAV movements fall outside the position domain used in training. Our research identifies a gap in utilizing velocity estimates, first-order dynamics, to better capture the dynamics and enhance prediction accuracy and generalizability in any position domain. To bridge this gap, we propose a new trajectory prediction method using Gated Recurrent Units (GRUs) within sequence-based neural networks. Unlike traditional methods that rely on RNNs or transformers, this approach forecasts future velocities and positions based on historical velocity data instead of positions. This is designed to enhance prediction accuracy and scalability, overcoming challenges faced by conventional models in handling complex UAV dynamics. The methodology employs both synthetic and real-world 3D UAV trajectory data, capturing a wide range of flight patterns, speeds, and agility. Synthetic data is generated using the Gazebo simulator and PX4 Autopilot, while real-world data comes from the UZH-FPV and Mid-Air drone racing datasets. The GRU-based models significantly outperform state-of-the-art RNN approaches, with a mean square error (MSE) as low as 2 x 10^-8. Overall, our findings confirm the effectiveness of incorporating velocity data in improving the accuracy of UAV trajectory predictions across both synthetic and real-world scenarios, in and out of position data distributions. Finally, we open-source our 5000 trajectories dataset and a ROS 2 package to facilitate the integration with existing ROS-based UAV systems.
Published: 2024

29. Sensing Accuracy Optimization for Communication-assisted Dual-baseline UAV-InSAR

Author: Lahmeri, Mohamed-Amine, Mustieles-Pérez, Víctor, Vossiek, Martin, Krieger, Gerhard, and Schober, Robert
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: In this paper, we study the optimization of the sensing accuracy of unmanned aerial vehicle (UAV)-based dual-baseline interferometric synthetic aperture radar (InSAR) systems. A swarm of three UAV-synthetic aperture radar (SAR) systems is deployed to image an area of interest from different angles, enabling the creation of two independent digital elevation models (DEMs). To reduce the InSAR sensing error, i.e., the height estimation error, the two DEMs are fused based on weighted average techniques into one final DEM. The heavy computations required for this process are performed on the ground. To this end, the radar data is offloaded in real time via a frequency division multiple access (FDMA) air-to-ground backhaul link. In this work, we focus on improving the sensing accuracy by minimizing the worst-case height estimation error of the final DEM. To this end, the UAV formation and the power allocated for offloading are jointly optimized based on alternating optimization (AO), while meeting practical InSAR sensing and communication constraints. Our simulation results demonstrate that the proposed solution can improve the sensing accuracy by over 39% compared to a classical single-baseline UAV-InSAR system and by more than 12% compared to other benchmark schemes.
Published: 2024

30. Can Self Supervision Rejuvenate Similarity-Based Link Prediction?

Author: Zhang, Chenhan, Wang, Weiqi, Tian, Zhiyi, Yu, James Jianqiao, Kaafar, Mohamed Ali, Liu, An, and Yu, Shui
Subjects: Computer Science - Artificial Intelligence
Abstract: Although recent advancements in end-to-end learning-based link prediction (LP) methods have shown remarkable capabilities, the significance of traditional similarity-based LP methods persists in unsupervised scenarios where there are no known link labels. However, the selection of node features for similarity computation in similarity-based LP can be challenging. Less informative node features can result in suboptimal LP performance. To address these challenges, we integrate self-supervised graph learning techniques into similarity-based LP and propose a novel method: Self-Supervised Similarity-based LP (3SLP). 3SLP is suitable for the unsupervised condition of similarity-based LP without the assistance of known link labels. Specifically, 3SLP introduces a dual-view contrastive node representation learning (DCNRL) with crafted data augmentation and node representation learning. DCNRL is dedicated to developing more informative node representations, replacing the node attributes as inputs in the similarity-based LP backbone. Extensive experiments over benchmark datasets demonstrate the salient improvement of 3SLP, outperforming the baseline of traditional similarity-based LP by up to 21.2% (AUC).
Published: 2024

31. Enhanced Gaussian interferometric power, entanglement and Gaussian quantum steering in magnonics system with squeezed light

Author: Chabar, Noura, Amghar, M bark, and Amazioug, Mohamed
Subjects: Quantum Physics
Abstract: In this study, we propose a scheme to improve quantum correlations (QCs) between two magnons in a tripartite magnonical system. We use Gaussian interferometric power (GIP) to quantify QCs beyond entanglement. Additionally, Gaussian quantum steering is discussed. We investigate the enhancement of QCs via a squeezing parameter and an optical parametric amplifier (OPA). The Mancini criterion is considered to confirm the presence of shared entanglement between the two magnons. We observed a squeezing of about 7dB for the first magnon. Additionally, the squeezed vacuum field and the OPA improve the generation of genuine tripartite entanglement. We hope that current experimental technology will allow the proposed scheme to be implemented., Comment: 10 pages, 7 figures
Published: 2024
Full Text: View/download PDF

32. Large graph limits of local matching algorithms on uniform random graphs

Author: Aoudi, Mohamed Habib Aliou Diallo, Moyal, Pascal, and Robin, Vincent
Subjects: Mathematics - Probability, 05C80, 60J25, 91B68
Abstract: In this work, we propose a large-graph limit estimate of the matching coverage for several matching algorithms, on general graphs generated by the configuration model. For a wide class of {\em local} matching algorithms, namely, algorithms that only use information on the immediate neighborhood of the explored nodes, we propose a joint construction of the graph by the configuration model, and of the resulting matching on the latter graph. This leads to a generalization in infinite dimension of the differential equation method of Wormald: We keep track of the matching algorithm over time by a measure-valued CTMC, for which we prove the convergence, to the large-graph limit, to a deterministic hydrodynamic limit, identified as the unique solution of a system of ODE's in the space of integer measures. Then, the asymptotic proportion of nodes covered by the matching appears as a simple function of that solution. We then make this solution explicit for three particular local algorithms: the classical {\sc greedy} algorithm, and then the {\sc uni-min} and {\sc uni-max} algorithms, two variants of the greedy algorithm that select, as neighbor of any explored node, its neighbor having the least (respectively largest) residual degree.
Published: 2024

33. Gradient-Based Meta Learning for Uplink RSMA with Beyond Diagonal RIS

Author: Khisa, Shreya, Amhaz, Ali, Elhattab, Mohamed, Assi, Chadi, and Sharafeddine, Sanaa
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: Beyond diagonal reconfigurable intelligent surface (BD-RIS) has emerged as an innovative and generalized RIS framework that provides greater flexibility in wave manipulation and enhanced coverage. In comparison to conventional RIS, optimization of BD-RIS is more challenging due to the large number of optimization variables associated with it. Typically, optimization of large-scale optimization problems utilizing traditional optimization methods results in high complexity. To tackle this issue, we propose a gradient-based meta learning algorithm which works without pre-training and is able to solve large-scale optimization problems. With the objective to maximize the sum rate of the system, to the best of our knowledge, this is the first work considering joint optimization of receiving beamforming vectors at the base station (BS), scattering matrix of BD-RIS and transmission power of users equipment (UEs) in uplink rate-splitting multiple access (RSMA) communication. Numerical results demonstrate that our proposed scheme can outperform the conventional RIS RSMA framework by 22.5$\%$.
Published: 2024

34. LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Author: Shen, Xiaoqian, Xiong, Yunyang, Zhao, Changsheng, Wu, Lemeng, Chen, Jun, Zhu, Chenchen, Liu, Zechun, Xiao, Fanyi, Varadarajan, Balakrishnan, Bordes, Florian, Liu, Zhuang, Xu, Hu, Kim, Hyunwoo J., Soran, Bilge, Krishnamoorthi, Raghuraman, Elhoseiny, Mohamed, and Chandra, Vikas
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multimodal Large Language Models (MLLMs) have shown promising progress in understanding and analyzing video content. However, processing long videos remains a significant challenge constrained by LLM's context size. To address this limitation, we propose LongVU, a spatiotemporal adaptive compression mechanism thats reduces the number of video tokens while preserving visual details of long videos. Our idea is based on leveraging cross-modal query and inter-frame dependencies to adaptively reduce temporal and spatial redundancy in videos. Specifically, we leverage DINOv2 features to remove redundant frames that exhibit high similarity. Then we utilize text-guided cross-modal query for selective frame feature reduction. Further, we perform spatial token reduction across frames based on their temporal dependencies. Our adaptive compression strategy effectively processes a large number of frames with little visual information loss within given context length. Our LongVU consistently surpass existing methods across a variety of video understanding benchmarks, especially on hour-long video understanding tasks such as VideoMME and MLVU. Given a light-weight LLM, our LongVU also scales effectively into a smaller size with state-of-the-art video understanding performance., Comment: Project page: https://vision-cair.github.io/LongVU
Published: 2024

35. Small Contributions, Small Networks: Efficient Neural Network Pruning Based on Relative Importance

Author: Hussien, Mostafa, Afifi, Mahmoud, Nguyen, Kim Khoa, and Cheriet, Mohamed
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Neural and Evolutionary Computing
Abstract: Recent advancements have scaled neural networks to unprecedented sizes, achieving remarkable performance across a wide range of tasks. However, deploying these large-scale models on resource-constrained devices poses significant challenges due to substantial storage and computational requirements. Neural network pruning has emerged as an effective technique to mitigate these limitations by reducing model size and complexity. In this paper, we introduce an intuitive and interpretable pruning method based on activation statistics, rooted in information theory and statistical analysis. Our approach leverages the statistical properties of neuron activations to identify and remove weights with minimal contributions to neuron outputs. Specifically, we build a distribution of weight contributions across the dataset and utilize its parameters to guide the pruning process. Furthermore, we propose a Pruning-aware Training strategy that incorporates an additional regularization term to enhance the effectiveness of our pruning method. Extensive experiments on multiple datasets and network architectures demonstrate that our method consistently outperforms several baseline and state-of-the-art pruning techniques.
Published: 2024

36. On rings of integer-valued rational functions

Author: Chems-Eddin, Mohamed Mahmoud, Feryouch, Badr, Mouanis, Hakima, and Tamoussit, Ali
Subjects: Mathematics - Commutative Algebra
Abstract: Let $D\subseteq B$ be an extension of integral domains and $E$ a subset of the quotient field of $D$. We introduce the ring of \textit{$D$-valued $B$-rational functions on $E$}, denoted by $Int^R_B(E,D)$, which naturally extends the concepts of integer-valued polynomials, defined as $ Int^R_B(E,D) \:=\lbrace f \in B(X);\; f(E)\subseteq D\rbrace.$ The notion of $Int^R_B(E,D)$ boils down to the usual notion of integer-valued rational functions when the subset $E$ is infinite. In this paper, we aim to investigate various properties of these rings, such as prime ideals, localization, and the module structure. Furthermore, we study the transfer of some ring-theoretic properties from $Int^R(E,D)$ to $D$., Comment: 21 pages
Published: 2024

37. Increasing Interpretability of Neural Networks By Approximating Human Visual Saliency

Author: Boyd, Aidan, Trabelsi, Mohamed, Uzunalioglu, Huseyin, and Kushnir, Dan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Understanding specifically where a model focuses on within an image is critical for human interpretability of the decision-making process. Deep learning-based solutions are prone to learning coincidental correlations in training datasets, causing over-fitting and reducing the explainability. Recent advances have shown that guiding models to human-defined regions of saliency within individual images significantly increases performance and interpretability. Human-guided models also exhibit greater generalization capabilities, as coincidental dataset features are avoided. Results show that models trained with saliency incorporation display an increase in interpretability of up to 30% over models trained without saliency information. The collection of this saliency information, however, can be costly, laborious and in some cases infeasible. To address this limitation, we propose a combination strategy of saliency incorporation and active learning to reduce the human annotation data required by 80% while maintaining the interpretability and performance increase from human saliency. Extensive experimentation outlines the effectiveness of the proposed approach across five public datasets and six active learning criteria.
Published: 2024

38. On the boundedness of periodic Fourier integral operators in Lebesgue spaces with variable exponent

Author: Tai, Boukary, Congo, Mohamed, Ouedraogo, Marie Françoise, and Ouedraogo, Arouna
Subjects: Mathematics - Functional Analysis
Abstract: The aim of this paper is to investigate the boundedness of periodic Fourier integral operators in Lebesgue spaces with variable exponent $L^{p(\cdot)}$ on the $n$-dimensional torus. We deal with operators of type $(\rho, \delta)$ which symbols belong to the H\"{o}rmander class $S^{m}_{\rho, \delta}(\mathbb{T}^{n}\times\mathbb{Z}^{n})$ for $0\leq\delta<\rho\leq1.$, Comment: 15 pages
Published: 2024

39. Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence

Author: Tihanyi, Norbert, Bisztray, Tamas, Dubniczky, Richard A., Toth, Rebeka, Borsos, Bertalan, Cherif, Bilel, Ferrag, Mohamed Amine, Muzsai, Lajos, Jain, Ridhi, Marinelli, Ryan, Cordeiro, Lucas C., and Debbah, Merouane
Subjects: Computer Science - Artificial Intelligence, Computer Science - Multiagent Systems
Abstract: As machine intelligence evolves, the need to test and compare the problem-solving abilities of different AI models grows. However, current benchmarks are often overly simplistic, allowing models to perform uniformly well, making it difficult to distinguish their capabilities. Additionally, benchmarks typically rely on static question-answer pairs, which models might memorize or guess. To address these limitations, we introduce the Dynamic Intelligence Assessment (DIA), a novel methodology for testing AI models using dynamic question templates and improved metrics across multiple disciplines such as mathematics, cryptography, cybersecurity, and computer science. The accompanying DIA-Bench dataset, which includes 150 diverse and challenging task templates with mutable parameters, is presented in various formats such as text, PDFs, compiled binaries, and visual puzzles. Our framework introduces four new metrics to assess a model's reliability and confidence across multiple attempts. These metrics revealed that even simple questions are frequently answered incorrectly when posed in varying forms, highlighting significant gaps in models' reliability. Notably, models like GPT-4o tended to overestimate their mathematical abilities, while ChatGPT-4o demonstrated better decision-making and performance through effective tool usage. We evaluated eight state-of-the-art large language models (LLMs) using DIA-Bench, showing that current models struggle with complex tasks and often display unexpectedly low confidence, even with simpler questions. The DIA framework sets a new standard for assessing not only problem-solving but also a model's adaptive intelligence and ability to assess its own limitations. The dataset is publicly available on our project's website.
Published: 2024

40. LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content

Author: Kmainasi, Mohamed Bayan, Shahroor, Ali Ezzat, Hasanain, Maram, Laskar, Sahinur Rahman, Hassan, Naeemul, and Alam, Firoj
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, 68T50, F.2.2, I.2.7
Abstract: Large Language Models (LLMs) have demonstrated remarkable success as general-purpose task solvers across various fields, including NLP, healthcare, finance, and law. However, their capabilities remain limited when addressing domain-specific problems, particularly in downstream NLP tasks. Research has shown that models fine-tuned on instruction-based downstream NLP datasets outperform those that are not fine-tuned. While most efforts in this area have primarily focused on resource-rich languages like English and broad domains, little attention has been given to multilingual settings and specific domains. To address this gap, this study focuses on developing a specialized LLM, LlamaLens, for analyzing news and social media content in a multilingual context. To the best of our knowledge, this is the first attempt to tackle both domain specificity and multilinguality, with a particular focus on news and social media. Our experimental setup includes 19 tasks, represented by 52 datasets covering Arabic, English, and Hindi. We demonstrate that LlamaLens outperforms the current state-of-the-art (SOTA) on 16 testing sets, and achieves comparable performance on 10 sets. We make the models and resources publicly available for the research community.(https://huggingface.co/QCRI), Comment: LLMs, Multilingual, Language Diversity, Large Language Models, Social Media, News Media, Specialized LLMs, Fact-checking, Media Analysis
Published: 2024

41. A Novel Reinforcement Learning Model for Post-Incident Malware Investigations

Author: Dunsin, Dipo, Ghanem, Mohamed Chahine, Ouazzane, Karim, and Vassilev, Vassil
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: This Research proposes a Novel Reinforcement Learning (RL) model to optimise malware forensics investigation during cyber incident response. It aims to improve forensic investigation efficiency by reducing false negatives and adapting current practices to evolving malware signatures. The proposed RL framework leverages techniques such as Q-learning and the Markov Decision Process (MDP) to train the system to identify malware patterns in live memory dumps, thereby automating forensic tasks. The RL model is based on a detailed malware workflow diagram that guides the analysis of malware artefacts using static and behavioural techniques as well as machine learning algorithms. Furthermore, it seeks to address challenges in the UK justice system by ensuring the accuracy of forensic evidence. We conduct testing and evaluation in controlled environments, using datasets created with Windows operating systems to simulate malware infections. The experimental results demonstrate that RL improves malware detection rates compared to conventional methods, with the RL model's performance varying depending on the complexity and learning rate of the environment. The study concludes that while RL offers promising potential for automating malware forensics, its efficacy across diverse malware types requires ongoing refinement of reward systems and feature extraction methods., Comment: 8 pages. arXiv admin note: substantial text overlap with arXiv:2408.01999
Published: 2024

42. Multi-channel, tunable quantum photonic devices on fiber-integrated platforms

Author: Jeon, Woong Bae, Moon, Jong Sung, Kim, Kyu-Young, Benyoucef, Mohamed, and Kim, Je-Hyung
Subjects: Quantum Physics
Abstract: Scalable, reliable quantum light sources are essential for increasing quantum channel capacity and advancing quantum protocols based on photonic qubits. Although recent developments in solid-state quantum emitters have enabled the generation of single photons with high performance, the scalable integration of quantum devices onto practical optical platforms remains a challenging task. Here, we present a breakthrough in achieving a multiple, tunable array of quantum photonic devices. The selective integration of multiple quantum dot devices onto a V-groove fiber platform features scalability, tunability, high yield, and high single-photon coupling efficiency. Therefore, our fiber-integrated quantum platform realizes a scalable and reliable single-photon array within a compact fiber chip at telecom wavelengths., Comment: 19 pages, 5 figures
Published: 2024

43. A New One Parameter Unit Distribution: Median Based Unit Rayleigh (MBUR): Parametric Quantile Regression Model

Author: Attia, Iman Mohamed
Subjects: Statistics - Methodology, Mathematics - Probability
Abstract: Parametric quantile regression is illustrated for the one parameter new unit Rayleigh distribution called Median Based Unit Rayleigh distribution (MBUR) distribution. The estimation process using re-parameterized maximum likelihood function is highlighted with real dataset example. The inference and goodness of fit is also explored.
Published: 2024

44. Streaming Deep Reinforcement Learning Finally Works

Author: Elsayed, Mohamed, Vasan, Gautham, and Mahmood, A. Rupam
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Natural intelligence processes experience as a continuous stream, sensing, acting, and learning moment-by-moment in real time. Streaming learning, the modus operandi of classic reinforcement learning (RL) algorithms like Q-learning and TD, mimics natural learning by using the most recent sample without storing it. This approach is also ideal for resource-constrained, communication-limited, and privacy-sensitive applications. However, in deep RL, learners almost always use batch updates and replay buffers, making them computationally expensive and incompatible with streaming learning. Although the prevalence of batch deep RL is often attributed to its sample efficiency, a more critical reason for the absence of streaming deep RL is its frequent instability and failure to learn, which we refer to as stream barrier. This paper introduces the stream-x algorithms, the first class of deep RL algorithms to overcome stream barrier for both prediction and control and match sample efficiency of batch RL. Through experiments in Mujoco Gym, DM Control Suite, and Atari Games, we demonstrate stream barrier in existing algorithms and successful stable learning with our stream-x algorithms: stream Q, stream AC, and stream TD, achieving the best model-free performance in DM Control Dog environments. A set of common techniques underlies the stream-x algorithms, enabling their success with a single set of hyperparameters and allowing for easy extension to other algorithms, thereby reviving streaming RL.
Published: 2024

45. On two-color partitions with odd smallest part

Author: Andrews, George E. and Bachraoui, Mohamed El
Subjects: Mathematics - Number Theory
Abstract: We will consider two-color integer partitions in which the smallest part is odd and the odd or the even parts may occur only in one color. It turns out that these partitions are generated by the mock theta functions of third order due to Ramanujan and Watson. Motivated by this, we will also consider two-color partitions in which the parts of the same parity may occur in both colors., Comment: 12 pages, submitted
Published: 2024

46. Learning Metadata-Agnostic Representations for Text-to-SQL In-Context Example Selection

Author: Mai, Chuhong, Tal, Ro-ee, and Mohamed, Thahir
Subjects: Computer Science - Computation and Language
Abstract: In-context learning (ICL) is a powerful paradigm where large language models (LLMs) benefit from task demonstrations added to the prompt. Yet, selecting optimal demonstrations is not trivial, especially for complex or multi-modal tasks where input and output distributions differ. We hypothesize that forming task-specific representations of the input is key. In this paper, we propose a method to align representations of natural language questions and those of SQL queries in a shared embedding space. Our technique, dubbed MARLO - Metadata-Agnostic Representation Learning for Text-tO-SQL - uses query structure to model querying intent without over-indexing on underlying database metadata (i.e. tables, columns, or domain-specific entities of a database referenced in the question or query). This allows MARLO to select examples that are structurally and semantically relevant for the task rather than examples that are spuriously related to a certain domain or question phrasing. When used to retrieve examples based on question similarity, MARLO shows superior performance compared to generic embedding models (on average +2.9\%pt. in execution accuracy) on the Spider benchmark. It also outperforms the next best method that masks metadata information by +0.8\%pt. in execution accuracy on average, while imposing a significantly lower inference latency., Comment: Accepted to NeurIPS 2024 Table Representation Learning workshop
Published: 2024

47. Private Counterfactual Retrieval

Author: Nomeir, Mohamed, Dissanayake, Pasan, Meel, Shreya, Dutta, Sanghamitra, and Ulukus, Sennur
Subjects: Computer Science - Information Theory, Computer Science - Cryptography and Security, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing
Abstract: Transparency and explainability are two extremely important aspects to be considered when employing black-box machine learning models in high-stake applications. Providing counterfactual explanations is one way of catering this requirement. However, this also poses a threat to the privacy of both the institution that is providing the explanation as well as the user who is requesting it. In this work, we propose multiple schemes inspired by private information retrieval (PIR) techniques which ensure the \emph{user's privacy} when retrieving counterfactual explanations. We present a scheme which retrieves the \emph{exact} nearest neighbor counterfactual explanation from a database of accepted points while achieving perfect (information-theoretic) privacy for the user. While the scheme achieves perfect privacy for the user, some leakage on the database is inevitable which we quantify using a mutual information based metric. Furthermore, we propose strategies to reduce this leakage to achieve an advanced degree of database privacy. We extend these schemes to incorporate user's preference on transforming their attributes, so that a more actionable explanation can be received. Since our schemes rely on finite field arithmetic, we empirically validate our schemes on real datasets to understand the trade-off between the accuracy and the finite field sizes.
Published: 2024

48. Generative Location Modeling for Spatially Aware Object Insertion

Author: Yun, Jooyeol, Abati, Davide, Omran, Mohamed, Choo, Jaegul, Habibian, Amirhossein, and Wiggers, Auke
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Generative models have become a powerful tool for image editing tasks, including object insertion. However, these methods often lack spatial awareness, generating objects with unrealistic locations and scales, or unintentionally altering the scene background. A key challenge lies in maintaining visual coherence, which requires both a geometrically suitable object location and a high-quality image edit. In this paper, we focus on the former, creating a location model dedicated to identifying realistic object locations. Specifically, we train an autoregressive model that generates bounding box coordinates, conditioned on the background image and the desired object class. This formulation allows to effectively handle sparse placement annotations and to incorporate implausible locations into a preference dataset by performing direct preference optimization. Our extensive experiments demonstrate that our generative location model, when paired with an inpainting method, substantially outperforms state-of-the-art instruction-tuned models and location modeling baselines in object insertion tasks, delivering accurate and visually coherent results.
Published: 2024

49. Using Intermittent Chaotic Clocks to Secure Cryptographic Chips

Author: Darya, Abdollah Masoud, Majzoub, Sohaib, El-Moursy, Ali A., Eladham, Mohamed Wed, Javeed, Khalid, and Elwakil, Ahmed S.
Subjects: Nonlinear Sciences - Chaotic Dynamics, Electrical Engineering and Systems Science - Systems and Control
Abstract: This letter proposes using intermittent chaotic clocks, generated from chaotic maps, to drive cryptographic chips running the Advanced Encryption Standard as a countermeasure against Correlation Power Analysis attacks. Five different chaotic maps -- namely: the Logistic map, the Bernoulli shift map, the Henon map, the Tent map, and the Ikeda map -- are used in this work to generate chaotic clocks. The performance of these chaotic clocks is evaluated in terms of timing overhead and the resilience of the driven chip against Correlation Power Analysis attacks. All proposed chaotic clocking schemes successfully protect the driven chip against attacks, with the clocks produced by the optimized Ikeda, Henon, and Logistic maps achieving the lowest timing overhead. These optimized maps, due to their intermittent chaotic behavior, exhibit lower timing overhead compared to previous work. Notably, the chaotic clock generated by the optimized Ikeda map approaches the theoretical limit of timing overhead, i.e., half the execution time of a reference periodic clock.
Published: 2024
Full Text: View/download PDF

50. Microwaves reveal the nanoscale ion intercalation and edge activity of 2D catalyst

Author: Awadein, Mohamed, Kumar, Abhishek, Wang, Yuqing, Dong, Mingdong, Müllegger, Stefan, and Gramse, Georg
Subjects: Physics - Chemical Physics
Abstract: The accelerated demand for electrochemical energy storage urges the need for new, sustainable, stable and lightweight materials able to store high energy densities rapidly and efficiently. Development of these functional materials requires specialized techniques that can provide a close insight into the electrochemical properties at the nanoscale. For this reason, we have introduced the electrochemical scanning microwave microscopy (EC-SMM) enabling local measurement of electrochemical properties with nanometer spatial resolution and sensitivity down to atto-Ampere electrochemical currents. The exceptional power of EC-SMM operated at radio frequency is exemplified here by the successful detection of the electrochemical activity and dynamics of molecularly thin NiCo(OH)2 flakes with a spatial resolution of 16 +/- 1 nm, uncovering the location of the active sites and providing atomistic details on the catalytic process that controls the electrocatalytic performance. Our results pinpoint the factors required to tune the thermodynamics of ion intercalation and to optimize the surface adsorption., Comment: 21 page
Published: 2024

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

10,792 results on '"Mohamed A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources