11 results on '"Layla A. A. El-Sayed"'
Search Results
2. A cloudlet architecture using mobile devices.
- Author
-
Abd El-Hameed G. El-Barbary, Layla A. A. El-Sayed, Hussein H. Aly, and Mohamed Nazih ElDerini
- Published
- 2015
- Full Text
- View/download PDF
3. Efficient interleaved modular multiplication based on sign detection.
- Author
-
Mohamed A. Nassar and Layla A. A. El-Sayed
- Published
- 2015
- Full Text
- View/download PDF
4. A lightweight incremental analysis and profiling framework for embedded devices.
- Author
-
Sara Elshobaky, Ahmed El-Mahdy 0002, Erven Rohou, Layla A. A. El-Sayed, and Mohamed Nazih ElDerini
- Published
- 2014
- Full Text
- View/download PDF
5. Novel GPU-Based Approach for Matrix Factorization using Stochastic Gradient Descent
- Author
-
Mohamed A. Nassar, Layla A. A. El-Sayed, and Yousry Taha
- Subjects
Stochastic gradient descent ,Computer science ,Applied mathematics ,Matrix decomposition - Published
- 2017
6. GPU_MF_SGD: A Novel GPU-Based Stochastic Gradient Descent Method for Matrix Factorization
- Author
-
Yousry Taha, Layla A. A. El-Sayed, and Mohamed A. Nassar
- Subjects
Speedup ,Computer science ,Graphics processing unit ,02 engineering and technology ,Parallel computing ,Recommender system ,Load balancing (computing) ,Matrix decomposition ,Stochastic gradient descent ,020204 information systems ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Sequential algorithm - Abstract
Recommender systems are used in most of nowadays applications. Providing real-time suggestions with high accuracy is considered as one of the most crucial challenges that face them. Matrix factorization (MF) is an effective technique for recommender systems as it improves the accuracy. Stochastic Gradient Descent (SGD) for MF is the most popular approach used to speed up MF. SGD is a sequential algorithm, which is not trivial to be parallelized, especially for large-scale problems. Recently, many researches have proposed parallel methods for parallelizing SGD. In this research, we propose GPU_MF_SGD, a novel GPU-based method for large-scale recommender systems. GPU_MF_SGD utilizes Graphics Processing Unit (GPU) resources by ensuring load balancing and linear scalability, and achieving coalesced access of global memory without preprocessing phase. Our method demonstrates 3.1X–5.4X speedup over the most state-of-the-art GPU method, CuMF_SGD.
- Published
- 2018
7. A cloudlet architecture using mobile devices
- Author
-
Layla A. A. El-Sayed, Abd El-Hameed G. El-Barbary, Mohamed N. El-Derini, and Hussien H. Aly
- Subjects
business.industry ,Computer science ,Server ,Mobile search ,Computation offloading ,Cloud computing ,Mobile telephony ,Cloudlet ,business ,Mobile device ,Computer network ,Mobile cloud computing - Abstract
With the increasing ubiquity of mobile devices with computational capabilities comparable to many personal computers, they became the dominant personal computing devices nowadays. However, mobile devices still suffer short battery life and many of them are not capable of running rich-media, data-intensive and compute-intensive applications. Offloading to the cloud is a promising solution, however, long WAN latencies and the large amount of consumed energy using cellular data connectivity guided researchers to bring cloud capabilities to LANs to form cloudlets. On the other hand, in the last few years, there has been a new trend of making use of the computing resources of the vast amount of mobile devices available everywhere. In this paper, we introduce DroidCloudlet as a cloudlet architecture that is based on mobile devices. DroidCloudlet is empowered by available resource-rich mobile devices, while resource-constrained mobile devices can offload compute-intensive workloads to them. Offloading is carried out dynamically at runtime according to specific policies that target reducing execution time or saving energy. We also propose a lightweight cost model that is used to make offloading decision based on both context parameters and historical offloading performance results. To make use of all reachable mobile devices, we introduce a multilevel architecture in which offloading can bubble recursively from local cloudlet servers to other remote servers in any reachable cloudlet. For evaluation, we have built a prototype of the proposed architecture and performance results show that it succeeded to save up to 72% of execution time and 98% of consumed energy.
- Published
- 2015
8. A lightweight incremental analysis and profiling framework for embedded devices
- Author
-
Layla A. A. El-Sayed, Sara El-Shobaky, Erven Rohou, Mohamed Nazih El-Derini, Ahmed El-Mahdy, Department of Computer and Systems Engineering [Alexandria], Université d'Alexandrie, Egypt-Japan University of Science and Technology [Alexandrie] (E-JUST), Amdahl's Law is Forever (ALF), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-ARCHITECTURE (IRISA-D3), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Pharos University [Alexandria] (PUA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Profiling (computer programming) ,Optimization ,[INFO.INFO-PL]Computer Science [cs]/Programming Languages [cs.PL] ,Exploit ,Workstation ,Computer science ,business.industry ,JIT ,02 engineering and technology ,Business process reengineering ,computer.software_genre ,020202 computer hardware & architecture ,law.invention ,Software portability ,[INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF] ,law ,Compilers ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,[INFO]Computer Science [cs] ,Cache ,Compiler ,business ,Mobile device ,computer - Abstract
International audience; Embedded systems such as mobile devices are currently ubiquitous. The performance potential of these devices is rapidly improving by incorporating multi-core and GPU technologies, and is rapidly catching up with the workstation platforms. Nevertheless, the heterogeneity of the underlying hardware as well as the low-power constraints severely limit performance portability. In this paper we consider the case of leveraging JIT compilers to provide portable parallelization while hiding the corresponding expensive runtime analysis. We propose a novel lightweight JIT framework that exploits the device idle time and the large storage space generally available on these devices. The framework performs 'incremental' analysis while the processor is idle (such as during charging time), and exploits the storage space to cache intermediate analysis results. Such approach requires reengineering existing complex optimization analysis methods. For this paper, we focus on the traditional loop parallelization analysis, and implement a working prototype into the LLVM framework, integrating a lightweight dynamic profiling method to identify hotspots. Initial results demonstrate the low overhead of our method for parallelizing simple loops on an embedded GPU.
- Published
- 2014
9. DroidCloudlet: Towards cloudlet-based computing using mobile devices
- Author
-
Abd El-Hameed G. El-Barbary, Mohamed Nazih El-Derini, Hussien H. Aly, and Layla A. A. El-Sayed
- Subjects
Computer science ,business.industry ,Server ,Distributed computing ,Global Positioning System ,Mobile computing ,Local area network ,Mobile search ,Cloud computing ,Cloudlet ,business ,Mobile device - Abstract
With the increasing ubiquity of mobile devices with computational capabilities comparable to many personal computers, while at the same time, wireless connectivity is almost available everywhere, researchers started to explore how to make use of that huge amount of potential aggregated computing resources. However, mobile devices still suffer short battery life, especially with their newly large screens and the spread use of applications which need GPS or sensors' readings. Moreover, not all mobile devices are capable of running rich-media and data-intensive applications. Offloading to the cloud is a promising solution, however, long WAN latencies and the larger amount of consumed energy using cellular data connectivity guided researchers to bring cloud capabilities to local area networks to form cloudlets. In this paper, we propose a cloudlet architecture in which, any available mobile device with abundant processing or power resources, can participate as a server. Offloading is carried out dynamically at runtime according to specific policies that target reducing execution time and/or saving battery. We also propose a scheme to enable applications' developers to select which parts of their code should be offloaded and parallelized among different available servers.
- Published
- 2014
10. Faster Interleaved Modular Multiplier Based on Sign Detection
- Author
-
Mohamed A. Nassar and Layla A. A. El-Sayed
- Subjects
Sign detection ,Modular arithmetic ,efficient architecture ,Computation ,Modulo ,Data security ,sign detection ,sign estimation technique ,carry-save adder ,modular multiplication ,RSA ,Key (cryptography) ,Cryptosystem ,Arithmetic ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,ElGamal encryption ,FPGA ,Mathematics - Abstract
Data Security is the most important issue nowadays. A lot of cryptosystems are introduced to provide security. Public key cryptosystems are the most common cryptosystems used for securing data communication. The common drawback of applying such cryptosystems is the heavy computations which degrade performance of a system. Modular multiplication is the basic operation of common public key cryptosystems such as RSA, Diffie-Hellman key agreement (DH), ElGamal and ECC. Much research is now directed to reduce overall time consumed by modular multiplication operation. Abd-el-fatah et al. introduced an enhanced architecture for computing modular ultiplication of two large numbers X and Y modulo given M. In this paper, a modification on that architecture is introduced. The proposed design computes modular multiplication by scanning two bits per iteration instead of one bit. The proposed design for 1024-bit precision reduced overall time by 38% compared to the design of Abd-el-fatah et al.
- Published
- 2012
11. Radix-4 Modified Interleaved Modular Multiplier Based on Sign Detection
- Author
-
Layla A. A. El-Sayed and Mohamed A. Nassar
- Subjects
Modular arithmetic ,Computer science ,Modulo ,Key (cryptography) ,Cryptosystem ,Data security ,Carry-save adder ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Arithmetic ,Field-programmable gate array ,ElGamal encryption - Abstract
Data Security is the most important issue nowadays. A lot of cryptosystems are introduced to provide security. Public key cryptosystems are most common cryptosystems used for securing data communication. Modular multiplication is the basic operation of a lot of public key cryptosystems such as RSA, Diffie-Hellman key agreement (DH), ElGamal, and ECC. Abd-el-fatah et al. introduced an enhanced architecture for computing modular multiplication of two large numbers X and Y modulo given M. In this paper, a modification on that architecture is introduced. The proposed design computes modular multiplication by scanning two bits per iteration instead of one bit. The proposed design for 1024-bit precision reduced overall time by 38% compared to the design of Abd-el-fatah et al.
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.