14 results on '"Sayed Omid Ayat"'
Search Results
2. Accurate and compact convolutional neural network based on stochastic computing.
- Author
-
Hamdan Abdellatef, Mohamed Khalil Hani, Nasir Shaikh-Husin, and Sayed Omid Ayat
- Published
- 2022
- Full Text
- View/download PDF
3. Spectral-based convolutional neural network without multiple spatial-frequency domain switchings.
- Author
-
Sayed Omid Ayat, Mohamed Khalil Hani, Ab Al-Hadi Ab Rahman, and Hamdan Abdellatef
- Published
- 2019
- Full Text
- View/download PDF
4. Optimizing FPGA-based CNN accelerator for energy efficiency with an extended Roofline model.
- Author
-
Sayed Omid Ayat, Mohamed Khalil Hani, and Ab Al-Hadi Ab Rahman
- Published
- 2018
- Full Text
- View/download PDF
5. OpenCL-based hardware-software co-design methodology for image processing implementation on heterogeneous FPGA platform.
- Author
-
Sayed Omid Ayat, Mohamed Khalil Hani, and Rabia Bakhteri
- Published
- 2015
- Full Text
- View/download PDF
6. Low-area and accurate inner product and digital filters based on stochastic computing.
- Author
-
Hamdan Abdellatef, Mohamed Khalil Hani, Nasir Shaikh-Husin, and Sayed Omid Ayat
- Published
- 2021
- Full Text
- View/download PDF
7. Accurate and compact convolutional neural network based on stochastic computing
- Author
-
Nasir Shaikh-Husin, Hamdan Abdellatef, Mohamed Khalil-Hani, and Sayed Omid Ayat
- Subjects
Footprint ,Stochastic computing ,Computer engineering ,Artificial Intelligence ,Computer science ,Cognitive Neuroscience ,Unconventional computing ,Convolutional neural network ,Facial recognition system ,Power budget ,MNIST database ,Computer Science Applications ,Power (physics) - Abstract
Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many recognition problems. However, CNN models are computation-intensive and require enormous resources and power, limiting their applicability in embedded systems with limited area and power budget. An alternative computing technique called Stochastic Computing (SC) can implement resource-demanding algorithms in smaller hardware that indeed reduces the power consumption. In this work, we propose SC-based forward functions for CNN layers that obtain significant area savings and high accuracy to replace the conventional binary-encoded (BE) deterministic computing counterparts. Then, we specify some training considerations to enable achieving low error rates for SC-based CNN. The experimental results show that the SC-based CNN attained 99.19% and 96.25% classification accuracy using MNIST digit classification and AT&T face recognition datasets, respectively. Moreover, the SC-based CNN of ResNet-20 model achieved 86.5% classification accuracy using the CIFAR-10 object dataset. The SC-based CNN functions have better classification accuracy compared to other SC schemes and obtained ultra-low hardware footprint compared to conventional BE counterparts.
- Published
- 2022
8. Spectral-based convolutional neural network without multiple spatial-frequency domain switchings
- Author
-
Ab Al-Hadi Ab Rahman, Hamdan Abdellatef, Mohamed Khalil-Hani, and Sayed Omid Ayat
- Subjects
Normalization (statistics) ,0209 industrial biotechnology ,Computer science ,Cognitive Neuroscience ,Activation function ,02 engineering and technology ,Rectifier (neural networks) ,Convolutional neural network ,Computer Science Applications ,Convolution ,Domain (software engineering) ,Nonlinear system ,020901 industrial engineering & automation ,Artificial Intelligence ,Frequency domain ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Spatial frequency ,Algorithm ,MNIST database - Abstract
Recent researches have shown that spectral representation provides a significant speed-up in the massive computation workload of convolution operations in the inference (feed-forward) algorithm of Convolutional Neural Networks (CNNs). This approach results in reducing the computational complexity of the classification task, which makes spectral-based CNN suitable for implementation on embedded platform that typically has constrained resources. However, a major challenge in this approach is that the mathematical formulation of a nonlinear activation function in spectral (frequency) domain is currently not available; hence, computation of the activation functions in each layer has to be performed in the spatial domain. This results in several spatial-frequency domain switchings that are computationally very costly, and as such, it would be advantageous to strictly stay in the frequency domain. Hence, in this work, a novel Spectral Rectified Linear Unit (SReLU) for the activation function is proposed, that makes it possible for the computations to remain in the frequency domain, and therefore avoids the multiple compute-intensive domain transformations. To further optimize the classification speed of the network, an efficient spectral-based CNN model is presented that uses only the lower frequency components by way of fusing the convolutional and sub-sampling layers. Additionally, we provide and utilize a frequency domain equivalent of the conventional batch normalization layer that results in improving the accuracy of the network. Experimental results indicate that the proposed spectral-based CNN model achieves up to 17.02 × and 3.45 × faster classification speed (without considerable accuracy loss) on AT&T face recognition and MNIST digit/fashion classification datasets, respectively, as compared to the equivalent models in the spatial domain, hence outperforming conventional approaches significantly.
- Published
- 2019
9. A Low-complexity Complex-valued Activation Function for Fast and Accurate Spectral Domain Convolutional Neural Network
- Author
-
Mohamed Khalil-Hani, Shahriyar Masud Rizvi, Ab Al-Hadi Ab Rahman, and Sayed Omid Ayat
- Subjects
Control and Optimization ,Computational complexity theory ,Computer science ,Computer Networks and Communications ,Activation function ,Inference ,Convolutional neural network ,Domain (software engineering) ,Nonlinear system ,Artificial Intelligence ,Hardware and Architecture ,Control and Systems Engineering ,Component (UML) ,Computer Science (miscellaneous) ,Electrical and Electronic Engineering ,Algorithm ,MNIST database ,Information Systems - Abstract
Conventional Convolutional Neural Networks (CNNs), which are realized in spatial domain, exhibit high computational complexity. This results in high resource utilization and memory usage and makes them unsuitable for implementation in resource and energy-constrained embedded systems. A promising approach for low-complexity and high-speed solution is to apply CNN modeled in the spectral domain. One of the main challenges in this approach is the design of activation functions. Some of the proposed solutions perform activation functions in spatial domain, necessitating multiple and computationally expensive spatial-spectral domain switching. On the other hand, recent work on spectral activation functions resulted in very computationally intensive solutions. This paper proposes a complex-valued activation function for spectral domain CNNs that only transmits input values that have positive-valued real or imaginary component. This activation function is computationally inexpensive in both forward and backward propagation and provides sufficient nonlinearity that ensures high classification accuracy. We apply this complex-valued activation function in a LeNet-5 architecture and achieve an accuracy gain of up to 7% for MNIST and 6% for Fashion MNIST dataset, while providing up to 79% and 85% faster inference times, respectively, over state-of-the-art activation functions for spectral domain.
- Published
- 2021
10. Characterization of Correlation in Stochastic Computing Functions
- Author
-
Sayed Omid Ayat, Mohamed Khalil-Hani, Nasir Shaikh-Husin, and Hamdan Abdellatef
- Subjects
Correlation ,Signal processing ,Stochastic computing ,Computer science ,Computation ,Pattern recognition (psychology) ,Function (mathematics) ,Unconventional computing ,Algorithm ,Measure (mathematics) - Abstract
Stochastic computing (SC) is an alternative computing paradigm that provides low-cost hardware implementation for complex operations. Many previous works adopted SC in several applications such as signal processing and pattern recognition, where these types of application possess a nature of successive computations. Successive SC computation affects correlation, which affects accuracy. This relationship is not studied well before. In this paper, we characterize correlation properties for SC functions to thoroughly understand the computation-correlation-accuracy relationship. We provide a measure ( $\delta_{SCC}$ ) to identify the significance of variation of correlation due to computation. We also offer a general method to determine the correlation characteristics of any designed SC function.
- Published
- 2020
11. Optimizing FPGA-based CNN accelerator for energy efficiency with an extended Roofline model
- Author
-
Ab Al-Hadi Ab Rahman, Sayed Omid Ayat, and Mohamed Khalil-Hani
- Subjects
General Computer Science ,Computer science ,Feed forward ,Memory bandwidth ,02 engineering and technology ,Bottleneck ,Convolutional neural network,field-programmable gate array,energy efficiency,Roofline model,race-to-halt strategy ,Computer engineering ,Gate array ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Field-programmable gate array ,Throughput (business) ,Energy (signal processing) ,Efficient energy use - Abstract
In recent years, the convolutional neural network (CNN) has found wide acceptance in solving practical computer vision and image recognition problems. Also recently, due to its flexibility, faster development time, and energy efficiency, the field-programmable gate array (FPGA) has become an attractive solution to exploit the inherent parallelism in the feedforward process of the CNN. However, to meet the demands for high accuracy of today's practical recognition applications that typically have massive datasets, the sizes of CNNs have to be larger and deeper. Enlargement of the CNN aggravates the problem of off-chip memory bottleneck in the FPGA platform since there is not enough space to save large datasets on-chip. In this work, we propose a memory system architecture that best matches the off-chip memory traffic with the optimum throughput of the computation engine, while it operates at the maximum allowable frequency. With the help of an extended version of the Roofline model proposed in this work, we can estimate memory bandwidth utilization of the system at different operating frequencies since the proposed model considers operating frequency in addition to bandwidth utilization and throughput. In order to find the optimal solution that has the best energy efficiency, we make a trade-off between energy efficiency and computational throughput. This solution saves 18% of energy utilization with the trade-off having less than 2% reduction in throughput performance. We also propose to use a race-to-halt strategy to further improve the energy efficiency of the designed CNN accelerator. Experimental results show that our CNN accelerator can achieve a peak performance of 52.11 GFLOPS and energy efficiency of 10.02 GFLOPS/W on a ZYNQ ZC706 FPGA board running at 250 MHz, which outperforms most previous approaches.
- Published
- 2018
12. Low-area and accurate inner product and digital filters based on stochastic computing
- Author
-
Sayed Omid Ayat, Mohamed Khalil-Hani, Nasir Shaikh-Husin, and Hamdan Abdellatef
- Subjects
Signal processing ,Stochastic computing ,Finite impulse response ,Computer science ,Computation ,020206 networking & telecommunications ,02 engineering and technology ,Function (mathematics) ,Reduction (complexity) ,Control and Systems Engineering ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Algorithm ,Digital filter ,Software ,Electronic circuit - Abstract
The inner product is a key operation in various applications, such as signal processing and pattern recognition. Research has shown that this function, when implemented in stochastic computing (SC) domain, can result in significant reduction in area cost and power consumption compared to its equivalent counterpart in the conventional binary-encoded (BE) deterministic computing. However, existing designs of SC inner product are disadvantaged due to high BE-SC conversion circuits, hence high overall area cost. They also suffer from correlation-induced errors that affect their accuracy performance. In this work, we propose a novel inner product design method for the SC domain that has high accuracy, low area cost, and most importantly, the circuit is correlation-insensitive. Experimental results show that the proposed design on average reduces 85.7% of hardware footprint when compared to its equivalent BE counterpart. We show that it outperforms current state-of-the-art SC designs in terms of area savings, both in computation and conversion costs. Furthermore, it achieves better (or comparable) accuracy performance compared to existing works, especially in designs having large number of inputs with low stochastic number lengths. Moreover, SC FIR filter based on the proposed design method outperforms state-of-the-art SC filters in terms of area and accuracy.
- Published
- 2021
13. Stochastic Computing Correlation Utilization in Convolutional Neural Network Basic Functions
- Author
-
Sayed Omid Ayat, Mohamed Khalil Hani, Hamdan Abdellatef, and Nasir Shaikh Husin
- Subjects
Stochastic computing ,Random number generation ,Computer science ,convolutional neural network ,020206 networking & telecommunications ,02 engineering and technology ,Convolutional neural network ,Power budget ,Computer engineering ,Application-specific integrated circuit ,correlation ,0202 electrical engineering, electronic engineering, information engineering ,stochastic computing ,020201 artificial intelligence & image processing ,Electrical and Electronic Engineering ,Round-off error ,Field-programmable gate array ,Unconventional computing - Abstract
In recent years, many applications have been implemented in embedded systems and mobile Internet of Things (IoT) devices that typically have constrained resources, smaller power budget, and exhibit "smartness" or intelligence. To implement computation-intensive and resource-hungry Convolutional Neural Network (CNN) in this class of devices, many research groups have developed specialized parallel accelerators using Graphical Processing Units (GPU), Field-Programmable Gate Arrays (FPGA), or Application-Specific Integrated Circuits (ASIC). An alternative computing paradigm called Stochastic Computing (SC) can implement CNN with low hardware footprint and power consumption. To enable building more efficient SC CNN, this work incorporates the CNN basic functions in SC that exploit correlation, share Random Number Generators (RNG), and is more robust to rounding error. Experimental results show our proposed solution provides significant savings in hardware footprint and increased accuracy for the SC CNN basic functions circuits compared to previous work.
- Published
- 2018
14. Stochastic Computing Correlation Utilization in Convolutional Neural Network Basic Functions.
- Author
-
Hamdan Abdellatef, Mohamed Khalil-Hani, Nasir Shaikh-Husin, and Sayed Omid Ayat
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL neural networks ,RANDOM number generators ,GATE array circuits ,WIRELESS Internet ,FIELD programmable gate arrays ,INTEGRATING circuits - Abstract
In recent years, many applications have been implemented in embedded systems and mobile Internet of Things (IoT) devices that typically have constrained resources, smaller power budget, and exhibit "smartness" or intelligence. To implement computation-intensive and resource-hungry Convolutional Neural Network (CNN) in this class of devices, many research groups have developed specialized parallel accelerators using Graphical Processing Units (GPU), Field-Programmable Gate Arrays (FPGA), or Application-Specific Integrated Circuits (ASIC). An alternative computing paradigm called Stochastic Computing (SC) can implement CNN with low hardware footprint and power consumption. To enable building more efficient SC CNN, this work incorporates the CNN basic functions in SC that exploit correlation, share Random Number Generators (RNG), and is more robust to rounding error. Experimental results show our proposed solution provides significant savings in hardware footprint and increased accuracy for the SC CNN basic functions circuits compared to previous work. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.