1,266 results on '"cpu"'
Search Results
52. Implementation of Morphological Gradient Algorithm for Edge Detection
- Author
-
Vardhan Rao, Mirupala Aarthi, Mukherjee, Debasish, Savitha, S., Xhafa, Fatos, Series Editor, Saraswat, Mukesh, editor, Sharma, Harish, editor, Balachandran, K., editor, Kim, Joong Hoon, editor, and Bansal, Jagdish Chand, editor
- Published
- 2022
- Full Text
- View/download PDF
53. Hybrid (CPU/GPU) Exact Nearest Neighbors Search in High-Dimensional Spaces
- Author
-
Muhr, David, Affenzeller, Michael, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Goedicke, Michael, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Tröltzsch, Fredi, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Reis, Ricardo, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Maglogiannis, Ilias, editor, Iliadis, Lazaros, editor, Macintyre, John, editor, and Cortez, Paulo, editor
- Published
- 2022
- Full Text
- View/download PDF
54. OptCL: A Middleware to Optimise Performance for High Performance Domain-Specific Languages on Heterogeneous Platforms
- Author
-
Xiao, Jiajian, Andelfinger, Philipp, Cai, Wentong, Eckhoff, David, Knoll, Alois, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lai, Yongxuan, editor, Wang, Tian, editor, Jiang, Min, editor, Xu, Guangquan, editor, Liang, Wei, editor, and Castiglione, Aniello, editor
- Published
- 2022
- Full Text
- View/download PDF
55. Machine Learning Enhanced CPU-GPU Simulation Platform for 5G System
- Author
-
Ouyang, Yuling, Yin, Caiyuan, Zhou, Ting, Jin, Yan, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin (Sherman), Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Calafate, Carlos T., editor, Chen, Xianfu, editor, and Wu, Yuan, editor
- Published
- 2022
- Full Text
- View/download PDF
56. Microcontroller Architecture
- Author
-
Ünsalan, Cem, Gürhan, Hüseyin Deniz, Yücel, Mehmet Erkin, Ünsalan, Cem, Gürhan, Hüseyin Deniz, and Yücel, Mehmet Erkin
- Published
- 2022
- Full Text
- View/download PDF
57. Performance Analysis of Software Enabled Accelerator Library for Intel Architecture
- Author
-
Mohindru, Gaurav, Mondal, Koushik, Banka, Haider, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Kumar, Amit, editor, Senatore, Sabrina, editor, and Gunjan, Vinit Kumar, editor
- Published
- 2022
- Full Text
- View/download PDF
58. A Novel Artificial Intelligence Technique for Cloud Computing Using a New Heuristic Initialisation and PSO-Parallel Execution
- Author
-
Chraibi, Amine, Alla, Said Ben, Ezzati, Abdellah, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2022
- Full Text
- View/download PDF
59. Performance Evaluation and Comparison of Hypervisors in a Multi-Cloud Environment
- Author
-
Reddy, Nalin, Nadesh, R. K., Srinivasa Perumal, R., Mallela, Nikhil Chakravarthy, Arivuselvan, K., Chlamtac, Imrich, Series Editor, Nagarajan, Rajganesh, editor, Raj, Pethuru, editor, and Thirunavukarasu, Ramkumar, editor
- Published
- 2022
- Full Text
- View/download PDF
60. A New FPGA-Based Task Scheduler for Real-Time Systems.
- Author
-
Kohútka, Lukáš and Mach, Ján
- Subjects
TIME management ,NUMBER systems ,CLOCKS & watches ,DECISION making ,CYCLONES - Abstract
This research demonstrates a novel design of an FPGA-implemented task scheduler for real-time systems that supports both aperiodic and periodic tasks. The periodic tasks are automatically restarted once their period has expired without any need for software intervention. The proposed scheduler utilizes the Earliest-Deadline First (EDF) algorithm and is optimized for multi-core CPUs, capable of executing up to four threads simultaneously. The scheduler also provides support for task suspension, resumption, and enabling inter-task synchronization. The design is based on priority queues, which play a crucial role in decision making and time management. Thanks to the hardware acceleration of the scheduler and the hardware implementation of priority queues, it operates in only two clock cycles, regardless of the number of tasks in the system. The results of the FPGA synthesis, performed on an Intel FPGA device (Cyclone V family), are presented in the paper. The proposed solution was validated through a simplified version of the Universal Verification Methodology (UVM) with millions of test instructions and random deadline and period values. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
61. Electrical-Level Attacks on CPUs, FPGAs, and GPUs: Survey and Implications in the Heterogeneous Era.
- Author
-
MAHMOUD, DINA G., LENDERS, VINCENT, and STOJILOVIĆ, MIRJANA
- Subjects
- *
HETEROGENEOUS computing , *CENTRAL processing units , *GRAPHICS processing units , *COMPUTER architecture , *COMPUTER systems , *GATE array circuits - Abstract
Given the need for efficient high-performance computing, computer architectures combining central processing units (CPUs), graphics processing units (GPUs), and field-programmable gate arrays (FPGAs) are currently prevalent. However, each of these components suffers from electrical-level security risks. Moving to heterogeneous systems, with the potential of multitenancy, it is essential to understand and investigate how the security vulnerabilities of individual components may affect the system as a whole. In this work, we provide a survey on the electrical-level attacks on CPUs, FPGAs, and GPUs. Additionally, we discuss whether these attacks can extend to heterogeneous systems and highlight open research directions for ensuring the security of heterogeneous computing systems in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
62. A depth information aided real-time instance segmentation method for space task scenarios under CPU platform.
- Author
-
Li, Qianlong, Zhu, Zhanxia, Liang, Junwu, Zhang, Hongwen, Xu, Yanwen, and Zhang, Zhihao
- Subjects
- *
OBJECT recognition (Computer vision) , *INTELLIGENCE officers , *IMAGE segmentation , *COMPUTER vision - Abstract
Visual instance segmentation ability is one of the effective means to promote the autonomy and intelligence of space agents. However, due to the limited airborne computing capacity, current methods are difficult to deploy on space agents, because these methods are developed based on GPU. To this end, this paper proposes a novel framework of instance segmentation by introducing depth information and combining traditional computer vision techniques with an object detection method. This framework provides a new idea for the implementation of instance segmentation. The experiment results show that the proposed method achieves a real-time performance under a common laptop CPU platform. In addition, thanks to the introduction of depth information, the proposed method can obtain better segmentation results compared to Mask R–CNN and SOLOv2 in complex scenes (poor illumination and occlusion). Finally, because the semantic information is obtained by the object detection method in this paper, the model training adopts a weakly supervised manner from bounding-box annotations, which can reduce various costs of data labeling to a certain extent. • A depth-aided instance segmentation fuses mask extraction and object detection. • Real-time performance on consumer grade laptop CPU; friendly to airborne platforms. • Better segmentation results in complex scenes compared with Mask-RCNN and SOLOv2. • Training process adopts a weakly supervised manner from bounding-box annotations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
63. A Hybrid GPU and CPU Parallel Computing Method to Accelerate Millimeter-Wave Imaging.
- Author
-
Ding, Li, Dong, Zhaomiao, He, Huagang, and Zheng, Qibin
- Subjects
PARALLEL programming ,CENTRAL processing units ,FOURIER transforms ,GRAPHICS processing units ,SPEED limits ,PARALLEL algorithms - Abstract
The range migration algorithm (RMA) based on Fourier transformation is widely applied in millimeter-wave (MMW) close-range imaging because of its few operations and small approximation. However, its interpolation stage is not effective due to the involved intensive logic controls, which limits the speed performance in a graphics processing unit (GPU) platform. Therefore, in this paper, we present an acceleration optimization method based on the hybrid GPU and central processing unit (CPU) parallel computation for implementing the RMA. The proposed method exploits the strong logic-control capability of the CPU to assist the GPU in processing the logic controls of the interpolation stage. The common positions of wavenumber-domain components to be interpolated are calculated by the CPU and stored in the constant memory for broadcast at any time. This avoids the repetitive computation consumed in a GPU-only scheme. Then the GPU is responsible for the remaining matrix-related steps and outputs the needed wavenumber-domain values. The imaging experiments verify the acceleration efficiency of the proposed method and demonstrate that the speedup ratio of our proposed method is more than 15 times of that by the CPU-only method, and more than 2 times of that by the GPU-only method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
64. Time series-based workload prediction using the statistical hybrid model for the cloud environment.
- Author
-
Devi, K. Lalitha and Valli, S.
- Subjects
- *
STATISTICAL models , *CENTRAL processing units , *TIME series analysis , *FORECASTING , *BOX-Jenkins forecasting , *RESOURCE management , *INFRASTRUCTURE (Economics) , *DEMAND forecasting - Abstract
Resource management is addressed using infrastructure as a service. On demand, the resource management module effectively manages available resources. Resource management in cloud resource provisioning is aided by the prediction of central processing unit (CPU) and memory utilization. Using a hybrid ARIMA–ANN model, this study forecasts future CPU and memory utilization. The range of values discovered is utilized to make predictions, which is useful for resource management. In the cloud traces, the ARIMA model detects linear components in the CPU and memory utilization patterns. For recognizing and magnifying nonlinear components in the traces, the artificial neural network (ANN) leverages the residuals derived from the ARIMA model. The resource utilization patterns are predicted using a combination of linear and nonlinear components. From the predicted and previous history values, the Savitzky–Golay filter finds a range of forecast values. Point value forecasting may not be the best method for predicting multi-step resource utilization in a cloud setting. The forecasting error can be decreased by introducing a range of values, and we employ as reported by Engelbrecht HA and van Greunen M (in: Network and Service Management (CNSM), 2015 11th International Conference, 2015) OER (over estimation rate) and UER (under estimation rate) to cope with the error produced by over or under estimation of CPU and memory utilization. The prediction accuracy is tested using statistical-based analysis using Google's 29-day trail and BitBrain (BB). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
65. A Survey on RISC-V-Based Machine Learning Ecosystem.
- Author
-
Kalapothas, Stavros, Galetakis, Manolis, Flamis, Georgios, Plessas, Fotis, and Kitsos, Paris
- Subjects
- *
MACHINE learning , *SYSTEMS on a chip , *NATURAL language processing , *COMPUTER vision , *SOFTWARE frameworks , *ARTIFICIAL intelligence - Abstract
In recent years, the advancements in specialized hardware architectures have supported the industry and the research community to address the computation power needed for more enhanced and compute intensive artificial intelligence (AI) algorithms and applications that have already reached a substantial growth, such as in natural language processing (NLP) and computer vision (CV). The developments of open-source hardware (OSH) and the contribution towards the creation of hardware-based accelerators with implication mainly in machine learning (ML), has also been significant. In particular, the reduced instruction-set computer-five (RISC-V) open standard architecture has been widely adopted by a community of researchers and commercial users, worldwide, in numerous openly available implementations. The selection through a plethora of RISC-V processor cores and the mix of architectures and configurations combined with the proliferation of ML software frameworks for ML workloads, is not trivial. In order to facilitate this process, this paper presents a survey focused on the assessment of the ecosystem that entails RISC-V based hardware for creating a classification of system-on-chip (SoC) and CPU cores, along with an inclusive arrangement of the latest released frameworks that have supported open hardware integration for ML applications. Moreover, part of this work is devoted to the challenges that are concerned, such as power efficiency and reliability, when designing and building application with OSH in the AI/ML domain. This study presents a quantitative taxonomy of RISC-V SoC and reveals the opportunities in future research in machine learning with RISC-V open-source hardware architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
66. Advanced Encryption Standard (AES) acceleration and analysis using graphical processing unit (GPU).
- Author
-
Assafli, Hayder T., Hashim, Ivan A., and Naser, Ahmed A.
- Subjects
ADVANCED Encryption Standard ,CENTRAL processing units ,PARALLEL processing ,NUMERICAL calculations ,GRAPHICS processing units - Abstract
Graphics processing units (GPUs) have become the target for high-speed and high-throughput computing in the last decade. The device provides excellent capabilities in speeding general-purpose computing in many applications. The main advantage of GPUs is the ability to process heavy parallel requests depending on thousands of parallel processing cores operating concurrently on solving numerical calculations. In this paper, the role of the GPUs in speeding up the encryption process is studied extensively. Advanced Encryption Standard (AES) algorithm is used to encrypt files in both Central Processing Units (CPUs) and GPUs. The encryption process consists of several sub-processes of which each is evaluated by both types of processors. Two different systems models were used in testing operations. Results showed that low-size encryption processes are not affected by the acceleration process and do not require GPU processing. On the other hand, large file size encryption relies heavily on the acceleration process resulting in less processing time. The analysis also showed that the acceleration speed remains constant after reaching the maximum load value. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
67. Enabling Bitwise Reproducibility for the Unstructured Computational Motif
- Author
-
Bálint Siklósi, Gihan R. Mudalige, and István Z. Reguly
- Subjects
floating-point ,bitwise reproducibility ,unstructured-mesh computation ,DSL ,CPU ,GPU ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
In this paper we identify the causes of numerical non-reproducibility in the unstructured mesh computational motif, a class of algorithms commonly used for the solution of PDEs. We introduce a number of parallel and distributed algorithms to address nondeterminism in the order of floating-point computations, in particular, a new graph coloring scheme that produces identical coloring results regardless of how many parts the graph is partitioned to. We implement these in the OP2 domain specific language (DSL) and show how it can be automatically deployed to any application that uses OP2 without user intervention. We contrast differences in results without reproducibility and then demonstrate how bitwise reproducibility can be gained using our methods on a variety of applications including a production CFD application used at Rolls-Royce. We evaluate the performance and overheads of enforcing bitwise reproducibility on a cluster of CPUs and GPUs.
- Published
- 2024
- Full Text
- View/download PDF
68. An evaluation of Kd-Trees vs Bounding Volume Hierarchy (BVH) acceleration structures in modern CPU architectures
- Author
-
Ernesto Rivera-Alvarado and Julio Zamora-Madrigal
- Subjects
ray tracing ,CPU ,BVH ,KD-Trees ,acceleration structures ,modern hardware ,Technology - Abstract
Ray tracing is a rendering technique that is highly praised for its realism and image quality. Nonetheless, this is a computationally intensive task that is slow compared to other rendering techniques like rasterization. Bounding Volume Hierarchy (BVH) is a primitive subdivision acceleration mechanism that is the mainly used method for accelerating ray tracing in modern solutions. It is regarded as having better performance against other acceleration methods. Another well-known technique is Kd-Trees that uses binary space partitioning to adaptively subdivide space with planes. In this research, we made an up-to-date evaluation of both acceleration structures, using state-of-the-art BVH and Kd-Trees algorithms implemented in C, and found out that the Kd-Trees acceleration structure provided better performance in all defined scenarios on a modern x86 CPU architecture.
- Published
- 2023
- Full Text
- View/download PDF
69. A proposed scenario to improve the Ncut algorithm in segmentation
- Author
-
Nhu Y. Tran, Huynh Trung Hieu, and Pham The Bao
- Subjects
GPU ,CPU ,parallel computing ,Ncut ,FCM ,Information technology ,T58.5-58.64 - Abstract
In image segmentation, there are many methods to accomplish the result of segmenting an image into k clusters. However, the number of clusters k is always defined before running the process. It is defined by some observation or knowledge based on the application. In this paper, we propose a new scenario in order to define the value k clusters automatically using histogram information. This scenario is applied to Ncut algorithm and speeds up the running time by using CUDA language to parallel computing in GPU. The Ncut is improved in four steps: determination of number of clusters in segmentation, computing the similarity matrix W, computing the similarity matrix's eigenvalues, and grouping on the Fuzzy C-Means (FCM) clustering algorithm. Some experimental results are shown to prove that our scenario is 20 times faster than the Ncut algorithm while keeping the same accuracy.
- Published
- 2023
- Full Text
- View/download PDF
70. Heat Transfer Enhancement Through Different Heat Sink/Impinging Air Jet Parameters an Experimental Approach.
- Author
-
BERIACHE, M’hamed, CHERKI, Brahim Hicham, SAÏDIA, Leila MOKHTAR, NOR AZWADI, Che Sidik, and RIZALMAN, Mamat
- Subjects
- *
HEAT sinks , *AIR jets , *HEAT transfer , *HAIR dryers , *CENTRAL processing units , *AIR ducts - Abstract
In the present work, improving the thermal performance of a central processing unit (CPU) cooling system consisting of a plate fin minichannel heat sink exposed to an air jet of impact is accomplished. The height of the impinging air jet duct, the mode of air flow on the heat sink, air blowing mode and as well as the suction blow mode "pull" and the air jet duct geometry are three important influence parameters on the phenomenon. The results obtained show that a jet height H = 20 mm, a convergent duct of H = 40 mm in height and a hair dryer tip provide further improvement of the cooling technique. The two preceding parameters combined with the fan air suction blow mode improve the performance by at least 11,2% compared to the initial configuration (marketed product), having as characteristics (H= = 0 mm, fan in the blowing push mode). The use of a converging duct and a hair dryer conduit show that the velocities from the hair dryer tip far exceed those from the best standard conduit height, H = = 20 mm. That said, the improvement of the flow in the central zone is substantially achieved, which consequently improves the heat removal in this zone at least 19%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
71. PAS: A new powerful and simple quantum computing simulator.
- Author
-
Bian, Haodong, Huang, Jianqiang, Tang, Jiahao, Dong, Runting, Wu, Li, and Wang, Xiaoying
- Subjects
QUANTUM computing ,FOURIER transforms ,QUBITS - Abstract
In recent years, many researchers have been using CPU for quantum computing simulation. However, in reality, the simulation efficiency of the large‐scale simulator is low on a single node. Therefore, striving to improve the simulator efficiency on a single node has become a serious challenge that many researchers need to solve. After many experiments, we found that much computational redundancy and frequent memory access are important factors that hinder the efficient operation of the CPU. This paper proposes a new powerful and simple quantum computing simulator: PAS (power and simple). Compared with existing simulators, PAS introduces four novel optimization methods: efficient hybrid vectorization, fast bitwise operation, memory access filtering, and quantum tracking. In the experiment, we tested the QFT (quantum Fourier transform) and RQC (random quantum circuits) of 21 to 30 qubits and selected the state‐of‐the‐art simulator QuEST (quantum exact simulation toolkit) as the benchmark. After experiments, we have concluded that PAS compared with QuEST can achieve a mean speedup of 8.69× (QFT), 2.62× (RQC) (up to 10.76×, 4.87×) on the Intel Xeon E5‐2670 v3 CPU. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
72. Environmentally friendly ionic liquid medium based on dicyanamide anion for leaching gold from e‐waste CPUs by iodination.
- Author
-
Liu, Ziyuan, Kou, Jue, Xing, Yi, Sun, Chunbao, and Zhang, Yuxin
- Subjects
ELECTRONIC waste ,LEACHING ,IONIC liquids ,IODINATION ,GOLD ,CHEMICAL reactions - Abstract
BACKGROUD: The large and growing amount of electronic waste (e‐waste) caused by the rise of the electronics industry has negatively impacted the environment and secondary resource utilization. Thus, research on the recovery, recycling, and reuse of valuable metals from e‐waste has attracted considerable attention. In this study, a novel leaching system based on dicyandiamide ionic liquids (ILs): 1‐ethyl‐3‐methylimidazolium dicyandiamide, 1‐butyl‐3‐methylimidazolium dicyandiamide, and 1‐hexyl‐3‐methylimidazolium dicyandiamide were used to evaluate the effectiveness of leaching gold from typical e‐waste CPUs by iodination. RESULTS: The response surface experiment and analysis of variance (ANOVA) for the results showed that the optimal gold leaching efficiency reached 93.81% after 170 min under a rotational speed of 300 rpm, temperature of 32 °C, and iodine dosage of 1.15%. The mechanism of reaction was determined and the main reaction products were Au[IN(CN)2]− and Au[N(CN)2]2−. Moreover, the kinetic behavior of gold leaching showed a good correlation with the Avrami model and the activation energy value was 14.01 kJ·mol−1. The comparison of iodine consumption between aqueous medium and ILs showed that the required amount of iodine (calculated by ionic iodine) decreased by a factor of six times in ILs. CONCLUSION: The iodine dosage, leaching temperature, and time had significant effects on the gold leaching efficiency in the IL leaching system. The reaction of gold leaching was jointly controlled by diffusion and chemical reactions. The coordination groups [N(CN)2]22− and [IN(CN)2]2− in the pregnant solution formed stable complexes with Au(I). The consumption of iodine in ILs decreased remarkably, showing the significant practical value of ILs. © 2022 Society of Chemical Industry (SCI). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
73. HH-NIDS: Heterogeneous Hardware-Based Network Intrusion Detection Framework for IoT Security.
- Author
-
Ngo, Duc-Minh, Lightbody, Dominic, Temko, Andriy, Pham-Quoc, Cuong, Tran, Ngoc-Thinh, Murphy, Colin C., and Popovici, Emanuel
- Subjects
INTRUSION detection systems (Computer security) ,ARTIFICIAL intelligence ,ARTIFICIAL neural networks ,INTERNET of things ,COMPUTER network security ,GRAPHICS processing units - Abstract
This study proposes a heterogeneous hardware-based framework for network intrusion detection using lightweight artificial neural network models. With the increase in the volume of exchanged data, IoT networks' security has become a crucial issue. Anomaly-based intrusion detection systems (IDS) using machine learning have recently gained increased popularity due to their generation's ability to detect unseen attacks. However, the deployment of anomaly-based AI-assisted IDS for IoT devices is computationally expensive. A high-performance and ultra-low power consumption anomaly-based IDS framework is proposed and evaluated in this paper. The framework has achieved the highest accuracy of 98.57% and 99.66% on the UNSW-NB15 and IoT-23 datasets, respectively. The inference engine on the MAX78000EVKIT AI-microcontroller is 11.3 times faster than the Intel Core i7-9750H 2.6 GHz and 21.3 times faster than NVIDIA GeForce GTX 1650 graphics cards, when the power drawn was 18mW. In addition, the pipelined design on the PYNQ-Z2 SoC FPGA board with the Xilinx Zynq xc7z020-1clg400c device is optimised to run at the on-chip frequency (100 MHz), which shows a speedup of 53.5 times compared to the MAX78000EVKIT. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
74. Bitcoin Mining and Making Money
- Author
-
Gray, Gerald R. and Gray, Gerald R.
- Published
- 2021
- Full Text
- View/download PDF
75. Automatic Computing Device Selection Scheme Between CPU and GPU for Enhancing the Computation Efficiency
- Author
-
Kim, Geunmo, Kim, Sungmin, Cho, Jinsung, Kim, Jeong-Dong, Kim, Bongjae, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Park, James J., editor, Loia, Vincenzo, editor, Pan, Yi, editor, and Sung, Yunsick, editor
- Published
- 2021
- Full Text
- View/download PDF
76. A Survey of the Exemplary Practices in Network Operations and Management
- Author
-
Majidha Fathima, K. M., Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Jeena Jacob, I., editor, Kolandapalayam Shanmugam, Selvanayaki, editor, Piramuthu, Selwyn, editor, and Falkowski-Gilski, Przemyslaw, editor
- Published
- 2021
- Full Text
- View/download PDF
77. Applications of GPUs for Signal Processing Algorithms: A Case Study on Design Choices for Cyber-Physical Systems
- Author
-
Srivastava, Neelesh Ranjan, Mittal, Vikas, Pisello, Anna Laura, Editorial Board Member, Hawkes, Dean, Editorial Board Member, Bougdah, Hocine, Editorial Board Member, Rosso, Federica, Editorial Board Member, Abdalla, Hassan, Editorial Board Member, Boemi, Sofia-Natalia, Editorial Board Member, Mohareb, Nabil, Editorial Board Member, Mesbah Elkaffas, Saleh, Editorial Board Member, Bozonnet, Emmanuel, Editorial Board Member, Pignatta, Gloria, Editorial Board Member, Mahgoub, Yasser, Editorial Board Member, De Bonis, Luciano, Editorial Board Member, Kostopoulou, Stella, Editorial Board Member, Pradhan, Biswajeet, Editorial Board Member, Abdul Mannan, Md., Editorial Board Member, Alalouch, Chaham, Editorial Board Member, O. Gawad, Iman, Editorial Board Member, Nayyar, Anand, Editorial Board Member, Amer, Mourad, Series Editor, Singh, Krishna Kant, editor, Tanwar, Sudeep, editor, and Abouhawwash, Mohamed, editor
- Published
- 2021
- Full Text
- View/download PDF
78. Power Consumption Reduction in IoT Devices Through Field-Programmable Gate Array with Nanobridge Switch
- Author
-
Sharma, Preeti, Nair, Rajit, Dwivedi, Vidya Kant, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Marriwala, Nikhil, editor, Tripathi, C. C., editor, Kumar, Dinesh, editor, and Jain, Shruti, editor
- Published
- 2021
- Full Text
- View/download PDF
79. Introduction to Tensorflow Package
- Author
-
Prakash, Kolla Bhanu, Ruwali, Adarsha, Kanagachidambaresan, G. R., Chlamtac, Imrich, Series Editor, Prakash, Kolla Bhanu, editor, and Kanagachidambaresan, G. R., editor
- Published
- 2021
- Full Text
- View/download PDF
80. Comparative Study of Cooling Solutions of a Drone Based on Raspberry Pi Deducing the Most Efficient Cooling Method
- Author
-
Beniwal, Rohit, Patidar, Sanjay, Tomar, Rohan, Shekhar, Khatta, Rohit, Xhafa, Fatos, Series Editor, Smys, S., editor, Palanisamy, Ram, editor, Rocha, Álvaro, editor, and Beligiannis, Grigorios N., editor
- Published
- 2021
- Full Text
- View/download PDF
81. Improved Post-quantum Merkle Algorithm Based on Threads
- Author
-
Iavich, Maksim, Gnatyuk, Sergiy, Arakelian, Arturo, Iashvili, Giorgi, Polishchuk, Yuliia, Prysiazhnyy, Dmytro, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Hu, Zhengbing, editor, Petoukhov, Sergey, editor, Dychka, Ivan, editor, and He, Matthew, editor
- Published
- 2021
- Full Text
- View/download PDF
82. A Framework for Supporting Repetition and Evaluation in the Process of Cloud-Based DBMS Performance Benchmarking
- Author
-
Erdelt, Patrick K., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nambiar, Raghunath, editor, and Poess, Meikel, editor
- Published
- 2021
- Full Text
- View/download PDF
83. Interconnect Modeling for Homogeneous and Heterogeneous Multiprocessors
- Author
-
Krishna, Tushar, Bharadwaj, Srikant, Mishra, Prabhat, editor, and Charles, Subodha, editor
- Published
- 2021
- Full Text
- View/download PDF
84. Energy-Efficient Networks-on-Chip Architectures: Design and Run-Time Optimization
- Author
-
Mandal, Sumit K., Krishnakumar, Anish, Ogras, Umit Y., Mishra, Prabhat, editor, and Charles, Subodha, editor
- Published
- 2021
- Full Text
- View/download PDF
85. Introduction
- Author
-
Sahrling, Mikael and Sahrling, Mikael
- Published
- 2021
- Full Text
- View/download PDF
86. On-the-Fly Lowering Engine: Offloading Data Layout Conversion for Convolutional Neural Networks
- Author
-
Mingu Kang, Sangmin Hyun, Tae Hee Han, Jungrae Kim, and Seokin Hong
- Subjects
Convolutional neural network ,GEMM ,CPU ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Many deep learning frameworks utilize GEneral Matrix Multiplication (GEMM)-based convolution to accelerate CNN execution. GEMM-based convolution provides faster convolution yet requires a data conversion process called lowering (i.e., im2col), which incurs significant memory overhead and diminishes performance. This paper proposes a novel hardware mechanism, called On-the-fly Lowering Engine (OLE), to eliminate the lowering overheads. Our goal is to offload the lowering overheads from the GEMM-based convolution. With OLE, the lowered matrix is neither pre-calculated nor stored in the main memory. Instead, a hardware engine generates lowered matrix on-the-fly from the original input matrix to reduce memory footprint and bandwidth requirements. Furthermore, the hardware offloading eliminates CPU cycles for lowering operation and overlaps computation with lowering to hide the performance overhead. Our evaluation shows that OLE can reduce memory footprint of convolutional layer inputs down to $\frac {1}{12.5}\times $ and the overall memory footprint by up to 33.5%. Moreover, OLE can reduce the execution time of convolutional layers by 57.7% on average, resulting in an average speedup of $2.3\times $ for representative CNN models.
- Published
- 2022
- Full Text
- View/download PDF
87. Implementation of Beeman's algorithm to calculate execution time on GPU using CUDA.
- Author
-
Rtal, Youness and Hadjoudja, Abdelkader
- Subjects
- *
GRAPHICS processing units , *MICROPROCESSORS , *PARALLEL computers , *INFORMATION storage & retrieval systems , *DIFFERENTIAL equations - Abstract
Graphics processing units (GPUs) are microprocessors designed to the operation of display and manipulation of graphics data. Currently, these graphics processor are found on all graphics hardware and have become very important instruments for parallel computing. GPUs are practical tools for the development of several fields like decoding and encoding, solving differential equations. Their advantages are increase in performance, faster data processing and reduced power consumption. It is simple to program a GPU with CUDA C to run parallel calculations. But it is necessary to have an understanding of the architectural aspects of the GPU and CUDA C. This paper, we will describe and implement Beeman's algorithm on GPU and CPU using CUDA C to solve the differential equation of charged particles in an electromagnetic field. Our goal is to evaluate the performances of the implementation on GPU and CPU processors and to deduce the efficiency of the use of GPUs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
88. Hydrothermal Analysis of Archimedean Spiral Channel Heat Sink for CPU Cooling.
- Author
-
Rashad, Hala M., Najim, Younis M., and Ismaeal, Hatem H.
- Subjects
HEAT sinks (Electronics) ,ENGINEERING models ,VELOCITY ,TEMPERATURE ,KINETIC energy - Abstract
The rapid improvement of engineering modeling is supported by the improvement of parallel GPU and CPU computational capacities. However, due to space limitations, the improvement of the computational capacities of GPU and CPU imposes challenges in the cooling process. The liquid cooling method has attracted more interest as an effective heat dissipation method. In this work, a new channel configuration is introduced using the Archimedean spiral curve to generate the Archimedean spiral channel configuration. The conjugate heat sink model was created to have four different domains: liquid coolant (water), Cold plate (copper), glue layer (ethoxy), and CPU (alumina). The effect of turbulence was incorporated by varying the flow rate at a constant water inlet temperature of 25oCto cover a range of Reynolds numbers (Re) from 3000 to 15000. The Shear Stress Transport (k-ω SST) was the used turbulent model for a better capturing of the viscous, high-frequency flow fluctuation near-wall region. Input power of 450 W was subjected to the bottom surface of the CPU. The results showed that the Reynolds number has a decisive impact on controlling the CPU temperature. As higher Re decreased the average temperature developed within the CPU and increased the pressure drop at an exponential rate. Darcy-Weisbach equation confirmed these findings for internal flow when the pressure drop depends on the squared average velocity. The hydrothermal performance of the Archimedean spiral channel configuration rapidly decreased with the Re. Similar to the velocity profile, the turbulent kinetic energy is generated at higher rate next to the channels’ outer wall compared to the inner wall. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
89. Study of bypassing Microsoft Windows Security using the MITRE CALDERA Framework [version 3; peer review: 2 approved]
- Author
-
Nachaat Mohamed
- Subjects
Research Article ,Articles ,APT ,CPU ,Attack ,Exploit ,Detection ,Cyberattack. - Abstract
Background: Microsoft Windows Security is a recently implemented safeguard for the Windows operating systems, including the latest versions of Windows10 and 11. However, there is a major shortcoming in this system to stop Advanced Persistent Threat (APT). These are government-financed groups that are funded to attack other government entities. Following the initial security breach, the hacked Windows device is used to access the rest of the network devices in order to transfer data to external storage (Exfiltration). Methods: In this work, we have tested the Microsoft Windows Security system using MITRE CALDERA and ATT&CK frameworks and explain how APT groups are able to bypass Windows Security. Results: In this study we used '54ndc47' agent through GoLang feature in MITRE CALDERA platform to test and bypass Microsoft Windows Security systems (MS Windows 10). Through it, we were able to bypass the Windows Security system and display entire files in the victim's device. Conclusions: In this paper, we have provided recommendations to Microsoft to improve their Windows Security tool through the use of Artificial intelligence (AI).
- Published
- 2022
- Full Text
- View/download PDF
90. Performance Modeling and Optimization for Machine Learning Workloads
- Author
-
Lin, Zhongyi
- Subjects
Computer engineering ,Electrical engineering ,CPU ,GPU ,Machine Learning ,Parallel Computing ,Performance Modeling - Abstract
Machine learning (ML) workloads emerge and evolve drastically in a series of aspects in recent years. ML workloads' performance, i.e., training/inference speed on various devices/platforms, stands as one of the top considerations in their development. Performance modeling is a powerful technique that helps ML practitioners understand the performance bottlenecks of ML workloads and optimize them. In this dissertation, we showcase how to use performance models to assist in the optimization of ML performance, and how we design such models that are highly accurate, robust, and versatile with different application configurations, such as training/inference, ML model types, and device types.We first show how to use the roofline model as a simple operator (op) level performance model to identify kernel/layer fusion candidates in convolution neural networks (CNN). We answer the question of when and why fusing two linearly connected complex ops, i.e., convolution (conv) and depthwise convolution (dw-conv) in an ML model will be beneficial in terms of execution time, and propose a deep learning (DL) compiler friendly solution that enables efficient auto-tuning of fused kernel schedule of two layers on multicore CPUs and beat the separate kernel execution performance of TVM (by 1.09x geomean and 1.29x max) MKLDNN-backed PyTorch (by 2.09x geomean and 3.35x max) and as end-to-end (E2E) baselines.Next, we present a more complicated application of performance models in predicting and aiding the optimization of ML training performance on GPU platforms. Built on top of a series of kernel-level performance models, either ML-based or analytical, for dominating ops/kernels as well as the overhead analysis for all ops in the deep learning recommendation model (DLRM), we devise a critical-path-based performance model that not only predicts the per-batch training time of DLRM on single GPU with low error rate (geomean: 4.61% for GPU active time, 7.96% for E2E, and 10.15% for E2E with shared overheads) but can also be generalized to other types of ML models such as computer vision (CV) and natural language processing (NLP).Finally, We further extend this performance model to multi-GPU platforms by adding supports to 1) communication collective performance modeling, 2) GPU stream synchronizations on the same device and across devices in the E2E time prediction algorithm, and 3) data-distribution-aware and problem size flexible performance modeling of embedding table lookup. On single-node multi-GPU platforms, this enhanced model exhibits robustness on DLRM models with random embedding tables, maintains low training speed prediction error (geomean: 5.21% for E2E with shared overheads on randomly generated DLRMs), and generalizes well to NLP models with 3.00% geomean prediction error. With a use case, we demonstrate its ability to quickly select the embedding table sharding configuration and thus improve the end-to-end training performance of DLRMs.
- Published
- 2023
91. Balancing of Web Applications Workload Using Hybrid Computing (CPU–GPU) Architecture
- Author
-
Chandrashekhar, B. N., Kantharaju, V., Harish Kumar, N., and Kumble, Lithin
- Published
- 2024
- Full Text
- View/download PDF
92. Accelerating Radiowave Propagation Simulations: A GPU-based Approach to Parabolic Equation Modeling
- Author
-
Nilsson, Andreas and Nilsson, Andreas
- Abstract
This study explores the application of GPU-based algorithms in radiowave propagation modeling, specifically through the scope of solving parabolic wave equations. Radiowave propagation models are crucial in the field of wireless communications, where they help predict how radio waves travel through different environments, which is vital for planning and optimization. The research specifically examines the implementation of two numerical methods: the Split Step Method and the Finite Difference Method. Both methods are adapted to utilize the parallel processing capabilities of modern GPUs, harnessing a parallel computing framework known as CUDA to achieve considerable speed enhancements compared to traditional CPU-based methods.Our findings reveal that the Split Step method generally achieves higher speedup factors, especially in scenarios involving large system sizes and high-frequency simulations, making it particularly effective for expansive and complex models. In contrast, the Finite Difference Method shows more consistent speedup across various domain sizes and frequencies, suggesting its robustness across a diverse range of simulation conditions. Both methods maintained high accuracy levels, with differences in computed norms remaining low when comparing GPU implementations against their CPU counterparts.
- Published
- 2024
93. Acceleration of 3D feature-enhancing noise filtering in hybrid CPU/GPU systems
- Author
-
Ministerio de Ciencia e Innovación (España), Agencia Estatal de Investigación (España), Ministerio de Ciencia, Innovación y Universidades (España), European Commission, https://ror.org/02gfc7t72, González-Ruiz, Vicente, Moreno, J. J., Fernández, José Jesús, Ministerio de Ciencia e Innovación (España), Agencia Estatal de Investigación (España), Ministerio de Ciencia, Innovación y Universidades (España), European Commission, https://ror.org/02gfc7t72, González-Ruiz, Vicente, Moreno, J. J., and Fernández, José Jesús
- Abstract
FlowDenoising is a new approach to noise reduction in biological volumes obtained with three-dimensional electron microscopy (3DEM). Its abilities to enhance the structural features stem from the fact that an anisotropic Gaussian filtering is steered according to the local structures. To this end, the Optical Flow (OF) among consecutive slices is estimated, which is the most computationally expensive step in this approach. In this article, a hybrid CPU/GPU implementation of FlowDenoising is introduced and evaluated. It exploits parallel computing by distributing the workload among multiple cores and takes advantage of the massive processing in GPUs to accelerate the OF estimation. The hybrid implementation provides remarkable speed-up factors and an important reduction of the processing time, which is particularly relevant for the denoising of huge volumes typically found in 3DEM.
- Published
- 2024
94. Compressed SVD-based L + S model to reconstruct undersampled dynamic MRI data using parallel architecture.
- Author
-
Shafique M, Qazi SA, and Omer H
- Subjects
- Humans, Phantoms, Imaging, Artifacts, Image Interpretation, Computer-Assisted methods, Signal-To-Noise Ratio, Reproducibility of Results, Algorithms, Image Processing, Computer-Assisted methods, Magnetic Resonance Imaging methods, Heart diagnostic imaging, Data Compression methods
- Abstract
Background: Magnetic Resonance Imaging (MRI) is a highly demanded medical imaging system due to high resolution, large volumetric coverage, and ability to capture the dynamic and functional information of body organs e.g. cardiac MRI is employed to assess cardiac structure and evaluate blood flow dynamics through the cardiac valves. Long scan time is the main drawback of MRI, which makes it difficult for the patients to remain still during the scanning process., Objective: By collecting fewer measurements, MRI scan time can be shortened, but this undersampling causes aliasing artifacts in the reconstructed images. Advanced image reconstruction algorithms have been used in literature to overcome these undersampling artifacts. These algorithms are computationally expensive and require a long time for reconstruction which makes them infeasible for real-time clinical applications e.g. cardiac MRI. However, exploiting the inherent parallelism in these algorithms can help to reduce their computation time., Methods: Low-rank plus sparse (L+S) matrix decomposition model is a technique used in literature to reconstruct the highly undersampled dynamic MRI (dMRI) data at the expense of long reconstruction time. In this paper, Compressed Singular Value Decomposition (cSVD) model is used in L+S decomposition model (instead of conventional SVD) to reduce the reconstruction time. The results provide improved quality of the reconstructed images. Furthermore, it has been observed that cSVD and other parts of the L+S model possess highly parallel operations; therefore, a customized GPU based parallel architecture of the modified L+S model has been presented to further reduce the reconstruction time., Results: Four cardiac MRI datasets (three different cardiac perfusion acquired from different patients and one cardiac cine data), each with different acceleration factors of 2, 6 and 8 are used for experiments in this paper. Experimental results demonstrate that using the proposed parallel architecture for the reconstruction of cardiac perfusion data provides a speed-up factor up to 19.15× (with memory latency) and 70.55× (without memory latency) in comparison to the conventional CPU reconstruction with no compromise on image quality., Conclusion: The proposed method is well-suited for real-time clinical applications, offering a substantial reduction in reconstruction time., (© 2023. The Author(s), under exclusive licence to European Society for Magnetic Resonance in Medicine and Biology (ESMRMB).)
- Published
- 2024
- Full Text
- View/download PDF
95. CPU vs GPU Performance of MATLAB Clustering Algorithms
- Author
-
Ivanov, Andrey, Natalia, Ziazina, Veronika, Antonova, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Vishnevskiy, Vladimir M., editor, Samouylov, Konstantin E., editor, and Kozyrev, Dmitry V., editor
- Published
- 2020
- Full Text
- View/download PDF
96. Bounding Volume Hierarchy Acceleration Through Tightly Coupled Heterogeneous Computing
- Author
-
Rivera-Alvarado, Ernesto, Torres-Rojas, Francisco J., Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Crespo-Mariño, Juan Luis, editor, and Meneses-Rojas, Esteban, editor
- Published
- 2020
- Full Text
- View/download PDF
97. Development of Game Modules with Support for Synchronous Multiplayer Based on Unreal Engine 4 Using Artificial Intelligence Approach
- Author
-
Levchenko, Bohdan, Chukhray, Andrii, Chumachenko, Dmytro, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Nechyporuk, Mykola, editor, Pavlikov, Vladimir, editor, and Kritskiy, Dmitriy, editor
- Published
- 2020
- Full Text
- View/download PDF
98. A Makespan Lower Bound for the Tiled Cholesky Factorization Based on ALAP Schedule
- Author
-
Beaumont, Olivier, Langou, Julien, Quach, Willy, Shilova, Alena, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Malawski, Maciej, editor, and Rzadca, Krzysztof, editor
- Published
- 2020
- Full Text
- View/download PDF
99. Polynomial Scheduling Algorithm for Parallel Applications on Hybrid Platforms
- Author
-
Ait Aba, Massinissa, Zaourar, Lilia, Munier, Alix, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Baïou, Mourad, editor, Gendron, Bernard, editor, Günlük, Oktay, editor, and Mahjoub, A. Ridha, editor
- Published
- 2020
- Full Text
- View/download PDF
100. From the Office to the Cloud, Why Should You Care? : The Short Answer: Because It Can Benefit You
- Author
-
Stradi-Granados, Benito A. and Stradi-Granados, Benito A.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.