13,556 results
Search Results
2. Rethinking Engineering Education on the Teaching and Research Practice of Computer Architecture
- Author
-
Xu, Qingzhen, Mao, Mingzhi, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gan, Jianhou, editor, Pan, Yi, editor, Zhou, Juxiang, editor, Liu, Dong, editor, Song, Xianhua, editor, and Lu, Zeguang, editor
- Published
- 2024
- Full Text
- View/download PDF
3. A Proposal for a Standard Evaluation Method for Assessing Programming Proficiency in Assembly Language
- Author
-
Rivera-Alvarado, Ernesto, Guadamuz, Saúl, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Jat, Dharm Singh, editor, Mishra, Durgesh, editor, and Joshi, Amit, editor
- Published
- 2024
- Full Text
- View/download PDF
4. An Efficient Workload Distribution Mechanism for Tightly Coupled Heterogeneous Hardware
- Author
-
Rivera-Alvarado, Ernesto, Torres-Rojas, Francisco J., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Singh Jat, Dharm, editor, Mishra, Durgesh Kumar, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
5. Neocortex and Bridges-2: A High Performance AI+HPC Ecosystem for Science, Discovery, and Societal Good
- Author
-
Buitrago, Paola A., Nystrom, Nicholas A., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Nesmachnow, Sergio, editor, Castro, Harold, editor, and Tchernykh, Andrei, editor
- Published
- 2021
- Full Text
- View/download PDF
6. Benchmarking Solvers for the One Dimensional Cubic Nonlinear Klein Gordon Equation on a Single Core
- Author
-
Muite, B. K., Aseeri, Samar, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gao, Wanling, editor, Zhan, Jianfeng, editor, Fox, Geoffrey, editor, Lu, Xiaoyi, editor, and Stanzione, Dan, editor
- Published
- 2020
- Full Text
- View/download PDF
7. Design Discussion and Performance Research of the Third-Level Cache in a Multi-socket, Multi-core Microchip
- Author
-
Li, Nan, Deng, Rangyu, Zhang, Ying, Zhou, Hongwei, Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Xu, Weixia, editor, Xiao, Liquan, editor, Li, Jinwen, editor, and Zhu, Zhenzhen, editor
- Published
- 2019
- Full Text
- View/download PDF
8. A Nomadic Testbed for Teaching Computer Architecture
- Author
-
Godoy, Pablo D., Garino, Carlos G. García, Cayssials, Ricardo L., Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Yuan, Junsong, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Pesado, Patricia, editor, and Aciti, Claudio, editor
- Published
- 2019
- Full Text
- View/download PDF
9. A Study of Intelligent Paper Grouping Model for Adult Higher Education Based on Random Matrix.
- Author
-
Wang, Yan
- Subjects
ADULT education ,HIGHER education ,RANDOM matrices ,DATABASE design ,CHAOS theory ,COMPUTER architecture ,PARTICLE swarm optimization ,COVARIANCE matrices - Abstract
This paper presents a comprehensive study and analysis of the intelligent grouping of papers in adult higher education using a random matrix approach. Using the results of random matrix theory on the eigenvalues of the sample covariance matrix, the energy of each subspace is estimated, and the estimated energy is then used to construct a subspace weighting matrix. The statistical properties of the sample covariance matrix eigenvectors are analyzed using the first-order perturbation approximation, and then, asymptotic results from random matrix theory on the projection of the sample covariance matrix signal subspace to the real signal parametrization are used to obtain the weighting matrix based on the random matrix eigenvectors. Dynamic adjustment according to the fitness of individuals in the population is performed to ensure population diversity, while the combination of the small habitat technique can avoid the algorithm from falling into early convergence. The algorithm introduces chaos theory to optimize the population initialization process and uses the dynamic traversal randomness of chaos to select individuals in the population so that the initial population is close to the desired target solution. The design of the fitness function in the genetic algorithm generally maps the objective function of the problem to the fitness function. A good fitness function can directly reflect the quality of the individuals in the group. Based on the in-depth study of the basic attributes of the test questions and the principles of test paper evaluation, the mathematical model and objective function of intelligent paper grouping are determined by the difficulty, knowledge points, and cognitive level of the test questions as the main constraints, and NCAGA is applied to the intelligent paper grouping method, which better completes the intelligent paper grouping session for the computer system architecture course. In the process of designing the intelligent grouping algorithm, for the situations of premature convergence and convergence to locally optimal solutions that easily occur in the traditional genetic algorithm, this paper adopts the approach of adaptive adjustment of crossover probability and variation probability to improve the algorithm and achieves satisfactory results. Based on extensive business research, this paper completes the requirement analysis of the online practice system based on the intelligent grouping of papers and presents the functional design and database design of the key functional modules in the system in detail. Finally, this paper conducts functional tests on the system and analyses the test results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Energy-Efficient VLSI Architecture & Implementation of Bi-modal Multi-banked Register-File Organization
- Author
-
Gudaparthi, Sumanth, Shrestha, Rahul, Barbosa, Simone Diniz Junqueira, Series editor, Chen, Phoebe, Series editor, Filipe, Joaquim, Series editor, Kotenko, Igor, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, Yuan, Junsong, Series editor, Zhou, Lizhu, Series editor, Kaushik, Brajesh Kumar, editor, Dasgupta, Sudeb, editor, and Singh, Virendra, editor
- Published
- 2017
- Full Text
- View/download PDF
11. Crucial Topics in Computer Architecture Education and a Survey of Textbooks and Papers.
- Author
-
Yildiz, Abdullah, Gören, Sezer, Ugurdag, H. Fatih, Aktemur, Barış, and Akdogan, Taylan
- Subjects
COMPUTERS in education ,COMPUTER architecture ,TEXTBOOKS ,MICROPROCESSORS ,COMPUTER surveys ,MICROCONTROLLERS - Abstract
We have been teaching undergraduate computer architecture since 2012 in an unconventional way. Most undergraduate computer architecture courses are based on microprocessors, and they quickly move into advanced topics such as instruction pipelining, forwarding, branch prediction, cache, and even memory management unit. We instead spend only the last one-third of our course on these topics. The first two thirds of the course is devoted to microcontrollers, i.e., simple-minded processors with no memory hierarchy, no branch prediction, sometimes even no pipelining. Our claim is that it is very hard to truly grasp the advanced topics without full grasp of the basics. Equipped with the above approach, this article comes up with an all-inclusive list of crucial topics for computer architecture education, and it surveys 25 computer architecture textbooks as well as 38 computer architecture education papers to see how much they cover these topics. In addition to that, the article contains a concise description of the perspective of our course. One of the pillars of our course is a working CPU on FPGA. We have so far had around 600 students design their own unique CPUs using Verilog given a complete instruction set, close to 70% of them with complete success. [ABSTRACT FROM AUTHOR]
- Published
- 2020
12. Teaching of IA-32 Assembly Language Programming Using Intel® Galileo
- Author
-
Phang, Tan Chee, Hashim, Shaiful Jahari b., Latiff, Nurul Adilah bt. Abdul, Rokhani, Fakhrul Zaman, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Huang, Tien-Chi, editor, Lau, Rynson, editor, Huang, Yueh-Min, editor, Spaniol, Marc, editor, and Yuen, Chun-Hung, editor
- Published
- 2017
- Full Text
- View/download PDF
13. Flipping a Course on Computer Architecture
- Author
-
Suleman, Hussein, Diniz Junqueira Barbosa, Simone, Series editor, Chen, Phoebe, Series editor, Du, Xiaoyong, Series editor, Filipe, Joaquim, Series editor, Kara, Orhun, Series editor, Kotenko, Igor, Series editor, Liu, Ting, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, and Gruner, Stefan, editor
- Published
- 2016
- Full Text
- View/download PDF
14. 67‐1: Distinguished Paper: Efficient Multi‐Quality Super Resolution Using a Deep Convolutional Neural Network for an FPGA Implementation
- Author
-
Sang-Lyn Lee, Min Beom Kim, Ilho Kim, Soo Young Yoon, Hee Jung Hong, and Chang Gone Kim
- Subjects
Computer architecture ,business.industry ,Computer science ,media_common.quotation_subject ,Deep learning ,Quality (business) ,Artificial intelligence ,business ,Field-programmable gate array ,Superresolution ,Convolutional neural network ,media_common - Published
- 2020
15. Position paper: Data for AI research (DAIR) infrastructure: advancing educational research and practice
- Author
-
Joksimović, Srećko, Siemens, George, Coyle, Damien, Zamecnik, Andrew, De Laat, Maarten, Dawson, Shane, Richey, Michael C, Kovanovic, Vitomir, Pardo, Abelardo, and Fey, Alexei
- Subjects
personalised e-learning ,big data application ,computer architecture ,knowledge management ,learning management systems ,data systems - Abstract
The project represents a report that introduces a data infrastructure that a) integrates data from multiple sources, b) enables various access permissions to differentstakeholders, c) provides model building and algorithm development within the data lake, and d) allows for the implementation of real-time analysis outputs including adaptive feedback and dashboards for both learners and teachers. This technical environment is foundation to the utilisation of artificial intelligence in knowledge processes and to establish advanced applications such as personal knowledge graphs and contextual learning supports that are indicative of true personalised learning and sensemaking, simultaneously advancing research and practice of teaching and learning.
- Published
- 2022
16. CGRA-ME: An Open-Source Framework for CGRA Architecture and CAD Research : (Invited Paper)
- Author
-
Xinyuan Wang, Xiaoyi Ling, Hsuan Hsiao, Rami Beidas, Omar Ragheb, Tianyi Yu, Vimal Chacko, and Jason H. Anderson
- Subjects
Computer science ,CAD ,Solid modeling ,computer.software_genre ,Software framework ,Application-specific integrated circuit ,Computer architecture ,Systems architecture ,Verilog ,ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS ,Field-programmable gate array ,computer ,computer.programming_language ,Abstraction (linguistics) - Abstract
Coarse-grained reconfigurable arrays (CGRAs) are programmable hardware platforms that can be used to realize application-specific accelerators for higher performance and energy efficiency. A CGRA is a 2D array of configurable logic blocks & interconnect, where the logic blocks are typically large & ALU-like, and the interconnect is word-wide. CGRA-ME is a software framework that enables the modelling and exploration of CGRA architectures, as well as research on CGRA CAD algorithms. With CGRA-ME, an architect can specify a CGRA architecture at a high level of abstraction. A set of applications can be mapped onto the architecture to assess the mappability, power, performance and cost. CGRA-ME also allows one to generate synthesizable Verilog RTL for the modelled CGRA, permitting its implementation as an ASIC or FPGA overlay. In this paper, we describe the CGRA-ME framework [5] and overview its capabilities and current limitations. We discuss ongoing and prior research conducted with the framework, as well as outline future plans. We believe CGRA-ME will be a valuable contribution to the community, enabling new research on CGRA CAD & architectures.
- Published
- 2021
17. Reproducibility Companion Paper: Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network
- Author
-
Wei Hu, Bo Wu, Yueqi Zhong, Jan Zahálka, and Xin Wang
- Subjects
Reproducibility ,Experimental Replication ,Computer architecture ,Computer science ,business.industry ,Deep learning ,Compatibility (mechanics) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Artificial intelligence ,business ,Software package - Abstract
This companion paper supports the experimental replication of paper "Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network", which is presented at ACM Multimedia 2019. We provide the software package for replicating the implementation of Multi-Layered Comparison Network (MCN), as well as the Polyvore-T dataset and baseline methods compared in the original paper. This paper contains the guides to reproduce the experiment results including outfit compatibility prediction, outfit diagnosis and automatic outfit revision.
- Published
- 2020
18. Ruche Networks: Wire-Maximal, No-Fuss NoCs : Special Session Paper
- Author
-
Chun Zhao, Dustin Richmond, Scott Davidson, Dai Cheol Jung, and Michael Taylor
- Subjects
Standard cell ,Router ,Very-large-scale integration ,Computer science ,Network packet ,Mesh networking ,02 engineering and technology ,021001 nanoscience & nanotechnology ,Chip ,Column (database) ,020202 computer hardware & architecture ,Computer architecture ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,0210 nano-technology - Abstract
Network-On-Chip design has been an active area of academic research for two decades, but many proposed ideas have not been adopted in real chips because they have complex behavior or create significant risks in chip implementation. For this reason, many existing chips just employ fast, replicated vanilla dimension-ordered mesh NoCs. However, these networks do not come close to utilizing the full available VLSI wiring capabilities, and propagate packets at speeds that are significantly below the raw speed of wires.The ideal network would not require any custom circuits, and would decompose easily into a hierarchical CAD flow consisting of a top-level design instantiating a mesh of identical hardened tiles with short-wire neighbor connections.At the same time, this ideal network would easily scale to efficiently utilize the majority of the available chip wiring resources, and would offer a mechanism for scaling this wire usage up or down based on available bandwidth. Packets would spend a significant fraction of their time in wire delay rather than router delay. Finally, the NoC would be simple to understand.This paper proposes Ruche Networks, which fulfill these requirements. They are based on simple 2-D mesh networks but amplify the NoC bandwidth and reduce NoC diameter of tiled architectures by adding long-range physical channels from each tile to other tiles on the same row or column. The more distant the connections, the greater the bandwidth of the network and the lower the diameter. The distance is typically increased until all of the physical VLSI wiring bandwidth have been absorbed.We explain the rational for this "ruching" and provide a simple methodology for designing and implementing these networks using a standard cell VLSI CAD flow.In this paper, we show the steps involved in ruching the HammerBlade Manycore’s mesh networks; these steps can easily apply to other designs.
- Published
- 2020
19. Ascend: a Scalable and Unified Architecture for Ubiquitous Deep Neural Network Computing : Industry Track Paper
- Author
-
Yuxing Hu, Jing Xia, Hu Liu, Xiping Zhou, Jiajin Tu, Honghui Yuan, and Heng Liao
- Subjects
Memory hierarchy ,business.industry ,Computer science ,020208 electrical & electronic engineering ,Symmetric multiprocessor system ,02 engineering and technology ,Data access ,Memory management ,Computer architecture ,Datapath ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data center ,business ,Heterogeneous network - Abstract
Deep neural networks (DNNs) have been successfully applied to a great variety of applications, ranging from small IoT devices to large scale services in a data center. In order to improve the efficiency of processing these DNN models, dedicated hardware accelerators are required for all these scenarios. Theoretically, there exists an optimized acceleration architecture for each application. However, considering the cost of chip design and corresponding tool-chain development, researchers need to trade off between efficiency and generality. In this work, we demonstrate that it is practical to use a unified architecture, called Ascend, to support those applications, ranging from IoT devices to data-center services. We provide a lot of design details to explain that the success of Ascend relies on contributions from different levels. First, heterogeneous computing units are employed to support various DNN models. And the datapath is adapted according to the requirement of computing and data access. Second, when scaling the Ascend architecture from a single core to a cluster containing thousands of cores, it involves design efforts, such as memory hierarchy and system level integration. Third, a multi-tier compiler, which provides flexible choices for developers, is the last critical piece. Experimental results show that using accelerators based on the Ascend architecture can achieve comparable or even better performance in different applications. In addition, various chips based on the Ascend architecture have been successfully commercialized. More than 100 million chips have been used in real products.
- Published
- 2021
20. AWD: Best Paper Competition (AWD) Enabling Next Generation Video Applications on Consumer Integrated and Discrete Client GPUs
- Author
-
Jill Macdonald Boyce and Basel Salahieh
- Subjects
Competition (economics) ,Computer architecture ,Computer science ,Encoding (memory) ,Codec ,Content adaptive ,Graphics ,Implementation ,Transform coding ,Power optimization - Abstract
This "success story" panel illustrates how next generation video applications are enabled on PCs today using consumer integrated and discrete GPUs launched in 2020. Intel's XeLP graphics technology with dedicated media hardware powers the integrated graphics in Intel's latest client processor, Tiger Lake, and Intel's first entry-level mainstream discrete graphics card, DG1. XeLP graphics in Tiger Lake and DG1 has democratized access to high performance implementations of the latest emerging video codec standards. Three topics will be covered: (i) 8K HEVC/AV1 Playback with Content Adaptive Power Optimization; (ii) Ludicrous Speed HEVC Encoding with Integrated + Discrete GPU; and (iii) MPEG Immersive Video (MIV) Playback on DG1.
- Published
- 2021
21. Deep neural networks compiler for a trace-based accelerator (short WIP paper)
- Author
-
Aliasger Zaidy, Eugenio Culurciello, Lukasz Burzawa, and Andre Xian Ming Chang
- Subjects
020203 distributed computing ,business.industry ,Computer science ,Dataflow ,Deep learning ,Image processing ,Memory bandwidth ,02 engineering and technology ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Computer Graphics and Computer-Aided Design ,Computer architecture ,0202 electrical engineering, electronic engineering, information engineering ,Deep neural networks ,Artificial intelligence ,Compiler ,business ,Field-programmable gate array ,computer ,Software ,0105 earth and related environmental sciences ,TRACE (psycholinguistics) - Abstract
Deep Neural Networks (DNNs) are the algorithm of choice for image processing applications. DNNs present highly parallel workloads that lead to the emergence of custom hardware accelerators. Deep Learning (DL) models specialized in different tasks require a programmable custom hardware and a compiler/mapper to efficiently translate different DNNs into an efficient dataflow in the accelerator. The goal of this paper is to present a compiler for running DNNs on Snowflake, which is a programmable hardware accelerator that targets DNNs. The compiler correctly generates instructions for various DL models: AlexNet, VGG, ResNet and LightCNN9. Snowflake, with a varying number of processing units, was implemented on FPGA to measure the compiler and Snowflake performance properties upon scaling up. The system achieves 70 frames/s and 4.5 GB/s of off-chip memory bandwidth for AlexNet without linear layers on Xilinx’s Zynq-SoC XC7Z045 FPGA.
- Published
- 2018
22. High performance network components for scalable spaceborne processing needs: Poster, short paper
- Author
-
Richard W. Berger and Joseph R. Marshall
- Subjects
Engineering ,Random access memory ,Computer architecture ,business.industry ,Interface (Java) ,Embedded system ,Emphasis (telecommunications) ,Short paper ,Scalability ,Electromagnetic compatibility ,High performance network ,business ,SpaceWire - Abstract
This paper will describe high performance interface building blocks, compare their networking features and show how they may be used in small and large systems especially as they apply to SpaceVPX modules. Emphasis will be placed on their SpaceWire and other networking capabilities.1
- Published
- 2016
23. Critically appraised paper: Tailored prescription of digitally enabled rehabilitation may improve mobility, but not physical activity, in geriatric and neurological rehabilitation [commentary]
- Author
-
Catherine M Said
- Subjects
Man-Computer Interface ,medicine.medical_treatment ,Psychological intervention ,Consumer Electronics ,Computer Architecture ,law.invention ,Randomized controlled trial ,law ,Medicine and Health Sciences ,Public and Occupational Health ,Range of Motion, Articular ,Rehabilitation ,Neurological Rehabilitation ,Virtual Reality ,Prescriptions ,Neurology ,Engineering and Technology ,medicine.symptom ,Range of motion ,Research Article ,Biotechnology ,Computer and Information Sciences ,medicine.medical_specialty ,Patients ,Visual impairment ,Equipment ,Bioengineering ,Physical Therapy, Sports Therapy and Rehabilitation ,Rehabilitation Medicine ,Intervention (counseling) ,medicine ,Humans ,Medical prescription ,Exercise ,Measurement Equipment ,Aged ,Inpatients ,business.industry ,lcsh:RM1-950 ,Australia ,Biology and Life Sciences ,Physical Activity ,Health Care ,lcsh:Therapeutics. Pharmacology ,Mobility Limitation ,Human Factors Engineering ,Neurorehabilitation ,Physical therapy ,Medical Devices and Equipment ,Electronics ,business ,User Interfaces - Abstract
Background Digitally enabled rehabilitation may lead to better outcomes but has not been tested in large pragmatic trials. We aimed to evaluate a tailored prescription of affordable digital devices in addition to usual care for people with mobility limitations admitted to aged care and neurological rehabilitation. Methods and findings We conducted a pragmatic, outcome-assessor-blinded, parallel-group randomised trial in 3 Australian hospitals in Sydney and Adelaide recruiting adults 18 to 101 years old with mobility limitations undertaking aged care and neurological inpatient rehabilitation. Both the intervention and control groups received usual multidisciplinary inpatient and post-hospital rehabilitation care as determined by the treating rehabilitation clinicians. In addition to usual care, the intervention group used devices to target mobility and physical activity problems, individually prescribed by a physiotherapist according to an intervention protocol, including virtual reality video games, activity monitors, and handheld computer devices for 6 months in hospital and at home. Co-primary outcomes were mobility (performance-based Short Physical Performance Battery [SPPB]; continuous version; range 0 to 3; higher score indicates better mobility) and upright time as a proxy measure of physical activity (proportion of the day upright measured with activPAL) at 6 months. The dataset was analysed using intention-to-treat principles. The trial was prospectively registered with the Australian New Zealand Clinical Trials Registry (ACTRN12614000936628). Between 22 September 2014 and 10 November 2016, 300 patients (mean age 74 years, SD 14; 50% female; 54% neurological condition causing activity limitation) were randomly assigned to intervention (n = 149) or control (n = 151) using a secure online database (REDCap) to achieve allocation concealment. Six-month assessments were completed by 258 participants (129 intervention, 129 control). Intervention participants received on average 12 (SD 11) supervised inpatient sessions using 4 (SD 1) different devices and 15 (SD 5) physiotherapy contacts supporting device use after hospital discharge. Changes in mobility scores were higher in the intervention group compared to the control group from baseline (SPPB [continuous, 0–3] mean [SD]: intervention group, 1.5 [0.7]; control group, 1.5 [0.8]) to 6 months (SPPB [continuous, 0–3] mean [SD]: intervention group, 2.3 [0.6]; control group, 2.1 [0.8]; mean between-group difference 0.2 points, 95% CI 0.1 to 0.3; p = 0.006). However, there was no evidence of a difference between groups for upright time at 6 months (mean [SD] proportion of the day spent upright at 6 months: intervention group, 18.2 [9.8]; control group, 18.4 [10.2]; mean between-group difference −0.2, 95% CI −2.7 to 2.3; p = 0.87). Scores were higher in the intervention group compared to the control group across most secondary mobility outcomes, but there was no evidence of a difference between groups for most other secondary outcomes including self-reported balance confidence and quality of life. No adverse events were reported in the intervention group. Thirteen participants died while in the trial (intervention group: 9; control group: 4) due to unrelated causes, and there was no evidence of a difference between groups in fall rates (unadjusted incidence rate ratio 1.19, 95% CI 0.78 to 1.83; p = 0.43). Study limitations include 15%–19% loss to follow-up at 6 months on the co-primary outcomes, as anticipated; the number of secondary outcome measures in our trial, which may increase the risk of a type I error; and potential low statistical power to demonstrate significant between-group differences on important secondary patient-reported outcomes. Conclusions In this study, we observed improved mobility in people with a wide range of health conditions making use of digitally enabled rehabilitation, whereas time spent upright was not impacted. Trial registration The trial was prospectively registered with the Australian New Zealand Clinical Trials Register; ACTRN12614000936628, In a randomised controlled trial, Leanne Hassett and colleagues investigate the impact of digitally-enabled aged care and neurological rehabilitation on activity and mobility outcomes in Australia., Author summary Why was this study done? A higher dose of therapy in physical rehabilitation is associated with better outcomes; however, current rehabilitation models deliver low therapy doses. Use of digital devices such as virtual reality video games, activity monitors, and handheld computer devices can be enjoyable, provide feedback on performance, and may enable a greater dose of task-specific therapy to improve outcomes. Current evidence is yet to confidently confirm the effects of rehabilitation using digital devices in addition to usual rehabilitation care on mobility tasks such as walking and other important outcomes such as quality of life. What did the researchers do and find? In a pragmatic, outcome-assessor-blinded randomised controlled trial, 300 people with walking difficulties (age 72 ± 16 years, 50% female) received usual multidisciplinary inpatient and post-hospital aged care and neurological rehabilitation alone, or in addition used a range of affordable devices such as virtual reality video games, activity monitors, and handheld devices to target mobility and physical activity, as individually prescribed by a physiotherapist for 6 months. On average participants in the intervention group used 4 ± 1 devices in the inpatient setting and 2 ± 1 devices in the post-hospital setting. This approach was feasible and enjoyed, and demonstrated it could be provided across care settings including the post-hospital setting with mostly remote support. Clinically important improvement was seen in mobility at 3 weeks and 6 months after baseline, but this was not accompanied by greater time spent upright. No adverse events were reported by participants whilst undertaking rehabilitation using digital devices, and there was no difference in the rate of falls between groups. What do these findings mean? Digitally enabled rehabilitation using a range of devices prescribed by a physiotherapist to target a range of mobility limitations across care settings for adults with mixed health conditions can improve mobility but not time spent upright. These results need to be interpreted in light of study limitations including a 15%–19% loss to follow-up at 6 months on the co-primary outcomes. Future models of rehabilitation should investigate incorporating digital devices to enhance inpatient and post-hospital rehabilitation, but prescription should ensure quality and quantity of practice.
- Published
- 2020
24. HPDM: A Survey Paper
- Author
-
Li Wang
- Subjects
MIMD ,Focus (computing) ,Computer architecture ,Shared memory ,Workstation ,law ,Computer science ,Carry (arithmetic) ,Component (UML) ,Parallelism (grammar) ,SIMD ,law.invention - Abstract
This survey reviews several approaches of HPDM from many research groups world wide. Modern computer hardware supports the development of high-performance applications for data analysis on many different levels. The focus is on modern multi-core processors built into today's commodity computers, which are typically found at university institutes both as small server and workstation computers. So they are deliberately not high-performance computers. Modern multi-core processors consist of several (2 to over 100) computer cores, which work independently of each other according to the principle of "multiple instruction multiple data'' (MIMD). They have a common main memory (shared memory). Each of these computer cores has several (2-16) arithmetic-logic units, which can simultaneously carry out the same arithmetic operation on several data in a vector-like manner (single instruction multiple data, SIMD). HPDM algorithms must use both types of parallelism (SIMD and MIMD), with access to the main memory (centralized component) being the main barrier to increased efficiency.
- Published
- 2020
25. MAGICAL: Toward Fully Automated Analog IC Layout Leveraging Human and Machine Intelligence: Invited Paper
- Author
-
Mingjie Liu, Nan Sun, David Z. Pan, Biying Xu, Keren Zhu, Xiyuan Tang, Shaolan Li, and Yibo Lin
- Subjects
Heuristic (computer science) ,business.industry ,Computer science ,020208 electrical & electronic engineering ,Constraint (computer-aided design) ,02 engineering and technology ,Integrated circuit design ,Integrated circuit layout ,Automation ,020202 computer hardware & architecture ,Computer architecture ,Fully automated ,Hardware_INTEGRATEDCIRCUITS ,0202 electrical engineering, electronic engineering, information engineering ,Netlist ,Routing (electronic design automation) ,business - Abstract
Despite tremendous advancement of digital IC design automation tools over the last few decades, analog IC layout is still heavily manual which is very tedious and error-prone. This paper will first review the history, challenges, and current status of analog IC layout automation. Then, we will present MAGICAL, a human-intelligence inspired, fully-automated analog IC layout system currently being developed under the DARPA IDEA program. It starts from an unannotated netlist, performs automatic layout constraint extraction and device generation, then performs placement and post-placement optimization, followed by routing to obtain the final GDSII layout. Various analytical, heuristic, and machine learning algorithms will be discussed. MAGICAL has obtained promising preliminary results. We will conclude the paper with further discussions on challenges and future directions for fully-automated analog IC layout.
- Published
- 2019
26. SAICSIT Papers in the ACM-DL.
- Author
-
Gruner, Stefan
- Subjects
COMPUTER architecture ,MANAGEMENT information systems ,INFORMATION resources management ,COMPUTERS ,COMPUTER science ,GRID computing - Published
- 2019
- Full Text
- View/download PDF
27. Wavelength-Routed Optical NoCs: Design and EDA — State of the Art and Future Directions: Invited Paper
- Author
-
Ulf Schlichtmann, Alexandre Truppel, Mengchu Li, Mahdi Nikdast, and Tsun-Ming Tseng
- Subjects
Range (mathematics) ,Computer architecture ,Computer science ,020208 electrical & electronic engineering ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Electronic design automation ,02 engineering and technology ,State (computer science) ,Routing (electronic design automation) ,Component placement ,Waveguide (optics) ,020202 computer hardware & architecture - Abstract
Wavelength-routed optical network-on-chip (WRONoC) design consists of topological and physical synthesis. It covers many interacting design aspects such as wavelength assignment, message routing, network construction, component placement, and waveguide routing. Due to the high complexity of the design problem, current manual design usually trades optimality for scalability and feasibility, which results in performance degradation and waste of resources. In this paper, we will present an overview of the existing design automation approaches that have demonstrated their effectiveness in customizing and optimizing application-specific WRONoC designs, and of the potential design automation directions to address a wider range of design challenges. We will also discuss the advantages of comprehensive optimization considering multiple design aspects simultaneously, and the possible barriers that need to be removed to achieve this goal.
- Published
- 2019
28. Ultra-Low Power and Minimal Design Effort Interfaces for the Internet of Things: Invited paper
- Author
-
Orazio Aiello, Paolo Stefano Crovetti, and Massimo Alioto
- Subjects
Ultra low power ,Computer science ,business.industry ,020208 electrical & electronic engineering ,Design flow ,Digital-to-analog converter ,Reconfigurability ,020206 networking & telecommunications ,02 engineering and technology ,law.invention ,Software portability ,Computer architecture ,law ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Internet of Things ,business - Abstract
This paper reviews the results of recent researches aimed to extend the standard-cell based digital design flow to analog building blocks, so that to enhance scalability, reconfigurability and portability across technology nodes and to reduce design effort, time-to-market and costs. In this framework, the application of the proposed fully digital design approach to a wake up oscillator and to a Digital-to-Analog Converter, which are two building blocks widely employed in IoT sensor nodes, is illustrated in detail.
- Published
- 2019
29. Short Paper: Neuromorphic Chip Embedded Electronic Systems to Expand Artificial Intelligence
- Author
-
Hamid Abdi and Lahiru L. Abeysekara
- Subjects
medicine.anatomical_structure ,Neuromorphic engineering ,Artificial neural network ,Application-specific integrated circuit ,Computer architecture ,Computer science ,medicine ,Human brain ,Applications of artificial intelligence ,Electronics ,Electronic hardware ,Chip - Abstract
Neuromorphic chips are electronic hardware mimicking neurons in human brain in an electronic structure. These ASICs (Application Specific Integrated Circuits) provide artificial neural networks with computational power comparatively higher than most neural networks generated by software algorithms. 'CM1K' is an electronic chip in this family of products. It has a parallel neural network of 1024 neurons. These neurons provide K-Nearest Neighbor (KNN) data classification. The chip requires to be embedded in an electronic system to access all its capabilities. This paper deliver a novel hardware system embedding CM1K neuromorphic chip. The system was implemented in image and video frame analysis for evaluation. The results prove that the system could benefit various applications including security, asset management, home appliances, mail sorting and manufacturing. Since the embedded system provide opportunity to integrate AI in to simple electronics, it helps on extending AI applications.
- Published
- 2019
30. Exploiting reconfigurable computing in 5G: a case study of latency critical function: Invited Paper
- Author
-
Piero Castoldi, F. Civerchia, Maxime Pelcat, Luca Valcarenghi, Scuola Universitaria Superiore Sant'Anna [Pisa] (SSSUP), Institut d'Electronique et de Télécommunications de Rennes (IETR), Université de Nantes (UN)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Institut Pascal - Clermont Auvergne (IP), Sigma CLERMONT (Sigma CLERMONT)-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne (UCA), Institut d'Électronique et des Technologies du numéRique (IETR), Université de Nantes (UN)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), and Nantes Université (NU)-Université de Rennes 1 (UR1)
- Subjects
OpenCL ,business.industry ,Orthogonal frequency-division multiplexing ,Computer science ,Hardware Acceleration ,030204 cardiovascular system & hematology ,Reconfigurable computing ,[SPI]Engineering Sciences [physics] ,03 medical and health sciences ,0302 clinical medicine ,Software ,Computer architecture ,Reconfigurable Computing ,Hardware acceleration ,030212 general & internal medicine ,Mobile telephony ,Latency (engineering) ,business ,Field-programmable gate array ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,5G ,ComputingMilieux_MISCELLANEOUS - Abstract
The fifth generation of mobile communications (5G) is expected to dramatically improve performance compared to preceding standards by offering very high bandwidths and low latencies. To provide this performance, heavy processing is required and must meet strong timing constraints. Reconfigurable computing, managing processing in software and exploiting reconfigurable hardware acceleration, is an innovative approach that should be considered for 5G for its capacity to combine high throughput and high flexibility. This paper presents a case study for Orthogonal Frequency Division Multiplexing (OFDM) computation reconfigurable offloading onto an Field Programmable Gate Array (FPGA). The implementation is based on Open Computing Language (OpenCL) that represents a versatile solution, as this language can be compiled for several architectures, provided that a Host+Accelerator structure is used. The objective of our study is to demonstrate that, by means of hardware offloading, the 5G architecture resources can reach high computational load, avoiding processing stalls and latency increase. Results show that around 15% of the software processing can be freed through hardware acceleration and reallocated to support other tasks.
- Published
- 2019
31. LSOracle: a Logic Synthesis Framework Driven by Artificial Intelligence: Invited Paper
- Author
-
Pierre-Emmanuel Gaillardon, Luca Amaru, Max Austin, Scott Temple, Xifan Tang, and Walter Lau Neto
- Subjects
Standard cell ,Computer science ,Context (language use) ,02 engineering and technology ,Integrated circuit ,020202 computer hardware & architecture ,law.invention ,Logic synthesis ,Computer architecture ,Application-specific integrated circuit ,law ,0202 electrical engineering, electronic engineering, information engineering ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Electronic design automation ,Hardware_LOGICDESIGN ,Electronic circuit - Abstract
The increasing complexity of modern Integrated Circuits (ICs) leads to systems composed of various different Intellectual Property (IPs) blocks, known as System-on-Chip (SoC). Such complexity requires strong expertise from engineers, that rely on expansive commercial EDA tools. To overcome such a limitation, an automated open-source logic synthesis flow is required. In this context, this work proposes LSOracle: a novel automated mixed logic synthesis framework. LSOracle is the first to exploit state-of-the-art And-Inverter Graph (AIG) and Majority-Inverter Graph (MIG) logic optimizers and relies on a Deep Neural Network (DNN) to automatically decide which optimizer should handle different portions of the circuit. To do so, LSOracle applies $k-way$ partitioning to split a DAG into multiple partitions and uses a to chose the best-fit optimizer. Post-tech mapping ASIC results, targeting the 7nm ASAP standard cell library, for a set of mixed-logic circuits, show an average improvement in area-delay product of 6.87% (up to 10.26%) and 2.70% (up to 6.27%) when compared to AIG and MIG, respectively. In addition, we show that for the considered circuits, LSOracle achieves an area close to AIGs (which delivered smaller circuits) with a similar performance of MIGs, which delivered faster circuits.
- Published
- 2019
32. 2018 International Symposium on Computer Architecture influential paper award
- Author
-
Antonio González, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
- Subjects
Awards ,Hardware ,Computer architecture ,Hardware and Architecture ,Computer science ,Microprocessadors -- Consum d'energia ,Microprocessors -- Energy consumption ,Electrical and Electronic Engineering ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,Software ,GeneralLiterature_MISCELLANEOUS ,Arquitectura d'ordinadors - Abstract
The International Symposium on Computer Architecture (ISCA) recognizes every year the most influential paper published in this conference 15 years earlier, based on its impact on research, development, products or ideas. This award is sponsored by the IEEEComputer Society Technical Committee on Computer Architecture (IEEE-CS TCCA) and the ACM Special Interest Group on Computer Architecture (ACM SIGARCH). In this year’s edition, the candidate papers were those papers published in ISCA 2003 proceedings.The selection process was chaired by Antonio González. Candidate papers for the award were selected by the current year’s ISCA Pro-gram Committee. The final award selection was made by the Award Chair (Antonio González), the IEEE-CS TCCA Chair (Lieven Eeckhout) and the ACM SIGARCH Chair (Sarita Adve). The award includes an honorarium for the authors and a certificate.The 2018 award was presented to “Temperature-Aware Microarchitecture” by Kevin Skadron, Mircea R. Stan, Wei Huang, Sivakumar Velusamy, Karthik Sankaranarayanan and DavidTarjan.
- Published
- 2018
33. Normative Emotional Agents: A Viewpoint Paper.
- Author
-
Argente, Estefania, Val, E. Del, Perez-Garcia, D., and Botti, V.
- Abstract
Human social relationships imply conforming to the norms, behaviors, and cultural values of the society, but also socialization of emotions, to learn how to interpret and show them. In multiagent systems, much progress has been made in the analysis and interpretation of both emotions and norms. Nonetheless, the relationship between emotions and norms has hardly been considered and most normative agents do not consider emotions, or vice-versa. In this article, we provide an overview of relevant aspects within the area of normative agents and emotional agents. First we focus on the concept of norm, the different types of norms, its life cycle and a review of multiagent normative systems. Second, we present the most relevant theories of emotions, the life cycle of an agent’s emotions, and how emotions have been included through computational models in multiagent systems. Next, we present an analysis of proposals that integrate emotions and norms in multiagent systems. From this analysis, four relationships are detected between norms and emotions, which we analyze in detail and discuss how these relationships have been tackled in the reviewed proposals. Finally, we present a proposal for an abstract architecture of a Normative Emotional Agent that covers these four norm-emotion relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Full-chip monolithic 3D IC design and power performance analysis with ASAP7 library: (Invited Paper)
- Author
-
Bon Woong Ku, Sung Kyu Lim, Kyungwook Chang, and Saurabh Sinha
- Subjects
Computer architecture ,Computer science ,020208 electrical & electronic engineering ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Power performance ,Process design ,Node (circuits) ,02 engineering and technology ,Chip ,3d ic design ,020202 computer hardware & architecture - Abstract
In this paper, we present full-chip designs and their power, performance, and area (PPA) metrics using the ASAP7 process design kit (PDK) and library. Reliable cell library is a key element in evaluating new technological options such as monolithic 3D (M3D) ICs. Given an RTL, we conduct synthesis and place/route to obtain commercial-quality 2D and M3D IC designs and compare PPA. The ASAP7 library is highly useful to build high-quality designs that accurately reflect 7nm technology node. In addition, the full front-end and back-end access provided in ASAP7 allows us to see the impact of various device and interconnect parameters at the full-chip level for both 2D and monolithic 3D ICs. This work demonstrates the critical role of an academic PDK and library in enabling high-quality research in disruptive technologies such as M3D integration.
- Published
- 2017
35. ASAP7 predictive design kit development and cell design technology co-optimization: Invited paper
- Author
-
Vinay Vashishtha, Manoj Vangala, and Lawrence T. Clark
- Subjects
010302 applied physics ,Standard cell ,Computer science ,Extreme ultraviolet lithography ,Process design ,01 natural sciences ,010309 optics ,Computer architecture ,0103 physical sciences ,Parasitic extraction ,Place and route ,Physical design ,Routing (electronic design automation) ,Lithography - Abstract
This work discusses the ASAP7 predictive process design kit (PDK) and associated standard cell library. The necessity for multi-patterning (MP) techniques at advanced nodes results in the standard cell and SRAM architecture becoming entangled with design rules, mandating design-technology co-optimization (DTCO). This paper discusses the DTCO process involving standard cell physical design. An assumption of extreme ultraviolet (EUV) lithography availability in the PDK allows bi-directional M1 geometries that are difficult with MP. Routing and power distribution schemes for self-aligned quadruple patterning (SAQP) friendly, high density standard cell based blocks are shown. Restrictive design rules are required and supported by the automated place and route (APR) setup. Supporting sub-20 nm dimensions with academic tool licenses is described. The APR (QRC techfile) extraction shows high correlation with the Calibre extraction deck. Finally, use of the PDK for academic coursework and research is discussed.
- Published
- 2017
36. Standard cell library design and optimization methodology for ASAP7 PDK: (Invited paper)
- Author
-
Andrew Evans, Brian Cline, Xiaoqing Xu, Saurabh Sinha, Greg Yeric, and Nishi Shah
- Subjects
Standard cell ,Computer science ,Transistor ,Process design ,02 engineering and technology ,Integrated circuit ,021001 nanoscience & nanotechnology ,020202 computer hardware & architecture ,law.invention ,Computer architecture ,law ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Node (circuits) ,0210 nano-technology ,Design methods - Abstract
Standard cell libraries are the foundation for the entire back-end design and optimization flow in modern application-specific integrated circuit designs. At 7nm technology node and beyond, standard cell library design and optimization is becoming increasingly difficult due to extremely complex design constraints, as described in the ASAP7 process design kit (PDK). Notable complexities include discrete transistor sizing due to FinFETs, complicated design rules from lithography and restrictive layout space from modern standard cell architectures. The design methodology presented in this paper enables efficient and high-quality standard cell library design and optimization with the ASAP7 PDK. The key techniques include exhaustive transistor sizing for cell timing optimization, transistor placement with generalized Euler paths and back-end design prototyping for library-level explorations.
- Published
- 2017
37. Multi-broker based software-defined optical networks (Invited paper)
- Author
-
Xiaoliang Chen, Andrea Castro, Roberto Proietti, S.J.B. Yoo, and Zuqing Zhu
- Subjects
Network control ,business.industry ,Computer science ,Quality of service ,Topology (electrical circuits) ,02 engineering and technology ,Blocking (statistics) ,Service provisioning ,Reduction (complexity) ,020210 optoelectronics & photonics ,Software ,Computer architecture ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,business - Abstract
This paper investigates the multi-broker based network control and management paradigm for realizing scalable and cost-effective service provisioning in multi-domain software-defined optical networks. Experimental results verify the feasibility of the proposal and demonstrate ∼ 7.6× blocking reduction comparing with the conventional single-broker based solution.
- Published
- 2017
38. 4.2: Invited Paper: OLCD: a low cost, area‐scalable manufacturing process for flexible displays
- Author
-
Paul Cain
- Subjects
Computer architecture ,Manufacturing process ,Flexible display ,Computer science ,Scalability - Published
- 2019
39. Generating FPGA-based image processing accelerators with Hipacc: (Invited paper)
- Author
-
Richard Membarth, Oliver Reiche, Jürgen Teich, Frank Hannig, and M. Akif Ozkan
- Subjects
020203 distributed computing ,Source code ,Computer science ,media_common.quotation_subject ,Image processing ,02 engineering and technology ,computer.software_genre ,020202 computer hardware & architecture ,Domain (software engineering) ,Digital subscriber line ,Computer architecture ,0202 electrical engineering, electronic engineering, information engineering ,Compiler ,Field-programmable gate array ,computer ,media_common ,Abstraction (linguistics) - Abstract
Domain-Specific Languages (DSLs) provide a high-level and domain-specific abstraction to describe algorithms within a certain domain concisely. Since a DSL separates the algorithm description from the actual target implementation, it offers a high flexibility among heterogeneous hardware targets, such as CPUs and GPUs. With the recent uprise of promising High-Level Synthesis (HLS) tools, like Vivado HLS and Altera OpenCL, FPGAs are becoming another attractive target architecture. Particularly in the domain of image processing, applications often come with stringent requirements regarding performance, energy efficiency, and power, for which FPGA have been proven to be among the most suitable architectures. In this work, we present the Hipacc framework, a DSL and source-to-source compiler for image processing. We show that domain knowledge can be captured to generate tailored implementations for C-based HLS from a common high-level DSL description targeting FPGAs. Our approach includes FPGA-specific memory architectures for handling point and local operators, as well as several high-level transformations. We evaluate our approach by comparing the resulting hardware accelerators to GPU implementations, generated from exactly the same DSL source code.
- Published
- 2017
40. Performance analysis and benchmarking of all-spin spiking neural networks (Special session paper)
- Author
-
Kaushik Roy, Aayush Ankit, and Abhronil Sengupta
- Subjects
010302 applied physics ,Spiking neural network ,Network complexity ,Speedup ,Artificial neural network ,Computer science ,business.industry ,Node (networking) ,02 engineering and technology ,021001 nanoscience & nanotechnology ,01 natural sciences ,Bottleneck ,Synapse ,Computer architecture ,Embedded system ,0103 physical sciences ,Benchmark (computing) ,Crossbar switch ,0210 nano-technology ,business - Abstract
Spiking Neural Network based brain-inspired computing paradigms are becoming increasingly popular tools for various cognitive tasks. The sparse event-driven processing capability enabled by such networks can be potentially appealing for implementation of low-power neural computing platforms. However, the parallel and memory-intensive computations involved in such algorithms is in complete contrast to the sequential fetch, decode, execute cycles of conventional von-Neumann processors. Recent proposals have investigated the design of spintronic “in-memory” crossbar based computing architectures driving “spin neurons” that can potentially alleviate the memory-access bottleneck of CMOS based systems and simultaneously offer the prospect of low-power inner product computations. In this article, we perform a rigorous system-level simulation study of such All-Spin Spiking Neural Networks on a benchmark suite of 6 recognition problems ranging in network complexity from 10k–7.4M synapses and 195–9.2k neurons. System level simulations indicate that the proposed spintronic architecture can potentially achieve ∼1292× energy efficiency and ∼ 235× speedup on average over the benchmark suite in comparison to an optimized CMOS implementation at 45nm technology node.
- Published
- 2017
41. A PAPER SURVEY ON THE IMPLEMENTATION OF THE PARALLEL FDTD ON MULTIPROCESSORS USING MPI
- Author
-
Oyku Akaydin, Adamu Abubakar Isah, and Mehmet Kusaf
- Subjects
Computer architecture ,Computer science ,Interface (computing) ,010401 analytical chemistry ,0202 electrical engineering, electronic engineering, information engineering ,Local area network ,Finite-difference time-domain method ,020206 networking & telecommunications ,02 engineering and technology ,General Medicine ,Parallel computing ,01 natural sciences ,0104 chemical sciences - Abstract
The research work explains a cost-effective, highperformance computing platform for the parallel implementation of the FDTD algorithm on PC clusters using the message-passing interface (MPI) library, which is a local area network system consisting of multiple interconnected personal computers (PCs), and is already widely employed for parallel computing.
- Published
- 2017
42. 2017 International Symposium on Computer Architecture Influential Paper Award
- Author
-
David Brooks
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer architecture ,Hardware and Architecture ,Computer science ,Hardware_INTEGRATEDCIRCUITS ,Hardware_PERFORMANCEANDRELIABILITY ,Electrical and Electronic Engineering ,Software ,Hardware_LOGICDESIGN - Abstract
This article discusses the 2017 ACM SIGARCH/IEEE-CS TCCA Influential ISCA Paper Award, which was given to the 2002 ISCA paper, “Drowsy Caches: Simple Techniques for Reducing Leakage Power.”
- Published
- 2017
43. Service and Energy Management in Fog Computing: A Taxonomy Approaches, and Future Directions.
- Author
-
Hashemi, S. M., Sahafi, A., Rahmani, A. M., and Bohlouli, M.
- Subjects
ENERGY management ,INTERNET of things ,ENERGY consumption ,COMPUTING platforms ,COMPUTER architecture ,EDGE computing - Abstract
Background and Objectives: Today, the increased number of Internet-connected smart devices require powerful computer processing servers such as cloud and fog and necessitate fulfilling requests and services more than ever before. The geographical distance of IoT devices to fog and cloud servers have turned issues such as delay and energy consumption into major challenges. However, fog computing technology has emerged as a promising technology in this field. Methods: In this paper, service/energy management approaches are generally surveyed. Then, we explain our motivation for the systematic literature review procedure (SLR) and how to select the related works. Results: This paper introduces four domains of service management and energy management, including Architecture, Resource Management, Scheduling management, and Service Management. Scheduling management has been used in 38% of the papers. Therefore, they have the highest service management and energy management. Also, Resource Management is the second domain that has been able to attract about 26% of the papers in service management and energy management. Conclusion: About 81% of the fog computing papers simulated their approaches, and the others implemented their schemes using a testbed in the real environment. Furthermore, 30% of the papers presented an architecture or framework for their research, along with their research. In this systematic literature review, papers have been extracted from five valid databases, including IEEE Xplore, Wiley, Science Direct (Elsevier), Springer Link, and Taylor & Francis, from 2013 to 2022. We obtained 1596 papers related to the discussed subject. We filtered them and achieved 47 distinct studies. In the following, we analyze and discuss these studies; then we review the parameters of service quality in the papers, and ultimately, we present the benefits, drawbacks, and innovations of each study. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. 2014 International Symposium on Computer Architecture Influential Paper Award; 2014 Maurice Wilkes Award Given to Ravi Rajwar
- Author
-
Dean M. Tullsen and Stephen W. Keckler
- Subjects
Computer architecture ,Hardware and Architecture ,Computer science ,Electrical and Electronic Engineering ,ComputingMilieux_MISCELLANEOUS ,GeneralLiterature_MISCELLANEOUS ,Software - Abstract
This column discusses two awards given in 2014: the International Symposium on Computer Architecture Influential Paper Award, which was given to the authors of the paper "PipeRench: A Coprocessor for Streaming Multimedia Acceleration," and the Maurice Wilkes Award, which was given to Ravi Rajwar.
- Published
- 2014
45. Hybrid large-area systems and their interconnection backbone (invited paper)
- Author
-
Warren Rieutort-Louis, Yu Hen Hu, Josue Sanz-Robinson, Naveen Verma, Tiffany Moy, Liechao Huang, Yasmin Afsar, Sigurd Wagner, Levent E. Aygun, and James C. Sturm
- Subjects
010302 applied physics ,Interconnection ,business.industry ,Computer science ,020208 electrical & electronic engineering ,02 engineering and technology ,Integrated circuit ,Modular design ,01 natural sciences ,law.invention ,CMOS ,Computer architecture ,law ,Hybrid system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electronics ,Telecommunications ,business - Abstract
Hybrid systems combine Large-Area Electronics (LAE) with high-performance technologies (e.g., silicon CMOS) [1]. With architectural concepts for hybrid systems broadening to match the range of emerging applications, this paper examines modular approaches for multi-sheet, multi-technology integration. It identifies the interfaces required as a critical backbone. For interfaces associated with various system functionalities (sensing, processing, powering), specific approaches are surveyed and analyzed, taking from insights derived from several previous experimental demonstrations of complete hybrid systems.
- Published
- 2016
46. Hardware optimizations for crypto implementations (Invited paper)
- Author
-
Sandeep K. Shukla and M. Mohamed Asan Basiri
- Subjects
Very-large-scale integration ,Cryptographic primitive ,Computer science ,business.industry ,020208 electrical & electronic engineering ,02 engineering and technology ,Fault injection ,Encryption ,Multiplexing ,Multiplexer ,020202 computer hardware & architecture ,Computer architecture ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Side channel attack ,Elliptic curve cryptography ,business ,Computer hardware - Abstract
Latency, Area, and Power are three important metrics that a VLSI designer wants to optimize. However, often one of these may have to be optimized at the cost of another or the other two. Depending on the application scenario, choice of the metric to optimize is made. In this paper, we consider hardware implementations of a number of cryptographic primitives and present a number of optimizations. We consider three areas of cryptoengineering. They are building physical unclonable functions (PUFs), implementing encryption/decryption algorithms, and side channel proof crypto implementations. The techniques we employ range from area optimization through customized multiplexer design, fusing multiple operations into a single hardware element, folding and unrolling of iterative algorithms, creating reconfigurable implementations to achieve multiple operations with the same set of hardware elements, to techniques of obfuscation to defeat fault injection based attacks on the crypto implementation. All the proposed and existing designs are implemented with 45 nm CMOS library.
- Published
- 2016
47. A Scientometric Mapping of Contributions to Journal of Computer Science and Technology during 2012-2016.
- Author
-
Patel, Vimlesh
- Subjects
SCIENTOMETRICS ,COMPUTER science ,DISTRIBUTED computing ,COMPUTER architecture ,STATISTICAL methods in information science - Abstract
The paper presents a Scientometrics mapping of papers published inJournal of Computer Science and Technology, during 2012 to 2016 as reflected in Web of Science database. It attempts to analyze the growth and development of publications output of Journal of Computer Science and Technologyas reflected. Data for a total of 485 have been downloaded and analysed according to objectives. The study reveals thatThe year wise growth rate revel that highest no. papers published in 2015, No. of Papers: 106 (21.86%) Authorship pattern data reveals that most of the authors like to publish papers in collaborations and most preferred authorship pattern was four author i.e. no. publications for four authors were 125 (25.77 %). The Degree of Collaboration (DC) revel that DC is found highest in 0.95 Co-Authored Publication. The highly prolific authors and their publications revel that Zhang L, published highest numbers of papers (11 nos.), the geographical distribution contributions (International) is revel that Peoples R China is in the top with no. of publications is 371 (76.50%), it is found from institutionwise distribution of papers that highest contributed institutions was Chinese Academy of Sciences with 93 Publications (19.18%) is placed at 1
st rank and the average of citations per year (2012-2016) were 205. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
48. Data centres for IoT applications: The M2DC approach (Invited paper)
- Author
-
Lennart Tigges, Daniel Schlitt, Wolfgang Christmann, Michal Kierzynka Ariel Oleksiak, Christian Pieper, Robert Plestenjak, Mario Porrmann, Mariano Cecowski, Loïc Cudennec, Udo Janssen, Thierry Goubier, Micha vor dem Berge, Chris Adeniyi-Jones, Carlo Brandolese, Meysam Peykanu, Giovanni Agosta, Jens Hagemeyer, William Fornaciari, René Griessl, Jean-Marc Philippe, Luca Ceva, Justin Cinkelj, Gerardo Pelosi, Stefan Krupop, Sven Rosinger, Poznan Supercomputing and Networking Center (PSNC), Department of Electronics, Information, and Bioengineering [Milano] (DEIB), Politecnico di Milano [Milan] (POLIMI), Christmann Informationstechnik + Medien (GERMANY), XLAB d.o.o., XLAB, Cognitive Interaction Technology [Bielefeld] (CITEC), Universität Bielefeld = Bielefeld University, Département d'Architectures, Conception et Logiciels Embarqués-LIST (DACLE-LIST), Laboratoire d'Intégration des Systèmes et des Technologies (LIST), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Institute for Information Technology [Oldenburg] (OFFIS), ARM Ltd [Cambridge] (ARM), Najjar W., Gerstlaur A., European Project: 688201,H2020,H2020-ICT-2015,M2DC(2016), and Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA))
- Subjects
Internet of things ,Computer science ,Embedded systems ,heterogeneous microserver computing resources ,computer centres ,Server architecture ,02 engineering and technology ,System efficiency ,cost-optimized server architecture ,World Wide Web ,modular microserver data-centre ,[SPI]Engineering Sciences [physics] ,Software ,Information management ,Cost optimized ,0202 electrical engineering, electronic engineering, information engineering ,[INFO]Computer Science [cs] ,Computer architecture ,Architecture ,Management strategies ,ComputingMilieux_MISCELLANEOUS ,Software data ,software data centre ecosystem ,flexible server architecture ,business.industry ,system efficiency enhancements ,Flexible servers ,advanced management strategies ,020206 networking & telecommunications ,Modular design ,Computing resource ,IoT applications ,M2DC ,020201 artificial intelligence & image processing ,Data center ,network servers ,IOT applications ,business ,Software engineering ,Internet of Things - Abstract
Conference of 16th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, SAMOS 2016 ; Conference Date: 17 July 2016 Through 21 July 2016; Conference Code:126004; International audience; The Modular Microserver DataCentre (M2DC) project investigates, develops and demonstrates a modular, highly-efficient, cost-optimized server architecture composed of heterogeneous micro server computing resources, being able to be tailored to meet requirements from various application domains, including the Internet of Things. M2DC is built on three main pillars: a flexible server architecture that can be easily customised, maintained and updated; advanced management strategies and system efficiency enhancements (SEE); well-defined interfaces to surrounding software data centre ecosystem.
- Published
- 2016
49. Neuromorphic hardware acceleration enabled by emerging technologies (Invited paper)
- Author
-
Mengjie Mao, Qing Wu, Yi Chen, Xiaoxiao Liu, Hai Li, and Mark Bamell
- Subjects
Speedup ,Artificial neural network ,business.industry ,Computer science ,Symmetric multiprocessor system ,Memristor ,law.invention ,symbols.namesake ,Neuromorphic engineering ,Computer architecture ,law ,Embedded system ,Scalability ,symbols ,Unconventional computing ,business ,Von Neumann architecture - Abstract
The explosion of big data applications imposes severe challenges of data processing speed and scalability on traditional computer systems. However, the performance of the von Neumann machine is greatly hindered by the increasing performance gap between CPU and memory, motivating the active research on new or alternative computing architectures. As one important instance, neuromorphic computing systems inspired by the working mechanism of human brains have gained considerable attention. In this work, we propose a heterogeneous computing system with neuromorphic computing accelerators (NCAs) that are built with emerging memristor technology. In the proposed system, NCA is designed to speed up the artificial neural network (ANN) executions in many high-performance applications by leveraging the extremely efficient mixed-signal computation capability of nanoscale memristor-based crossbar (MBC) arrays. The hierarchical MBC arrays of the NCA can be flexibly configured to different ANN topologies through the help of an analog Network-on-Chip (A-NoC). A general approach which translates the target codes within a program to the corresponding NCA instructions is also developed to facilitate the utilization of the NCA. Our simulation results show that compared to the baseline general purpose processor, the proposed heterogeneous system can achieve on average 18.2x performance speedup and 20.1x energy reduction over nine representative applications while constraining the computation accuracy degradation within an acceptable range.
- Published
- 2014
50. Computational modelling methods for pliable structures based on curved-line folding.
- Author
-
Vergauwen, Aline, Laet, Lars De, and Temmerman, Niels De
- Subjects
- *
COMPUTATIONAL complexity , *PAPER arts , *BENDING (Metalwork) , *FINITE element method software , *COMPUTER architecture - Abstract
Curved-line folding, the act of folding paper along a pattern of curved lines to obtain a 3D shape, is an interesting starting-point for the design of innovative pliable structures. There exists a kinematic connection between two surfaces linked through a curved crease that can be used to generate a folding motion. However, due to the interdependency of geometry, forces and material properties the design of pliable structures based on curved-line folding is very complex. To facilitate the design process, adequate computational modelling methods are essential. This paper presents two ways of modelling: a geometric modelling method based on discretisation of the crease pattern and a method based on Finite Element Analysis (FEA). The proposed methods are validated by means of a case study in which a physical model is compared to digital ones. It can be concluded that the method based on FEA corresponds very well with the physical model, proving its potential. The accuracy of the geometric modelling is improved by the introduction of a set of guidelines based on the direction of the principal bending moments in the pliable structure. Furthermore, the case study exposes how the material-dependent behaviour of pliable structures increases the complexity of the design and should certainly be part of future research. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.