Author: "Poncino, M." - Searchworks@Jio Institute Digital Library Search Results

Author: Bruno, M., Macii, A., and Poncino, M.
Subjects: *ELECTRONIC circuit design, *ELECTRONICS, *SEMICONDUCTORS, *STATISTICS, *TECHNOLOGICAL innovations, *COMPUTER science
Abstract: Power estimation at the register-transfer level (RTL) is usually narrowed down to the problem of building accurate power models for the modules corresponding to RTL operators. It is shown that, when RTL power estimation is integrated into a realistic design flow based on an HDL description, other types of primitives need to be accurately modelled. In particular, a significant part of the RTL functionality is realised by sparse logic elements. The proposed estimation strategy replaces the low-effort synthesis that is typically used for this type of fine-grain primitives with an empirical power model based on parameters that can be extracted from either the internal representation of the design or from RTL simulation data. The model can be made scalable with respect to technology, and provides very good accuracy (13% on average, measured on a set of industrial benchmarks). Using a similar statistical paradigm, accurate (about 20% average error) models for the power consumption of internal wires are also presented. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

269. Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks

Author: Massimo Poncino, Daniele Jahier Pagliari, Andrea Bartolini, Alessio Burrello, Luca Benini, Enrico Macii, Burrello A., Pagliari D.J., Bartolini A., Benini L., Macii E., and Poncino M.
Subjects: Network architecture, IoT, Artificial neural network, Computer science, business.industry, Deep learning, Predictive maintenance, Sequence analysis, Temporal Convolutional Networks, Machine learning, computer.software_genre, Convolutional neural network, Article, Constant false alarm rate, Random forest, Recurrent neural network, Artificial intelligence, business, computer, Sequence analysi
Abstract: In modern data centers, storage system failures are major contributors to downtimes and maintenance costs. Predicting these failures by collecting measurements from disks and analyzing them with machine learning techniques can effectively reduce their impact, enabling timely maintenance. While there is a vast literature on this subject, most approaches attempt to predict hard disk failures using either classic machine learning solutions, such as Random Forests (RFs) or deep Recurrent Neural Networks (RNNs). In this work, we address hard disk failure prediction using Temporal Convolutional Networks (TCNs), a novel type of deep neural network for time series analysis. Using a real-world dataset, we show that TCNs outperform both RFs and RNNs. Specifically, we can improve the Fault Detection Rate (FDR) of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx $$\end{document}≈7.5% (FDR = 89.1%) compared to the state-of-the-art, while simultaneously reducing the False Alarm Rate (FAR = 0.052%). Moreover, we explore the network architecture design space showing that TCNs are consistently superior to RNNs for a given model size and complexity and that even relatively small TCNs can reach satisfactory performance. All the codes to reproduce the results presented in this paper are available at https://github.com/ABurrello/tcn-hard-disk-failure-prediction.
Published: 2021

270. Adaptive Random Forests for Energy-Efficient Inference on Microcontrollers

Author: Enrico Macii, Francesco Daghero, Massimo Poncino, Daniele Jahier Pagliari, Alessio Burrello, Andrea Calimera, Luca Benini, Chen Xie, Daghero F., Burrello A., Xie C., Benini L., Calimera A., MacIi E., Poncino M., and Pagliari D.J.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, Decision tree, Inference, Energy consumption, Machine Learning (cs.LG), Random forest, Machine Learning, Microcontroller, Computer engineering, Embedded System, Embedded Systems, Latency (engineering), Energy (signal processing), Efficient energy use
Abstract: Random Forests (RFs) are widely used Machine Learning models in low-power embedded devices, due to their hardware friendly operation and high accuracy on practically relevant tasks. The accuracy of a RF often increases with the number of internal weak learners (decision trees), but at the cost of a proportional increase in inference latency and energy consumption. Such costs can be mitigated considering that, in most applications, inputs are not all equally difficult to classify. Therefore, a large RF is often necessary only for (few) hard inputs, and wasteful for easier ones. In this work, we propose an early-stopping mechanism for RFs, which terminates the inference as soon as a high-enough classification confidence is reached, reducing the number of weak learners executed for easy inputs. The early-stopping confidence threshold can be controlled at runtime, in order to favor either energy saving or accuracy. We apply our method to three different embedded classification tasks, on a single-core RISC-V microcontroller, achieving an energy reduction from 38% to more than 90% with a drop of less than 0.5% in accuracy. We also show that our approach outperforms previous adaptive ML methods for RFs., Published in: 2021 IFIP/IEEE 29th International Conference on Very Large Scale Integration (VLSI-SoC), 2021
Published: 2022

271. Robust and Energy-efficient PPG-based Heart-Rate Monitoring

Author: Daniele Jahier Pagliari, Matteo Risso, Massimo Pontino, Luca Benini, Enrico Macii, Alessio Burrello, Simone Benatti, Risso M., Burrello A., Pagliari D.J., Benatti S., Macii E., Benini L., and Poncino M.
Subjects: Signal Processing (eess.SP), FOS: Computer and information sciences, Computer Science - Machine Learning, Heart rate,Measurement,Medical services,Inference algorithms,Integrated circuit modeling,Monitoring,Motion artifacts, Computer science, Real-time computing, Latency (audio), Inference, Motion artifacts, Machine Learning (cs.LG), Set (abstract data type), Microcontroller, Heart rate Measurement, Deep Learning, Medical services, FOS: Electrical engineering, electronic engineering, information engineering, Leverage (statistics), Inference algorithms, Enhanced Data Rates for GSM Evolution, Microcontrollers, Electrical Engineering and Systems Science - Signal Processing, Efficient energy use
Abstract: A wrist-worn PPG sensor coupled with a lightweight algorithm can run on a MCU to enable non-invasive and comfortable monitoring, but ensuring robust PPG-based heart-rate monitoring in the presence of motion artifacts is still an open challenge. Recent state-of-the-art algorithms combine PPG and inertial signals to mitigate the effect of motion artifacts. However, these approaches suffer from limited generality. Moreover, their deployment on MCU-based edge nodes has not been investigated. In this work, we tackle both the aforementioned problems by proposing the use of hardware-friendly Temporal Convolutional Networks (TCN) for PPG-based heart estimation. Starting from a single "seed" TCN, we leverage an automatic Neural Architecture Search (NAS) approach to derive a rich family of models. Among them, we obtain a TCN that outperforms the previous state-of-the- art on the largest PPG dataset available (PPGDalia), achieving a Mean Absolute Error (MAE) of just 3.84 Beats Per Minute (BPM). Furthermore, we tested also a set of smaller yet still accurate (MAE of 5.64 - 6.29 BPM) networks that can be deployed on a commercial MCU (STM32L4) which require as few as 5k parameters and reach a latency of 17.1 ms consuming just 0.21 mJ per inference.
Published: 2022
Full Text: View/download PDF

272. Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for Temporal Convolutional Networks

Author: Luca Benini, Lorenzo Lamberti, Matteo Risso, Daniele Jahier Pagliari, Massimo Poncino, Alessio Burrello, Enrico Macii, Francesco Conti, Risso M., Burrello A., Pagliari D.J., Conti F., Lamberti L., MacIi E., Benini L., and Poncino M.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Network architecture, Artificial neural network, Computer science, business.industry, Deep learning, Neural Architecture Search, Deep Learning, Edge Computing, Temporal Convolutional Networks, Machine Learning (cs.LG), Convolution, Set (abstract data type), Dilation (metric space), Feature (computer vision), Pruning (decision trees), Artificial intelligence, business, Algorithm
Abstract: Temporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks. One key feature of TCNs is time-dilated convolution, whose optimization requires extensive experimentation. We propose an automatic dilation optimizer, which tackles the problem as a weight pruning on the time-axis, and learns dilation factors together with weights, in a single training. Our method reduces the model size and inference latency on a real SoC hardware target by up to 7.4x and 3x, respectively with no accuracy drop compared to a network without dilation. It also yields a rich set of Pareto-optimal TCNs starting from a single model, outperforming hand-designed solutions in both size and accuracy.
Published: 2021
Full Text: View/download PDF

273. TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference

Author: Marcello Zanghieri, Francesco Conti, Massimo Poncino, Enrico Macii, Alessio Burrello, Alberto Dequino, Daniele Jahier Pagliari, Luca Benini, Burrello A., Dequino A., Pagliari D.J., Conti F., Zanghieri M., MacIi E., Benini L., and Poncino M.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Deep-Learning, Computer Science - Artificial Intelligence, business.industry, Computer science, Deep learning, Temporal Convolutional Network, Edge-Computing, Internet-of-Things, Network topology, Machine Learning (cs.LG), Computational science, Microcontroller, Internet-of-Thing, Artificial Intelligence (cs.AI), Low-power electronics, Benchmark (computing), Artificial intelligence, Enhanced Data Rates for GSM Evolution, Latency (engineering), business, Energy (signal processing)
Abstract: Temporal Convolutional Networks (TCNs) are emerging lightweight Deep Learning models for Time Series analysis. We introduce an automated exploration approach and a library of optimized kernels to map TCNs on Parallel Ultra-Low Power (PULP) microcontrollers. Our approach minimizes latency and energy by exploiting a layer tiling optimizer to jointly find the tiling dimensions and select among alternative implementations of the causal and dilated 1D-convolution operations at the core of TCNs. We benchmark our approach on a commercial PULP device, achieving up to $103 \times $ lower latency and $20.3 \times $ lower energy than the Cube-AI toolkit executed on the STM32L4 and from $2.9 \times $ to $26.6 \times $ lower energy compared to commercial closed-source and academic open-source approaches on the same hardware target.
Published: 2021
Full Text: View/download PDF

274. Manufacturing as a Data-Driven Practice: Methodologies, Technologies, and Tools

Author: Edoardo Patti, Andrea Calimera, Daniele Jahier Pagliari, Andrea Acquaviva, Tania Cerquitelli, Lorenzo Bottaccioli, Massimo Poncino, Cerquitelli T., Pagliari D.J., Calimera A., Bottaccioli L., Patti E., Acquaviva A., and Poncino M.
Subjects: Process (engineering), Computer science, Internet of Things, Data modeling, Tools, Busine, Industrie, Protocol, Orchestration (computing), Electrical and Electronic Engineering, data analytics, Data mining, Data-centric architectures, data management, data analytics, Industry 4.0, Internet of Things, technologies, technologies, Data-centric architectures, business.industry, Service robot, Information technology, Business value, Industry 4.0, Data science, Software quality, Internet of Things (IoT), Manufacturing, data-centric architecture, Software deployment, Data analytic, data management, Software architecture, business
Abstract: In recent years, the introduction and exploitation of innovative information technologies in industrial contexts have led to the continuous growth of digital shop floor environments. The new Industry 4.0 model allows smart factories to become very advanced IT industries, generating an ever-increasing amount of valuable data. As a consequence, the necessity of powerful and reliable software architectures is becoming prominent along with data-driven methodologies to extract useful and hidden knowledge supporting the decision-making process. This article discusses the latest software technologies needed to collect, manage, and elaborate all data generated through innovative Internet-of-Things (IoT) architectures deployed over the production line, with the aim of extracting useful knowledge for the orchestration of high-level control services that can generate added business value. This survey covers the entire data life cycle in manufacturing environments, discussing key functional and methodological aspects along with a rich and properly classified set of technologies and tools, useful to add intelligence to data-driven services. Therefore, it serves both as a first guided step toward the rich landscape of the literature for readers approaching this field and as a global yet detailed overview of the current state of the art in the Industry 4.0 domain for experts. As a case study, we discuss, in detail, the deployment of the proposed solutions for two research project demonstrators, showing their ability to mitigate manufacturing line interruptions and reduce the corresponding impacts and costs.
Published: 2021

275. Dual- assignment policies in ITD-aware synthesis

Author: Calimera, A., Bahar, R.I., Macii, E., and Poncino, M.
Subjects: *TEMPERATURE effect, *COMPLEMENTARY metal oxide semiconductors, *ELECTRIC potential, *ELECTRONIC circuit design, *MICROELECTRONICS, *LOGIC circuits
Abstract: Abstract: Traditionally, the effects of temperature on delay of CMOS devices have been evaluated using the highest operating temperature as a worst-case corner. This conservative approach was based on the fact that, in older technologies, CMOS devices systematically degraded their performance as temperature increases. With the progressive scaling of technology, however, there has been a continuous reduction of the gap between supply and threshold voltages of devices, mostly due to low-power constraints. The latter have accelerated this trend by using libraries containing multiple instances of a cell with different ranges of threshold voltages; in particular, the use of high- cells to control sub-threshold leakage currents has made this gap smaller and smaller. The consequence of this trend is the occurrence of the so-called inverted temperature dependence (ITD), under which cells get faster as temperature increases. This new thermal dependence has made the old worst-case design approach obsolete, posing new EDA challenges. Beside complicating timing analysis, in particular, ITD has important and unforeseeable consequences for power-aware design, especially in dual- logic synthesis. Due to a contrasting temperature dependence between low- cells (which enjoy the classical, direct temperature dependence) and high- cells (for which an inverted temperature dependence holds), a single-temperature worst-case design approach fails to generate netlists that are compliant with timing constraints for the entire temperature range. In this work, we first validate the relevance of ITD on an industrial 65nm CMOS multi- library. Then, we describe an ITD-aware, dual- assignment algorithm that guarantees temperature-insensitive operation of the circuits, together with a significant reduction of both leakage and total power consumption. The algorithm has been tested over standard benchmarks using three different replacement policies. Experimental results show an average leakage power savings of 50% w.r.t. circuits synthesized with a standard, commercial flow that does not take ITD into account and thus, to ensure that no temperature-induced timing faults occur, needs to resort to over-design (i.e., over-constraining the timing bound so as to make sure that temperature fluctuations never make the circuits violating the specified required time for all paths). [Copyright &y& Elsevier]
Published: 2010
Full Text: View/download PDF

276. Implementation of a thermal management unit for canceling temperature-dependent clock skew variations

Author: Chakraborty, A., Duraisami, K., Sathanur, A., Sithambaram, P., Macii, A., Macii, E., and Poncino, M.
Subjects: *TEMPERATURE control of electronics, *INTEGRATED circuit interconnections, *LINE drivers (Integrated circuits), *NANOSTRUCTURED materials, *TEMPERATURE lapse rate
Abstract: Thermal gradients across the die are becoming increasingly prominent as we scale further down into the sub-nanometer regime. While temperature was never a primary concern, its non-negligible impact on delay and reliability is getting significant attention lately. One of the principal factors affecting designs today is timing criticality, which, in today''s technologies is mostly determined by wire delays. Clocks, which are the backbone of the interconnect network, are extremely prone to temperature dependent delay variations and need to be designed with extreme care so as to meet accurate timing constraints. Their skew has to be minimized in order to guarantee functionality, albeit in the presence of these process variations. Temperature, on the other hand, is dynamic in nature and its effects hence need run-time monitoring and management. One of the most efficient ways to manage temperature dependent skew is through the use of buffers with dynamically tunable delays. The use of such buffers in the clock distribution network allows modulating the delay on selected branches of the clock network based on a thermal profile, so as to keep the skew within acceptable bounds. A runtime scheme obviously requires an on-line management unit. Our work predominantly focuses on the implementation of one such unit, while studying its impact on design parameters such as area, wire-length and power. Results show negligible a impact (0.67% in area, 0.62% in wire-length, 0.33% in power, and 0.37% in via-number) on the design. [Copyright &y& Elsevier]
Published: 2008
Full Text: View/download PDF

277. Energy-efficient adaptive machine learning on IoT end-nodes with class-dependent confidence

Author: Francesco Daghero, Daniele Jahier Pagliari, Enrico Macii, Luca Benini, Alessio Burrello, Massimo Poncino, Daghero F., Burrello A., Pagliari D.J., Benini L., Macii E., and Poncino M.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Current (mathematics), Edge device, Computer science, 02 engineering and technology, Machine learning, computer.software_genre, 01 natural sciences, Machine Learning (cs.LG), 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Latency (engineering), Set (psychology), Computer Science::Databases, 010302 applied physics, Class (computer programming), business.industry, Work (physics), Energy consumption, 020202 computer hardware & architecture, Artificial intelligence, business, Energy-efficient machine, IoT applications, computer, Efficient energy use
Abstract: Energy-efficient machine learning models that can run directly on edge devices are of great interest in IoT applications, as they can reduce network pressure and response latency, and improve privacy. An effective way to obtain energy-efficiency with small accuracy drops is to sequentially execute a set of increasingly complex models, early-stopping the procedure for "easy" inputs that can be confidently classified by the smallest models. As a stopping criterion, current methods employ a single threshold on the output probabilities produced by each model. In this work, we show that such a criterion is sub-optimal for datasets that include classes of different complexity, and we demonstrate a more general approach based on per-classes thresholds. With experiments on a low-power end-node, we show that our method can significantly reduce the energy consumption compared to the single-threshold approach., Comment: Published in: 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS)
Published: 2020

278. A Comparison Analysis of BLE-Based Algorithms for Localization in Industrial Environments

Author: Enrico Macii, Massimo Poncino, Marina Zafiri, Davide Cannizzaro, Edoardo Patti, Daniele Jahier Pagliari, Andrea Acquaviva, Cannizzaro D., Zafiri M., Pagliari D.J., Patti E., Macii E., Poncino M., and Acquaviva A.
Subjects: Computer Networks and Communications, computer.internet_protocol, Computer science, lcsh:TK7800-8360, trilateration, fingerprint, 02 engineering and technology, smart industry, industry 4.0, bluetooth low energy, indoor location, 0202 electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, Bluetooth Low Energy, lcsh:Electronics, 020206 networking & telecommunications, Beacon, Hardware and Architecture, Control and Systems Engineering, Received signal strength indication, Signal Processing, 020201 artificial intelligence & image processing, computer, Algorithm, Trilateration
Abstract: Proximity beacons are small, low-power devices capable of transmitting information at a limited distance via Bluetooth low energy protocol. These beacons are typically used to broadcast small amounts of location-dependent data (e.g., advertisements) or to detect nearby objects. However, researchers have shown that beacons can also be used for indoor localization converting the received signal strength indication (RSSI) to distance information. In this work, we study the effectiveness of proximity beacons for accurately locating objects within a manufacturing plant by performing extensive experiments in a real industrial environment. To this purpose, we compare localization algorithms based either on trilateration or environment fingerprinting combined with a machine-learning based regressor (k-nearest neighbors, support-vector machines, or multi-layer perceptron). Each algorithm is analyzed in two different types of industrial environments. For each environment, various configurations are explored, where a configuration is characterized by the number of beacons per square meter and the density of fingerprint points. In addition, the fingerprinting approach is based on a preliminary site characterization, it may lead to location errors in the presence of environment variations (e.g., movements of large objects). For this reason, the robustness of fingerprinting algorithms against such variations is also assessed. Our results show that fingerprint solutions outperform trilateration, showing also a good resilience to environmental variations. Given the similar error obtained by all three fingerprint approaches, we conclude that k-NN is the preferable algorithm due to its simple deployment and low number of hyper-parameters.
Published: 2019
Full Text: View/download PDF

279. Fast Computation of Discharge Current Upper Bounds for Clustered Power Gating

Author: Massimo Poncino, Ashoka Sathanur, E. Macii, Alberto Macii, Luca Benini, Sathanur A., Benini L., Macii A., Macii E., and Poncino M.
Subjects: Engineering, Power gating, business.industry, Computation, Static timing analysis, Upper and lower bounds, Leakage power optimization, Reduction (complexity), Hardware and Architecture, Low-power electronics, Power electronics, low power design, maximum current estimation, Signal integrity, Electrical and Electronic Engineering, business, Algorithm, Software
Abstract: The capability of accurately estimating an upper bound of the maximum current drawn by a digital macroblock from the ground or power supply line constitutes a major asset of automatic power-gating flows. In fact, the maximum current information is essential to properly size the sleep transistor in such a way that speed degradation and signal integrity violations are avoided. Loose upper bounds can be determined with a reasonable computational cost, but they lead to oversized sleep transistors. On the other hand, exact computation of the maximum drawn current is an NP-hard problem, even when conservative simplifying assumptions are made on gate-level current profiles. In this paper, we present a scalable algorithm for tightening upper bound computation, with a controlled and tunable computational cost. The algorithm exploits state-of-the-art commercial timing analysis engines, and it is tightly integrated into an industrial power-gating flow for leakage power reduction. The results we have obtained on large circuits demonstrate the scalability and effectiveness of our estimation approach.
Published: 2011

280. Row-Based Power-Gating: A Novel Sleep Transistor Insertion Methodology for Leakage Power Optimization in Nanometer CMOS Circuits

Author: Massimo Poncino, E. Macii, Alberto Macii, Ashoka Sathanur, Luca Benini, Sathanur A., Benini L., Macii A., Macii E., and Poncino M.
Subjects: power optimization, Engineering, Power gating, Hardware_PERFORMANCEANDRELIABILITY, law.invention, Hardware_GENERAL, law, Low-power electronics, Hardware_INTEGRATEDCIRCUITS, Electronic engineering, Electrical and Electronic Engineering, Leakage (electronics), Electronic circuit, business.industry, Transistor, Electrical engineering, Power optimization, Leakage power, CMOS, Nanoelectronics, Hardware and Architecture, power gating, logic synthesi, low-power design, business, Software, Hardware_LOGICDESIGN
Abstract: Leakage power has become a serious concern in nanometer CMOS technologies, and power-gating has shown to offer a viable solution to the problem with a small penalty in performance. This paper focuses on leakage power reduction through automatic insertion of sleep transistors for power-gating. In particular, we propose a novel, layout-aware methodology that facilitates sleep transistor insertion and virtual-ground routing on row-based layouts. We also introduce a clustering algorithm that is able to handle simultaneously timing and area constraints, and we extend it to the case of multi- Vt sleep transistors to increase leakage savings. The results we have obtained on a set of benchmark circuits show that the leakage savings we can achieve are, by far, superior to those obtained using existing power-gating solutions and with much tighter timing and area constraints.
Published: 2011

281. Design of a Flexible Reactivation Cell for Safe Power-Mode Transition in Power-Gated Circuits

Author: Massimo Poncino, Luca Benini, Alberto Macii, E. Macii, Andrea Calimera, Calimera A., Benini L., Macii A., Macii E., and Poncino M.
Subjects: Engineering, business.industry, Transistor, Electrical engineering, Hardware_PERFORMANCEANDRELIABILITY, Integrated circuit design, law.invention, Dynamic voltage scaling, CMOS, law, Hardware_INTEGRATEDCIRCUITS, Electronic engineering, Ground bounce, Electrical and Electronic Engineering, Power MOSFET, Power network design, business, Electronic circuit
Abstract: Power-gating is one of the most promising and widely adopted solutions for controlling sub-threshold leakage power in nanometer circuits. Although single-cycle power-mode transition reduces wake-up latency, it develops large discharge current spikes, thereby causing IR-drop and inductive ground bounce for the neighboring circuit blocks, which can suffer from power plane integrity degradation. We propose a new reactivation solution that helps in controlling power supply fluctuations and achieving minimum reactivation times. Our structure limits the turn-on current below a given threshold through a sequential activation of the sleep transistors (STs), which are connected in parallel and sized using a novel optimal sizing algorithm. We also introduce a distributed physical implementation, which allows minimum layout disruption after ST insertion and minimizes routing congestion.
Published: 2009

282. Exploiting Temporal Discharge Current Information to Improve the Efficiency of Clustered Power-Gating

Author: Luca Benini, Massimo Poncino, A. Sathanur, Enrico Macii, Alberto Macii, Sathanur A., Benini L., Macii A., Macii E., and Poncino M.
Subjects: Engineering, Power gating, business.industry, Transistor, Electrical engineering, Hardware_PERFORMANCEANDRELIABILITY, law.invention, CMOS, law, Low-power electronics, Power electronics, Hardware_INTEGRATEDCIRCUITS, Electronic engineering, Electrical and Electronic Engineering, business, Dimensioning, Hardware_LOGICDESIGN, Electronic circuit, Leakage (electronics)
Abstract: The use of sleep transistors as power-gating devices to cut-off sub-threshold leakage stand-by currents has become a very popular solution to tackle the rise of leakage consumption in nanometer CMOS circuits. Clustered power-gating is now the de-fact standard for application of this leakage saving technique in industry. Cell clustering, sleep transistor sizing and peak current estimation are among the key steps of state-of-the-art clustered power-gating methodologies. In this work, we propose to exploit the information on the temporal variations of the discharge currents of the gates in a circuit to improve the quality of the solutions generated by an existing cell clustering algorithm. This translates to power-gated circuits with lower leakage consumption compared to implementations based clusters formed assuming a time-invariant, worst-case behavior of the currents drawn by the cells. The achieved leakage savings can be as high as 17%.
Published: 2009

283. Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference.

Author: Xie C, Burrello A, Daghero F, Benini L, Calimera A, Macii E, Poncino M, and Jahier Pagliari D
Subjects: Humans, Physical Phenomena, Databases, Factual, Fatigue, Gestures, Electric Power Supplies
Abstract: Hand gesture recognition applications based on surface electromiographic (sEMG) signals can benefit from on-device execution to achieve faster and more predictable response times and higher energy efficiency. However, deploying state-of-the-art deep learning (DL) models for this task on memory-constrained and battery-operated edge devices, such as wearables, requires a careful optimization process, both at design time, with an appropriate tuning of the DL models' architectures, and at execution time, where the execution of large and computationally complex models should be avoided unless strictly needed. In this work, we pursue both optimization targets, proposing a novel gesture recognition system that improves upon the state-of-the-art models both in terms of accuracy and efficiency. At the level of DL model architecture, we apply for the first time tiny transformer models (which we call bioformers ) to sEMG-based gesture recognition. Through an extensive architecture exploration, we show that our most accurate bioformer achieves a higher classification accuracy on the popular Non-Invasive Adaptive hand Prosthetics Database 6 (Ninapro DB6) dataset compared to the state-of-the-art convolutional neural network (CNN) TEMPONet (+3.1%). When deployed on the RISC-V-based low-power system-on-chip (SoC) GAP8, bioformers that outperform TEMPONet in accuracy consume 7.8×-44.5× less energy per inference. At runtime, we propose a three-level dynamic inference approach that combines a shallow classifier, i.e., a random forest (RF) implementing a simple "rest detector" with two bioformers of different accuracy and complexity, which are sequentially applied to each new input, stopping the classification early for "easy" data. With this mechanism, we obtain a flexible inference system, capable of working in many different operating points in terms of accuracy and average energy consumption. On GAP8, we obtain a further 1.03×-1.35× energy reduction compared to static bioformers at iso-accuracy.
Published: 2023
Full Text: View/download PDF

284. Q-PPG: Energy-Efficient PPG-Based Heart Rate Monitoring on Wearable Devices.

Author: Burrello A, Pagliari DJ, Risso M, Benatti S, Macii E, Benini L, and Poncino M
Subjects: Algorithms, Artifacts, Heart Rate physiology, Signal Processing, Computer-Assisted, Photoplethysmography, Wearable Electronic Devices
Abstract: Hearth Rate (HR) monitoring is increasingly performed in wrist-worn devices using low-cost photoplethysmography (PPG) sensors. However, Motion Artifacts (MAs) caused by movements of the subject's arm affect the performance of PPG-based HR tracking. This is typically addressed coupling the PPG signal with acceleration measurements from an inertial sensor. Unfortunately, most standard approaches of this kind rely on hand-tuned parameters, which impair their generalization capabilities and their applicability to real data in the field. In contrast, methods based on deep learning, despite their better generalization, are considered to be too complex to deploy on wearable devices. In this work, we tackle these limitations, proposing a design space exploration methodology to automatically generate a rich family of deep Temporal Convolutional Networks (TCNs) for HR monitoring, all derived from a single "seed" model. Our flow involves a cascade of two Neural Architecture Search (NAS) tools and a hardware-friendly quantizer, whose combination yields both highly accurate and extremely lightweight models. When tested on the PPG-Dalia dataset, our most accurate model sets a new state-of-the-art in Mean Absolute Error. Furthermore, we deploy our TCNs on an embedded platform featuring a STM32WB55 microcontroller, demonstrating their suitability for real-time execution. Our most accurate quantized network achieves 4.41 Beats Per Minute (BPM) of Mean Absolute Error (MAE), with an energy consumption of 47.65 mJ and a memory footprint of 412 kB. At the same time, the smallest network that obtains a MAE 8 BPM, among those generated by our flow, has a memory footprint of 1.9 kB and consumes just 1.79 mJ per inference.
Published: 2021
Full Text: View/download PDF

285. On the impact of smart sensor approximations on the accuracy of machine learning tasks.

Author: Jahier Pagliari D and Poncino M
Abstract: Smart sensors present in ubiquitous Internet of Things (IoT) devices often obtain high energy efficiency by carefully tuning how the sensing, the analog to digital (A/D) conversion and the digital serial transmission are implemented. Such tuning involves approximations , i.e. alterations of the sensed signals that can positively affect energy consumption in various ways. However, for many IoT applications, approximations may have an impact on the quality of the produced output, for example on the classification accuracy of a Machine Learning (ML) model. While the impact of approximations on ML algorithms is widely studied, previous works have focused mostly on processing approximations. In this work, in contrast, we analyze how the signal alterations imposed by smart sensors impact the accuracy of ML classifiers. We focus in particular on data alterations introduced in the serial transmission from a smart sensor to a processor, although our considerations can also be extended to other sources of approximation, such as A/D conversion. Results on several types of models and on two different datasets show that ML algorithms are quite resilient to the alterations produced by smart sensors, and that the serial transmission energy can be reduced by up to 70% without a significant impact on classification accuracy. Moreover, we also show that, contrarily to expectations, the two generic approximation families identified in our work yield similar accuracy losses., Competing Interests: The authors declare no conflict of interest., (© 2020 The Authors.)
Published: 2020
Full Text: View/download PDF

286. LAPSE: Low-Overhead Adaptive Power Saving and Contrast Enhancement for OLEDs.

Author: Pagliari DJ, Macii E, and Poncino M
Abstract: Organic Light Emitting Diode (OLED) display panels are becoming increasingly popular especially in mobile devices; one of the key characteristics of these panels is that their power consumption strongly depends on the displayed image. In this paper we propose LAPSE, a new methodology to concurrently reduce the energy consumed by an OLED display and enhance the contrast of the displayed image, that relies on image-specific pixel-by-pixel transformations. Unlike previous approaches, LAPSE focuses specifically on reducing the overheads required to implement the transformation at runtime. To this end, we propose a transformation that can be executed in real time, either in software, with low time overhead, or in a hardware accelerator with a small area and low energy budget. Despite the significant reduction in complexity, we obtain comparable results to those achieved with more complex approaches in terms of power saving and image quality. Moreover, our method allows to easily explore the full quality-versus-power tradeoff by acting on a few basic parameters; thus, it enables the runtime selection among multiple display quality settings, according to the status of the system.
Published: 2018
Full Text: View/download PDF

287. The Human Brain Project and neuromorphic computing.

Author: Calimera A, Macii E, and Poncino M
Subjects: Humans, Brain physiology, Computer Simulation, Neural Networks, Computer
Abstract: Understanding how the brain manages billions of processing units connected via kilometers of fibers and trillions of synapses, while consuming a few tens of Watts could provide the key to a completely new category of hardware (neuromorphic computing systems). In order to achieve this, a paradigm shift for computing as a whole is needed, which will see it moving away from current "bit precise" computing models and towards new techniques that exploit the stochastic behavior of simple, reliable, very fast, lowpower computing devices embedded in intensely recursive architectures. In this paper we summarize how these objectives will be pursued in the Human Brain Project.
Published: 2013

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

287 results on '"Poncino, M."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources