2,031 results
Search Results
2. Feedforward FFT Hardware Architectures Based on Rotator Allocation.
- Author
-
Garrido, Mario, Huang, Shen-Jui, and Chen, Sau-Gee
- Subjects
FAST Fourier transforms ,DIGITAL signal processing ,ALGORITHMS ,DISCRETE Fourier transforms ,HARDWARE - Abstract
In this paper, we present new feedforward FFT hardware architectures based on rotator allocation. The rotator allocation approach consists in distributing the rotations of the FFT in such a way that the number of edges in the FFT that need rotators and the complexity of the rotators are reduced. Radix-2 and radix-2k feedforward architectures based on rotator allocation are presented in this paper. Experimental results show that the proposed architectures reduce the hardware cost significantly with respect to previous FFT architectures. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
3. CORDIC-Based Architecture for Computing Nth Root and Its Implementation.
- Author
-
Luo, Yuanyong, Wang, Yuxuan, Sun, Huaqing, Zha, Yi, Wang, Zhongfeng, and Pan, Hongbing
- Subjects
DIGITAL computer simulation ,ALGORITHMS ,HARDWARE ,COMPUTER simulation ,DIGITAL signal processing - Abstract
This paper presents a COordinate Rotation Digital Computer (CORDIC)-based architecture for the computation of Nth root and proves its feasibility by hardware implementation. The proposed architecture performs the task of Nth root simply by shift-add operations and enables easy tradeoff between the speed (or precision) and the area. Technically, we divide the Nth root computation into three different subtasks, and map them onto three different classes of the CORDIC accordingly. To overcome the drawback of narrow convergence range of the CORDIC algorithm, we adopt several innovative methods to yield a much improved convergence range. Subsequently, in terms of convergence range and precision, a flexible architecture is developed. The architecture is validated using MATLAB with extensive vector matching. Finally, using a pipelined structure with fixed-point input data, we implement the example circuits of the proposed architecture with radicand ranging from zero to one million, and achieve an average mean of approximately 10−7 for the relative error. The design is modeled using Verilog HDL and synthesized under the TSMC 40-nm CMOS technology. The report shows a maximum frequency of 2.083 GHz with $197421.00~{\mu }\text{m}^{2}$ area. The area decreases to $169689.98~{\mu }\text{m}^{2}$ when the frequency lowers to 1.00 GHz. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
4. Event-Triggered Optimized Control for Nonlinear Delayed Stochastic Systems.
- Author
-
Zhang, Guoping and Zhu, Quanxin
- Subjects
STOCHASTIC systems ,ADAPTIVE fuzzy control ,FUZZY logic ,ALGORITHMS ,DYNAMIC programming ,FUZZY systems - Abstract
This paper is concerned with the problem of event-triggered optimized control for uncertain nonlinear Itô-type stochastic systems with time-delay and unknown dynamic. By using fuzzy logic systems to approximate two unknown nonlinear functions with the delayed state and current state, respectively. The adaptive identifier is constructed to determine the stochastic system, and the optimized control is designed by using the identifier and adaptive dynamic programming (ADP) of actor-critic architecture. Almost all of the works are concentrated on ADP-based optimal control and it will inevitably cause the complexity of computation and requirements of persistence excitation (PE) assumption. In this paper, the ADP algorithm is obtained based on the negative gradient of a simple positive function (equivalent to the HJB equation), and so the proposed optimal control is simple and can release the PE assumption. Moreover, the event-triggered control approach is proposed to reduce computing burden and communication resources. Furthermore, we prove that the states of system and FLSs parameter errors are semi-globally uniformly ultimately bounded (SGUUB) in mean square via the adaptive identifier and the Lyapunov direct method as well as identifier-actor-critic architecture-based ADP algorithm. Finally, the effectiveness of the proposed method is illustrated through two numerical examples. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. FPGA Implementation of Reconfigurable CORDIC Algorithm and a Memristive Chaotic System With Transcendental Nonlinearities.
- Author
-
Mohamed, Sara M., Sayed, Wafaa S., Radwan, Ahmed G., and Said, Lobna A.
- Subjects
TRANSCENDENTAL functions ,MATHEMATICAL functions ,FIELD programmable gate arrays ,ALGORITHMS - Abstract
Coordinate Rotation Digital Computer (CORDIC) is a robust iterative algorithm that computes many transcendental mathematical functions. This paper proposes a reconfigurable CORDIC hardware design and FPGA realization that includes all possible configurations of the CORDIC algorithm. The proposed architecture is introduced in two approaches: multiplier-less and single multiplier approaches, each with its advantages. Compared to recent related works, the proposed implementation overpasses them in the included number of configurations. Additionally, it demonstrates efficient hardware utilization and suitability for potential applications. Furthermore, the proposed design is applied to a memristive chaotic system with different transcendental functions computed using the proposed reconfigurable block. The memristive system design is realized on the Artix-7 FPGA board, yielding throughputs of 0.4483 and 0.3972 Gbit/s for the two approaches of reconfigurable CORDIC. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Dualityfree Methods for Stochastic Composition Optimization.
- Author
-
Liu, Liu, Liu, Ji, and Tao, Dacheng
- Subjects
REINFORCEMENT learning ,STATISTICAL learning ,MACHINE learning ,CONJUGATE gradient methods ,EMBEDDINGS (Mathematics) ,ARTIFICIAL intelligence ,ALGORITHMS - Abstract
In this paper, we consider the composition optimization with two expected-value functions in the form of $({1}/{n})\sum _{i = 1}^{n} F_{i}\left({({1}/{m})\sum _{j = 1}^{m} G_{j}(x)}\right)+R(x)$ , which formulates many important problems in statistical learning and machine learning such as solving Bellman equations in reinforcement learning and nonlinear embedding. Full gradient- or classical stochastic gradient descent-based optimization algorithms are unsuitable or computationally expensive to solve this problem due to the inner expectation $({1}/{m})\sum _{j = 1}^{m} G_{j}(x)$. We propose a dualityfree-based stochastic composition method that combines the variance reduction methods to address the stochastic composition problem. We apply the stochastic variance reduction gradient- and stochastic average gradient algorithm-based methods to estimate the inner function and the dualityfree method to estimate the outer function. We prove the linear convergence rate not only for the convex composition problem but also for the case that the individual outer functions are nonconvex, while the objective function is strongly convex. We also provide the results of experiments that show the effectiveness of our proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
7. Compensation Network Optimal Design Based on Evolutionary Algorithm for Inductive Power Transfer System.
- Author
-
Chen, Weiming, Lu, Weiguo, Iu, Herbert Ho-Ching, and Fernando, Tyrone
- Subjects
EVOLUTIONARY algorithms ,CURRENT fluctuations ,EVOLUTIONARY computation ,ALGORITHMS ,MATHEMATICAL models ,EXPERIMENTAL design - Abstract
Conventional design and optimization of passive compensation network (PCN) for inductive power transfer (IPT) system are based on specific topologies. The demerits of this design method are: i) The topology is mostly chosen by experience; ii) The design parameters are not multi-objective optimal. Aiming at these issues, this paper proposes an optimal PCN design scheme based on evolutionary algorithm (EA) to synchronously optimize the topology and parameters of PCN for IPT system. Firstly, a unified mathematical model of the PCN is presented and derived by transmission matrix. Then, according to the mathematical model, the multi-objective functions (such as output fluctuation and efficiency) as well as the constraints (such as load and coupling coefficient) for the optimal PCN design are established. The EA based multi-objective optimal PCN design algorithm is further constructed. Six optimal results are obtained using the algorithm, and one optimized PCN having minimum output current fluctuation and high-efficiency is chosen to validate the effectiveness of the proposed design scheme in experiment. For the given IPT system with the optimized PCN, the maximum fluctuation of output current is no more than 11%, within 200% of load variation and about 77% of coupling variation. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
8. Analyzing the Impact of Memristor Variability on Crossbar Implementation of Regression Algorithms With Smart Weight Update Pulsing Techniques.
- Author
-
Afshari, Sahra, Musisi-Nkambwe, Mirembe, and Sanchez Esqueda, Ivan Sanchez
- Subjects
ALGORITHMS ,MEMRISTORS ,COMPUTER architecture ,MATHEMATICAL models ,INTEGRATING circuits - Abstract
This paper presents an extensive study of linear and logistic regression algorithms implemented with 1T1R memristor crossbars arrays. Using a sophisticated simulation platform that wraps circuit-level simulations of 1T1R crossbars and physics-based models of RRAM (memristors), we elucidate the impact of device variability on algorithm accuracy, convergence rate and precision. Moreover, a smart pulsing strategy is proposed for practical implementation of synaptic weight updates that can accelerate training in real crossbar architectures. Stochastic multi-variable linear regression shows robustness to memristor variability in terms of prediction accuracy but reveals impact on convergence rate and precision. Similarly, the stochastic logistic regression crossbar implementation reveals immunity to memristor variability as determined by negligible effects on image classification accuracy but indicates an impact on training performance manifested as reduced convergence rate and degraded precision. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. A New Full Chaos Coupled Mapping Lattice and Its Application in Privacy Image Encryption.
- Author
-
Wang, Xingyuan and Liu, Pengbo
- Subjects
IMAGE encryption ,PRIVACY ,DYNAMICAL systems ,HEURISTIC algorithms ,CRYPTOGRAPHY ,ALGORITHMS - Abstract
Since chaotic cryptography has a long-term problem of dynamic degradation, this paper presents proof that chaotic systems resist dynamic degradation through theoretical analysis. Based on this proof, a novel one-dimensional two-parameter with a wide-range system mixed coupled map lattice model (TWMCML) is given. The evaluation of TWMCML shows that the system has the characteristics of strong chaos, high sensitivity, broader parameter ranges and wider chaos range, which helps to enhance the security of chaotic sequences. Based on the excellent performance of TWMCML, it is applied to the newly proposed encryption algorithm. The algorithm realizes double protection of private images under the premise of ensuring efficiency and safety. First, the important information of the image is extracted by edge detection technology. Then the important area is scrambled by the three-dimensional bit-level coupled XOR method. Finally, the global image is more fully confused by the dynamic index diffusion formula. The simulation experiment verified the effectiveness of the algorithm for grayscale and color images. Security tests show that the application of TWMCML makes the encryption algorithm have a better ability to overcome conventional attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Toward Practical Code-Based Signature: Implementing Fast and Compact QC-LDGM Signature Scheme on Embedded Hardware.
- Author
-
Hu, Jingwei and Cheung, Ray C. C.
- Subjects
PUBLIC key cryptography ,CODING theory ,ALGORITHMS ,FIELD programmable gate arrays ,DATA encryption - Abstract
In this paper, fast and compact implementations for code-based signature are presented. Existing designs are either using enormous memory storage or suffering from slow issuing speed of signatures. A vastly optimized new design solving these problems is proposed by exploiting quasi-cyclic low-density generator matrix codes at different levels. In particular, this paper provides a new algorithmic enhancement of signature generation and gives detailed and optimized solutions for critical steps of this algorithm. The design presented in this paper is the fastest implementation of code-based signatures in open literature. It is shown, for instance, that our implementation of signature generation engine can generate approximately 60 000 signatures per second on a Xilinx Virtex-6 FPGA, requiring only 5992 slices and 60 memory blocks. In addition, a very compact implementation is also provided, producing 5438 signatures per second with only 18 memory blocks. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
11. Finite-/Fixed-Time Synchronization of Memristor Chaotic Systems and Image Encryption Application.
- Author
-
Wang, Leimin, Jiang, Shan, Ge, Ming-Feng, Hu, Cheng, and Hu, Junhao
- Subjects
SLIDING mode control ,CHAOS synchronization ,IMAGE encryption ,IMAGING systems ,LYAPUNOV stability ,ALGORITHMS - Abstract
In this paper, a unified framework is proposed to address the synchronization problem of memristor chaotic systems (MCSs) via the sliding-mode control method. By employing the presented unified framework, the finite-time and fixed-time synchronization of MCSs can be realized simultaneously. On the one hand, based on the Lyapunov stability and sliding-mode control theories, the finite-/fixed-time synchronization results are obtained. It is proved that the trajectories of error states come near and get to the designed sliding-mode surface, stay on it accordingly and approach the origin in a finite/fixed time. On the other hand, we develop an image encryption algorithm as well as its implementation process to show the application of the synchronization. Finally, the theoretical results and the corresponding image encryption application are carried out by numerical simulations and statistical performances. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
12. Online Learning Algorithm Based on Adaptive Control Theory.
- Author
-
Liu, Jian-Wei, Zhou, Jia-Jia, Kamel, Mohamed S., and Luo, Xiong-Lin
- Subjects
DISTANCE education ,ALGORITHMS ,MACHINE learning - Abstract
This paper proposes a new online learning algorithm which is based on adaptive control (AC) theory, thus, we call this proposed algorithm as AC algorithm. Comparing to the gradient descent (GD) and exponential gradient (EG) algorithm which have been applied to online prediction problems, we find a new form of AC theory for online prediction problems and investigate two key questions: how to get a new update law which has a tighter upper bound on the error than the square loss? How to compare the upper bound for accumulated losses for the three algorithms? We obtain a new update law which fully utilizes model reference AC theory. Moreover, we present upper bound on the worst-case expected loss for AC algorithm and compare it with previously known bounds for the GD and EG algorithm. The loss bound we get in this paper is a time-varying function, which provides increasingly accurate estimates for upper bound. The AC algorithm has a much smaller loss only if the number of the samples meets certain conditions which can be seen in this paper. We also performed experiments which show that our update law is reasonably feasible and our upper bound is quite tight on both simple artificial and real data sets. The main contributions of this paper are twofold. First of all, we develop a new online algorithm called AC algorithm, and second, we obtain improved bounds, see
Theorems 2 – 4 in this paper. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
13. Extended Polynomial Growth Transforms for Design and Training of Generalized Support Vector Machines.
- Author
-
Gangopadhyay, Ahana, Chatterjee, Oindrila, and Chakrabartty, Shantanu
- Subjects
SUPPORT vector machines ,MACHINE learning ,POLYNOMIALS ,NONLINEAR programming ,ALGORITHMS - Abstract
Growth transformations constitute a class of fixed-point multiplicative update algorithms that were originally proposed for optimizing polynomial and rational functions over a domain of probability measures. In this paper, we extend this framework to the domain of bounded real variables which can be applied towards optimizing the dual cost function of a generic support vector machine (SVM). The approach can, therefore, not only be used to train traditional soft-margin binary SVMs, one-class SVMs, and probabilistic SVMs but can also be used to design novel variants of SVMs with different types of convex and quasi-convex loss functions. In this paper, we propose an efficient training algorithm based on polynomial growth transforms, and compare and contrast the properties of different SVM variants using several synthetic and benchmark data sets. The preliminary experiments show that the proposed multiplicative update algorithm is more scalable and yields better convergence compared to standard quadratic and nonlinear programming solvers. While the formulation and the underlying algorithms have been validated in this paper only for SVM-based learning, the proposed approach is general and can be applied to a wide variety of optimization problems and statistical learning models. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
14. Joint Sparsity and Order Optimization Based on ADMM With Non-Uniform Group Hard Thresholding.
- Author
-
Matsuoka, Ryo, Kyochi, Seisuke, Ono, Shunsuke, and Okuda, Masahiro
- Subjects
FINITE impulse response filters ,DIGITAL signal processing ,LEAST squares ,PROGRAM transformation ,MULTIPLIERS (Mathematical analysis) ,ALGORITHMS - Abstract
This paper proposes a new optimization framework for the joint optimization of sparsity and filter order (JOSFO) for FIR filter design. Since the cost function for JOSFO involves \ell 0 and non-uniform overlapped group \ell 0 norms, which are not convex, a global optimal solution is difficult to obtain. To find an approximate solution of the non-convex problem, existing approaches repeat the following steps: 1) approximate the cost function; 2) find candidates of zero coefficients by minimizing the cost function; and 3) set them to zero. On the other hand, this paper directly solves the optimization problem, without any approximation to the cost function, by using the alternating direction method of multipliers with the pseudo-proximity operators of \ell 0 and non-uniform non-overlapped group \ell 0 norms. Experimental results show that resulting filters designed by the proposed method have sparser coefficients and lower orders, while satisfying filter specifications, such as an error from a desired frequency response. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
15. Fault Modeling and Efficient Testing of Memristor-Based Memory.
- Author
-
Liu, Peng, You, Zhiqiang, Wu, Jigang, Liu, Bosheng, Han, Yinhe, and Chakrabarty, Krishnendu
- Subjects
BRIDGE defects ,MEMORY testing ,ALGORITHMS ,DISCRETE Fourier transforms ,OPTICAL disks ,MEMRISTORS - Abstract
Memristor-based memory technology is one of the emerging memory technologies, which is a potential candidate to replace traditional memories. Efficient test solutions are required to enable the quality and reliability of such products. In previous works, fault models are caused by open, short and bridge defects and parametric variations during the fabrication. However, these fault models cannot describe the bridge defects that cause the state of the faulty cell to an undefined state. In this paper, we analyze the different effects of bridge defects and aggregate their faulty behavior into new fault models, undefined coupling fault and dynamic undefined coupling fault. In addition, an enhanced March algorithm is designed to detect all the modeled faults. In one resistor crossbar with $N$ memristors, the enhanced March algorithm requires $8N$ write and $7N$ read operations with negligible hardware overhead. To reduce the test time, a March RC algorithm is proposed based on read operations with new reference currents, which requires $4N+2$ write and $6N$ read operations. Analytical results show that the proposed test algorithms can detect all the modeled faults outperforming all the previous methods. Subsequently, a Design-for-Testability scheme is proposed to implement March RC algorithm with a little area overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. The Impact of Device Uniformity on Functionality of Analog Passively-Integrated Memristive Circuits.
- Author
-
Fahimi, Z., Mahmoodi, M. R., Klachko, M., Nili, H., and Strukov, D. B.
- Subjects
UNIFORMITY ,MEMRISTORS ,COMPUTER systems ,ANALOG circuits ,ALGORITHMS ,NEUROMORPHICS - Abstract
Passively-integrated memristors are the most prospective candidates for designing high-speed, energy-efficient, and compact neuromorphic circuits. Despite all the promising properties, experimental demonstrations of passive memristive crossbars have been limited to circuits with few thousands of devices until now, which stems from the strict uniformity requirements on the IV characteristics of memristors. This paper expands upon this vital challenge and investigates how uniformity impacts the computing accuracy of analog memristive circuits, focusing on neuromorphic applications. Specifically, the paper explores the tradeoffs between computing accuracy, crossbar size, switching threshold variations, and target precision. All-embracing simulations of matrix multipliers and deep neural networks on CIFAR-10 and ImageNet datasets have been carried out to evaluate the role of uniformity on the accuracy of computing systems. Further, we study three post-fabrication methods that increase the accuracy of nonuniform 0T1R neuromorphic circuits: hardware-aware training, improved tuning algorithm, and switching threshold modification. The application of these techniques allows us to implement advanced deep neural networks with almost no accuracy drop, using current state-of-the-art analog 0T1R technology. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. A Smoothed LASSO-Based DNN Sparsification Technique.
- Author
-
Koneru, Basava Naga Girish, Chandrachoodan, Nitin, and Vasudevan, Vinita
- Subjects
ERROR functions ,ALGORITHMS ,APPROXIMATION algorithms ,SMOOTHNESS of functions ,COST functions - Abstract
Deep Neural Networks (DNNs) are increasingly being used in a variety of applications. However, DNNs have huge computational and memory requirements. One way to reduce these requirements is to sparsify DNNs by using smoothed LASSO (Least Absolute Shrinkage and Selection Operator) functions. In this paper, we show that irrespective of error profile, the sparsity values obtained using various smoothed LASSO functions are similar, provided the maximum error of these functions with respect to the LASSO function is the same. We also propose a layer-wise DNN pruning algorithm, where the layers are pruned based on their individual allocated accuracy loss budget, determined by estimates of the reduction in number of multiply-accumulate operations (in convolutional layers) and weights (in fully connected layers). Further, the structured LASSO variants in both convolutional and fully connected layers are explored within the smoothed LASSO framework and the tradeoffs involved are discussed. The efficacy of proposed algorithm in enhancing the sparsity within the allowed degradation in DNN accuracy and results obtained on structured LASSO variants are shown on MNIST, SVHN, CIFAR-10, and Imagenette datasets and on larger networks such as ResNet-50 and Mobilenet. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. Constructing Higher-Dimensional Digital Chaotic Systems via Loop-State Contraction Algorithm.
- Author
-
Wang, Qianxue, Yu, Simin, Guyeux, Christophe, and Wang, Wei
- Subjects
PROBLEM solving ,ALGORITHMS ,TIME series analysis ,COMPACT spaces (Topology) - Abstract
This paper aims to refine and expand the theoretical and application framework of higher-dimensional digital chaotic system (HDDCS). Topological mixing for HDDCS is strictly proved theoretically at first. Topological mixing implies Devaney’s definition of chaos in a compact space, but not vice versa. Therefore, the proof of topological mixing promotes the theoretical research of HDDCS. Then, a general design method for constructing HDDCS via loop-state contraction algorithm is given. The construction of the iterative function uncontrolled by random sequences (hereafter called iterative function) is the starting point of this research. On this basis, this paper put forward a general design method to solve the construction problem of HDDCS, and several examples illustrate the effectiveness and feasibility of this method. The adjacency matrix corresponding to the designed HDDCS is used to construct the chaotic Echo State Network (ESN) for predicting Mackey-Glass time series. Compared with other ESNs, the chaotic ESN has better prediction performance and is able to accurately predict a much longer period of time. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. IEEE Transactions on Circuits and Systems—I: Regular Papers information for authors.
- Subjects
PUBLISHING ,MANUSCRIPT preparation (Authorship) ,COMPUTER-aided design ,SIGNAL processing ,ELECTRIC circuits ,ALGORITHMS - Abstract
Provides instructions and guidelines to prospective authors who wish to submit manuscripts. [ABSTRACT FROM PUBLISHER]
- Published
- 2013
- Full Text
- View/download PDF
20. Efficient Row-Layered Decoder for Sparse Code Multiple Access.
- Author
-
Pang, Xu, Song, Wenqing, Shen, Yifei, You, Xiaohu, and Zhang, Chuan
- Subjects
BIT error rate ,MESSAGE passing (Computer science) ,ALGORITHMS ,WIRELESS communications ,TECHNOLOGY convergence ,BARBELLS ,VERY large scale circuit integration ,JACOBIAN matrices - Abstract
Sparse code multiple access (SCMA) is a promising technology for the development of wireless communication, which supports a large number of overloading users and enjoys high spectral efficiency. However, conventional SCMA decoders suffer very high complexity in implementations. Changing the updating scheme is a superior approach to reduce complexity, which guarantees the updated information immediately join in the following message propagating of the current iteration and accelerates the decoding convergence. In this paper, a row-layered message passing algorithm (MPA) is proposed, which offers a good trade-off between the hardware complexity and the bit error rate (BER) performance. Simulation results show that the proposed decoder saves 66.7% computation complexity compared with the original MPA with the similar BER performance. Pipelining and folding technology are adopted in VLSI implementations. The synthesis results with 45-nm CMOS technology show that the proposed decoder can achieve higher hardware efficiency and throughput under a high frequency than the existing decoders, achieving 1777.78 Mb/s throughput with 1.112 mm
2 area consumption. [ABSTRACT FROM AUTHOR]- Published
- 2021
- Full Text
- View/download PDF
21. Privacy-Preserving Consensus for Multi-Agent Systems via Node Decomposition Strategy.
- Author
-
Wang, Yaqi, Lu, Jianquan, Zheng, Wei Xing, and Shi, Kaibo
- Subjects
MULTIAGENT systems ,DISTRIBUTED algorithms ,ALGORITHMS ,UNDIRECTED graphs ,COMPUTATIONAL complexity ,INFORMATION sharing - Abstract
This paper proposes two kinds of algorithms to achieve privacy-preserving consensus of multi-agent systems over undirected graphs via node decomposition mechanism and homomorphic cryptography technique. Based on the number of neighboring nodes (|N
i |), every agent is decomposed into |Ni | subagents, which are connected as a chain graph. Note that every subagent connects one and only one non-homologous subagent (generated by different agents). Information interaction between non-homologous subagents is encrypted by a homomorphic cryptography algorithm, and homologous subagents exchange information directly. In this regard, the proposed node decomposition mechanism enhances the privacy of the initial values without increasing the computational complexity of encryption. The first privacy-preserving algorithm can achieve the accurate average consensus, which means that the agreement value of every subagent is consistent with the original average consensus value. The second algorithm studies the privacy-preserving scaled consensus problem without a priori knowledge about the underlying graph. Although the final convergence values of subagents do not keep exactly the same, homologous subagents can compute the original group decision value by resorting to the product of the limit value and agent’s degree. Importantly, this algorithm also guarantees the privacy of group decision value of the whole system. Besides, it is proved that the privacy of the initial value can be preserved if the agent has at least one neutral neighbor. [ABSTRACT FROM AUTHOR]- Published
- 2021
- Full Text
- View/download PDF
22. Dynamic Deadband Event-Triggered Strategy for Distributed Adaptive Consensus Control With Applications to Circuit Systems.
- Author
-
Xu, Yong, Sun, Jian, Pan, Ya-Jun, and Wu, Zheng-Guang
- Subjects
MULTIAGENT systems ,SELF-tuning controllers ,DATA transmission systems ,DATA reduction ,ADAPTIVE control systems ,ALGORITHMS - Abstract
This paper focuses on the distributed consensus seeking of multi-agent systems (MASs) with discrete-time control updating and intermittent communications among agents. Compared with existing linearly coupled protocols, a nonlinear coupled Zeno-free event-triggered controller is first proposed, which is further to project the static and dynamic triggering mechanisms exploited by using the deadband control method. Then, the node-based nonlinear coupled adaptive event-triggered controller with online self-tuning of time-varying coupling weight and its corresponding to static and dynamic deadband-based event-triggered mechanisms are designed, respectively. The exploited adaptive event-triggered controller does not rely on any global information of interaction structure and is implemented in a fully distributed fashion. In addition, two dynamic proposals not only cover existing static strategies as special cases, but also show that the minimal inter-execution time of dynamic one is not smaller than that of static one. Theoretical analysis shows that the proposed static and dynamic deadband-based event-triggered mechanisms can not only ensure the average consensus with Zeno-freeness, but also achieve the data reduction of communication and control. Finally, the proposed algorithms applied to circuit implementation are corroborated to prove its practical merits and validity. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. A New Fast Algorithm for Discrete Fractional Hadamard Transform.
- Author
-
Cariow, Aleksandr, Majorkowska-Mech, Dorota, Paplinski, Janusz P., and Cariowa, Galina
- Subjects
HADAMARD matrices ,DISCRETE Fourier transforms ,ALGORITHMS ,SYMMETRIC matrices ,MATRIX decomposition ,VECTOR data - Abstract
This paper proposes a new fast algorithm for calculating the discrete fractional Hadamard transform for data vectors whose size $N$ is a power of two. A direct method for the calculation of the discrete fractional Hadamard transform requires $O(N^{2})$ multiplications, the last fast algorithm requires $O(N \log _{2}~N)$ , while in the proposed algorithm the number of multiplications is reduced to $O(N)$. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
24. Efficient Shift-Add Implementation of FIR Filters Using Variable Partition Hybrid Form Structures.
- Author
-
Ray, Dwaipayan, George, Nithin V., and Meher, Pramod Kumar
- Subjects
FINITE impulse response filters ,ENERGY consumption ,HARDWARE ,DIGITAL signal processing ,ALGORITHMS - Abstract
Single constant multiplication (SCM) and multiple constant multiplications (MCM) are among the most popular schemes used for low-complexity shift-add implementation of finite impulse response (FIR) filters. While SCM is used in the direct form realization of FIR filters, MCM is used in the transposed direct form structures. Very often, the hybrid form FIR filters where the sub-sections are implemented by fixed-size MCM blocks provide better area, time, and power efficiency than those of traditional MCM and SCM based implementations. To have an efficient hybrid form filter, in this paper, we have performed a detailed complexity analysis in terms of the hardware and time consumed by the hybrid form structures. We find that the existing hybrid form structures lead to an undesirable increase of complexity in the structural-adder block. Therefore, to have a more efficient implementation, a variable size partitioning approach is proposed in this paper. It is shown that the proposed approach consumes less area and provides nearly 11% reduction of critical path delay, 40% reduction of power consumption, 15% reduction of area-delay product, 52% reduction of energy-delay product, and 42% reduction of power-area product, on an average, over the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
25. Cooperative Stabilization of a Class of LTI Plants With Distributed Observers.
- Author
-
Liu, Kexin, Zhu, Henghui, and Lu, Jinhu
- Subjects
LINEAR systems ,ALGORITHMS ,COMPUTER simulation - Abstract
Over the last decades, the cooperative design of complex networked systems has received an increasing attention in real-world engineering practices. Traditionally, each node in the network is assumed to obtain the same signal. However, each agent often possesses different measurement due to the observability or configuration of the systems. To solve the stabilization problem in this case, we aim to establish a unified framework for the cooperative control of complex network with distributed observers. In detail, different from the traditional centralized design, this paper initiates a cooperative approach by only using the local information of the networked systems. In allusion to the three kinds of representative networks, this paper establishes some sufficient or necessary conditions for the existence of network parameters that guarantee the stabilization of the LTI plants. Moreover, some corresponding algorithms are developed to find the suitable parameters of networked observers. Last but not least, numerical simulations are also provided to verify the above theoretical results. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
26. Distributed Model Predictive Consensus of Heterogeneous Time-Varying Multi-Agent Systems: With and Without Self-Triggered Mechanism.
- Author
-
Li, Huiyan and Li, Xiang
- Subjects
MULTIAGENT systems ,TIME-varying systems ,PREDICTION models ,CONSTRAINT algorithms ,INTEGRATORS ,ALGORITHMS - Abstract
This paper provides a framework of designing distributed model predictive controller to reach consensus of a heterogeneous time-varying multi-agent system, and the dynamics of heterogeneous agents are modeled by double integrators and Euler-Lagrange (EL) equations. Firstly, a DMPC-based consensus algorithm is proposed, where the constraints in the algorithm depend on the heterogeneous dynamics. We prove that the resultant DMPC optimization problem is feasible with the designed controllers, which is stable when the system reaches consensus. To further reduce communication cost and solve the problem with asynchronous discrete-time information exchange, self-triggered mechanism is introduced into the framework. Trigger intervals are alternatively optimized with the control inputs, and the influence on the system performance is analyzed. Numerical examples are provided to verify the effectiveness and advantages of the proposed algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
27. Leakage-Aware Battery Lifetime Analysis Using the Calculus of Variations.
- Author
-
Jafari-Nodoushan, Mostafa, Safaei, Bardia, Ejlali, Alireza, and Chen, Jian-Jia
- Subjects
ELECTRIC batteries ,LEAKAGE ,CURVE fitting ,ALGORITHMS ,ONLINE algorithms ,CALCULUS of variations - Abstract
Due to non-linear factors such as the rate capacity and the recovery effect, the shape of the battery discharge curve plays a significant role in the overall lifetime of the batteries. Accordingly, this paper proposes a simple heuristic battery-aware speed scheduling policy for periodic and non-periodic real-time tasks in Dynamic Voltage Scaling (DVS) systems with non-negligible leakage/static power. A set of comprehensive analysis has been conducted to compare the battery efficiency of the proposed policies with an optimal solution, which could be derived via the Calculus of Variations (CoV). These evaluations have taken into account both periodic and non-periodic tasks in DVS-based systems. Our experiments have shown a maximum of 7% difference between the optimal solution and the simple heuristic speed scheduling for realistic settings of the battery model. By considering the calculated optimal speed scheduling for different tasks (with different utilizations), a two-phase algorithm has been proposed, in which a speed approximation function is being calculated offline based on curve fitting, while the best execution speed is applied online. The results show a maximum of 17.7% and 11.3% battery charge saving for non-periodic and periodic tasks in comparison to the baseline critical frequency method, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. Implementation of Supersingular Isogeny-Based Diffie-Hellman and Key Encapsulation Using an Efficient Scheduling.
- Author
-
Farzam, Mohammad-Hossein, Bayat-Sarmadi, Siavash, and Mosanaei-Boorani, Hatameh
- Subjects
ALGORITHMS ,CONSTRAINT programming ,SCHEDULING ,QUANTUM cryptography ,CRYPTOGRAPHY - Abstract
Isogeny-based cryptography is one of the promising post-quantum candidates mainly because of its smaller public key length. Due to its high computational cost, efficient implementations are significantly important. In this paper, we have proposed a high-speed FPGA implementation of the supersingular isogeny Diffie-Hellman (SIDH) and key encapsulation (SIKE). To this end, we have adapted the algorithm of finding optimal large-degree isogeny computation strategy for hardware implementations. Using this algorithm, hardware-suited strategies (HSSs) can be devised. We have also developed a tool to schedule field arithmetic operations efficiently using constraint programming. This tool enables reducing the latency of SIDH and SIKE subroutines by up to 14% at NIST’s highest security level, i.e., using the SIKEp751 parameter set. We have also improved the latency of field inversion, the most costly field operation in SIDH, by 23% using the Montgomery ladder technique. We have provided constant-time implementations of SIDH and SIKE on Virtex-7 using SIKEp751 utilizing 6 and 8 prime field multipliers to resemble the previous work. Experimental results show that using 8 multipliers SIDH and SIKE encapsulation and decapsulation can be performed in 24.66 ms and 24.10 ms, which is 1.37 and 1.12 times faster than the latest corresponding FPGA implementations, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
29. Scanning the Issue.
- Subjects
WAVE equation ,ALGORITHMS ,STATE-space methods ,LYAPUNOV functions ,HARMONIC oscillators - Abstract
Provides an overview of the technical articles and features presented in this issue. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
30. Real-Time Distance Evaluation System for Wireless Localization.
- Author
-
Piccinni, Giovanni, Avitabile, Gianfranco, Coviello, Giuseppe, and Talarico, Claudio
- Subjects
MATHEMATICAL sequences ,FIELD programmable gate arrays ,ALGORITHMS ,WIRELESS localization ,MULTIPATH channels - Abstract
The paper describes the FPGA implementation of a novel position evaluation algorithm based on the time difference of arrival (TDOA) principle that combines the characteristics of an Orthogonal Frequency Division Modulation (OFDM) symbol with the properties of the Zadoff-Chu mathematical sequences. The resulting system is highly scalable and its characteristics are easily adaptable to different operating scenarios. The algorithm has been implemented using the Stratix IV-E EP4SGX70HF35C3 FPGA, requiring about 112k bit of memory and less than 44k logic elements of which about 16k are registers. The paper describes the algorithm and its FPGA implementation along with experimental results validating the system performance even in the presence of multipath interferences and showing precision in target position estimation that is better than 2 cm. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. Label-less Learning for Emotion Cognition.
- Author
-
Chen, Min and Hao, Yixue
- Subjects
BLENDED learning ,COGNITION ,EMOTIONS ,LABELS ,TAGS (Metadata) ,EMOTION recognition ,ALGORITHMS - Abstract
In this paper, we propose a label-less learning for emotion cognition (LLEC) to achieve the utilization of a large amount of unlabeled data. We first inspect the unlabeled data from two perspectives, i.e., the feature layer and the decision layer. By utilizing the similarity model and the entropy model, this paper presents a hybrid label-less learning that can automatically label data without human intervention. Then, we design an enhanced hybrid label-less learning to purify the automatic labeled data. To further improve the accuracy of emotion detection model and increase the utilization of unlabeled data, we apply enhanced hybrid label-less learning for multimodal unlabeled emotion data. Finally, we build a real-world test bed to evaluate the LLEC algorithm. The experimental results show that the LLEC algorithm can improve the accuracy of emotion detection significantly. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. Consensus of Multi-Agent Systems Under Binary-Valued Measurements and Recursive Projection Algorithm.
- Author
-
Wang, Ting, Zhang, Hang, and Zhao, Yanlong
- Subjects
MULTIAGENT systems ,PARAMETER estimation ,RANDOM noise theory ,ALGORITHMS - Abstract
This paper studies consensus problems of multi-agent systems with binary-valued communications. Different from most existing works, the agents considered in this paper can only get binary-valued observations of its neighbors’ states with random noises. A consensus algorithm is proposed: first, each agent estimates its neighbors’ states by the recursive projection algorithm; then, each agent designs the control timely based on the estimates. It is proved that the estimates of the states can converge to the true states with a faster convergence rate than that in the parameter estimation. Moreover, the states of the agents can achieve mean-square consensus, and the corresponding consensus speed can achieve $O(1/t)$ under certain conditions. Finally, simulations are given to demonstrate the theoretical results. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Fast 2D Convolution Algorithms for Convolutional Neural Networks.
- Author
-
Cheng, Chao and Parhi, Keshab K.
- Subjects
DECONVOLUTION (Mathematics) ,MATHEMATICAL convolutions ,ALGORITHMS ,ADDITION (Mathematics) ,KRONECKER products - Abstract
Convolutional Neural Networks (CNN) are widely used in different artificial intelligence (AI) applications. Major part of the computation of a CNN involves 2D convolution. In this paper, we propose novel fast convolution algorithms for both 1D and 2D to remove the redundant multiplication operations in convolution computations at the cost of controlled increase of addition operations. For example, when the 2D processing block size is $3\times 3$ , our algorithm has multiplication saving factor as high as 3.24, compared to direct 2D convolution computation scheme. The proposed algorithm can also process input feature maps and generate output feature maps with the same flexible block sizes that are independent of convolution weight kernel size. The memory access efficiency is also largely improved by the proposed method. These structures can be applied to different CNN layers, such as convolution with stride > 1, pooling and deconvolution by exploring flexible feature map processing tile sizes. The proposed algorithm is suitable for both software and hardware implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. Proper Orthogonal Decomposition Method to Nonlinear Filtering Problems in Medium-High Dimension.
- Author
-
Wang, Zhongjian, Luo, Xue, Yau, Stephen S.-T., and Zhang, Zhiwen
- Subjects
PROPER orthogonal decomposition ,DECOMPOSITION method ,NONLINEAR equations ,ORTHOGONAL decompositions ,ALGORITHMS - Abstract
In this paper, we investigate the proper orthogonal decomposition (POD) method to numerically solve the forward Kolmogorov equation (FKE). Our method aims to explore the low-dimensional structures in the solution space of the FKE and to develop efficient numerical methods. As an important application and our primary motivation to study the POD method to FKE, we solve the nonlinear filtering (NLF) problems with a real-time algorithm proposed by Yau and Yau combined with the POD method. This algorithm is referred as POD algorithm in this paper. Our POD algorithm consists of offline and online stages. In the offline stage, we construct a small number of POD basis functions that capture the dynamics of the system and compute propagation of the POD basis functions under the FKE operator. In the online stage, we synchronize the coming observations in a real-time manner. Its convergence analysis has also been discussed. Some numerical experiments of the NLF problems are performed to illustrate the feasibility of our algorithm and to verify the convergence rate. Our numerical results show that the POD algorithm provides considerable computational savings over existing numerical methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Robust Stability for Multiple Model Adaptive Control: Part II—Gain Bounds.
- Author
-
Buchstaller, Dominic and French, Mark
- Subjects
ROBUST control ,ADAPTIVE control systems ,ESTIMATION theory ,ALGORITHMS ,UNCERTAINTY (Information theory) ,PERFORMANCE evaluation - Abstract
The axiomatic development of a wide class of Estimation based Multiple Model Switched Adaptive Control (EMMSAC) algorithms considered in the first part of this two part contribution forms the basis for the proof of the gain bounds given in this paper. The bounds are determined in terms of a cover of the uncertainty set, and in particular, in many instances, are independent of the number of candidate plant models under consideration. The full interpretation, implications and usage of these bounds within design synthesis are discussed in part I. Here in part II, key features of the bounds are also discussed and a simulation example is considered. It is shown that a dynamic EMMSAC design can be universal and hence non-conservative and hence outperforms static EMMSAC and other conservative designs. A wide range of possible dynamic algorithms are outlined, motivated by both performance and implementation considerations. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
36. Matrix-Based Algorithms for the Optimal Design of Variable Fractional Delay FIR Filters.
- Author
-
Zhao, Ruijie and Hong, Xiaoying
- Subjects
FINITE impulse response filters ,ALGORITHMS ,KERNEL functions ,CONJUGATE gradient methods ,MATRIX inequalities - Abstract
This paper investigates the weighted least squares (WLS) and minimax design problems for variable fractional delay (VFD) FIR filters. For the WLS design, the general case of incorporating arbitrary nonnegative weighting functions is considered. The optimal solution is characterized by two matrix equations. An efficient algorithm using conjugate gradient (CG) techniques is proposed to solve the WLS solution. The proposed algorithm is guaranteed to converge to the optimal solution in a finite number of iterations. Moreover, an iterative reweighted least squares (IRLS) algorithm that uses the proposed CG algorithm as its iteration core is developed for the minimax design problem. In both of the algorithms, the filter coefficients are arranged as matrices, achieving a great saving in computations and memory space. The associated computational complexity is analyzed. Some design examples are provided and comparisons with existing methods show that the proposed ones are either computationally more efficient or can obtain the better filter performance, or both. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. A Fast and Power-Efficient Hardware Architecture for Visual Feature Detection in Affine-SIFT.
- Author
-
Ouyang, Peng, Yin, Shouyi, Liu, Leibo, Zhang, Youguang, Zhao, Weisheng, and Wei, Shaojun
- Subjects
COMPUTER vision ,GAUSSIAN processes ,ALGORITHMS - Abstract
Visual feature detection has been widely used in many computer vision applications, with increasing concern on feature robustness, processing speed, and power efficiency. In comparison with popular feature detection algorithms, affine-SIFT achieves the strongest robustness on the image illumination, image rotation, and image scale transformation, but exhibits extreme high computation complexity. To improve its computing efficiency, this work first proposes three hardware optimization methods to address three main performance bottlenecks. The first method is the reverse affine-based pipelined computing with optimized memory accessing. The second method is about stream processing with full parallel Gaussian pyramid. The third method is the rotation invariant binary pattern based feature vector generation. Then by incorporating these three optimization methods, this paper designs a high-efficient pipelined and parallel hardware architecture with optimized parallel memory accessing. Postlayout simulations using TSMC 65-nm 1P9M low power process show that this work achieves a processing speed of 97 fps at 1080p (1000 feature points per frame on average) under 200 MHz, with power consumption at 300 mW. In comparison, its computing efficiency (1005.6K pixels/s at 1 MHz) and power efficiency (670.5K pixels/s at 1 mW) are higher than state-of-the-art works and it is more promising for broad vision applications especially the embedded vision and mobile vision applications. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. $k$ -Times Markov Sampling for SVMC.
- Author
-
Zou, Bin, Xu, Chen, Lu, Yang, Tang, Yuan Yan, Xu, Jie, and You, Xinge
- Subjects
SUPPORT vector machines ,MARKOV processes ,ALGORITHMS ,TRAINING ,SAMPLES (Commerce) - Abstract
Support vector machine (SVM) is one of the most widely used learning algorithms for classification problems. Although SVM has good performance in practical applications, it has high algorithmic complexity as the size of training samples is large. In this paper, we introduce SVM classification (SVMC) algorithm based on $k$ -times Markov sampling and present the numerical studies on the learning performance of SVMC with $k$ -times Markov sampling for benchmark data sets. The experimental results show that the SVMC algorithm with $k$ -times Markov sampling not only have smaller misclassification rates, less time of sampling and training, but also the obtained classifier is more sparse compared with the classical SVMC and the previously known SVMC algorithm based on Markov sampling. We also give some discussions on the performance of SVMC with $k$ -times Markov sampling for the case of unbalanced training samples and large-scale training samples. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. A Computationally Efficient Reconfigurable Constant Multiplication Architecture Based on CSD Decoded Vertical–Horizontal Common Sub-Expression Elimination Algorithm.
- Author
-
Hatai, Indranil, Chakrabarti, Indrajit, and Banerjee, Swapna
- Subjects
FINITE impulse response filters ,ALGORITHMS ,ENCODING - Abstract
This paper introduces a computationally efficient hardware architecture for reconfigurable multiple constant multiplication block, which functions according to the canonical signed digit (CSD)-based vertical and horizontal common sub-expression elimination (VHCSE) algorithm. In the proposed architecture, the CSD decoded coefficient along with 4-b common sub-expressions (CSs) in the vertical direction and 4- and 8-b CSs in the horizontal direction reduces the required number of full adder cells and the adder depths. This technique helps in reducing area consumption by decreasing the number of coefficient multiplier adders by 59% than that of the binary VHCSE (VHBCSE) algorithm. This technique helps in reducing the average switching activity of the adder blocks used in each coefficient multiplier block by 26.1%, 25.6%, and 21.3%, while compared with those of the 2- and 3-b binary CS elimination (BCSE) and VHBCSE algorithms, respectively. For different orders of filter, the proposed one delivers 57.5% and 61.9% improvement in area-power product (APP) on an average compared with the VHBCSE and mixed integer programming algorithms, respectively. Experimental results of differently specified finite impulse response (FIR) filters ranging from 10 to 100 taps and the coefficients of 8, 12, and 16 b show the improvements of 42.8%, 53.6%, and 37%, respectively, in the average gate count and 51.8%, 43.5%, and 36.7% less propagation delay than those of earlier canonical double-based number representation method. Moreover, in the metric made of APP divided by throughput, the proposed technique experiences 63.7% improvement on an average over that of faithfully rounded truncated multiple constant multiplication/accumulation technique of designing constant multiplier and demonstrates its suitability for implementing efficient reconfigurable FIR filter. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
40. Low-Complexity and Low-Latency SVC Decoding Architecture Using Modified MAP-SP Algorithm.
- Author
-
Hong, Seungwoo, Kam, Dongyun, Yun, Sangbu, Choe, Jeongwon, Lee, Namyoon, and Lee, Youngjoo
- Subjects
DECODING algorithms ,ALGORITHMS ,STATIC VAR compensators ,COMPUTER architecture ,PARALLEL processing - Abstract
The compressive sensing (CS) based sparse vector coding (SVC) method is one of the promising ways for the next-generation ultra-reliable and low-latency communications. In this paper, we present advanced algorithm-hardware co-optimization schemes for realizing a cost-effective SVC decoding architecture. The previous maximum a posteriori subspace pursuit (MAP-SP) algorithm is newly modified to relax the computational overheads by applying novel residual forwarding and LLR approximation schemes. A fully-pipelined parallel hardware is also developed to support the modified decoding algorithm, reducing the overall processing latency, especially at the support identification step. In addition, an advanced least-square-problem solver is presented by utilizing the parallel Cholesky decomposer design, further reducing the decoding latency with parallel updates of support values. The implementation results from a 22nm FinFET technology showed that the fully-optimized design is 9.6 times faster while improving the area efficiency by 12 times compared to the baseline realization. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Algorithms of Finding the First Two Minimum Values and Their Hardware Implementation.
- Author
-
Chin-Long Wey, Ming-Der Shieh, and Shin-Yo Lin
- Subjects
ALGORITHMS ,VALUE engineering ,DECODERS (Electronics) ,PURCHASING power parity ,COST effectiveness ,CODING theory ,COMPUTER programming ,VALUE distribution theory ,CODE generators - Abstract
Given a set of numbers X, finding the minimum value of X, min1;st, is a very easy task. However, efficiently finding its second minimum value, min2nd, requires the derivations of min1st and finding the minimum value from the set of the remaining numbers. Efficient algorithms and cost-effective hardware of finding the two smallest of X are greatly needed for the low-density parity-check (LDPC) decoder design. The following two architectures are developed in this paper: 1) sorting-based (XS) approach and 2) tree structure (TS) approach. Experimental results show that the XS approach provides less number of comparisons, while the TS approach achieves higher speed performance at lower hardware cost. Since the hardware unit is repeatedly used in the LDPC decoder design, the proposed high-speed low-cost TS approach is strongly recommended. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
42. On Factor Prime Factorizations for n-D Polynomial Matrices.
- Author
-
Mingsheng Wang
- Subjects
MATRICES (Mathematics) ,POLYNOMIAL rings ,COMMUTATIVE rings ,ALGORITHMS ,RING theory ,ALGEBRA ,FACTORIZATION ,MATHEMATICS ,MATHEMATICAL analysis - Abstract
This paper investigates the problem of factor prime factorizations for n-D polynomial matrices and presents a criterion for the existence of factor prime factorizations for an important class of n-D polynomial matrices. As a by-product, we also obtain an algebraic algorithm to check n-D factor primeness in some important cases which partially solves the long-standing open problem of recognizing n-D factor prime matrices. Some problems related to the factorization methods are also studied. Several exam- pies are given to illustrate the results. The results presented in this paper are true over any coefficient field. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
43. FREL: A Stable Feature Selection Algorithm.
- Author
-
Li, Yun, Si, Jennie, Zhou, Guojing, Huang, Shasha, and Chen, Songcan
- Subjects
ALGORITHMS ,TECHNOLOGICAL innovations ,MATHEMATICAL regularization ,MICROARRAY technology ,SAMPLE size (Statistics) - Abstract
Two factors characterize a good feature selection algorithm: its accuracy and stability. This paper aims at introducing a new approach to stable feature selection algorithms. The innovation of this paper centers on a class of stable feature selection algorithms called feature weighting as regularized energy-based learning (FREL). Stability properties of FREL using L1 or L2 regularization are investigated. In addition, as a commonly adopted implementation strategy for enhanced stability, an ensemble FREL is proposed. A stability bound for the ensemble FREL is also presented. Our experiments using open source real microarray data, which are challenging high dimensionality small sample size problems demonstrate that our proposed ensemble FREL is not only stable but also achieves better or comparable accuracy than some other popular stable feature weighting methods. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
44. Convolutive Bounded Component Analysis Algorithms for Independent and Dependent Source Separation.
- Author
-
Inan, Huseyin A. and Erdogan, Alper T.
- Subjects
MATHEMATICAL bounds ,ALGORITHMS ,SIGNAL separation ,MATHEMATICAL proofs ,PERFORMANCE evaluation ,DIGITAL communications - Abstract
Bounded component analysis (BCA) is a framework that can be considered as a more general framework than independent component analysis (ICA) under the boundedness constraint on sources. Using this framework, it is possible to separate dependent as well as independent components from their mixtures. In this paper, as an extension of a recently introduced instantaneous BCA approach, we introduce a family of convolutive BCA criteria and corresponding algorithms. We prove that the global optima of the proposed criteria, under generic BCA assumptions, are equivalent to a set of perfect separators. The algorithms introduced in this paper are capable of separating not only the independent sources but also the sources that are dependent/correlated in both component (space) and sample (time) dimensions. Therefore, under the condition that the sources are bounded, they can be considered as extended convolutive ICA algorithms with additional dependent/correlated source separation capability. Furthermore, they have potential to provide improvement in separation performance, especially for short data records. This paper offers examples to illustrate the space-time correlated source separation capability through a copula distribution-based example. In addition, a frequency-selective Multiple Input Multiple Output equalization example demonstrates the clear performance advantage of the proposed BCA approach over the state-of-the-art ICA-based approaches in setups involving convolutive mixtures of digital communication sources. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
45. Synchronization of Networks of Heterogeneous Agents With Common Nominal Behavior.
- Author
-
Lovisari, Enrico and Kao, Chung-Yao
- Subjects
HETEROGENEOUS computing ,MEMORYLESS systems ,SCALABILITY ,DYNAMICAL systems ,ALGORITHMS - Abstract
This paper deals with the problem of synchronization in networks of heterogeneous agents with common nominal behavior. The agents are modeled as (possibly nonlinear) perturbed versions of a common SISO nominal LTI operator, and they are interconnected via a sparse memoryless interconnection operator, coherent with the communication graph underlying the network. The network is said to synchronize if the outputs of the agents tend to align along given directions, the most important case being consensus, or agreement. The paper provides a general result, based on IQCs, that ensures synchronization of the network with robustness w.r.t. uncertainties in the interconnection and in each agent's dynamics. Scalability issues are discussed in the scenario where the interconnection operator is a constant normal matrix, yielding the generalization of the very popular linear consensus algorithm. The wide range of applicability of the proposed criterion is shown by providing synchronization conditions in some important examples. Whenever possible, graphical criteria are proposed for checking the required conditions. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
46. A Synthesis Methodology for Ternary Logic Circuits in Emerging Device Technologies.
- Author
-
Srinivasu, B. and Sridharan, K.
- Subjects
LOGIC circuit design ,DIGITAL electronics -- Design & construction ,ALGORITHMS ,TRANSISTORS ,ELECTRONIC circuit design - Abstract
Automatic synthesis of digital circuits has played a key role in obtaining high-performance designs. While considerable work has been done in the past, emerging device technologies call for a need to re-examine the synthesis approaches, so that better circuits that harness the true power of these technologies can be developed. This paper presents a methodology for synthesis applicable to devices that support ternary logic. We present an algorithm for synthesis that combines a geometrical representation with unary operators of multivalued logic. The geometric representation facilitates scanning appropriately to obtain simple sum-of-products expressions in terms of unary operators. An implementation based on Python is described. The power of the approach lies in its applicability to a wide variety of circuits. The proposed approach leads to the savings of 26% and 22% in transistor-count, respectively, for a ternary full-adder and a ternary content-addressable memory (TCAM) over the best existing designs. Furthermore, the proposed approach requires, on an average, less than 10% of the number of the transistors in comparison with a recent decoder-based design for various ternary benchmark circuits. Extensive HSPICE simulation results show roughly 92% reduction in power-delay product (PDP) for a $12\times 12$ TCAM and 60% reduction in PDP for a 24-ternary digit barrel shifter over recent designs. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
47. Efficient Verification Against Undesired Operating Points for MOS Analog Circuits.
- Author
-
Li, You, Liu, Zhiqiang, and Chen, Degang
- Subjects
ANALOG integrated circuits ,INTEGRATED circuit verification ,METAL oxide semiconductors ,ALGORITHMS ,ANALOG circuits -- Design & construction - Abstract
Identifying and removing undesired operating points is one of the most important problems in analog circuit design. In this paper, a divide and contraction verification method against undesired operating points in analog circuits is proposed. Unlike traditional methods to find all operating points, this method only targets searching voltage intervals containing undesired operating points. To achieve this, a systematic approach for automatically identifying all positive and negative feedback loops in circuits is introduced. A positive feedback loop breaking method and selection of breaking nodes are discussed to determine whether a monotonic return function can be obtained. Depending on the monotonicity of the return function, two types of divide and contraction algorithms are proposed to efficiently search voltage intervals containing operating points. Simulation results show that this method is effective and efficient in identifying the presence/absence of undesired operating points in a set of commonly used benchmark circuits. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
48. Affinity and Penalty Jointly Constrained Spectral Clustering With All-Compatibility, Flexibility, and Robustness.
- Author
-
Qian, Pengjiang, Jiang, Yizhang, Wang, Shitong, Su, Kuan-Hao, Wang, Jun, Hu, Lingzhi, and Muzic, Raymond F.
- Subjects
LAPLACE transformation ,ALGORITHMS ,ROBUST control - Abstract
The existing, semisupervised, spectral clustering approaches have two major drawbacks, i.e., either they cannot cope with multiple categories of supervision or they sometimes exhibit unstable effectiveness. To address these issues, two normalized affinity and penalty jointly constrained spectral clustering frameworks as well as their corresponding algorithms, referred to as type-I affinity and penalty jointly constrained spectral clustering (TI-APJCSC) and type-II affinity and penalty jointly constrained spectral clustering (TII-APJCSC), respectively, are proposed in this paper. TI refers to type-I and TII to type-II. The significance of this paper is fourfold. First, benefiting from the distinctive affinity and penalty jointly constrained strategies, both TI-APJCSC and TII-APJCSC are substantially more effective than the existing methods. Second, both TI-APJCSC and TII-APJCSC are fully compatible with the three well-known categories of supervision, i.e., class labels, pairwise constraints, and grouping information. Third, owing to the delicate framework normalization, both TI-APJCSC and TII-APJCSC are quite flexible. With a simple tradeoff factor varying in the small fixed interval (0, 1], they can self-adapt to any semisupervised scenario. Finally, both TI-APJCSC and TII-APJCSC demonstrate strong robustness, not only to the number of pairwise constraints but also to the parameter for affinity measurement. As such, the novel TI-APJCSC and TII-APJCSC algorithms are very practical for medium- and small-scale semisupervised data sets. The experimental studies thoroughly evaluated and demonstrated these advantages on both synthetic and real-life semisupervised data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
49. Linear Convergence and Metric Selection for Douglas-Rachford Splitting and ADMM.
- Author
-
Giselsson, Pontus and Boyd, Stephen
- Subjects
LINEAR systems ,STOCHASTIC convergence ,MULTIPLIERS (Mathematical analysis) ,MATHEMATICAL optimization ,ALGORITHMS - Abstract
Recently, several convergence rate results for Douglas-Rachford splitting and the alternating direction method of multipliers (ADMM) have been presented in the literature. In this paper, we show global linear convergence rate bounds for Douglas-Rachford splitting and ADMM under strong convexity and smoothness assumptions. We further show that the rate bounds are tight for the class of problems under consideration for all feasible algorithm parameters. For problems that satisfy the assumptions, we show how to select step-size and metric for the algorithm that optimize the derived convergence rate bounds. For problems with a similar structure that do not satisfy the assumptions, we present heuristic step-size and metric selection methods. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
50. Adaptive Scaling of Cluster Boundaries for Large-Scale Social Media Data Clustering.
- Author
-
Meng, Lei, Tan, Ah-Hwee, and Wunsch, Donald C.
- Subjects
BIG data ,SOCIAL media ,ALGORITHMS - Abstract
The large scale and complex nature of social media data raises the need to scale clustering techniques to big data and make them capable of automatically identifying data clusters with few empirical settings. In this paper, we present our investigation and three algorithms based on the fuzzy adaptive resonance theory (Fuzzy ART) that have linear computational complexity, use a single parameter, i.e., the vigilance parameter to identify data clusters, and are robust to modest parameter settings. The contribution of this paper lies in two aspects. First, we theoretically demonstrate how complement coding, commonly known as a normalization method, changes the clustering mechanism of Fuzzy ART, and discover the vigilance region (VR) that essentially determines how a cluster in the Fuzzy ART system recognizes similar patterns in the feature space. The VR gives an intrinsic interpretation of the clustering mechanism and limitations of Fuzzy ART. Second, we introduce the idea of allowing different clusters in the Fuzzy ART system to have different vigilance levels in order to meet the diverse nature of the pattern distribution of social media data. To this end, we propose three vigilance adaptation methods, namely, the activation maximization (AM) rule, the confliction minimization (CM) rule, and the hybrid integration (HI) rule. With an initial vigilance value, the resulting clustering algorithms, namely, the AM-ART, CM-ART, and HI-ART, can automatically adapt the vigilance values of all clusters during the learning epochs in order to produce better cluster boundaries. Experiments on four social media data sets show that AM-ART, CM-ART, and HI-ART are more robust than Fuzzy ART to the initial vigilance value, and they usually achieve better or comparable performance and much faster speed than the state-of-the-art clustering algorithms that also do not require a predefined number of clusters. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.