1,336 results
Search Results
2. Rethinking Engineering Education on the Teaching and Research Practice of Computer Architecture
- Author
-
Xu, Qingzhen, Mao, Mingzhi, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gan, Jianhou, editor, Pan, Yi, editor, Zhou, Juxiang, editor, Liu, Dong, editor, Song, Xianhua, editor, and Lu, Zeguang, editor
- Published
- 2024
- Full Text
- View/download PDF
3. A Proposal for a Standard Evaluation Method for Assessing Programming Proficiency in Assembly Language
- Author
-
Rivera-Alvarado, Ernesto, Guadamuz, Saúl, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Jat, Dharm Singh, editor, Mishra, Durgesh, editor, and Joshi, Amit, editor
- Published
- 2024
- Full Text
- View/download PDF
4. An Efficient Workload Distribution Mechanism for Tightly Coupled Heterogeneous Hardware
- Author
-
Rivera-Alvarado, Ernesto, Torres-Rojas, Francisco J., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Singh Jat, Dharm, editor, Mishra, Durgesh Kumar, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
5. A Study of Intelligent Paper Grouping Model for Adult Higher Education Based on Random Matrix.
- Author
-
Wang, Yan
- Subjects
ADULT education ,HIGHER education ,RANDOM matrices ,DATABASE design ,CHAOS theory ,COMPUTER architecture ,PARTICLE swarm optimization ,COVARIANCE matrices - Abstract
This paper presents a comprehensive study and analysis of the intelligent grouping of papers in adult higher education using a random matrix approach. Using the results of random matrix theory on the eigenvalues of the sample covariance matrix, the energy of each subspace is estimated, and the estimated energy is then used to construct a subspace weighting matrix. The statistical properties of the sample covariance matrix eigenvectors are analyzed using the first-order perturbation approximation, and then, asymptotic results from random matrix theory on the projection of the sample covariance matrix signal subspace to the real signal parametrization are used to obtain the weighting matrix based on the random matrix eigenvectors. Dynamic adjustment according to the fitness of individuals in the population is performed to ensure population diversity, while the combination of the small habitat technique can avoid the algorithm from falling into early convergence. The algorithm introduces chaos theory to optimize the population initialization process and uses the dynamic traversal randomness of chaos to select individuals in the population so that the initial population is close to the desired target solution. The design of the fitness function in the genetic algorithm generally maps the objective function of the problem to the fitness function. A good fitness function can directly reflect the quality of the individuals in the group. Based on the in-depth study of the basic attributes of the test questions and the principles of test paper evaluation, the mathematical model and objective function of intelligent paper grouping are determined by the difficulty, knowledge points, and cognitive level of the test questions as the main constraints, and NCAGA is applied to the intelligent paper grouping method, which better completes the intelligent paper grouping session for the computer system architecture course. In the process of designing the intelligent grouping algorithm, for the situations of premature convergence and convergence to locally optimal solutions that easily occur in the traditional genetic algorithm, this paper adopts the approach of adaptive adjustment of crossover probability and variation probability to improve the algorithm and achieves satisfactory results. Based on extensive business research, this paper completes the requirement analysis of the online practice system based on the intelligent grouping of papers and presents the functional design and database design of the key functional modules in the system in detail. Finally, this paper conducts functional tests on the system and analyses the test results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Position paper: Data for AI research (DAIR) infrastructure: advancing educational research and practice
- Author
-
Joksimović, Srećko, Siemens, George, Coyle, Damien, Zamecnik, Andrew, De Laat, Maarten, Dawson, Shane, Richey, Michael C, Kovanovic, Vitomir, Pardo, Abelardo, and Fey, Alexei
- Subjects
personalised e-learning ,big data application ,computer architecture ,knowledge management ,learning management systems ,data systems - Abstract
The project represents a report that introduces a data infrastructure that a) integrates data from multiple sources, b) enables various access permissions to differentstakeholders, c) provides model building and algorithm development within the data lake, and d) allows for the implementation of real-time analysis outputs including adaptive feedback and dashboards for both learners and teachers. This technical environment is foundation to the utilisation of artificial intelligence in knowledge processes and to establish advanced applications such as personal knowledge graphs and contextual learning supports that are indicative of true personalised learning and sensemaking, simultaneously advancing research and practice of teaching and learning.
- Published
- 2022
7. Normative Emotional Agents: A Viewpoint Paper.
- Author
-
Argente, Estefania, Val, E. Del, Perez-Garcia, D., and Botti, V.
- Abstract
Human social relationships imply conforming to the norms, behaviors, and cultural values of the society, but also socialization of emotions, to learn how to interpret and show them. In multiagent systems, much progress has been made in the analysis and interpretation of both emotions and norms. Nonetheless, the relationship between emotions and norms has hardly been considered and most normative agents do not consider emotions, or vice-versa. In this article, we provide an overview of relevant aspects within the area of normative agents and emotional agents. First we focus on the concept of norm, the different types of norms, its life cycle and a review of multiagent normative systems. Second, we present the most relevant theories of emotions, the life cycle of an agent’s emotions, and how emotions have been included through computational models in multiagent systems. Next, we present an analysis of proposals that integrate emotions and norms in multiagent systems. From this analysis, four relationships are detected between norms and emotions, which we analyze in detail and discuss how these relationships have been tackled in the reviewed proposals. Finally, we present a proposal for an abstract architecture of a Normative Emotional Agent that covers these four norm-emotion relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Service and Energy Management in Fog Computing: A Taxonomy Approaches, and Future Directions.
- Author
-
Hashemi, S. M., Sahafi, A., Rahmani, A. M., and Bohlouli, M.
- Subjects
ENERGY management ,INTERNET of things ,ENERGY consumption ,COMPUTING platforms ,COMPUTER architecture ,EDGE computing - Abstract
Background and Objectives: Today, the increased number of Internet-connected smart devices require powerful computer processing servers such as cloud and fog and necessitate fulfilling requests and services more than ever before. The geographical distance of IoT devices to fog and cloud servers have turned issues such as delay and energy consumption into major challenges. However, fog computing technology has emerged as a promising technology in this field. Methods: In this paper, service/energy management approaches are generally surveyed. Then, we explain our motivation for the systematic literature review procedure (SLR) and how to select the related works. Results: This paper introduces four domains of service management and energy management, including Architecture, Resource Management, Scheduling management, and Service Management. Scheduling management has been used in 38% of the papers. Therefore, they have the highest service management and energy management. Also, Resource Management is the second domain that has been able to attract about 26% of the papers in service management and energy management. Conclusion: About 81% of the fog computing papers simulated their approaches, and the others implemented their schemes using a testbed in the real environment. Furthermore, 30% of the papers presented an architecture or framework for their research, along with their research. In this systematic literature review, papers have been extracted from five valid databases, including IEEE Xplore, Wiley, Science Direct (Elsevier), Springer Link, and Taylor & Francis, from 2013 to 2022. We obtained 1596 papers related to the discussed subject. We filtered them and achieved 47 distinct studies. In the following, we analyze and discuss these studies; then we review the parameters of service quality in the papers, and ultimately, we present the benefits, drawbacks, and innovations of each study. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Critique of “MemXCT: Memory-Centric X-Ray CT Reconstruction With Massive Parallelization” by SCC Team From the University of Texas at Austin.
- Author
-
Davis, Brock, Paez, Juan, Gaither, Jack, and Garcia, Joe A.
- Subjects
COMPUTED tomography ,VIRTUAL machine systems ,X-rays ,GRAPHICS processing units ,MICROSOFT Azure (Computing platform) ,COMPUTER workstation clusters - Abstract
This report describes The University of Texas Student Cluster Competition team’s effort to reproduce the results of “MemXCT: memory-centric X-ray CT reconstruction with massive parallelization” (Hidayetoğlu et al., 2019). The article details a new memory-centric approach that reconstructs X-ray computed tomography (XCT) from noisy raw data. In our reproduction experiments, we utilized Microsoft Azure’s CycleCloud tool to provision, orchestrate, and manage our computing cluster in the cloud. In particular, we scheduled and benchmarked reconstruction workloads using Azure’s CPU-based HC44rs and GPU-based NC12s v2 virtual machine (VM) types to evaluate the scalability properties of the reconstruction approach and the performance differences between architectures. The HC44rs VMs contained 44 Intel Xeon Platinum cores, while the NC12s v2 VM was equipped with two NVIDIA P100 GPUs. We used a recent version of Intel’s compiler stack with the MKL library for our CPU code along with CUDA 11.1 on GPUs. Overall, our results confirm the findings of the original article, demonstrating similar acceleration on GPUs and scalability properties on CPUs. Digital artifacts from these experiments are available at: 10.5281/zenodo.5598108 [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Optimizing Convolution Neural Nets with a Unified Transformation Approach.
- Author
-
Ceze, Luis
- Subjects
DEEP learning ,MACHINE learning ,COMPUTER architecture ,MATHEMATICAL optimization ,COMPUTER input-output equipment ,COMPILERS (Computer programs) - Abstract
The article explores how deep learning models have evolved from relying on hand-crafted operator libraries to utilizing compiler-based approaches for optimization, especially with the growing diversity of hardware platforms. It highlights the Apache TVM project, which allows machine learning engineers to compile models for specific hardware targets, optimizing performance without altering model accuracy. The article points to an accompanying paper that proposes a unified transformation approach that optimizes model architectures through program transformations, avoiding expensive retraining and achieving significant performance gains without compromising accuracy.
- Published
- 2024
- Full Text
- View/download PDF
11. Computation Foretold.
- Author
-
MURTAGH, JACK
- Subjects
SCIENCE museums ,COMPUTER programmers ,CENTRAL processing units ,BERNOULLI numbers ,COMPUTER architecture - Abstract
Ada Lovelace, known as the world's first computer programmer, played a crucial role in the development of the first general-purpose computer. Lovelace's annotations on a paper about the analytical engine, designed by Charles Babbage, showcased her technical expertise and philosophical insights. She recognized the potential of computers to go beyond mathematical calculations and discussed concepts such as programmability and artificial intelligence. Despite never seeing a computer in action, Lovelace's visionary ideas laid the foundation for modern computing. Unfortunately, Babbage's analytical engine was never built due to lack of funding. [Extracted from the article]
- Published
- 2024
12. The HSF Conditions Database Reference Implementation.
- Author
-
Mashinistov, Ruslan, Gerlach, Lino, Laycock, Paul, Formica, Andrea, Govi, Giacomo, and Pinkenburg, Chris
- Subjects
DATABASES ,COMPUTING platforms ,COMPUTER architecture ,METADATA ,REDUNDANCY in engineering - Abstract
Conditions data is the subset of non-event data that is necessary to process event data. It poses a unique set of challenges, namely a heterogeneous structure and high access rates by distributed computing. The HSF Conditions Databases activity is a forum for cross-experiment discussions inviting as broad a participation as possible. It grew out of the HSF Community White Paper work to study conditions data access, where experts from ATLAS, Belle II, and CMS converged on a common language and proposed a schema that represents best practice. Following discussions with a broader community, including NP as well as HEP experiments, a core set of use cases, functionality and behaviour was defined with the aim to describe a core conditions database API. This paper will describe the reference implementation of both the conditions database service and the client which together encapsulate HSF best practice conditions data handling. Django was chosen for the service implementation, which uses an ORM instead of the direct use of SQL for all but one method. The simple relational database schema to organise conditions data is implemented in PostgreSQL. The task of storing conditions data payloads themselves is outsourced to any POSIX-compliant filesystem, allowing for transparent relocation and redundancy. Crucially this design provides a clear separation between retrieving the metadata describing which conditions data are needed for a data processing job, and retrieving the actual payloads from storage. The service deployment using Helm on OKD will be described together with scaling tests and operations experience from the sPHENIX experiment running more than 25k cores at BNL. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Parallel computing technologies 2020.
- Author
-
Malyshkin, Victor
- Subjects
PARALLEL programming ,FEATURE extraction ,COMPUTER architecture ,ALGEBRAIC multigrid methods ,SCIENTIFIC computing ,LATTICE Boltzmann methods - Abstract
A. Kireeva et al. contributed another paper in the field of cellular automata: "Parallel simulation of drift-diffusion-recombination by cellular automata and global random walk algorithm". Parallel computing technologies enable the solution of large-scale numerical simulation and data processing problems in science and industry. The previously proposed contention-aware scheduling algorithm is formally described with this model, modifications of the algorithm are proposed: look-ahead technique, chunked data transfers, data caching, and peer-to-peer data transfers. [Extracted from the article]
- Published
- 2022
- Full Text
- View/download PDF
14. STUDY OF PARALLEL COMPUTING STRUCTURE AND CIRCUIT ANALYSIS FOR INTERCONNECTION NETWORK.
- Author
-
Katare, Rakesh Kumar, Tripathi, Shravan, and Tiwari, Sunil
- Subjects
PARALLEL programming ,COMPUTER architecture ,SWITCHING circuits ,COMPUTER science ,COMPUTATIONAL physics ,HIGH performance computing - Abstract
Parallel computing has become a crucial topic in the concern of computer science and also it is revealed to be critical when researching in high performance. The evolution of computer architectures towards an improved number of nodes, where parallelism could be the approach to option for speeding up an algorithm within the last few decades. The combination of processing units build a model of computation (circuits) has gained an essential place in the area of high performance computing (HPC) due to its configuration and considerable processing supremacy that is parallel, series, etc.... In this paper, we study the idea of parallel computing, and its programming models and also explore some theoretical and technical concepts which can be often needed to understand the Interconnection network. In particular, we show how this technology is new in assisting the field of computational physics, especially when the issue is data parallel. In this paper firstly we convert the graphical model of perfect difference network into circuit diagram then convert circuit diagram into switching function, then simplify it and redraw the equivalent network. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. DPoS: Decentralized, Privacy-Preserving, and Low-Complexity Online Slicing for Multi-Tenant Networks.
- Author
-
Zhao, Hailiang, Deng, Shuiguang, Liu, Zijie, Xiang, Zhengzhe, Yin, Jianwei, Dustdar, Schahram, and Zomaya, Albert Y.
- Subjects
ONLINE algorithms ,MOBILE virtual network operators ,5G networks ,VIRTUAL networks ,COST functions - Abstract
Network slicing is the key to enable virtualized resource sharing among vertical industries in the era of 5G communication. Efficient resource allocation is of vital importance to realize network slicing in real-world business scenarios. To deal with the high algorithm complexity, privacy leakage, and unrealistic offline setting of current network slicing algorithms, in this paper we propose a fully decentralized and low-complexity online algorithm, DPoS, for multi-resource slicing. We first formulate the problem as a global social welfare maximization problem. Next, we design the online algorithm DPoS based on the primal-dual approach and posted price mechanism. In DPoS, each tenant is incentivized to make its own decision based on its true preferences without disclosing any private information to the mobile virtual network operator and other tenants. We provide a rigorous theoretical analysis to show that DPoS has the optimal competitive ratio when the cost function of each resource is linear. Extensive simulation experiments are conducted to evaluate the performance of DPoS. The results show that DPoS can not only achieve close-to-offline-optimal performance, but also have low algorithmic overheads. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. NEW SISC SECTION ON SCIENTIFIC MACHINE LEARNING.
- Author
-
De Sterck, Hans
- Subjects
SCIENCE education ,MACHINE learning ,DEEP learning ,NUMERICAL solutions for linear algebra ,SCIENTIFIC computing ,COMPUTER architecture - Abstract
The article introduces a new section structure for the SIAM Journal on Scientific Computing (SISC), including a focus on Scientific Machine Learning as a major development in the field of scientific computing. Topics include the development of accurate, efficient, and scalable machine learning algorithms for solving mathematical modeling problems, emphasizing the importance of reproducibility in the review process.
- Published
- 2024
- Full Text
- View/download PDF
17. The System Architecture and Methods for Efficient Resource-Saving Scheduling in the Fog †.
- Author
-
Klimenko, Anna
- Subjects
COMPUTER architecture ,QUALITY of service ,CLOUD computing ,COMPUTER scheduling ,PROBLEM solving - Abstract
The problem of resource-saving scheduling in a fog environment is considered in this paper. The objective function of the problem in question presupposes the fog nodes' reliability function maximizing. Therefore, to create a schedule, the following is required: the history of the fog devices' state changes and the search space, which consists of preselected nodes of the cloud-fog broker neighbourhood. The obvious approach to providing the scheduler with this information is to poll the fog nodes, yet this can consume the unacceptable time because of the QoS requirements. In this paper, the system architecture and general methods for efficient resource-saving scheduling is presented. The system is based on distributed ledger element usage, which provides the nodes with the proper awareness about the surroundings. The usage of the distributed ledger allows not only for the creation of the resource-saving schedule but also the reduction of the scheduling problem-solving time, which frees addition time that can be used for the solving of user tasks. The latter also affects the overall resource-saving via reliability. The novelty of this paper consists in the development of the hybrid ledger-based system, which integrates and arranges the elements of various ledger types to solve the newly formulated problem. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Conference Calendar.
- Author
-
Craeynest, Dirk
- Subjects
SOFTWARE verification ,SOFTWARE architecture ,HIGH performance computing ,COMPUTER science conferences ,REAL-time computing ,COMPUTER architecture ,DISTRIBUTED computing - Published
- 2022
19. Computer Architectures Empowered by Sierpinski Interconnection Networks utilizing an Optimization Assistant.
- Author
-
Iqbal, Muhammad Waseem and Alshammry, Nizal
- Subjects
COMPUTER architecture ,COMPUTER science ,VERY large scale circuit integration ,COMPUTER engineering ,COMPUTER engineers - Abstract
The current article discusses Sierpinski networks, which are fractal networks with certain applications in computer science, physics, and chemistry. These networks are typically used in complicated frameworks, fractals, and recursive assemblages. The results derived in this study are in mathematical and graphical format for particular classes of these networks of two distinct sorts with two invariants, K-Banhatti Sombor (KBSO) and Dharwad, along with their reduced forms. These results can facilitate the formation, scalability, and introduction of novel interconnection network topologies, chemical compounds, and VLSI processor circuits. The mathematical expressions employed in this research offer modeling insights and design guidelines to computer engineers. The derived simulation results demonstrate the optimal ranges for a certain network. The optimization assistant tool deployed in this work provides a single maximized value representing the maximum optimized network. These ranges can be put into service to dynamically establish a network according to the requirements of this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Technologies of Data Protection and Institutional Decisions for Data Sovereignty.
- Author
-
Del Re, Enrico
- Subjects
DATA privacy ,DATA protection ,COMPUTER architecture ,SCIENTIFIC literature ,BIOMETRIC identification - Abstract
This paper aims to propose innovative actions of advanced technological solutions and consequent necessary institutional decisions to achieve in a reasonable time the definitive confidential data protection and data sovereignty, based on available scientific results. Confidential data protection is a fundamental and strategic issue in next-generation Internet systems to guarantee data sovereignty and the respect of human rights as stated in the foundation of the United Nations. Even if presently many international regulations are decisive steps to guarantee data protection within normative contexts, they are not adequate to face new technologies, such as facial recognition, automatic profiling, position tracking, biometric data, AI applications, and many others in the future, as they are implemented without any awareness by the interested subjects. Therefore, a new approach to data protection is mandatory based on innovative and disruptive technological solutions. A recent OECD report highlighted the need for the so-called Privacy-Enhancing Technologies (PETs) for the effective protection of confidential data, even more urgent for the coexistence of privacy and data sharing in international contexts. A common feature of these technologies is the use of software methodologies that can run on currently available microprocessors and their present immaturity. More effective and definitive protection can be achieved with another methodological approach based on the paradigm of 'Data Usage Control'. This new concept guarantees data protection policy by default and initial design and it requires a new architecture of the data and a new HW&SW architecture of the computers. This contribution has a two-fold objective: first, to clarify why regulations alone and present technological proposals are not adequate for the effective and definitive protection of data and, second, to indicate the new necessary technological approach and the simultaneous institutional actions required to achieve the definitive protection and sovereignty of data in reasonable times, based on the results already available in the scientific literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Empirical Architectural Analysis on Performance Scalability of Petascale All-Flash Storage Systems.
- Author
-
Ajdari, Mohammadamin, Montazerzohour, Behrang, Abdi, Kimia, and Asadi, Hossein
- Abstract
In this paper, we first analyze a real storage system consisting of 72 SSDs utilizing either Hardware RAID (HW-RAID) or Software RAID (SW-RAID), and show that SW-RAID is up to 7× faster. We then reveal that with an increasing number of SSDs, the limited I/O parallelism in SAS controllers and multi-enclosure handshaking overheads cause a significant performance drop, minimizing the total I/O Per Second (IOPS) of a 144-SSD system to less than a single SSD. Second, we disclose the most important architectural parameters that affect a large-scale storage system. Third, we propose a framework that models a large-scale storage system and estimates the system IOPS and system resource usage for various architectures. We verify our framework against a real system and show its high accuracy. Lastly, we analyze a use case of a 240-SSD system and reveal how our framework guides architects in storage system scaling. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Exploiting Direct Memory Operands in GPU Instructions.
- Author
-
Mohammadpur-Fard, Ali, Darabi, Sina, Falahati, Hajar, Mahani, Negin, and Sarbazi-Azad, Hamid
- Abstract
GPUs are widely used for diverse applications, particularly data-parallel tasks like machine learning and scientific computing. However, their efficiency is hindered by architectural limitations, inherited from historical RISC processors, in handling memory loads causing high register file contention. We observe that a significant number (around 26%) of values present in the register file are typically used only once, contributing to more than 25% of the total register file bank conflicts, on average. This paper addresses the challenge of single-use memory values in the GPU register file (i.e. data values used only once) which wastes space and increases latency. To this end, we introduce a novel mechanism inspired by CISC architectures. It replaces single-use loads with direct memory operands in arithmetic operations. Our approach improves performance by 20% and reduces energy consumption by 18%, on average, with negligible (<1%) hardware overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Movement Representation Learning for Pain Level Classification.
- Author
-
Olugbade, Temitayo, Williams, Amanda C de C, Gold, Nicolas, and Bianchi-Berthouze, Nadia
- Abstract
Self-supervised learning has shown value for uncovering informative movement features for human activity recognition. However, there has been minimal exploration of this approach for affect recognition where availability of large labelled datasets is particularly limited. In this paper, we propose a P-STEMR (Parallel Space-Time Encoding Movement Representation) architecture with the aim of addressing this gap and specifically leveraging the higher availability of human activity recognition datasets for pain-level classification. We evaluated and analyzed the architecture using three different datasets across four sets of experiments. We found statistically significant increase in average F1 score to 0.84 for pain level classification with two classes based on the architecture compared with the use of hand-crafted features. This suggests that it is capable of learning movement representations and transferring these from activity recognition based on data captured in lab settings to classification of pain levels with messier real-world data. We further found that the efficacy of transfer between datasets can be undermined by dissimilarities in population groups due to impairments that affect movement behaviour and in motion primitives (e.g. rotation versus flexion). Future work should investigate how the effect of these differences could be minimized so that data from healthy people can be more valuable for transfer learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Enhancing Monitoring Performance: A Microservices Approach to Monitoring with Spyware Techniques and Prediction Models.
- Author
-
Rossetto, Anubis Graciela de Moraes, Noetzold, Darlan, Silva, Luis Augusto, and Leithardt, Valderi Reis Quietinho
- Subjects
COMPUTER architecture ,SPYWARE (Computer software) ,PREDICTION models ,DATA security failures ,COMPUTER monitors ,AUTOMATIC speech recognition - Abstract
In today's digital landscape, organizations face significant challenges, including sensitive data leaks and the proliferation of hate speech, both of which can lead to severe consequences such as financial losses, reputational damage, and psychological impacts on employees. This work considers a comprehensive solution using a microservices architecture to monitor computer usage within organizations effectively. The approach incorporates spyware techniques to capture data from employee computers and a web application for alert management. The system detects data leaks, suspicious behaviors, and hate speech through efficient data capture and predictive modeling. Therefore, this paper presents a comparative performance analysis between Spring Boot and Quarkus, focusing on objective metrics and quantitative statistics. By utilizing recognized tools and benchmarks in the computer science community, the study provides an in-depth understanding of the performance differences between these two platforms. The implementation of Quarkus over Spring Boot demonstrated substantial improvements: memory usage was reduced by up to 80% and CPU usage by 95%, and system uptime decreased by 119%. This solution offers a robust framework for enhancing organizational security and mitigating potential threats through proactive monitoring and predictive analysis while also guiding developers and software architects in making informed technological choices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. BlockGraph: a scalable secure distributed ledger that exploits locality.
- Author
-
Goldstein, Seth Copen, Gao, Sixiang, and Sun, Zhenbo
- Subjects
BLOCKCHAINS ,COMPUTER architecture ,CRYPTOCURRENCIES ,BITCOIN ,TRANSACTION costs ,SCALABILITY - Abstract
Distributed public ledgers, the key to modern cryptocurrencies and the heart of many novel applications, have scalability problems. Ledgers such as the blockchain underlying Bitcoin can process fewer than 10 transactions per second (TPS). The cost of transactions is high, and the time to confirm a transaction is in the minutes. We present the BlockGraph, a scalable distributed public ledger inspired by principles of computer architecture. The BlockGraph exploits the natural locality of transactions to allow publishing independent transactions in parallel. It extends the blockchain with three new transactions to create a unified consistent ledger out of essentially independent blockchains. The most important change is the introduction of the blockstamp transaction, which essentially checkpoints a local blockchain and secures it against attack. The result is a locality-based, simple, secure, sharding protocol which keeps all transactions readable. This paper introduces the BlockGraph protocol, proves that it is consistent and can achieve many thousands of TPS. Using our implementation (a small extension to Bitcoin core) we demonstrate that it, in practice, can significantly improve throughput. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Gender determination from periocular images using deep learning based EfficientNet architecture.
- Author
-
Nambiar, Viji B, Ramamurthy, Bojan, and Veeresha, Pundikala
- Subjects
CONVOLUTIONAL neural networks ,DEEP learning ,COMPUTER architecture ,LANGUAGE models ,MATHEMATICAL functions - Abstract
In this study, we obtain a sex prediction algorithm based on CNN in two ways - building a red Convolutional Neural Network (CNN) model from scratch and via transfer learning. We built a model from scratch and compared it with fine-tuned EfficientNetB1. We use these models for gender determination using periocular images and compare the two models depending on the accuracy of the models. The CNN model proposed from scratch yields an accuracy of 94.46% while the fine-tuned EfficientNetB1 yields an accuracy of 97.94%. This paper is one of the first works in determining gender from periocular images in the visible spectrum using a CNN model built from the outset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Event‐based high throughput computing: A series of case studies on a massively parallel softcore machine.
- Author
-
Vousden, Mark, Morris, Jordan, McLachlan Bragg, Graeme, Beaumont, Jonathan, Rafiev, Ashur, Luk, Wayne, Thomas, David, and Brown, Andrew
- Subjects
CONDENSED matter physics ,ELECTRICITY pricing ,COMPUTATIONAL chemistry ,COMPUTER architecture ,MESSAGE passing (Computer science) - Abstract
This paper introduces an event‐based computing paradigm, where workers only perform computation in response to external stimuli (events). This approach is best employed on hardware with many thousands of smaller compute cores with a fast, low‐latency interconnect, as opposed to traditional computers with fewer and faster cores. Event‐based computing is timely because it provides an alternative to traditional big computing, which suffers from immense infrastructural and power costs. This paper presents four case study applications, where an event‐based computing approach finds solutions to orders of magnitude more quickly than the equivalent traditional big compute approach, including problems in computational chemistry and condensed matter physics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. A countless variant simulation-based toolkit for remote learning and evaluation.
- Author
-
Romero, Felipe, Bandera, Gerardo, Romero, Javier, and Romero, Luis F.
- Subjects
DISTANCE education ,COMPUTER architecture ,COVID-19 pandemic ,EDUCATORS ,ACADEMIC motivation - Abstract
The COVID-19 pandemic has brought about a profound transformation in the educational landscape in recent months. Educators worldwide have been challenged to tackle academic issues they could never have imagined. Among the most stressful situations faced by students and teachers is implementing online assessments. This paper proposes a system that includes exam prototypes for computer architecture modules at the higher education level. This system generates a wide range of questions and variations on the server side, supported by a set of simulators, resulting in many unique examination proposals. This system streamlines the monitoring process for the teacher, as it eliminates the possibility of two students receiving similar exams and reduces student stress by allowing them to practice with a limitless number of exam samples. This paper also highlights several indicators that demonstrate the advantages of this framework. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. A Novel Architecture Based on Business Intelligence Approach to Exploit Big Data.
- Author
-
Nejad, M. R. Behbahani and Rashid, H.
- Subjects
BIG data ,BUSINESS intelligence ,DECISION making ,UNIFIED modeling language ,COMPUTER architecture - Abstract
Background and Objectives: Big data is a combination of structured, semi-structured and unstructured data collected by organizations that must be stored and used for decision-making. Businesses that deal with the business intelligence system, as well as their data sources, have a major challenge in exploiting Big Data. The current architecture of business intelligence systems is not capable of incorporating and exploiting Big Data. In this paper, an architecture is developed to respond to this challenge. Methods: This paper focuses on the promotion of business intelligence to create an ability to exploit Big Data in business intelligence. In this regard, a new architecture is proposed to integrate both Business Intelligence and Big Data architectures. To evaluate the proposed architecture, we investigated business intelligence architecture and Big Data architecture. Then, we developed a Unified Modeling Language diagram for the proposed architecture. In addition, using the Colored Petri-Net, the proposed architecture is evaluated in a case study. Results: The results show that our architectural system has a higher efficiency in performing all steps, average time, and maximum time compared to business intelligence architecture. Conclusion: The proposed architecture can help companies and organizations gain more value from their data sources and better support managers and organizations in their decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Speculative Taint Tracking (STT): A Comprehensive Protection for Speculatively Accessed Data.
- Author
-
Jiyong Yu, Mengjia Yan, Khyzha, Artem, Morrison, Adam, Torrellas, Josep, and Fletcher, Christopher W.
- Subjects
COMPUTER security ,DATA protection ,MALWARE prevention ,COMPUTER architecture ,COMPUTER performance - Abstract
Speculative execution attacks present an enormous security threat, capable of reading arbitrary program data under malicious speculation, and later exfiltrating that data over microarchitectural covert channels. This paper proposes speculative taint tracking (STT), a high security and high performance hardware mechanism to block these attacks. The main idea is that it is safe to execute and selectively forward the results of speculative instructions that read secrets, as long as we can prove that the forwarded results do not reach potential covert channels. The technical core of the paper is a new abstraction to help identify all microarchitectural covert channels, and an architecture to quickly identify when a covert channel is no longer a threat. We further conduct a detailed formal analysis on the scheme in a companion document. When evaluated on SPEC06 workloads, STT incurs 8.5% or 14.5% performance overhead relative to an insecure machine. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. One-Step Calculation Circuit of FFT and Its Application.
- Author
-
Liu, Yiyang, Wang, Chunhua, Sun, Jingru, Du, Sichun, and Hong, Qinghui
- Subjects
DISCRETE Fourier transforms ,FAST Fourier transforms ,ANALOG circuits ,SIGNAL processing - Abstract
Discrete Fourier Transform (DFT) and Fast Fourier Transform (FFT) are core components in the field of signal processing. However, in the existing research, there is no fully analog circuit that can realize the one-step calculation of FFT. Therefore, in this paper, an analog circuit that can calculate FFT and its inverse transform IFFT in one-step is proposed. First, a circuit that can realize complex number operations is designed. On the basis of this structure, a fully analog circuit that can realize fast and efficient computing of FFT and IFFT in one-step is proposed. In addition, different coefficient matching can be obtained to achieve arbitrary points of FFT and IFFT by adjusting the resistance value of the memristor, which has good programmability. Specific examples are given in the paper to evaluate the proposed method. The PSPICE simulation results show that the average accuracy is above 99.98%. More importantly, the calculation speed has been greatly improved compared with MATLAB simulation. Finally, the proposed circuit can be used to quickly solve convolution operation, and the average accuracy can reach 99.95%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Implementation of Efficient Vedic Multiplier and Its Performance Evaluation.
- Author
-
Mugatkar, Ashutosh and Gajre, Suhas S.
- Subjects
INTEGRATED circuits ,COMPUTER architecture ,MULTIPLIERS (Mathematical analysis) ,MULTIPLICATION ,MATHEMATICS - Abstract
The ancient Vedic mathematics is well known for quicker handy multiplications but its recognition as an integrated circuit core against existing hardware multipliers is not established. As optimized hardware implementation of binary multiplier is one of the prominent unsolved problems in computer architecture, this paper proposes efficient Urdhava Tiryakbhyam Vedic multiplier architecture and compares it with the set of hierarchical multiplication algorithms which generate multiplication result in a single clock cycle. Two innovative algorithms are proposed here, one with a compact structure and another for faster execution. Also, its optimized transistor level layout is designed and implemented. To maintain homogeneity for comparison, all the algorithms are programmed on a common HDL language platform and analyzed with the same tool and technology. Final results indicate that the proposed architecture delivers 15.5% less power delay product (PDP) compared to closest competitor algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Applications, Deployments, and Integration of Internet of Drones (IoD): A Review.
- Author
-
Abualigah, Laith, Diabat, Ali, Sumari, Putra, and Gandomi, Amir H.
- Abstract
The Internet of Drones (IoD) has become a hot research topic in academia, industry, and management in current years due to its wide potential applications, such as aerial photography, civilian, and military. This paper presents a comprehensive survey of IoD and its applications, deployments, and integration. We focused in this review on two main sides; IoD Applications include smart cities surveillance, cloud and fog frameworks, unmanned aerial vehicles, wireless sensor networks, networks, mobile computing, and business paradigms; integration of IoD includes privacy protection, security authentication, neural network, blockchain, and optimization based-method. A discussion highlights the hot research topics and problems to help researchers interested in this area in their future works. The keywords that have been used in this paper are Internet of Drones. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
34. A novel edge computing approach to astronomical image data processing based on sCMOS camera using SoC.
- Author
-
Zienkiewicz, Paweł, Karpińska, Katarzyna, Jamroży, Mikołaj, Juszczyk, Bartłomiej, Pochapskyi, Dmytro, Przedpełski, Tomasz, Łukasiewicz, Jerzy, Czortek, Natalia, and Brona, Grzegorz
- Subjects
CLIENT/SERVER computing ,ASTRONOMY ,FIELD programmable gate arrays ,COMPUTER architecture ,DATA transmission systems - Abstract
The ever-growing deluge of astronomical data challenges traditional server-based processing, hindering real-time analysis and scientific discovery. This paper proposes a novel approach: edge computing directly on an sCMOS camera using a System-on-Chip (SoC) architecture currently developed at Creotech Instruments. We present a custom-designed camera equipped with an FPGA-based SoC, enabling on-board preprocessing and feature extraction of astronomical images. This significantly reduces data transmission, minimizes latency, and empowers real-time decision-making for critical observations. We showcase the camera's capabilities through real-world scenarios, demonstrating its usability in astronomy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. PCI-Express: Evolution of a Ubiquitous Load-Store Interconnect Over Two Decades and the Path Forward for the Next Two Decades.
- Author
-
Sharma, Debendra Das
- Abstract
The Peripheral Component Interconnect Express (PCI-Express or PCle) architecture has been the ubiquitous backbone interconnect in the evolving computing landscape for more than two decades. This paper delves into the multiple innovations driving the backward-compatible evolution of PCle for seven generations, doubling bandwidth every generation, while delivering power-efficient and cost-effective performance. Compute Express Link (CXL) overlays coherency and memory protocols on top of PCle for heterogeneous computing, addressing the memory wall, resource pooling and sharing across servers, and distributed computing using load-store based messaging. The die-todie industry-standard, Universal Chiplet Interconnect Express $^{TM}$ (UCle), offers orders of magnitude improvement in bandwidth density, power efficiency, and latency for PCle and CXL protocols on-package and pod-level connectivity with co-packaged optics. We foresee PCle will continue to evolve over the next few decades to serve future computing needs. It will do so by embracing alternate media while backward-compatible frequency scaling with copper will continue and protocol enhancements will enable non-tree fabric topologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A Taxonomy and Survey of Edge Cloud Computing for Intelligent Transportation Systems and Connected Vehicles.
- Author
-
Arthurs, Peter, Gillam, Lee, Krause, Paul, Wang, Ning, Halder, Kaushik, and Mouzakitis, Alexandros
- Abstract
Recent advances in smart connected vehicles and Intelligent Transportation Systems (ITS) are based upon the capture and processing of large amounts of sensor data. Modern vehicles contain many internal sensors to monitor a wide range of mechanical and electrical systems and the move to semi-autonomous vehicles adds outward looking sensors such as cameras, lidar, and radar. ITS is starting to connect existing sensors such as road cameras, traffic density sensors, traffic speed sensors, emergency vehicle, and public transport transponders. This disparate range of data is then processed to produce a fused situation awareness of the road network and used to provide real-time management, with much of the decision making automated. Road networks have quiet periods followed by peak traffic periods and cloud computing can provide a good solution for dealing with peaks by providing offloading of processing and scaling-up as required, but in some situations latency to traditional cloud data centres is too high or bandwidth is too constrained. Cloud computing at the edge of the network, close to the vehicle and ITS sensor, can provide a solution for latency and bandwidth constraints but the high mobility of vehicles and heterogeneity of infrastructure still needs to be addressed. This paper surveys the literature for cloud computing use with ITS and connected vehicles and provides taxonomies for that plus their use cases. We finish by identifying where further research is needed in order to enable vehicles and ITS to use edge cloud computing in a fully managed and automated way. We surveyed 496 papers covering a seven-year timespan with the first paper appearing in 2013 and ending at the conclusion of 2019. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Complicated Dynamics of a Delayed Photonic Reservoir Computing System.
- Author
-
Pei, Lijun and Zhang, Mengyu
- Subjects
COMPUTER systems ,MULTIPLE scale method ,HOPF bifurcations ,RECURRENT neural networks ,COMPUTER architecture ,BIFURCATION diagrams - Abstract
In this paper, we consider the complicated dynamics of a delay-based photonic reservoir computing system. Since conventional computer architectures are approaching their limit, it is imperative to find new, efficient and fast ways of data processing. Photonic reservoir computing (RC) is a promising way which combines the computational capabilities of recurrent neural networks with high processing speed and energy efficiency of photonics. As the RC system is very promising, we analyze its dynamics so that we can make better use of it. In this paper, we mainly focus on its double Hopf bifurcation. We first analyze the existence of double Hopf bifurcation points. Then we use DDE-BIFTOOL to draw the bifurcation diagrams with respect to two bifurcation parameters, i.e. feedback strength η and delay τ , and give a clear picture of the double Hopf bifurcation points of the system. These figures show stability switches and the existence of double Hopf bifurcation points. Finally, we employ the method of multiple scales to obtain their normal forms, use the method of normal form to unfold and classify their local dynamics. The classification and unfolding of these double Hopf bifurcation points are obtained. Three types of double Hopf bifurcations are found. We verify the results by numerical simulations and find its complicated behavioral dynamics. For example, there exist stable equilibrium, stable periodic and quasi-periodic solutions in distinct regions. The discovered rich dynamical phenomena can help us to choose suitable values of parameters to achieve excellent performance of RC. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads.
- Author
-
Li, Ruo-Shi, Peng, Ping, Shao, Zhi-Yuan, Jin, Hai, and Zheng, Ran
- Subjects
COMPUTER architecture ,GRAYSCALE model ,COMPUTER vision ,COMPUTER performance ,PARALLEL processing ,ALGORITHMS - Abstract
Computer vision (CV) algorithms have been extensively used for a myriad of applications nowadays. As the multimedia data are generally well-formatted and regular, it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV algorithms. Single Instruction Multiple Data (SIMD) instructions, capable of conducting the same operation on multiple data items in a single instruction, are extensively employed to improve the efficiency of CV algorithms. In this paper, we evaluate the power and effectiveness of RISC-V vector extension (RV-V) on typical CV algorithms, such as Gray Scale, Mean Filter, and Edge Detection. By our examinations, we show that compared with the baseline OpenCV implementation using scalar instructions, the equivalent implementations using the RV-V (version 0.8) can reduce the instruction count of the same CV algorithm up to 24x, when processing the same input images. Whereas, the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V co-processor. In our evaluation, by using the vector co-processor (with eight execution lanes) of Xuantie C906, vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. An Integrated, Scalable, Electronic Video Consent Process to Power Precision Health Research: Large, Population-Based, Cohort Implementation and Scalability Study
- Author
-
Clara Lajonchere, Arash Naeim, Sarah Dry, Neil Wenger, David Elashoff, Sitaram Vangala, Antonia Petruse, Maryam Ariannejad, Clara Magyar, Liliana Johansen, Gabriela Werre, Maxwell Kroloff, and Daniel Geschwind
- Subjects
Adult ,data collection ,Adolescent ,Computer science ,precision medicine ,Large population ,Health Informatics ,privacy ,video ,electronic consent ,Cohort Studies ,biobanking ,research methods ,Humans ,scalability ,validation ,Original Paper ,patient privacy ,research ,Informed Consent ,Power (physics) ,recruitment ,Computer architecture ,Scalability ,Cohort ,consent ,clinical data ,eHealth ,Preprint ,Electronics ,population health ,Laboratories, Clinical - Abstract
Background Obtaining explicit consent from patients to use their remnant biological samples and deidentified clinical data for research is essential for advancing precision medicine. Objective We aimed to describe the operational implementation and scalability of an electronic universal consent process that was used to power an institutional precision health biobank across a large academic health system. Methods The University of California, Los Angeles, implemented the use of innovative electronic consent videos as the primary recruitment tool for precision health research. The consent videos targeted patients aged ≥18 years across ambulatory clinical laboratories, perioperative settings, and hospital settings. Each of these major areas had slightly different workflows and patient populations. Sociodemographic information, comorbidity data, health utilization data (ambulatory visits, emergency room visits, and hospital admissions), and consent decision data were collected. Results The consenting approach proved scalable across 22 clinical sites (hospital and ambulatory settings). Over 40,000 participants completed the consent process at a rate of 800 to 1000 patients per week over a 2-year time period. Participants were representative of the adult University of California, Los Angeles, Health population. The opt-in rates in the perioperative (16,500/22,519, 73.3%) and ambulatory clinics (2308/3390, 68.1%) were higher than those in clinical laboratories (7506/14,235, 52.7%; P Conclusions This is one of the few large-scale, electronic video–based consent implementation programs that reports a 65.5% (26,314/40,144) average overall opt-in rate across a large academic health system. This rate is higher than those previously reported for email (3.6%) and electronic biobank (50%) informed consent rates. This study demonstrates a scalable recruitment approach for population health research.
- Published
- 2021
40. Utilizing DMAIC Process to Identify Successful Completion of SRAD Phases of Waterfall Development.
- Author
-
Hossain, Niamat Ullah Ibne, Sokolov, Alexandr M., Petersen, Tim, and Merrill, Brian
- Subjects
REQUIREMENTS engineering ,COMPUTER architecture ,PROJECT management ,WATERFALLS ,MANUFACTURING processes - Abstract
The Systems Requirements Analysis (SRA) and System Architecture Design (SAD) (often combined into one acronym SRAD) phases of projects in the waterfall development cycle often pass-through design gates without proper pass/fail criteria. In addition, completion of project designs is often put off onto later design phases (Preliminary Design and Critical Design) in favor of meeting schedule/budget early in the project lifecycle. Currently in industry, schedule, and budget dictate project phase completion over proper metric tracking/utilization. This is normally due to the fluidity of the design in early phases of project development. This thinking can be dangerous for organizations/industries as it consistently leads to defects late in the development cycle where fixes are costly. It is cheaper to change designs and defects as early in a design as possible. This paper will outline the DMAIC (Define, Measure, Analyze, Improve, Control) process to help track completion of the SRAD phases for proper completion of design review phase gates. By using DMAIC, projects will also be able to reduce latent defects in designs that can become costly to projects in later design phases such as testing, and production. These phase gate completion metrics can be implemented and refined, as the DMAIC process is an ongoing methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
41. IS ARCHITECTURE COMPLEXITY DYNAMICS IN M&A: DOES CONSOLIDATION REDUCE COMPLEXITY?
- Author
-
Onderdelinden, Eric, van den Hooff, Bart, and van Vliet, Mario
- Subjects
INFORMATION storage & retrieval systems ,DATA integration ,COMPUTER architecture ,MERGERS & acquisitions ,TECHNOLOGICAL complexity - Abstract
In this paper we aim to improve our understanding of the dynamics of IS architecture complexity (i.e, the change in this complexity over time) during the execution of a consolidation IS integration strategy (IIS). Based on two case studies, we find that unexpected levels of complexity emerge during IIS execution because of an underestimation of requisite complexity and an overestimation of the potential to reduce complexity. Our analysis shows that increased complexity is due to the fact that the intended consolidation IIS is only partially executed, and to increasingly emergent IIS execution. Additionally, we find that while complexity was reduced at the portfolio level, at more detailed levels of observation complexity was actually increased. Our paper contributes to knowledge in the field by providing a deeper insight into IS architecture complexity dynamics during the execution of a consolidation IIS, and the concept of IS architecture complexity in general. [ABSTRACT FROM AUTHOR]
- Published
- 2023
42. Relation Between INL and ACPR of RF DACs.
- Author
-
Babamir, Seyed-Mehrdad and Razavi, Behzad
- Subjects
COMPUTER architecture ,DIGITAL-to-analog converters ,TRANSMITTERS (Communication) ,INTEGRATED circuits ,MATHEMATICAL models ,RADIO frequency - Abstract
The integral nonlinearity of digital-to-analog converters manifests itself as adjacent-channel power in RF transmitters. This paper derives compact equations relating these two quantities and verifies the results by simulations. Both current-steering and switched-mode architectures are analyzed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. DPCrypto: Acceleration of Post-Quantum Cryptography Using Dot-Product Instructions on GPUs.
- Author
-
Lee, Wai-Kong, Seo, Hwajeong, Hwang, Seong Oun, Achar, Ramachandra, Karmakar, Angshuman, and Mera, Jose Maria Bermudo
- Subjects
SCIENCE education ,GRAPHICS processing units ,QUANTUM cryptography ,DATA structures ,SCIENTIFIC computing ,MACHINE learning - Abstract
Modern NVIDIA GPU architectures offer dot-product instructions (DP2A and DP4A), with the aim of accelerating machine learning and scientific computing applications. These dot-product instructions allow the computation of multiply-and-add instructions in a single clock cycle, effectively achieving higher throughput compared to conventional 32-bit integer units. In this paper, we show that the dot-product instruction can also be used to accelerate matrix-multiplication and polynomial convolution operations, which are widely used in post-quantum lattice-based cryptographic schemes. In particular, we propose a highly optimized implementation of FrodoKEM wherein the matrix-multiplication is accelerated by the dot-product instruction. We also present specially designed data structures that allow an efficient implementation of Saber key-encapsulation mechanism, utilizing the dot-product instruction to speed-up the polynomial convolution. The proposed FrodoKEM implementation achieves $4.37\times $ higher throughput than the state-of-the-art implementation on a V100 GPU. This paper also presents the first implementation of Saber on GPU platforms, achieving 124,418, 120,463, and 31,658 key exchanges per second on RTX3080, V100, and T4 GPUs, respectively. Since matrix-multiplication and polynomial convolution operations are the most time-consuming operations in lattice-based cryptographic schemes, we strongly believe that the proposed methods can be beneficial to other KEM and signatures schemes based on lattices. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Classification of mastoid air cells by CT scan images using deep learning method.
- Author
-
Khosravi, Mohammad, Jabbari Moghaddam, Yalda, Esmaeili, Mahdad, Keshtkar, Ahmad, Jalili, Javad, and Tayefi Nasrabadi, Hamid
- Subjects
COMPUTED tomography ,DEEP learning ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,COMPUTER architecture ,MIDDLE ear - Abstract
Purpose: Mastoid abnormalities show different types of ear illnesses, however inadequacy of experts and low accuracy of diagnostic demand a new approach to detect these abnormalities and reduce human mistakes. The manual analysis of mastoid CT scans is time-consuming and labor-intensive. In this paper the first and robust deep learning-based approaches is introduced to diagnose mastoid abnormalities using a large database of CT images obtained in the clinical center with remarkable accuracy. Methods: In this paper, mastoid abnormalities are classified using the Xception based Convolutional Neural Network (CNN) model, with optimizer Adamax into five categories (Complete pneumatized, Opacification in pneumatization, Partial pneumatization, Opacification in partial pneumatization, None pneumatized). For this reason, a total of 24,800 slides of 152 patients were selected that include the mastoid from most upper to the lowest part of the middle ear cavity to complete the construction of the proposed deep neural network model. Results: The proposed model had the best accuracy of 87.80% (based on grader 1) and 88.44% (based on grader 2) on the 20th epoch and 87.70% (based on grader 1) and 87.56% (based on grader 2) on average and also significantly faster than other types of implemented architectures in terms of the computer running time (in seconds). The 99% confidence interval of the average accuracy was 0.012 which means that the true accuracy is 87.80% and 87.56% ± 1.2% that indicates the power of the model. Conclusions: The manual analysis of ear cavity CT scans is often time-consuming and prone to errors due to various inter- or intra operator variability studies. The proposed method can be used to automatically analyze the middle ear cavity to classify mastoid abnormalities, which is markedly faster than most types of models with the highest accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Energy and Delay Guaranteed Joint Beam and User Scheduling Policy in 5G CoMP Networks.
- Author
-
Kim, Yeongjin, Jeong, Jaehwan, Ahn, Suyoung, Kwak, Jeongho, and Chong, Song
- Abstract
Massive Multi-Input Multi-Output (MIMO) and Coordinated MultiPoint (CoMP) technologies in Cloud-RAN (C-RAN) architecture become inevitable trend due to the advent of next-generation mobile applications, which are traffic-intensive, such as ultra high definition (UHD) video. In this paper, we study a joint beam activation and user scheduling problem in a 5G cellular network with massive MIMO and CoMP utilizing orthogonal random beamforming technique. This paper aims to minimize total Remote Radio Heads’ (RRHs’) energy expenditure in a dynamic C-RAN architecture while ensuring finite service time for all user traffic arrivals in the communication coverage. We leverage Lyapunov drift-plus-penalty framework to transform an original long-term average problem into a series of per-slot modified problems. Since the provided per-slot problem is combinatorial and nonlinear optimization problem, we are inspired by a greedy algorithm to design energy and delay guaranteed joint beam activation and user scheduling policy, namely BEANS. We prove that the proposed BEANS ensures finite upper bounds of average RRH energy consumption and average queue backlogs for all traffic arrival rates within constant ratio of capacity region and all energy-delay tradeoff parameters. These proofs are the first attempt to theoretically demonstrate guarantees of energy and queue bounds in a framework consisting of possibly negative submodular objective function and non-matriod constraints. Finally, via extensive simulations, we compare the capacity region and energy-queue backlog tradeoff of BEANS with optimal and existing algorithms, and show that BEANS attains up to 65% of energy saving for the same average queue backlog compared to the algorithms which do not take traffic dynamics and energy consumption into considerations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Distributed MPC for Large Freeway Networks Using Alternating Optimization.
- Author
-
Todorovic, Ugljesa, Frejo, Jose Ramon D., and De Schutter, Bart
- Abstract
The Model Predictive Control (MPC) framework has shown great potential for the control of Variable Speed Limits (VSLs) and Ramp Metering (RM) installations. However, the implementation to large freeway networks remains challenging. One major reason is that, by considering the VSLs to be discrete decision variables, an extremely difficult Mixed Integer Nonlinear Programming (MINLP) optimization problem has to be solved within every controller sampling interval. Consequently, many related papers relax the MINLP problems by considering the VSLs to be continuous variables. This paper proposes two novel MPC algorithms for coordinated control of discrete VSLs and continuous RM rates that do not make this relaxation. The proposed algorithms use a distributed control architecture and an alternating optimization scheme to relax the MINLP optimization problems but still consider the VSLs as discrete variables and, hence, offer a trade-off between computational complexity and system performance. The performance of the proposed algorithms is evaluated in a case study. The case study shows that relaxing the VSLs to be continuous variables with a distributed architecture results in a significant performance loss. Furthermore, both proposed algorithms have a lower computational complexity than the more conventional centralized approach and, as a result, they do manage to solve all optimization problems within the sampling intervals. Moreover, one of the proposed algorithms has a system performance that is remarkably similar to the optimal performance of the centralized approach. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. EEG-Based Emotion Recognition via Neural Architecture Search.
- Author
-
Li, Chang, Zhang, Zhongzhen, Song, Rencheng, Cheng, Juan, Liu, Yu, and Chen, Xun
- Abstract
With the flourishing development of deep learning (DL) and the convolution neural network (CNN), electroencephalogram-based (EEG) emotion recognition is occupying an increasingly crucial part in the field of brain-computer interface (BCI). However, currently employed architectures have mostly been designed manually by human experts, which is a time-consuming and labor-intensive process. In this paper, we proposed a novel neural architecture search (NAS) framework based on reinforcement learning (RL) for EEG-based emotion recognition, which can automatically design network architectures. The proposed NAS mainly contains three parts: search strategy, search space, and evaluation strategy. During the search process, a recurrent network (RNN) controller is used to select the optimal network structure in the search space. We trained the controller with RL to maximize the expected reward of the generated models on a validation set and force parameter sharing among the models. We evaluated the performance of NAS on the DEAP and DREAMER dataset. On the DEAP dataset, the average accuracies reached 97.94%, 97.74%, and 97.82% on arousal, valence, and dominance respectively. On the DREAMER dataset, average accuracies reached 96.62%, 96.29% and 96.61% on arousal, valence, and dominance, respectively. The experimental results demonstrated that the proposed NAS outperforms the state-of-the-art CNN-based methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Towards Software-Defined Delay Tolerant Networks.
- Author
-
Ta, Dominick, Booth, Stephanie, and Dudukovich, Rachel
- Subjects
DELAY-tolerant networks ,SOFTWARE-defined networking ,SCHEDULING ,SCALABILITY ,COMPUTER architecture - Abstract
This paper proposes a Software-Defined Delay Tolerant Networking (SDDTN) architecture as a solution to managing large Delay Tolerant Networking (DTN) networks in a scalable manner. This work is motivated by the planned deployments of large DTN networks on the Moon and beyond in deep space. Current space communication involves relatively few nodes and is heavily deterministic and scheduled, which will not be true in the future. It is unclear how these large space DTN networks, consisting of inherently intermittent links, will be able to adapt to dynamically changing network conditions. In addition to the proposed SDDTN architecture, this paper explores data plane programming and the Programming Protocol-Independent Packet Processors (P4) language as a possible method of implementing this SDDTN architecture, enumerates the challenges of this approach, and presents intermediate results. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Probabilistic Classification Method of Spiking Neural Network Based on Multi-Labeling of Neurons.
- Author
-
Sung, Mingyu, Kim, Jaesoo, and Kang, Jae-Mo
- Subjects
DEEP learning ,COMPUTER architecture ,ARTIFICIAL intelligence ,NEURONS ,ENERGY consumption - Abstract
Recently, deep learning has exhibited outstanding performance in various fields. Even though artificial intelligence achieves excellent performance, the amount of energy required for computations has increased with its development. Hence, the need for a new energy-efficient computer architecture has emerged, which further leads us to the neuromorphic computer. Although neuromorphic computing exhibits several advantages, such as low-power parallelism, it exhibits lower accuracy than deep learning. Therefore, the major challenge is to improve the accuracy while maintaining the neuromorphic computing-specific energy efficiency. In this paper, we propose a novel method of the inference process that considers the probability that after completing the learning process, a neuron can react to multiple target labels. Our proposed method can achieve improved accuracy while maintaining the hardware-friendly, low-power-parallel processing characteristics of a neuromorphic processor. Furthermore, this method converts the spike counts occurring in the learning process into probabilities. The inference process is conducted to implement the interaction between neurons by considering all the spikes that occur. The inferring circuit is expected to show a significant reduction in hardware cost and can afford an algorithm exhibiting a competitive computing performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Sharing non‐cache‐coherent memory with bounded incoherence.
- Author
-
Ren, Yuxin, Parmer, Gabriel, and Milojicic, Dejan
- Subjects
CACHE memory ,MEMORY ,MODERN architecture ,COMPUTER architecture ,INFORMATION sharing ,MANAGEMENT controls - Abstract
Summary: Cache coherence in modern computer architectures enables easier programming by sharing data across multiple processors. Unfortunately, it can also limit scalability due to cache coherency traffic initiated by competing memory accesses. Rack‐scale systems introduce shared memory across a whole rack, but without inter‐node cache coherence. This poses memory management and concurrency control challenges for applications that must explicitly manage cache‐lines. To fully utilize rack‐scale systems for low‐latency and scalable computation, applications need to maintain cached memory accesses in spite of non‐coherency. This paper introduces Bounded Incoherence, a memory consistency model that enables cached access to shared data‐structures in non‐cache‐coherency memory. It ensures that updates to memory on one node are visible within at most a bounded amount of time on all other nodes. We evaluate this memory model on modified PowerGraph graph processing framework, and boost its performance by 30% with eight sockets by enabling cached‐access to data‐structures. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.