12 results on '"Eduard Ayguadé"'
Search Results
2. On the maturity of parallel applications for asymmetric multi-core processors
- Author
-
Rosa M. Badia, Miquel Moreto, Marc Casas, Kallia Chronaki, Mateo Valero, Alejandro Rico, Eduard Ayguadé, Ministerio de Economía y Competitividad (España), Generalitat de Catalunya, European Commission, Chronaki, Kallia, Badia, Rosa M., Ayguade, Eduard, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Chronaki, Kallia [0000-0003-4579-8151], Badia, Rosa M. [0000-0003-2941-5499], and Ayguade, Eduard [0000-0002-5146-103X]
- Subjects
Computer Networks and Communications ,Computer science ,Parallel programming ,Parallel programming (Computer science) ,02 engineering and technology ,Programació en paral·lel (Informàtica) ,Power budget ,Theoretical Computer Science ,Scheduling (computing) ,Runtime system ,Artificial Intelligence ,Asymmetric multi-cores ,0202 electrical engineering, electronic engineering, information engineering ,Runtime systems ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,Multi-core processor ,Scheduling ,business.industry ,020206 networking & telecommunications ,Supercomputers ,Supercomputer ,Hardware and Architecture ,Embedded system ,HPC ,Superordinadors ,020201 artificial intelligence & image processing ,High performance computing ,business ,Càlcul intensiu (Informàtica) ,Software - Abstract
Asymmetric multi-cores (AMCs) are a successful architectural solution for both mobile devices and supercomputers. By maintaining two types of cores (fast and slow) AMCs are able to provide high performance under the facility power budget. This paper performs the first extensive evaluation of how portable are the current HPC applications for such supercomputing systems. Specifically we evaluate several execution models on an ARM big.LITTLE AMC using the PARSEC benchmark suite that includes representative highly parallel applications. We compare schedulers at the user, OS and runtime levels, using both static and dynamic options and multiple configurations, and assess the impact of these options on the well-known problem of balancing the load across AMCs. Our results demonstrate that scheduling is more effective when it takes place in the runtime system level as it improves the baseline by 23%, while the heterogeneous-aware OS scheduling solution improves the baseline by 10%., This work has been supported by the RoMoL ERC Advanced Grant (GA 321253), by the European HiPEAC Network of Excellence, by the Spanish Ministry of Science and Innovation (contracts TIN2015-65316-P), by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), and by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671697 and No. 779877. Kallia Chronaki has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal fellowship number RYC-2016-21104.
- Published
- 2019
- Full Text
- View/download PDF
3. EMVS: Embedded Multi Vector-core System
- Author
-
Adrian Cristal, Eduard Ayguadé, Amna Haider, Tassadaq Hussain, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
- Subjects
Digital electronics ,Parallel processing (Electronic computers) ,business.industry ,Computer science ,Processament en paral·lel (Ordinadors) ,Multiprocessadors ,02 engineering and technology ,Ordinadors immersos, Sistemes d' ,Embedded computer systems ,Porting ,020202 computer hardware & architecture ,Software portability ,Hardware and Architecture ,Embedded system ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Systems architecture ,Multiprocessors ,Core system ,020201 artificial intelligence & image processing ,business ,Field-programmable gate array ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,Software ,Energy (signal processing) - Abstract
With the increase in the density and performance of digital electronics, the demand for a power-efficient high-performance computing (HPC) system has been increased for embedded applications. The existing embedded HPC systems suffer from issues like programmability, scalability, and portability. Therefore, a parameterizable and programmable high-performance processor system architecture is required to execute the embedded HPC applications. In this work, we proposed an Embedded Multi Vector-core System (EMVS) which executes the embedded application by managing the multiple vectorized tasks and their memory operations. The system is designed and ported on an Altera DE4 FPGA development board. The performance of EMVS is compared with the Heterogeneous Multi-Processing Odroid XU3, Parallela and GPU Jetson TK1 embedded systems. In contrast to the embedded systems, the results show that EMVS improves 19.28 and 10.22 times of the application and system performance respectively and consumes 10.6 times less energy.
- Published
- 2018
- Full Text
- View/download PDF
4. A visual embedding for the unsupervised extraction of abstract semantics
- Author
-
Javier Béjar, Richard Chen, Dario Garcia-Gasulla, Jesús Labarta, Ulises Cortés, Eduard Ayguadé, Toyotaro Suzumura, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Universitat Politècnica de Catalunya. KEMLG - Grup d'Enginyeria del Coneixement i Aprenentatge Automàtic, and Barcelona Supercomputing Center
- Subjects
FOS: Computer and information sciences ,Artificial image cognition ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Cognitive Neuroscience ,Computer Science - Computer Vision and Pattern Recognition ,WordNet ,Experimental and Cognitive Psychology ,02 engineering and technology ,010501 environmental sciences ,Visual reasoning ,Semantics ,01 natural sciences ,Cognitive learning ,Ensenyament i aprenentatge::Metodologies docents [Àrees temàtiques de la UPC] ,Machine Learning (cs.LG) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Neural and Evolutionary Computing (cs.NE) ,0105 earth and related environmental sciences ,Artificial neural network ,business.industry ,Deep learning ,Computer Science - Neural and Evolutionary Computing ,Pattern recognition ,Computer Science - Learning ,Deep learning embeddings ,Aprenentatge cognitiu ,Embedding ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Software ,Word (computer architecture) ,Vector space - Abstract
Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g. 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g. living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used., Comment: 14 pages, 5 figures, accepted at Cognitive Systems Research
- Published
- 2017
- Full Text
- View/download PDF
5. Automatic Query Driven Data Modelling in Cassandra
- Author
-
Eduard Ayguadé, Roger Hernandez, Yolanda Becerra, Jordi Torres, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
- Subjects
Computer science ,Reliability (computer networking) ,Distributed computing ,Informàtica::Sistemes d'informació [Àrees temàtiques de la UPC] ,Big data ,NoSQL ,computer.software_genre ,Data modeling ,models ,Consistency (database systems) ,Database management ,Supercomputadors ,big data ,Models ,Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC] ,Code (cryptography) ,Information retrieval ,dynamic model selection ,Bases de dades -- Gestió ,Cassandra ,Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] ,General Environmental Science ,Recuperació de la informació ,Database ,business.industry ,Macrodades ,Heterogeneous replication ,Supercomputers ,Replication (computing) ,cassandra ,Data model ,nosql ,Dynamic model selection ,heterogeneous replication ,General Earth and Planetary Sciences ,High performance computing ,business ,computer ,Càlcul intensiu (Informàtica) - Abstract
Non-relational databases have recently been the preferred choice when it comes to dealing with Big Data challenges, but their performance is very sensitive to the chosen data organisations. We have seen differences of over 70 times in response time for the same query on different models. This brings users the need to be fully conscious of the queries they intend to serve in order to design their data model. The common practice then, is to replicate data into different models designed to fit different query requirements. In this scenario, the user is in charge of the code implementation required to keep consistency between the different data replicas. Manually replicating data in such high layers of the database results in a lot of squandered storage due to the underlying system replication mechanisms that are formerly designed for availability and reliability ends. We propose and design a mechanism and a prototype to provide users with transparent management, where queries are matched with a well-performing model option. Additionally, we propose to do so by transforming the replication mechanism into a heterogeneous replication one, in order to avoid squandering disk space while keeping the availability and reliability features. The result is a system where, regardless of the query or model the user specifies, response time will always be that of an affine query.
- Published
- 2015
- Full Text
- View/download PDF
6. PMSS: A programmable memory system and scheduler for complex memory patterns
- Author
-
Amna Haider, Tassadaq Hussain, Eduard Ayguadé, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
- Subjects
MicroBlaze ,Speedup ,Computer Networks and Communications ,Data parallelism ,Computer science ,Parallel programming (Computer science) ,Programació en paral·lel (Informàtica) ,Theoretical Computer Science ,Scheduling (computing) ,law.invention ,Artificial Intelligence ,law ,Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació [Àrees temàtiques de la UPC] ,Ordinadors -- Dispositius de memòria ,Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC] ,Field-programmable gate array ,Xilkernel ,FPGA ,Supercomputer ,Computer storage devices ,DRAM ,Microprocessor ,Memory management ,Computer architecture ,Hardware and Architecture ,HPC ,Software ,Dram - Abstract
HPC industry demands more computing units on FPGAs, to enhance the performance by using task/data parallelism. FPGAs can provide its ultimate performance on certain kernels by customizing the hardware for the applications. However, applications are getting more complex, with multiple kernels and complex data arrangements, generating overhead while scheduling/managing system resources. Due to this reason all classes of multi threaded machines–minicomputer to supercomputer–require to have efficient hardware scheduler and memory manager that improves the effective bandwidth and latency of the DRAM main memory. This architecture could be a very competitive choice for supercomputing systems that meets the demand of parallelism for HPC benchmarks. In this article, we proposed a Programmable Memory System and Scheduler (PMSS), which provides high speed complex data access pattern to the multi threaded architecture. This proposed PMSS system is implemented and tested on a Xilinx ML505 evaluation FPGA board. The performance of the system is compared with a microprocessor based system that has been integrated with the Xilkernel operating system. Results show that the modified PMSS based multi-accelerator system consumes 50% less hardware resources, 32% less on-chip power and achieves approximately a 19x speedup compared to the MicroBlaze based system.
- Published
- 2014
- Full Text
- View/download PDF
7. Programmability and portability for exascale: Top down programming methodology and tools with StarSs
- Author
-
Jesús Labarta, José Gracia, Vladimir Marjanovic, Rosa M. Badia, Eduard Ayguadé, Mateo Valero, Christoph Niethammer, Vladimir Subotic, Steffen Brinkmann, Ministerio de Economía y Competitividad (España), Generalitat de Catalunya, and European Commission
- Subjects
General Computer Science ,Computer science ,Programming language ,4. Education ,Concurrency ,Parallel programming models ,Development tools ,020207 software engineering ,02 engineering and technology ,Top-down and bottom-up design ,computer.software_genre ,Theoretical Computer Science ,Scheduling (computing) ,Software portability ,Modeling and Simulation ,0202 electrical engineering, electronic engineering, information engineering ,Programming paradigm ,020201 artificial intelligence & image processing ,Performance analysis tools ,Compiler ,Programmer ,computer ,Debugger - Abstract
StarSs is a task-based programming model that allows to parallelize sequential applications by means of annotating the code with compiler directives. The model further supports transparent execution of designated tasks on heterogeneous platforms, including clusters of GPUs. This paper focuses on the methodology and tools that complements the programming model forming a consistent development environment with the objective of simplifying the live of application developers.The programming environment includes the tools TAREADOR and TEMANEJO, which have been designed specifically for StarSs. TAREADOR, a Valgrind-based tool, allows a top-down development approach by assisting the programmer in identifying tasks and their data-dependencies across all concurrency levels of an application. TEMANEJO is a graphical debugger supporting the programmer by visualizing the task dependency tree on one hand, but also allowing to manipulate task scheduling or dependencies. These tools are complemented with a set of performance analysis tools (Scalasca, Cube and Paraver) that enable to fine tune StarSs application. © 2013 Elsevier B.V., We thankfully acknowledge the support of the European Commission through the TEXT project (FP7-261580) and the HiPEAC-3 Network of Excellence (FP7-ICT 287759), and the support of the Spanish Ministry of Education (TIN2007-60625, TIN2012-34557 and CSD2007-00050), the Generalitat de Catalunya (2009-SGR-980)
- Published
- 2013
- Full Text
- View/download PDF
8. Energy accounting for shared virtualized environments under DVFS using PMC-based power models
- Author
-
Jordi Torres, Marc Gonzílez, David Carrera, Nacho Navarro, Vicenç Beltran, Yolanda Becerra, Ramon Bertran, Xavier Martorell, and Eduard Ayguadé
- Subjects
Flexibility (engineering) ,Computer Networks and Communications ,Computer science ,business.industry ,Distributed computing ,Energy consumption ,Virtualization ,computer.software_genre ,Energy accounting ,Hardware and Architecture ,Virtual machine ,Embedded system ,Central processing unit ,Frequency scaling ,business ,computer ,Software - Abstract
Virtualized infrastructure providers demand new methods to increase the accuracy of the accounting models used to charge their customers. Future data centers will be composed of many-core systems that will host a large number of virtual machines (VMs) each. While resource utilization accounting can be achieved with existing system tools, energy accounting is a complex task when per-VM granularity is the goal. In this paper, we propose a methodology that brings new opportunities to energy accounting by adding an unprecedented degree of accuracy on the per-VM measurements. We present a system - which leverages CPU and memory power models based in performance monitoring counters (PMCs) - to perform energy accounting in virtualized systems. The contribution of this paper is threefold. First, we show that PMC-based power modeling methods are still valid on virtualized environments. Second, we show that the Dynamic Voltage and Frequency Scaling (DVFS) mechanism, which commonly is used by infrastructure providers to avoid power and thermal emergencies, does not affect the accuracy of the models. And third, we introduce a novel methodology for accounting of energy consumption in virtualized systems. Accounting is done on a per-VM basis, even in the case where multiple VMs are deployed on top of the same physical hardware, bypassing the limitations of per-server aggregated power metering. Overall, the results for an Intel^(R) Core(TM) 2 Duo show errors in energy estimations
- Published
- 2012
- Full Text
- View/download PDF
9. Designing an overload control strategy for secure e-commerce applications
- Author
-
Jordi Guitart, David Carrera, Jordi Torres, Eduard Ayguadé, and Vicenç Beltran
- Subjects
Transport Layer Security ,Computer Networks and Communications ,business.industry ,Application server ,Computer science ,Access control ,E-commerce ,Admission control ,computer.software_genre ,Computer security ,Key (cryptography) ,Session (computer science) ,business ,computer ,Computer network - Abstract
Uncontrolled overload can lead e-commerce applications to considerable revenue losses. For this reason, overload prevention in these applications is a critical issue. In this paper we present a complete characterization of secure e-commerce applications scalability to determine which are the bottlenecks in their performance that must be considered for an overload control strategy. With this information, we design an adaptive session-based overload control strategy based on SSL (Secure Socket Layer) connection differentiation and admission control. The SSL connection differentiation is a key factor because the cost of establishing a new SSL connection is much greater than establishing a resumed SSL connection (it reuses an existing SSL session on the server). Considering this big difference, we have implemented an admission control algorithm that prioritizes resumed SSL connections to maximize the performance in session-based environments and dynamically limits the number of new SSL connections accepted, according to the available resources and the current number of connections in the system, in order to avoid server overload. Our evaluation on a Tomcat server demonstrates the benefit of our proposal for preventing server overload.
- Published
- 2007
- Full Text
- View/download PDF
10. Tools and techniques for automatic data layout: A case study
- Author
-
Eduard Ayguadé, Jordi Garcia, Ulrich Kremer, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, and Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
- Subjects
Linear 0–1 integer ,Computer Networks and Communications ,Computer science ,Computation ,Distribution ,Theoretical Computer Science ,Distributed memory multiprocessor ,Automatic data and computation partitioning ,Programming technology ,Remapping ,Artificial Intelligence ,Informàtica::Arquitectura de computadors::Arquitectures paral·leles [Àrees temàtiques de la UPC] ,Alignment ,Parallel processing (Electronic computers) ,Data layout ,business.industry ,Processament en paral·lel (Ordinadors) ,Computer Graphics and Computer-Aided Design ,Computer architecture ,Kernel (image processing) ,Hardware and Architecture ,Programming paradigm ,Distributed memory ,Data locality ,Software engineering ,business ,Software - Abstract
Parallel architectures with physically distributed memory providing computing cycles and large amounts of memory are becoming more and more common. To make such architectures truly usable, programming models and support tools are needed to ease the programming effort for these parallel systems. Automatic data distribution tools and techniques play an important role in achieving that goal. This paper discusses state-of-the-art approaches to fully automatic data and computation partitioning. A kernel application is used as a case study to illustrate the main differences of four representative approaches. The paper concludes with a discussion of promising future research directions for automatic data layout.
- Published
- 1998
- Full Text
- View/download PDF
11. Conflict-free access to streams in multiprocessor systems
- Author
-
Mateo Valero, Eduard Ayguadé, Montse Peiron, and Tomás Lang
- Subjects
Scheme (programming language) ,Hardware_MEMORYSTRUCTURES ,Asynchronous communication ,Computer science ,Distributed computing ,General Engineering ,Multiprocessing ,STREAMS ,Conflict free ,computer ,computer.programming_language - Abstract
The simultaneous access to several vectors is typical in vector multiprocessors. When these accesses are performed in an asynchronous manner, collisions in the network and the conflicts in the memory modules produce high latencies that reduce the efficiency of the system. In this paper we propose a block-interleaved storage scheme to store streams as well as a synchronized out-of-order access mechanism to the vectors that compose the stream so no access conflicts occur for several families of strides.
- Published
- 1993
- Full Text
- View/download PDF
12. Scheduling in a continuous area-time design space
- Author
-
Jordl Cortadella, Rosa M. Badia, and Eduard Ayguadé
- Subjects
Rate-monotonic scheduling ,Mathematical optimization ,Fixed-priority pre-emptive scheduling ,Spacetime ,Computer science ,Simulated annealing ,General Engineering ,Dynamic priority scheduling ,Fair-share scheduling ,Electronic circuit ,Scheduling (computing) - Abstract
Operation scheduling and hardware allocation are the two most important phases in the synthesis of circuits from behavioral descriptions. This paper presents CYOS (CYcle time Optimizer and Scheduler), a new approach to the scheduling of operations in data-path synthesis. The main contribution consists in confronting the problem in its broadest sense, exploring both time and space as continuous variables of the design space. Cycle time is also one of the variables explored and optimized by CYOS. A Simulated Annealing based algorithm has been chosen to search through the area-time design space.
- Published
- 1991
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.