131 results on '"Marazakis, Manolis"'
Search Results
2. The ExaNeSt Prototype: Evaluation of Efficient HPC Communication Hardware in an ARM-based Multi-FPGA Rack
- Author
-
Ploumidis, Manolis, Chaix, Fabien, Chrysos, Nikolaos, Assiminakis, Marios, Flouris, Vassilis, Kallimanis, Nikolaos, Kossifidis, Nikolaos, Nikoloudakis, Michael, Petrakis, Polydoros, Dimou, Nikolaos, Gianioudis, Michael, Ieronymakis, George, Ioannou, Aggelos, Kalokerinos, George, Xirouchakis, Pantelis, Ailamakis, George, Damianakis, Astrinos, Ligerakis, Michael, Makris, Ioannis, Vavouris, Theocharis, Katevenis, Manolis, Papaefstathiou, Vassilis, Marazakis, Manolis, and Mavroidis, Iakovos
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
We present and evaluate the ExaNeSt Prototype, a liquid-cooled rack prototype consisting of 256 Xilinx ZU9EG MPSoCs, 4 TBytes of DRAM, 16 TBytes of SSD, and configurable interconnection 10-Gbps hardware. We developed this testbed in 2016-2019 to validate the flexibility of FPGAs for experimenting with efficient hardware support for HPC communication among tens of thousands of processors and accelerators in the quest towards Exascale systems and beyond. We present our key design choices reagrding overall system architecture, PCBs and runtime software, and summarize insights resulting from measurement and analysis. Of particular note, our custom interconnect includes a low-cost low-latency network interface, offering user-level zero-copy RDMA, which we have tightly coupled with the ARMv8 processors in the MPSoCs. We have developed a system software runtime on top of these features, and have been able to run MPI. We have evaluated our testbed through MPI microbenchmarks, mini, and full MPI applications. Single hop, one way latency is $1.3$~$\mu$s; approximately $0.47$~$\mu$s out of these are attributed to network interface and the user-space library that exposes its functionality to the runtime. Latency over longer paths increases as expected, reaching $2.55$~$\mu$s for a five-hop path. Bandwidth tests show that, for a single hop, link utilization reaches $82\%$ of the theoretical capacity. Microbenchmarks based on MPI collectives reveal that broadcast latency scales as expected when the number of participating ranks increases. We also implemented a custom Allreduce accelerator in the network interface, which reduces the latency of such collectives by up to $88\%$. We assess performance scaling through weak and strong scaling tests for HPCG, LAMMPS, and the miniFE mini application; for all these tests, parallelization efficiency is at least $69\%$, or better., Comment: 45 pages, 23 figures
- Published
- 2023
3. Co-design and Software Architecture
- Author
-
Gheller, Claudio, Marazakis, Manolis, Suarez, Estela, Taffoni, Giuliano, Burton, W.B., Series Editor, Shore, Steven N., Series Editor, Vardoulaki, Eleni, editor, Dembska, Marta, editor, Drabent, Alexander, editor, and Hoeft, Matthias, editor
- Published
- 2024
- Full Text
- View/download PDF
4. Computing Infrastructure
- Author
-
Russo, Stefano Alberto, Suarez, Estela, Chazapis, Antony, Marazakis, Manolis, Taffoni, Giuliano, Burton, W.B., Series Editor, Shore, Steven N., Series Editor, Vardoulaki, Eleni, editor, Dembska, Marta, editor, Drabent, Alexander, editor, and Hoeft, Matthias, editor
- Published
- 2024
- Full Text
- View/download PDF
5. Case Studies on the Impact and Challenges of Heterogeneous NUMA Architectures for HPC
- Author
-
Zaourar, Lilia, Benazouz, Mohamed, Mouhagir, Ayoub, Falquez, Carlos, Portero, Antoni, Ho, Nam, Suarez, Estela, Petrakis, Polydoros, Marazakis, Manolis, Sgherzi, Francesco, Fernandez, Ivan, Dolbeau, Romain, Pleiter, Dirk, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Fey, Dietmar, editor, Stabernack, Benno, editor, Lankes, Stefan, editor, Pacher, Mathias, editor, and Pionteck, Thilo, editor
- Published
- 2024
- Full Text
- View/download PDF
6. Frisbee: automated testing of Cloud-native applications in Kubernetes
- Author
-
Nikolaidis, Fotis, Chazapis, Antony, Marazakis, Manolis, and Bilas, Angelos
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
As more and more companies are migrating (or planning to migrate) from on-premise to Cloud, their focus is to find anomalies and deficits as early as possible in the development life cycle. We propose Frisbee, a declarative language and associated runtime components for testing cloud-native applications on top of Kubernetes. Given a template describing the system under test and a workflow describing the experiment, Frisbee automatically interfaces with Kubernetes to deploy the necessary software in containers, launch needed sidecars, execute the workflow steps, and perform automated checks for deviation from expected behavior. We evaluate Frisbee through a series of tests, to demonstrate its role in designing, and evaluating cloud-native applications; Frisbee helps in testing uncertainties at the level of application (e.g., dynamically changing request patterns), infrastructure (e.g., crashes, network partitions), and deployment (e.g., saturation points). Our findings have strong implications for the design, deployment, and evaluation of cloud applications. The most prominent is that: erroneous benchmark outputs can cause an apparent performance improvement, automated failover mechanisms may require interoperability with clients, and that a proper placement policy should also account for the clock frequency, not only the number of cores.
- Published
- 2021
7. Improving the Performance and Resilience of MPI Parallel Jobs with Topology and Fault-Aware Process Placement
- Author
-
Vardas, Ioannis, Ploumidis, Manolis, and Marazakis, Manolis
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,C.4 - Abstract
HPC systems keep growing in size to meet the ever-increasing demand for performance and computational resources. Apart from increased performance, large scale systems face two challenges that hinder further growth: energy efficiency and resiliency. At the same time, applications seeking increased performance rely on advanced parallelism for exploiting system resources, which leads to increased pressure on system interconnects. At large system scales, increased communication locality can be beneficial both in terms of application performance and energy consumption. Towards this direction, several studies focus on deriving a mapping of an application's processes to system nodes in a way that communication cost is reduced. A common approach is to express both the application's communication patterns and the system architecture as graphs and then solve the corresponding mapping problem. Apart from communication cost, the completion time of a job can also be affected by node failures. Node failures may result in job abortions, requiring job restarts. In this paper, we address the problem of assigning processes to system resources with the goal of reducing communication cost while also taking into account node failures. The proposed approach is integrated into the Slurm resource manager. Evaluation results show that, in scenarios where few nodes have a low outage probability, the proposed process placement approach achieves a notable decrease in the completion time of batches of MPI jobs. Compared to the default process placement approach in Slurm, the reduction is 18.9% and 31%, respectively for two different MPI applications., Comment: 21 pages, 8 figures, added Acknowledgements section
- Published
- 2020
8. COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores
- Author
-
Portero, Antoni, Falquez, Carlos, Ho, Nam, Petrakis, Polydoros, Nassyr, Stepan, Marazakis, Manolis, Dolbeau, Romain, Cifuentes, Jorge Alejandro Nocua, Alvarez, Luis Bertran, Pleiter, Dirk, Suarez, Estela, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Goumas, Georgios, editor, Tomforde, Sven, editor, Brehm, Jürgen, editor, Wildermann, Stefan, editor, and Pionteck, Thilo, editor
- Published
- 2023
- Full Text
- View/download PDF
9. Running Kubernetes Workloads on HPC
- Author
-
Chazapis, Antony, Nikolaidis, Fotis, Marazakis, Manolis, Bilas, Angelos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bienz, Amanda, editor, Weiland, Michèle, editor, Baboulin, Marc, editor, and Kruse, Carola, editor
- Published
- 2023
- Full Text
- View/download PDF
10. Event-Driven Chaos Testing for Containerized Applications
- Author
-
Nikolaidis, Fotis, Chazapis, Antony, Marazakis, Manolis, Bilas, Angelos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bienz, Amanda, editor, Weiland, Michèle, editor, Baboulin, Marc, editor, and Kruse, Carola, editor
- Published
- 2023
- Full Text
- View/download PDF
11. Power and Performance Analysis of Persistent Key-Value Stores
- Author
-
Mikrou, Stella, Papagiannis, Anastasios, Saloustros, Giorgos, Marazakis, Manolis, and Bilas, Angelos
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Performance - Abstract
With the current rate of data growth, processing needs are becoming difficult to fulfill due to CPU power and energy limitations. Data serving systems and especially persistent key-value stores have become a substantial part of data processing stacks in the data center, providing access to massive amounts of data for applications and services. Key-value stores exhibit high CPU and I/O overheads because of their constant need to reorganize data on the devices. In this paper, we examine the efficiency of two key-value stores on four servers of different generations and with different CPU architectures. We use RocksDB, a key-value that is deployed widely, e.g. in Facebook, and Kreon, a research key-value store that has been designed to reduce CPU overhead. We evaluate their behavior and overheads on an ARM-based microserver and three different generations of x86 servers. Our findings show that microservers have better power efficiency in the range of 0.68-3.6x with a comparable tail latency.
- Published
- 2020
12. Shall numerical astrophysics step into the era of Exascale computing?
- Author
-
Taffoni, Giuliano, Murante, Giuseppe, Tornatore, Luca, Goz, David, Borgani, Stefano, Katevenis, Manolis, Chrysos, Nikolaos, and Marazakis, Manolis
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
High performance computing numerical simulations are today one of the more effective instruments to implement and study new theoretical models, and they are mandatory during the preparatory phase and operational phase of any scientific experiment. New challenges in Cosmology and Astrophysics will require a large number of new extremely computationally intensive simulations to investigate physical processes at different scales. Moreover, the size and complexity of the new generation of observational facilities also implies a new generation of high performance data reduction and analysis tools pushing toward the use of Exascale computing capabilities. Exascale supercomputers cannot be produced today. We discuss the major technological challenges in the design, development and use of such computing capabilities and we will report on the progresses that has been made in the last years in Europe, in particular in the framework of the ExaNeSt European funded project. We also discuss the impact of this new computing resources on the numerical codes in Astronomy and Astrophysics., Comment: 3 figures, invited talk for proceedings of ADASS XXVI, accepted by ASP Conference Series
- Published
- 2019
13. Interactive, Cloud-Native Workflows on HPC Using KNoC
- Author
-
Maliaroudakis, Evangelos, Chazapis, Antony, Kanterakis, Alexandros, Marazakis, Manolis, Bilas, Angelos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Anzt, Hartwig, editor, Bienz, Amanda, editor, Luszczek, Piotr, editor, and Baboulin, Marc, editor
- Published
- 2022
- Full Text
- View/download PDF
14. Exploring the Impact of Node Failures on the Resource Allocation for Parallel Jobs
- Author
-
Vardas, Ioannis, Ploumidis, Manolis, Marazakis, Manolis, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chaves, Ricardo, editor, B. Heras, Dora, editor, Ilic, Aleksandar, editor, Unat, Didem, editor, Badia, Rosa M., editor, Bracciali, Andrea, editor, Diehl, Patrick, editor, Dubey, Anshu, editor, Sangyoon, Oh, editor, L. Scott, Stephen, editor, and Ricci, Laura, editor
- Published
- 2022
- Full Text
- View/download PDF
15. Trace-Based Workload Generation and Execution
- Author
-
Sfakianakis, Yannis, Kanellou, Eleni, Marazakis, Manolis, Bilas, Angelos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Sousa, Leonel, editor, Roma, Nuno, editor, and Tomás, Pedro, editor
- Published
- 2021
- Full Text
- View/download PDF
16. HugeMap: Optimizing Memory-Mapped I/O with Huge Pages for Fast Storage
- Author
-
Malliotakis, Ioannis, Papagiannis, Anastasios, Marazakis, Manolis, Bilas, Angelos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Balis, Bartosz, editor, B. Heras, Dora, editor, Antonelli, Laura, editor, Bracciali, Andrea, editor, Gruber, Thomas, editor, Hyun-Wook, Jin, editor, Kuhn, Michael, editor, Scott, Stephen L., editor, Unat, Didem, editor, and Wyrzykowski, Roman, editor
- Published
- 2021
- Full Text
- View/download PDF
17. HugeMap: Optimizing Memory-Mapped I/O with Huge Pages for Fast Storage
- Author
-
Malliotakis, Ioannis, primary, Papagiannis, Anastasios, additional, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
18. Trace-Based Workload Generation and Execution
- Author
-
Sfakianakis, Yannis, primary, Kanellou, Eleni, additional, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
19. Redesign of astrophysical codes for exascale computing: the SPACE experience
- Author
-
Ibsen, Jorge, Chiozzi, Gianluca, Taffoni, Giuliano, Mignone, Andrea, Tornatore, Luca, Sciacca, Eva, Guarrasi, Massimiliano, Lapenta, Giovanni, Riha, Lubomir, Vavrik, Radim, Vysocky, Ondrej, Kadlubiak, Kristian, Strakos, Petr, Jaros, Milan, Dolag, Klaus, Commercon, Benoit, Rezzolla, Luciano, Pierre, Khalil, Doulis, Georgios, Shen, Sijing, Marazakis, Manolis, Gregori, Daniele, Boella, Elisabetta, Perna, Gino, Zanotti, Marisa, Raffin, Erwan, Polsterer, Kai, Trujillo Gomez, Sebastian, and Marin, Guillermo
- Published
- 2024
- Full Text
- View/download PDF
20. Next generation of Exascale-class systems: ExaNeSt project and the status of its interconnect and storage development
- Author
-
Katevenis, Manolis, Ammendola, Roberto, Biagioni, Andrea, Cretaro, Paolo, Frezza, Ottorino, Lo Cicero, Francesca, Lonardo, Alessandro, Martinelli, Michele, Paolucci, Pier Stanislao, Pastorelli, Elena, Simula, Francesco, Vicini, Piero, Taffoni, Giuliano, Pascual, Jose A., Navaridas, Javier, Luján, Mikel, Goodacre, John, Lietzow, Bernd, Mouzakitis, Angelos, Chrysos, Nikolaos, Marazakis, Manolis, Gorlani, Paolo, Cozzini, Stefano, Brandino, Giuseppe Piero, Koutsourakis, Panagiotis, Ruth, Joeri van, Zhang, Ying, and Kersten, Martin
- Published
- 2018
- Full Text
- View/download PDF
21. Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors
- Author
-
Katevenis, George, primary, Ploumidis, Manolis, additional, and Marazakis, Manolis, additional
- Published
- 2023
- Full Text
- View/download PDF
22. User-Space I/O for s-level Storage Devices
- Author
-
Papagiannis, Anastasios, Saloustros, Giorgos, Marazakis, Manolis, Bilas, Angelos, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Taufer, Michela, editor, Mohr, Bernd, editor, and Kunkel, Julian M., editor
- Published
- 2016
- Full Text
- View/download PDF
23. Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors - Computational Artifacts
- Author
-
Katevenis, George, Ploumidis, Manolis, and Marazakis, Manolis
- Subjects
shared-memory ,multi-core ,cache coherency ,HPC ,MPI ,collectives ,Intel Xeon Scalable ,broadcast ,intra-node - Abstract
Collection of computationtal artifacts (source code, scripts, datasets, instructions) for reproducibility of experiments featured in the associated paper: Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors George Katevenis, Manolis Ploumidis, and Manolis Marazakis ICPP 2023, Salt Lake City, Utah, USA
- Published
- 2023
- Full Text
- View/download PDF
24. eProcessor
- Author
-
Alvarez, Lluc, primary, Ruiz, Abraham, additional, Bigas-Soldevilla, Arnau, additional, Kuroedov, Pavel, additional, Gonzalez, Alberto, additional, Mahale, Hamsika, additional, Bustamante, Noe, additional, Aguilera, Albert, additional, Minervini, Francesco, additional, Salamero, Javier, additional, Palomar, Oscar, additional, Papaefstathiou, Vassilis, additional, Psathakis, Antonis, additional, Dimou, Nikolaos, additional, Giaourtas, Michalis, additional, Mastorakis, Iasonas, additional, Ieronymakis, Georgios, additional, Matzouranis, Georgios-Michail, additional, Flouris, Vasilis, additional, Kossifidis, Nick, additional, Marazakis, Manolis, additional, Goel, Bhavishya, additional, Manivannan, Madhavan, additional, Ejaz, Ahsen, additional, Strikos, Panagiotis, additional, Vázquez, Mateo, additional, Sourdis, Ioannis, additional, Trancoso, Pedro, additional, Stenström, Per, additional, Hagemeyer, Jens, additional, Tigges, Lennart, additional, Kucza, Nils, additional, Philippe, Jean-Marc, additional, and Papaefstathiou, Ioannis, additional
- Published
- 2023
- Full Text
- View/download PDF
25. RISER: The first All- European RISC-V Cloud Server Infrastructure
- Author
-
Marazakis, Manolis and Louloudakis, Stelios
- Abstract
Public announcement of the RISER ('RISC-V for Cloud Services') project, in ERCIM News, issue Nr. 133 (April 2023).  
- Published
- 2023
- Full Text
- View/download PDF
26. RISER: Raising RISC-V to the cloud
- Author
-
Marazakis, Manolis and Louloudakis, Stelios
- Abstract
First public announcement of the RISER ('RISC-V for Cloud Services') project, in the HiPEACInfo magazine (issue Nr. 68, January 2023).
- Published
- 2023
- Full Text
- View/download PDF
27. ETP4HPC's SRA 5 - Strategic Research Agenda for High-Performance Computing in Europe - 2022
- Author
-
Malms, Michael, Cargemel, Laurent, Suarez, Estela, Mittenzwey, Nico, Duranton, Marc, Sezer, Sakir, Prunty, Craig, Rossé-Laurent, Pascale, Pérez-Harnandez, Maria, Marazakis, Manolis, Lonsdale, Guy, Carpenter, Paul, Antoniu, Gabriel, Narasimharmurthy, Sai, Brinkman, André, Pleiter, Dirk, Haus, Utz-Uwe, Krueger, Jens, Hoppe, Hans-Christian, Laure, Erwin, Wierse, Andreas, Bartsch, Valeria, Michielsen, Kristel, Allouche, Cyril, Becker, Tobias, and Haas, Robert
- Abstract
This document feeds research and development priorities devel-oped by the European HPC ecosystem into EuroHPC’s Research and Innovation Advisory Group with an aim to define the HPC Technology research Work Programme and the calls for proposals included in it and to be launched from 2023 to 2026. This SRA also describes the major trends in the deployment of HPC and HPDA methods and systems, driven by economic and societal needs in Europe, taking into account the changes ex-pected in the technologies and architectures of the expanding underlying IT infrastructure. The goal is to draw a complete pic-ture of the state of the art and the challenges for the next three to four years rather than to focus on specific technologies, implementations or solutions.
- Published
- 2022
- Full Text
- View/download PDF
28. A framework for hierarchical single-copy MPI collectives on multicore nodes
- Author
-
Katevenis, George, primary, Ploumidis, Manolis, additional, and Marazakis, Manolis, additional
- Published
- 2022
- Full Text
- View/download PDF
29. Using gLite to Implement a Secure ICGrid
- Author
-
Luna, Jesus, Dikaiakos, Marios D, Gjermundrod, Harald, Flouris, Michail, Marazakis, Manolis, and Bilas, Angelos
- Published
- 2009
- Full Text
- View/download PDF
30. A Data-Centric Security Analysis Of ICGrid
- Author
-
Luna, Jesus, Flouris, Michail, Marazakis, Manolis, Bilas, Angelos, Dikaiakos, Marios D., Gjermundrod, Harald, Kyprianou, Theodoros, Gorlatch, Sergei, editor, Fragopoulou, Paraskevi, editor, and Priol, Thierry, editor
- Published
- 2008
- Full Text
- View/download PDF
31. An Analysis of Security Services in Grid Storage Systems
- Author
-
Luna, Jesus, Flouris, Michail D., Marazakis, Manolis, Bilas, Angelos, Stagni, Federico, Forti, Alberto, Ghiselli, Antonia, Magnoni, Luca, Zappi, Riccardo, Talia, Domenico, Yahyapour, Ramin, and Ziegler, Wolfgang
- Published
- 2008
- Full Text
- View/download PDF
32. LatEst: Vertical elasticity for millisecond serverless execution
- Author
-
Sfakianakis, Yannis, primary, Marazakis, Manolis, additional, Kozanitis, Christos, additional, and Bilas, Angelos, additional
- Published
- 2022
- Full Text
- View/download PDF
33. User-Space I/O for $$\mu $$ s-level Storage Devices
- Author
-
Papagiannis, Anastasios, primary, Saloustros, Giorgos, additional, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2016
- Full Text
- View/download PDF
34. HPC for Urgent Decision-Making
- Author
-
Marazakis, Manolis, Duranton, Marc, Pleiter, Dirk, Taffoni, Giuliano, and Hoppe, Hans-Christian
- Abstract
Emerging use cases from incident response planning and broad-scope European initiatives (e.g. Destination Earth [1,2], European Green Deal and Digital Package [21]) are expected to require federated, distributed infrastructures combining computing and data platforms. These will provide elasticity enabling users to build applications and integrate data for thematic specialisation and decision support, within ever shortening response time windows. For prompt and, in particular, for urgent decision support, the conventional usage modes of HPC centres is not adequate: these rely on relatively long-term arrangements for time-scheduled exclusive use of HPC resources, and enforce well-established yet time-consuming policies for granting access. In urgent decision support scenarios, managers or members of incident response teams must initiate processing and control the resources required based on their real-time judgement on how a complex situation evolves over time. This circle of clients is distinct from the regular users of HPC centres, and they must interact with HPC workflows on-demand and in real-time, while engaging significant HPC and data processing resources in or across HPC centres. This white paper considers the technical implications of supporting urgent decisions through establishing flexible usage modes for computing, analytics and AI/ML-based applications using HPC and large, dynamic assets. The target decision support use cases will involve ensembles of jobs, data-staging to support workflows, and interactions with services/facilities external to HPC systems/centres. Our analysis identifies the need for flexible and interactive access to HPC resources, particularly in the context of dynamic workflows processing large datasets. This poses several technical and organisational challenges: short-notice secure access to HPC and data resources, dynamic resource allocation and scheduling, coordination of resource managers, support for data-intensive workflow (including data staging on node-local storage), preemption of already running workloads and interactive steering of simulations. Federation of services and resources across multiple sites will help to increase availability, provide elasticity for time-varying resource needs and enable leverage of data locality., The authors would like to thank Maria S. Perez (Professor at Universidad Polit��cnica de Madrid, Spain) for her insightful critique on earlier drafts of this whitepaper., {"references":["[1] Destination Earth (DestinE) initiative.\t https://ec.europa.eu/digital-single-market/en/destination-earth-destine","[2] Destination Earth: Use Cases Analysis, JRC Technical Report JRC122456, 2020. \t https://publications.jrc.ec.europa.eu/repository/handle/JRC122456","[3] Wilkinson MD, Dumontier M, Aalbersberg IJ, et al.. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18. Erratum in: Sci Data. 2019 Mar 19;6(1):6. PMID: 26978244; PMCID: PMC4792175.","[4] N. Brown, R. Nash, G. Gibb, B. Prodan, M. Kontak, V. Olshevsky, and W. Der Chien, \"The role of interactive supercomputing in using HPC for urgent decision making\", in Proceedings of the International Conference on High Performance Computing. Springer, 2019, pp. 528–540.","[5] G. Gibb, R. Nash, N. Brown and B. Prodan, \"The Technologies Required for Fusing HPC and Real-Time Data to Support Urgent Computing\", in Proceedings of the 2019 IEEE/ACM Workshop on HPC for Urgent Decision Making (UrgentHPC), 2019, pp. 24-34.","[6] Earth System Modeling Framework : https://earthsystemmodeling.org/","[7] T. C. Schulthess, P. Bauer, N. Wedi, O. Fuhrer, T. Hoefler and C. Schär, \"Reflecting on the Goal and Baseline for Exascale Computing: A Roadmap Based on Weather and Climate Simulations,\" in Computing in Science & Engineering, vol. 21, no. 1, pp. 30-41, 1 Jan.-Feb. 2019, doi: 10.1109/MCSE.2018.2888788.","[8] Baker, D.N., Erickson, P.J., Fennell, J.F. et al. Space Weather Effects in the Earth's Radiation Belts. Space Sci Rev 214, 17 (2018). https://doi.org/10.1007/s11214-017-0452-10","[9] R. Kube et al., \"Near real-time analysis of big fusion data on HPC systems,\" 2020 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2020, pp. 55-63, doi: 10.1109/UrgentHPC51945.2020.00012.","[10] A. Kremin, S. Bailey, J. Guy, T. Kisner and K. Zhang, \"Rapid Processing of Astronomical Data for the Dark Energy Spectroscopic Instrument,\" 2020 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2020, pp. 1-9, doi: 10.1109/UrgentHPC51945.2020.00006.","[11] Jiang, M., Bu, C., Zeng, J. et al. Applications and challenges of high performance computing in genomics. CCF Trans. HPC (2021). https://doi.org/10.1007/s42514-021-00081-w","[12] CISCO 2020, Global Network Trends Report, Tech. rep., CISCO. URL https://www.cisco.com/c/dam/m/en_us/solutions/enterprise-networks/ networking-report/files/GLBL-ENG_NB-06_0_NA_RPT_PDF_ MOFU-no-NetworkingTrendsReport-NB_rpten018612_5.pdf","[13] Asch M, Moore T, Badia R, et al. Big data and extreme-scale computing: Pathways to Convergence-Toward a shaping strategy for a future software and data ecosystem for scientific inquiry. The International Journal of High Performance Computing Applications. 2018;32(4):435-479. doi:10.1177/1094342018778123","[14] E.Yamasaki, 2012, What We Can Learn From Japan's Early Earthquake Warning System, Momentum: Volume 1: Issue 1, Article 2.","[15] F. Løvholt, S. Lorito, J. Macias, M. Volpe, J. Selva and S. Gibbons, \"Urgent Tsunami Computing,\" 2019 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2019, pp. 45-50, doi: 10.1109/UrgentHPC49580.2019.00011.","[16] Siew Hoon Leong, Dieter Kranzlmüller, \"Towards a General Definition of Urgent Computing,\" Procedia Computer Science, Volume 51, 2015, https://doi.org/10.1016/j.procs.2015.05.402.","[17] Tzachor, A., Whittlestone, J., Sundaram, L. et al. Artificial intelligence in a crisis needs ethics with urgency. Nat Mach Intell 2, 365–366 (2020). https://doi.org/10.1038/s42256-020-0195-0","[18] Chen, N., Liu, W., Bai, R. et al. Application of computational intelligence technologies in emergency management: a literature review. Artif Intell Rev 52, 2131–2168 (2019). https://doi.org/10.1007/s10462-017-9589-8","[19] D. Elia, S. Fiore and G. Aloisio, \"Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale,\" in IEEE Access, vol. 9, pp. 73307-73326, 2021. https://doi.org/10.1109/ACCESS.2021.3079139","[20] European High Performance Computing Joint Undertaking (EuroHPC JU). https://eurohpc-ju.europa.eu","[21] A European Green Deal. https://ec.europa.eu/info/strategy/priorities-2019-2024/european-green-deal_en","[22] R. Roscher, B. Bohn, M. F. Duarte and J. Garcke, \"Explainable Machine Learning for Scientific Insights and Discoveries,\" in IEEE Access, vol. 8, pp. 42200-42216, 2020, doi: 10.1109/ACCESS.2020.2976199.","[23] Strategic Research and Innovation Agenda of the European Open Science Cloud (EOSC), Feb. 2021. https://www.eosc.eu/sites/default/files/EOSC-SRIA-V1.0_15Feb2021.pdf"]}
- Published
- 2022
- Full Text
- View/download PDF
35. Aurora: An architecture for dynamic and adaptive work sessions in open environments
- Author
-
Marazakis, Manolis, Papadakis, Dimitris, Nikolaou, Christos, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Quirchmayr, Gerald, editor, Schweighofer, Erich, editor, and Bench-Capon, Trevor J.M., editor
- Published
- 1998
- Full Text
- View/download PDF
36. A Framework for the Encapsulation of Value-Added Services in Digital Objects
- Author
-
Marazakis, Manolis, Papadakis, Dimitris, Papadakis, Stavros A., Goos, Gerhard, Series editor, Hartmanis, Juris, Series editor, van Leeuwen, Jan, Series editor, Nikolaou, Christos, editor, and Stephanidis, Constantine, editor
- Published
- 1998
- Full Text
- View/download PDF
37. System Infrastructure for Digital Libraries: A Survey and Outlook
- Author
-
Nikolaou, Christos, Marazakis, Manolis, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, and Rovan, Branislav, editor
- Published
- 1998
- Full Text
- View/download PDF
38. Towards a common infrastructure for large-scale distributed applications
- Author
-
Nikolaoul, Christos, Marazakis, Manolis, Papadakis, Dimitris, Yeorgiannakis, Yiorgos, Sairamesh, Jakka, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Peters, Carol, editor, and Thanos, Costantino, editor
- Published
- 1997
- Full Text
- View/download PDF
39. Multilevel simulation-based co-design of next generation HPC microprocessors
- Author
-
Zaourar, Lilia, primary, Benazouz, Mohamed, additional, Mouhagir, Ayoub, additional, Jebali, Fatma, additional, Sassolas, Tanguy, additional, Weill, Jean-Christophe, additional, Falquez, Carlos, additional, Ho, Nam, additional, Pleiter, Dirk, additional, Portero, Antoni, additional, Suarez, Estela, additional, Petrakis, Polydoros, additional, Papaefstathiou, Vassilis, additional, Marazakis, Manolis, additional, Radulovic, Milan, additional, Martinez, Francesc, additional, Armejach, Adria, additional, Casas, Marc, additional, Nocua, Alejandro, additional, and Dolbeau, Romain, additional
- Published
- 2021
- Full Text
- View/download PDF
40. MARVEL: Multimodal Extreme Scale Data Analytics for Smart Cities Environments
- Author
-
Bajovic, Dragana, primary, Bakhtiarnia, Arian, additional, Bravos, George, additional, Brutti, Alessio, additional, Burkhardt, Felix, additional, Cauchi, Daniel, additional, Chazapis, Antony, additional, Cianco, Claire, additional, Dall'Asen, Nicola, additional, Delic, Vlado, additional, Dimou, Christos, additional, Djokic, Djordje, additional, Escobar-Molero, Antonio, additional, Esterle, Lukas, additional, Eyben, Florian, additional, Farella, Elisabetta, additional, Festi, Thomas, additional, Geromitsos, Artemios, additional, Giakoumakis, Giannis, additional, Hatzivasilis, George, additional, Ioannidis, Sotiris, additional, Iosifidis, Alexandros, additional, Kallipolitou, Theodora, additional, Kalogiannis, Grigorios, additional, Kiousi, Akrivi, additional, Kopanaki, Despina, additional, Marazakis, Manolis, additional, Markopoulou, Stella, additional, Muscat, Adrian, additional, Paissan, Francesco, additional, Lobo, Tomas Pariente, additional, Pavlovic, Dusan, additional, Raptis, Theofanis P., additional, Ricci, Elisa, additional, Saez, Borja, additional, Sahito, Farhan, additional, Scerri, Kenneth, additional, Schuller, Bjorn, additional, Simic, Nikola, additional, Spanoudakis, George, additional, Tomasi, Alex, additional, Triantafyllopoulos, Andreas, additional, Valerio, Lorenzo, additional, Villazan, Javier, additional, Wang, Yiming, additional, Xuereb, Andre, additional, and Zammit, Johan, additional
- Published
- 2021
- Full Text
- View/download PDF
41. Skynet: Performance-driven Resource Management for Dynamic Workloads
- Author
-
Sfakianakis, Yannis, primary, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
42. IOTier: A Virtual Testbed to evaluate systems for IoT environments
- Author
-
Nikolaidis, Fotis, primary, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
43. Frisbee
- Author
-
Nikolaidis, Fotis, primary, Chazapis, Antony, additional, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
44. Memory-mapped I/O on steroids
- Author
-
Papagiannis, Anastasios, primary, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2021
- Full Text
- View/download PDF
45. Towards Resilient EU HPC Systems: A Blueprint
- Author
-
Radojkovic, Petar, Marazakis, Manolis, Carpenter, Paul, Jeyapaul, Reiley, Gizopoulos, Dimitris, Schulz, Martin, Armejach, Adria, Ayguade, Eduard, Bodin, François, Canal, Ramon, Cappello, Franck, Chaix, Fabien, Colin de Verdiere, Guillaume, Derradji, Said, Di Carlo, Stefano, Engelmann, Christian, Laguna, Ignacio, Moreto, Miquel, Mutlu, Onur, Papadopoulos, Lazaros, Perks, Olly, Ploumidis, Manolis, Salami, Bezhad, Sazeides, Yanos, Soudris, Dimitrios, Sourdis, Yiannis, Stenstrom, Per, Thibault, Samuel, Toms, Will, Unsal, Osman, Barcelona Supercomputing Center - Centro Nacional de Supercomputacion (BSC - CNS), Foundation for Research and Technology - Hellas (FORTH), ARM Ltd [Cambridge] (ARM), National and Kapodistrian University of Athens (NKUA), Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM), Leibniz Supercomputing Centre (LRZ), Logic and applications (LOGICA), LANGAGE ET GÉNIE LOGICIEL (IRISA-D4), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Universitat Politècnica de Catalunya [Barcelona] (UPC), Argonne National Laboratory [Lemont] (ANL), Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Bull atos technologies, Politecnico di Torino = Polytechnic of Turin (Polito), Oak Ridge National Laboratory [Oak Ridge] (ORNL), UT-Battelle, LLC, Lawrence Livermore National Laboratory (LLNL), Eidgenössische Technische Hochschule - Swiss Federal Institute of Technology [Zürich] (ETH Zürich), National Technical University of Athens [Athens] (NTUA), University of Cyprus (UCY), Chalmers University of Technology [Göteborg], Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), STatic Optimizations, Runtime Methods (STORM), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), University of Manchester [Manchester], European HPC resilience initiative, European Project: 801015,H2020,EXA2PRO(2018), European Project: 611404,EC:FP7:ICT,FP7-ICT-2013-10,CLERECO(2013), European Project, European Project: 671553,H2020,H2020-FETHPC-2014,ExaNeSt(2015), European Project: 671578,H2020,H2020-FETHPC-2014,ExaNoDe(2015), European Project: 671558,H2020,H2020-FETHPC-2014,EXDCI(2015), European Project: 780681,LEGaTO, European Project: 671632,H2020,H2020-FETHPC-2014,ECOSCALE(2015), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), University of Cyprus [Nicosia] (UCY), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS), and Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest
- Subjects
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Published
- 2020
46. ETP4HPC's Strategic Research Agenda for High-Performance Computing in Europe 4
- Author
-
Malms, Michael, Ostasz, Marcin, Gilliot, Maike, Bernier-Bruna, Pascale, Cargemel, Laurent, Suarez, Estela, Cornelius, Herbert, Duranton, Marc, Koren, Benny, Rosse-Laurent, Pascale, Pérez-Hernández, María S., Marazakis, Manolis, Lonsdale, Guy, Carpenter, Paul, Antoniu, Gabriel, Narasimhamurthy, Sai, Brinkman, André, Pleiter, Dirk, Tate, Adrian, Krueger, Jens, Hoppe, Hans-Christian, Laure, Erwin, Wierse, Andreas, European Technology Platform (ETP) for High-Performance Computing (HPC) (ETP4HPC), IBM Research Lab. - Zürich, TERATEC [Bruyères-le-Chatel], Atos, Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH | Centre de recherche de Juliers, Helmholtz-Gemeinschaft = Helmholtz Association-Helmholtz-Gemeinschaft = Helmholtz Association, Megware Computer Vertrieb und Service GmbH (Megware), Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Mellanox Technologies [Sunnyvale], Universidad Politécnica de Madrid (UPM), Foundation for Research and Technology - Hellas (FORTH), Scapos AG (SCAPOS), Departament d'Arquitectura de Computadors - Universitat Politècnica de Catalunya (DAC), Universitat Politècnica de Catalunya [Barcelona] (UPC), Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Seagate Technology, Johannes Gutenberg - Universität Mainz = Johannes Gutenberg University (JGU), Helmholtz-Gemeinschaft = Helmholtz Association, Numerical Algorithms Group [New Mexico] (NAG), Fraunhofer (Fraunhofer-Gesellschaft), Intel Corporation [Santa Clara], Intel Corporation [USA], Royal Institute of Technology [Stockholm] (KTH ), SICOS BW [Stuttgart] (SICOS), ETP4HPC: European Technology Platform for High Performance Computing, with the support of the EXDCI-2 project, IBM Systems Development, TERATEC, Philips France Semiconducteurs, C and C Research Laboratories NEC Eur. Ltd., NEC Europe Ltd. [Middlesex], Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Johannes Gutenberg - Universität Mainz (JGU), Intel Corporation, Santa Clara CA, SICOS BW (SICOS), and European Technology Platform for High-Performance Computing (ETP4HPC)
- Subjects
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
International audience; This Strategic Research Agenda is the fourth High Performance Computing (HPC) technology roadmap developed and maintained by ETP4HPC, the European High-Performance Computing Platform with the support of the EXDCI-2 project. It continues the tradition of a structured approach to the identification of key research objectives. The main objective of this SRA is to identify the European technology research priorities in the area of High-Performance Computing (HPC) and High-Performance Data Analytics (HPDA), which should be used by EuroHPC to build its 2021 – 2024 Work Programme.Over eighty HPC experts associated with member organisations of ETP4HPC created this document in collaboration with external technical leaders representing those areas of technology that together with HPC form what we have come to call “The Digital Continuum”. This new concept well reflects the main trend of this SRA – it is not only about developing HPC technology in order to build competitive European HPC systems but also about making our HPC solutions work together with other related technologies - the material included in this SRA is also a result of our interactions with Big Data, Internet of Things (IoT), and Artificial Intelligence (AI) and Cyber Physical Systems (CPS).
- Published
- 2020
47. Towards resilient EU HPC systems: A blueprint
- Author
-
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Universitat Politècnica de Catalunya. VIRTUOS - Virtualisation and Operating Systems, Radojković, Petar, Marazakis, Manolis, Carpenter, Paul Matthew, Jeyapaul, Reiley, Gizopoulos, Dimitris, Schulz, Martin, Armejach Sanosa, Adrià, Ayguadé Parra, Eduard, Canal Corretger, Ramon, Moretó Planas, Miquel, Salami, Behzad, Unsal, Osman Sabri, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Universitat Politècnica de Catalunya. VIRTUOS - Virtualisation and Operating Systems, Radojković, Petar, Marazakis, Manolis, Carpenter, Paul Matthew, Jeyapaul, Reiley, Gizopoulos, Dimitris, Schulz, Martin, Armejach Sanosa, Adrià, Ayguadé Parra, Eduard, Canal Corretger, Ramon, Moretó Planas, Miquel, Salami, Behzad, and Unsal, Osman Sabri
- Abstract
This document aims to spearhead a Europe-wide discussion on HPC system resilience and to help the European HPC community define best practices for resilience. We analyse a wide range of state-of-the-art resilience mechanisms and recommend the most effective approaches to employ in large-scale HPC systems. Our guidelines will be useful in the allocation of available resources, as well as guiding researchers and research funding towards the enhancement of resilience approaches with the highest priority and utility. Although our work is focused on the needs of next generation HPC systems in Europe, the principles and evaluations are applicable globally., This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the projects ECOSCALE (grant agreement No 671632), EPI (grant agreement No 826647), EuroEXA (grant agreement No 754337), Eurolab4HPC (grant agreement No 800962), EVOLVE (grant agreement No 825061), EXA2PRO (grant agreement No 801015), ExaNest (grant agreement No 671553), ExaNoDe (grant agreement No 671578), EXDCI-2 (grant agreement No 800957), LEGaTO (grant agreement No 780681), MB2020 (grant agreement No 779877), RECIPE (grant agreement No 801137) and SDK4ED (grant agreement No 780572). The work was also supported by the European Commission’s Seventh Framework Programme under the projects CLERECO (grant agreement No 611404), the NCSA-Inria-ANL-BSC-JSCRiken-UTK Joint-Laboratory for Extreme Scale Computing – JLESC (https://jlesc.github.io/), OMPI-X project (No ECP-2.3.1.17) and the Spanish Government through Severo Ochoa programme (SEV-2015-0493). This work was sponsored in part by the U.S. Department of Energy's Office of Advanced Scientific Computing Research, program managers Robinson Pino and Lucy Nowell. This manuscript has been authored by UT-Battelle, LLC under Contract No DE-AC05-00OR22725 with the U.S. Department of Energy., Preprint
- Published
- 2020
48. DyRAC: Cost-aware Resource Assignment and Provider Selection for Dynamic Cloud Workloads
- Author
-
Sfakianakis, Yannis, primary, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2020
- Full Text
- View/download PDF
49. Using gLite to Implement a Secure ICGrid
- Author
-
Luna, Jesus, primary, Dikaiakos, Marios D, additional, Gjermundrod, Harald, additional, Flouris, Michail, additional, Marazakis, Manolis, additional, and Bilas, Angelos, additional
- Published
- 2008
- Full Text
- View/download PDF
50. Towards Communication Profile, Topology and Node Failure Aware Process Placement
- Author
-
Vardas, Ioannis, primary, Ploumidis, Manolis, additional, and Marazakis, Manolis, additional
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.