59 results on '"Costan, Alexandru"'
Search Results
2. Enabling federated learning across the computing continuum: Systems, challenges and future directions
- Author
-
Prigent, Cédric, Costan, Alexandru, Antoniu, Gabriel, and Cudennec, Loïc
- Published
- 2024
- Full Text
- View/download PDF
3. Mission possible: Unify HPC and Big Data stacks towards application-defined blobs at the storage layer
- Author
-
Matri, Pierre, Alforov, Yevhen, Brandon, Álvaro, Pérez, María S., Costan, Alexandru, Antoniu, Gabriel, Kuhn, Michael, Carns, Philip, and Ludwig, Thomas
- Published
- 2020
- Full Text
- View/download PDF
4. Keeping up with storage: Decentralized, write-enabled dynamic geo-replication
- Author
-
Matri, Pierre, Pérez, María S., Costan, Alexandru, Bougé, Luc, and Antoniu, Gabriel
- Published
- 2018
- Full Text
- View/download PDF
5. KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments
- Author
-
Rosendo, Daniel, Keahey, Kate, Costan, Alexandru, Simonin, Matthieu, Valduriez, Patrick, and Antoniu, Gabriel
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Distributed infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex scientific workflows to be executed across hybrid systems spanning from IoT Edge devices to Clouds, and sometimes to supercomputers (the Computing Continuum). Understanding the performance trade-offs of large-scale workflows deployed on such complex Edge-to-Cloud Continuum is challenging. To achieve this, one needs to systematically perform experiments, to enable their reproducibility and allow other researchers to replicate the study and the obtained conclusions on different infrastructures. This breaks down to the tedious process of reconciling the numerous experimental requirements and constraints with low-level infrastructure design choices.To address the limitations of the main state-of-the-art approaches for distributed, collaborative experimentation, such as Google Colab, Kaggle, and Code Ocean, we propose KheOps, a collaborative environment specifically designed to enable cost-effective reproducibility and replicability of Edge-to-Cloud experiments. KheOps is composed of three core elements: (1) an experiment repository; (2) a notebook environment; and (3) a multi-platform experiment methodology.We illustrate KheOps with a real-life Edge-to-Cloud application. The evaluations explore the point of view of the authors of an experiment described in an article (who aim to make their experiments reproducible) and the perspective of their readers (who aim to replicate the experiment). The results show how KheOps helps authors to systematically perform repeatable and reproducible experiments on the Grid5000 + FIT IoT LAB testbeds. Furthermore, KheOps helps readers to cost-effectively replicate authors experiments in different infrastructures such as Chameleon Cloud + CHI@Edge testbeds, and obtain the same conclusions with high accuracies (> 88% for all performance metrics).
- Published
- 2023
6. JetStream: Enabling high throughput live event streaming on multi-site clouds
- Author
-
Tudoran, Radu, Costan, Alexandru, Nano, Olivier, Santos, Ivo, Soncu, Hakan, and Antoniu, Gabriel
- Published
- 2016
- Full Text
- View/download PDF
7. AutoCompBD: Autonomic Computing and Big Data platforms
- Author
-
Pop, Florin, Dobre, Ciprian, and Costan, Alexandru
- Published
- 2017
- Full Text
- View/download PDF
8. Supporting Efficient Workflow Deployment of Federated Learning Systems across the Computing Continuum
- Author
-
Prigent, Cédric, Antoniu, Gabriel, Costan, Alexandru, Cudennec, Loïc, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), DGA Maîtrise de l'information (DGA.MI), and Direction générale de l'Armement (DGA)
- Subjects
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Federated Learning ,Hyperparameter optimization ,Computing Continuum ,Workflow - Abstract
International audience; IoT devices produce ever growing amounts of data. Traditional cloud-based approaches for processing data are facing some limitations: bandwidth might become a bottleneck and sensitive data should not leave user devices as stated by data protection regulators such as GDPR. Federated Learning (FL) is a distributed Machine Learning paradigm aiming to collaboratively learn a shared model while considering privacy preservation. Clients do the training process locally with their private data while a central server updates the global model by aggregating local models. In the Computing Continuum context (edge-fog-cloud ecosystem), FL raises several challenges such as supporting very heterogeneous devices and optimizing massively distributed applications. We propose a workflow to better support and optimize FL systems across the Computing Continuum by relying on formal descriptions of the underlying infrastructure, hyperparameter optimization and model retraining in case of performance degradation. We motivate our approach by providing preliminary results using a human activity recognition dataset showing the importance of hyperparameter optimization and model retraining in the FL scenario.
- Published
- 2022
9. Deploying Heterogeneity-aware Deep Learning Workloads on the Computing Continuum
- Author
-
Bouvier, Thomas, Costan, Alexandru, Antoniu, Gabriel, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Université de Rennes (UNIV-RENNES)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Rennes (ENS Rennes)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Université de Rennes (UNIV-RENNES)
- Subjects
incremental learning ,Deep Learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO]Computer Science [cs] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,heterogeneous systems ,continual learning ,Computing Continuum - Abstract
National audience; The increasing need for real-time analytics motivated the emergence of new incremental methods to learn representations from continuous flows of data, especially in the context of the Internet of Things. This trend led to the evolution of centralized computing infrastructures towards interconnected processing units spanning from edge devices to cloud data centers. This new paradigm is referred to as the Computing or Edge-to-Cloud Continuum. However, the network and compute heterogeneity across and within clusters may negatively impact Deep Learning (DL) training. We introduce a roadmap for understanding the end-to-end performance of DL workloads in such heterogeneous settings. The goal is to identify key parameters leading to stragglers and devise novel intra- and inter-cluster strategies to address them. We will explore various policies aiming to improve makespan, cost and fairness objectives while ensuring system scalability.
- Published
- 2021
10. MonGNN: A neuroevolutionary-based solution for 5G network slices monitoring
- Author
-
Balouek-Thomert, Daniel, Silva, Pedro, Fauvel, Kevin, Costan, Alexandru, Antoniu, Gabriel, Parashar, Manish, Dependability Interoperability and perfOrmance aNalYsiS Of networkS (DIONYSOS), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-RÉSEAUX, TÉLÉCOMMUNICATION ET SERVICES (IRISA-D2), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Nokia Bell Labs, Scientific Computing and Imaging Institute (SCI Institute), University of Utah, Hasso Plattner Institute [Potsdam, Germany], Large Scale Collaborative Data Mining (LACODAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-GESTION DES DONNÉES ET DE LA CONNAISSANCE (IRISA-D7), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES), Scalable Storage for Clouds and Beyond (KerData), and Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1)
- Subjects
Scheme (programming language) ,Computer science ,business.industry ,Real-time computing ,Network tomography ,Task (project management) ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,Path (graph theory) ,Task analysis ,Routing (electronic design automation) ,Focus (optics) ,business ,computer ,5G ,Computer network ,computer.programming_language - Abstract
International audience; Monitoring the status of network slices is a priority for network operators to ensure that SLAs are not violated. To overcome the limitations of direct slices' monitoring, network tomography (NT) is seen as a promising solution. NT-based solutions require constraining monitoring traffic to follow specific paths, which we can achieve by using segment-based routing (SR). This allows deploying customized probing scheme, such as cycles' probing. A major challenge with SR is, however, the limited length of the monitoring path. In this paper, we focus on the complexity of that task and propose MonGNN, a standalone solution based on Graph Neural Networks (GNNs) and genetic algorithms to find a trade-off between the quality of monitors' placement and the cost to achieve it. Simulation results show the efficiency of our approach compared to existing methods.
- Published
- 2021
- Full Text
- View/download PDF
11. Enabling Reproducible Analysis of Complex Workflows on the Edge-to-Cloud Continuum
- Author
-
Rosendo, Daniel, Costan, Alexandru, Antoniu, Gabriel, Valduriez, Patrick, Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES), Philippe Rigaux, Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), and Institut National des Sciences Appliquées (INSA)
- Subjects
Networking and Internet Architecture (cs.NI) ,Performance (cs.PF) ,FOS: Computer and information sciences ,Computer Science - Networking and Internet Architecture ,[INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF] ,[INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI] ,Computer Science - Performance ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Computer Science - Distributed, Parallel, and Cluster Computing ,Distributed, Parallel, and Cluster Computing (cs.DC) ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
National audience; Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex applications to be executed from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to reconciling many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum. We introduce a rigorous methodology for such a process and validate it through E2Clab. It is the first platform to support the complete experimental cycle across the Computing Continuum: deployment, analysis, optimization. Preliminary results with real-life use cases show that E2Clab allows one to understand and improve performance, by correlating it to the parameter settings, the resource usage and the specifics of the underlying infrastructure.
- Published
- 2021
12. Heterogeneity-aware Deep Learning Workload Deployments on the Computing Continuum
- Author
-
Bouvier, Thomas, Costan, Alexandru, Antoniu, Gabriel, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1)
- Subjects
incremental learning ,Deep Learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO]Computer Science [cs] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,heterogeneous systems ,Computing Continuum ,continual learning - Abstract
International audience; The increasing need for real-time analytics motivated the emergence of new incremental methods to learn representations from continuous flows of data, especially in the context of the Internet of Things. This trend led to the evolution of centralized computing infrastructures towards interconnected processing units spanning from edge devices to cloud data centers. This new paradigm is referred to as the Computing or Edge-to-Cloud Continuum. However, the network and compute heterogeneity across and within clusters may negatively impact Deep Learning (DL) training. We introduce a roadmap for understanding the end-to-end performance of DL workloads in such heterogeneous settings. The goal is to identify key parameters leading to stragglers and devise novel intra- and inter-cluster strategies to address them. We will explore various policies aiming to improve makespan, cost and fairness objectives while ensuring system scalability.
- Published
- 2021
13. Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review.
- Author
-
Rosendo, Daniel, Costan, Alexandru, Valduriez, Patrick, and Antoniu, Gabriel
- Subjects
- *
ARTIFICIAL intelligence , *MACHINE learning , *HIGH performance computing , *COMMUNICATION infrastructure , *REPRODUCIBLE research - Abstract
The explosion of data volumes generated by an increasing number of applications is strongly impacting the evolution of distributed digital infrastructures for data analytics and machine learning (ML). While data analytics used to be mainly performed on cloud infrastructures, the rapid development of IoT infrastructures and the requirements for low-latency, secure processing has motivated the development of edge analytics. Today, to balance various trade-offs, ML-based analytics tends to increasingly leverage an interconnected ecosystem that allows complex applications to be executed on hybrid infrastructures where IoT Edge devices are interconnected to Cloud/HPC systems in what is called the Computing Continuum , the Digital Continuum , or the Transcontinuum. Enabling learning-based analytics on such complex infrastructures is challenging. The large scale and optimized deployment of learning-based workflows across the Edge-to-Cloud Continuum requires extensive and reproducible experimental analysis of the application execution on representative testbeds. This is necessary to help understand the performance trade-offs that result from combining a variety of learning paradigms and supportive frameworks. A thorough experimental analysis requires the assessment of the impact of multiple factors, such as: model accuracy, training time, network overhead, energy consumption, processing latency, among others. This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today. It describes the main learning paradigms enabling learning-based analytics on the Edge-to-Cloud Continuum. The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed. Furthermore, we analyze how the selected systems provide support for experiment reproducibility. We conclude our review with a detailed discussion of relevant open research challenges and of future directions in this domain such as: holistic understanding of performance; performance optimization of applications; efficient deployment of Artificial Intelligence (AI) workflows on highly heterogeneous infrastructures; and reproducible analysis of experiments on the Computing Continuum. • After screening 1159 articles from 5 databases (2016-2021), we summarized 69 studies. • A taxonomy of Data Analytics and AI on the Edge-to-Cloud Computing Continuum. • Most exploited: frameworks/libs; hardware; metrics; models; dataset; AI paradigms. • Analysis of articles in terms of experimental research and reproducibility support. • Discussion of the relevant open challenges and future directions in 4 domains. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Monitoring and control of large systems with MonALISA
- Author
-
Legrand, Iosif, Voicu, Ramiro, Cirstoiu, Catalin, Grigoras, Costin, Betev, Latchezar, and Costan, Alexandru
- Subjects
Distributed database ,Technology application ,California Institute of Technology -- Technology application ,Distributed databases -- Design and construction - Published
- 2009
15. Earthquake Early Warning Dataset
- Author
-
Fauvel, Kevin, Balouek-Thomert, Daniel, Melgar, Diego, Silva, Pedro, Simonet, Anthony, Antoniu, Gabriel, Costan, Alexandru, Masson, Véronique, Parashar, Manish, Rodero, Ivan, Termier, Alexandre, Large Scale Collaborative Data Mining (LACODAM), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-GESTION DES DONNÉES ET DE LA CONNAISSANCE (IRISA-D7), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Rutgers, The State University of New Jersey [New Brunswick] (RU), Rutgers University System (Rutgers), University of Oregon [Eugene], Scalable Storage for Clouds and Beyond (KerData), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), NSF OAC 1640834, OAC 1835661, OAC 1835692, OCE 1745246, ANR-15-CE25-0003,OverFlow,Workflow Data Management as a Service pour des Applications Multi-Site(2015), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes 1 (UR1), and Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique)
- Subjects
[SDE]Environmental Sciences ,[INFO]Computer Science [cs] - Abstract
figshare; figshare
- Published
- 2019
16. Towards a demonstrator of the Sigma Data Processing Architecture for BDEC 2
- Author
-
Antoniu, Gabriel, Costan, Alexandru, Marcu, Ovidiu-Cristian, Hernández-Pérez, María, Stojanovic, Nenad, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Universidad Politécnica de Madrid (UPM), Poznan Supercomputing and Networking Center, Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
- Subjects
Data processing ,Big data ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,High performance computing ,Edge computing ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2019
17. From Big Data to Fast Data: Efficient Stream Data Management
- Author
-
Costan, Alexandru, Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES), Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), ENS Rennes, Christian Pérez, Institut National des Sciences Appliquées (INSA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
- Subjects
Big Data ,analyse de données ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,gestion de données ,workflow management ,transfert de données ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,gestion de métadonnées ,données massives ,storage ,transactions ,in-transit processing ,[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] ,HPC ,data transfers ,traitement de flux ,data management ,metadata management ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,data analytics ,stream processing - Published
- 2019
18. The Sigma Data Processing Architecture: Leveraging Future Data for Extreme-Scale Data Analytics to Enable High-Precision Decisions
- Author
-
Antoniu, Gabriel, Costan, Alexandru, Pérez, Maria, Stojanovic, Nenad, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Universidad Politécnica de Madrid (UPM), Nissatech, Indiana University Bloomington, Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
- Subjects
[INFO.INFO-WB]Computer Science [cs]/Web ,[INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
International audience; This white paper introduces several key principles based on which HPC-Big Data convergence can be achieved: 1) use future (simulated) data to substantially enrich knowledge obtained based on historical (past) data; 2) enable high-precision analytics thanks to hybrid modeling combining simulation and data-driven models; 3) enable unified data processing thanks to a data processing framework able to relevantly leverage and combine stream processing and batch processing in situ and in transit.
- Published
- 2018
19. Exploring shared state in key-value store for window-based multi-pattern streaming
- Author
-
Marcu, Ovidiu-Cristian, Tudoran, Radu, Nicolae, Bogdan, Costan, Alexandru, Antoniu, Gabriel, and Pérez Hernández, María de los Santos
- Subjects
Informática - Abstract
We are now witnessing an unprecedented growth of data that needs to be processed at always increasing rates in order to extract valuable insights. Big Data streaming analytics tools have been developed to cope with the online dimension of data processing: they enable real-time handling of live data sources by means of stateful aggregations (operators). Current state-of-art frameworks (e.g. Apache Flink [1]) enable each operator to work in isolation by creating data copies, at the expense of increased memory utilization. In this paper, we explore the feasibility of deduplication techniques to address the challenge of reducing memory footprint for window-based stream processing without significant impact on performance. We design a deduplication method specifically for windowbased operators that rely on key-value stores to hold a shared state. We experiment with a synthetically generated workload while considering several deduplication scenarios and based on the results, we identify several potential areas of improvement. Our key finding is that more fine-grained interactions between streaming engines and (key-value) stores need to be designed in order to better respond to scenarios that have to overcome memory scarcity.
- Published
- 2017
20. Dynamic meta-scheduling architecture based on monitoring in distributed systems
- Author
-
Pop, Florin, Dobre, Ciprian, Stratan, Corina, Costan, Alexandru, and Cristea, Valentin
- Subjects
Scheduling (Management) -- Methods ,Science and technology - Abstract
Byline: Florin Pop, Ciprian Dobre, Corina Stratan, Alexandru Costan, Valentin Cristea The scheduling process in large scale distributed systems (LSDS) became more important due to increases in the number of users and applications. This paper presents a dynamic meta-scheduling architecture model for LSDS based on monitoring. The dynamic scheduling process tries to perform task allocation on the fly as the application executes. The monitoring has an important role in this process because it can offer a full view of nodes in distributed systems. The proposed architecture is an agent framework and contains a grid monitoring service, an execution service and a discovery service. The performance of the used monitoring system, MonALISA, is very important for dynamic scheduling because it ensures the real-time process. The experimental results validate our architecture and scheduling model.
- Published
- 2010
21. Tyr: Blob storage systems meet Built-In transactions
- Author
-
Matri, Pierre, Costan, Alexandru, Antoniu, Gabriel, Montes Sánchez, Jesús, and Pérez Hernández, María de los Santos
- Subjects
Informática - Abstract
Concurrent Big Data applications often require high-performance storage, as well as ACID (Atomicity, Consistency, Isolation, Durability) transaction support. Although blobs (binary large objects) are an increasingly popular storage model for such applications, state-of-the-art blob storage systems offer no transaction semantics. This demands users to coordinate data access carefully in order to avoid race conditions, inconsistent writes, overwrites and other problems that cause erratic behavior. We argue there is a gap between existing storage solutions and application requirements, which limits the design of transactionoriented applications. We introduce T¿yr, the first blob storage system to provide built-in, multiblob transactions, while retaining sequential consistency and high throughput under heavy access concurrency. T¿yr offers fine-grained random write access to data and in-place atomic operations. Large-scale experiments with a production application from CERN LHC show T¿yr throughput outperforming state-of-the-art solutions by more than 75%.
- Published
- 2016
22. Týr: Efficient Transactional Storage for Data-Intensive Applications
- Author
-
Matri, Pierre, Costan, Alexandru, Antoniu, Gabriel, Montes, Jesús, Pérez, María, Universidad Politécnica de Madrid (UPM), Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA), Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Marie Skłodowska-Curie Actions, Inria Rennes Bretagne Atlantique, Universidad Politécnica de Madrid, European Project: 642963,H2020 Pilier Excellent Science,H2020-MSCA-ITN-2014,BigStorage(2015), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES), CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)
- Subjects
grid'5000 ,distributed systems ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,consistency ,metadata ,données massives ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,blobs ,intergiciel ,stockage ,storage ,supervision ,monitoring ,middleware ,transactions ,systèmes distribués ,big data ,métadonnées ,consistence ,cloud ,[INFO]Computer Science [cs] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
As the computational power used by large-scale applications increases, the amount of data they need to manipulate tends to increase as well. A wide range of such applications requires robust and flexible storage support for atomic, durable and concurrent transactions. Historically, databases have provided the de facto solution to transactional data management, but they have forced applications to drop control over data layout and access mechanisms, while remaining unable to meet the scale requirements of Big Data. More recently, key-value stores have been introduced to address these issues. However, this solution does not provide transactions, or only restricted transaction support, compelling users to carefully coordinate access to data in order to avoid race conditions, partial writes, overwrites, and other hard problems that cause erratic behaviour. We argue there is a gap between existing storage solutions and application requirements that limits the design of transaction-oriented data-intensive applications. In this paper we introduce Týr, a massively parallel distributed transactional blob storage system. A key feature behind Týr is its novel multi-versioning management designed to keep the metadata overhead as low as possible while still allowing fast queries or updates and preserving transaction semantics. Its share-nothing architecture ensures minimal contention and provides low latency for large numbers of concurrent requests. Týr is the first blob storage system to provide sequential consistency and high throughput, while enabling unforeseen transaction support. Experiments with a real-life application from the CERN LHC show Týr throughput outperforming state-of-the-art solutions by more than 100%.; À mesure que la puissance de calcul utilisée par des applications à grande échelle augmente, le volume de données qu’elles manipulent tend à augmenter également. Une grande partie de ces applications nécessite un système de stockage robuste et flexible permettant l’exécution de transactions de manière concurrente. Antérieurement, les bases de données furent la solution de facto pour la gestion des données transactionnelles, mais elles empêchent les applications de contrôler l’organisation du stockage des données ainsi que l’accés à ces données, tout en restant incapables de répondre aux contraintes posées par les données massives. Plus récemment, des systèmes de stockage clé-valeur ont été créés pour répondre à cette problématique. Cependant, ces solutions ne fournissent pas de support des transactions, ou seulement un support partiel, imposant aux utilisateurs de coordonner avec soin l’accès aux données afin d’éviter tout état de concurrence, écritures partielles, surécritures, ainsi que d’autres problèmes à l’origine d’un comportement erratique des applications. Nous soutenons qu’il existe un fossé entre les solutions de stockage actuelles et les besoins des utilisateurs, ce qui limite la conception des applications transactionnelles gérant des volumes massifs de données. Dans ce document, nous présentons Týr, un système de stockage de blobs distribué et transactionnel. Une des caractéristiques principales de Týr est sa gestion des versions novatrice conçue pour permettre un accès rapide tant en lecture qu’en écriture aux données tout en gardant une sémantique transactionnelle et en nécessitant une faible surcharge de métadonnées. Son architecture décentralisée garantit une contention minimale et permet une faible latence avec un nombre important de requêtes concurrentes. Týr est le permier système de stockage de blobs à fournir à la fois une consistence séquentielle et un débit élevé, tout en apportant le support des transactions. Les expériences réalisées avec une application réelle du CERN LHC montrent que le débit de Týr surpasse celui des solutions actuelles de plus de 100%.
- Published
- 2016
23. Scaling Smart Appliances for Spatial Data Synthesis
- Author
-
Pineda-Morales, Luis, Subramaniam, Balaji, Keahey, Kate, Antoniu, Gabriel, Costan, Alexandru, Wang, Shaowen, Padmanabhan, Anand, Soliman, Aiman, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Microsoft Research - Inria Joint Centre (MSR - INRIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Microsoft Research Laboratory Cambridge-Microsoft Corporation [Redmond, Wash.], Argonne National Laboratory [Lemont] (ANL), Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES), Department of Geography and Geographic Information Science and Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign [Urbana], University of Illinois System-University of Illinois System, CyberGIS Center for Advanced Digital and Spatial Studies (CyberGIS), National Center for Supercomputing Applications (NCSA), Joint Laboratory in Extreme Scale Computing (JLESC), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), and Institut National des Sciences Appliquées (INSA)
- Subjects
elastic provisioning ,spatial data ,cloud computing ,[INFO]Computer Science [cs] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
International audience; With the rapidly growing number of dynamic data streams produced by sensing and experimental devices as well as social networks, scientists are given an unprecedented opportunity to explore a variety of environmental and social phenomena ranging from understanding of weather and climate to population dynamics. One of the main challenges is that dynamic data streams and their computation requirements are volatile: sensors or social networks may generate data at highly variable rates, processing time in an application may significantly change from one stage to the next one, or different phenomena may simply generate different levels of interest. Cloud computing is a promising platform allowing us to cope with such volatility because it enables us to allocate computational resources on demand, for short periods of time, and at an acceptable cost. At the same time using clouds for this purpose is challenging because an application may yield a very different performance depending on the hosting infrastructure, requiring us to pay special attention to how and where we schedule resources. In this poster, we describe our experiences using an application relying on input from social networks, notably geo-located tweets, to discover correlation between users’ work and home locations, with focus in the Illinois area. Our overall intent is to assess the impact of running the same application in offerings from different providers; to this end, we execute data filtering and per-user classification applications in two flavors of Chameleon cloud instances, namely bare-metal and KVM. Also, we analyze specific configuration parameters, such as data block size, replication factor and parallel processing, towards statistically modeling the application performance in a given infrastructure. We then identify and discuss the key parameters that influence the execution time. Finally, we look into the gains brought by accounting for data proximity when scheduling a resource in a multi-site environment.
- Published
- 2015
24. Efficient Scheduling of Scientific Workflows Using Hot Metadata in a Multisite Cloud.
- Author
-
Liu, Ji, Pineda, Luis, Pacitti, Esther, Costan, Alexandru, Valduriez, Patrick, Antoniu, Gabriel, and Mattoso, Marta
- Subjects
METADATA ,WORKFLOW management ,WORKFLOW management systems ,SCHEDULING ,SERVER farms (Computer network management) ,SOVEREIGN wealth funds - Abstract
Large-scale, data-intensive scientific applications are often expressed as scientific workflows (SWfs). In this paper, we consider the problem of efficient scheduling of a large SWf in a multisite cloud, i.e., a cloud with geo-distributed cloud data centers (sites). The reasons for using multiple cloud sites to run a SWf are that data is already distributed, the necessary resources exceed the limits at a single site, or the monetary cost is lower. In a multisite cloud, metadata management has a critical impact on the efficiency of SWf scheduling as it provides a global view of data location and enables task tracking during execution. Thus, it should be readily available to the system at any given time. While it has been shown that efficient metadata handling plays a key role in performance, little research has targeted this issue in multisite cloud. In this paper, we propose to identify and exploit hot metadata (frequently accessed metadata) for efficient SWf scheduling in a multisite cloud, using a distributed approach. We implemented our approach within a scientific workflow management system, which shows that our approach reduces the execution time of highly parallel jobs up to 64 percent and that of the whole SWfs up to 55 percent. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
25. Euro-Par 2014: Parallel Processing Workshops, Part II
- Author
-
Lopez, Luis, Zilinskas, Julius, Costan, Alexandru, Cascella, Roberto Gioacchino, Kecskemeti, Gabor, Jeannot, Emmanuel, Cannataro, Mario, Ricci, Laura, Benkner, Siegfried, Petit, Salvador, Scarano, Vittorio, Gracia, José, Hunold, Sascha, Scott, Stephen L., Lankes, Stefan, Lengauer, Christian, Carretero, Jesus, Breitbart, Jens, Alexander, Michael, CRACS & Inesc TEC [Porto], Universidade do Porto = University of Porto, Vilnius University [Vilnius], Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Design and Implementation of Autonomous Distributed Systems (MYRIADS), Computer and Automation Research Institute [Budapest] (MTA SZTAKI ), Efficient runtime systems for parallel architectures (RUNTIME), Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS), Università degli Studi 'Magna Graecia' di Catanzaro = University of Catanzaro (UMG), University of Pisa - Università di Pisa, Faculty of Computer Science [Vienna], Universität Wien, Universitat Politècnica de València (UPV), Dipartimento di Informatica [Fisciano], Università degli Studi di Salerno = University of Salerno (UNISA), High Performance Computing Center Stuttgart [Stuttgart] (HLRS), University of Stuttgart, Vienna University of Technology (TU Wien), Oak Ridge National Laboratory [Oak Ridge] (ORNL), UT-Battelle, LLC, Tennessee Tech University [Cookeville] (TTU), Rheinisch-Westfälische Technische Hochschule Aachen University (RWTH), University of Passau, Universidad Carlos III de Madrid [Madrid] (UC3M), Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM), Universidade do Porto, CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Università degli Studi 'Magna Graecia' di Catanzaro [Catanzaro, Italie] (UMG), Università degli Studi di Salerno (UNISA), Rheinisch-Westfälische Technische Hochschule Aachen (RWTH), and Jeannot, Emmanuel
- Subjects
[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2014
26. TomusBlobs: Scalable Data-intensive Processing on Azure Clouds
- Author
-
Costan, Alexandru, Tudoran, Radu, Antoniu, Gabriel, Brasche, Goetz, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), European Microsoft Innovation Center (EMIC), Microsoft Corporation [Redmond, Wash.], CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)
- Subjects
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
International audience; The emergence of cloud computing has brought the opportunity to use large-scale compute infrastructures for a broader and broader spectrum of applications and users. As the cloud paradigm gets attractive for the "elasticity'' in resource usage and associated costs (the users only pay for resources actually used), cloud applications still suffer from the high latencies and low performance of cloud storage services. As Big Data analysis on clouds becomes more and more relevant in many application areas, enabling high-throughput massive data processing on cloud data becomes a critical issue, as it impacts the overall application performance. In this paper we address this challenge at the level of cloud storage. We introduce a concurrency-optimized data storage system (called TomusBlobs) which federates the virtual disks associated to the Virtual Machines running the application code on the cloud. We demonstrate the performance benefits of our solution for efficient data-intensive processing by building an optimized prototype MapReduce framework for Microsoft's Azure cloud platform based on TomusBlobs. Finally, we specifically address the limitations of state-of-the-art MapReduce frameworks for reduce-intensive workloads, by proposing MapIterativeReduce as an extension of the MapReduce model. We validate the above contributions through large-scale experiments with synthetic benchmarks and with real-world applications on the Azure commercial cloud, using resources distributed across multiple data centers: they demonstrate that our solutions bring substantial benefits to data intensive applications compared to approaches relying on state-of-the-art cloud object storage.
- Published
- 2013
- Full Text
- View/download PDF
27. A-Brain: Using the Cloud to Understand the Impact of Genetic Variability on the Brain
- Author
-
Antoniu, Gabriel, Costan, Alexandru, da Mota, Benoit, Thirion, Bertrand, Tudoran, Radu, Scalable Storage for Clouds and Beyond (KerData), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Algorithms and Models for Integrative Biology (AMIB ), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire de Recherche en Informatique (LRI), Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Tudoran, Radu, Antoniu, Gabriel, Modelling brain structure, function and variability based on high-field MRI data (PARIETAL), Service NEUROSPIN (NEUROSPIN), Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Inria Saclay - Ile de France, A-Brain project, INRIA - Microsoft Research, CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Laboratoire de Recherche en Informatique (LRI), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Service NEUROSPIN (NEUROSPIN), Direction de Recherche Fondamentale (CEA) (DRF (CEA)), and Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay
- Subjects
[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,[SCCO.COMP] Cognitive science/Computer science ,education ,[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[SCCO.COMP]Cognitive science/Computer science ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM] - Abstract
International audience; Joint genetic and neuroimaging data analysis on large cohorts of subjects is a new approach used to assess and understand the variability that exists between individuals. This approach has remained poorly understood so far and brings forward very significant challenges, as progress in this field can open pioneering directions in biology and medicine. As both neuroimaging- and genetic-domain observations represent a huge amount of variables (of the order of 106 ), performing statistically rigorous analyses on such Big Data represents a computational challenge that cannot be addressed with conventional computational techniques. In the A-Brain project, we address this computational problem using cloud computing techniques on Microsoft Azure, relying on our complementary expertise in the area of scalable cloud data management and in the field of neuroimaging and genetics data analysis.
- Published
- 2012
28. An Architectural Model for a Grid based Workflow Management Platform in Scientific Applications
- Author
-
Costan, Alexandru, Pop, Florin, Stratan, Corina, Dobre, Ciprian, Leordeanu, Catalin, and Cristea, Valentin
- Subjects
FOS: Computer and information sciences ,Computer Science - Distributed, Parallel, and Cluster Computing ,Distributed, Parallel, and Cluster Computing (cs.DC) - Abstract
With recent increasing computational and data requirements of scientific applications, the use of large clustered systems as well as distributed resources is inevitable. Although executing large applications in these environments brings increased performance, the automation of the process becomes more and more challenging. While the use of complex workflow management systems has been a viable solution for this automation process in business oriented environments, the open source engines available for scientific applications lack some functionalities or are too difficult to use for non-specialists. In this work we propose an architectural model for a grid based workflow management platform providing features like an intuitive way to describe workflows, efficient data handling mechanisms and flexible fault tolerance support. Our integrated solution introduces a workflow engine component based on ActiveBPEL extended with additional functionalities and a scheduling component providing efficient mapping between tasks and available resources., 17th International Conference on Control Systems and Computer Science (CSCS 17), Bucharest, Romania, May 26-29, 2009. Vol. 1, pp. 407-414, ISSN: 2066-4451
- Published
- 2011
29. Critical Analysis of Middleware Architectures for Large Scale Distributed Systems
- Author
-
Pop, Florin, Dobre, Ciprian Mihai, Costan, Alexandru, Andreica, Mugurel Ionut, Tirsa, Eliana-Dina, Stratan, Corina, Cristea, Valentin, Tirsa, Eliana-Dina, Computer Science Department [Bucarest], University Politehnica of Bucharest [Romania] (UPB), Parallel and Distributed Systems Laboratory [Bucarest], University Politehnica of Bucarest, and Department of Computer Science [Bucharest]
- Subjects
Networking and Internet Architecture (cs.NI) ,FOS: Computer and information sciences ,Computer Science - Networking and Internet Architecture ,Computer Science - Distributed, Parallel, and Cluster Computing ,C.2.4 ,D.4 ,[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Distributed, Parallel, and Cluster Computing (cs.DC) ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
Some of the contents of this paper have been used as teaching materials in several university classes.; International audience; Distributed computing is increasingly being viewed as the next phase of Large Scale Distributed Systems (LSDSs). However, the vision of large scale resource sharing is not yet a reality in many areas - Grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new paradigm. Hence, in this paper we analyze the current development of middleware tools for LSDS, from multiple perspectives: architecture, applications and market research. For each perspective we are interested in relevant technologies used in undergoing projects, existing products or services and useful design issues. In the end, based on this approach, we draw some conclusions regarding the future research directions in this area.
- Published
- 2009
30. Spark Versus Flink: Understanding Performance in Big Data Analytics Frameworks.
- Author
-
Marcu, Ovidiu-Cristian, Costan, Alexandru, Antoniu, Gabriel, and Perez-Hernandez, Maria S.
- Published
- 2016
- Full Text
- View/download PDF
31. TomusBlobs: scalable data-intensive processing on Azure clouds.
- Author
-
Costan, Alexandru, Tudoran, Radu, Antoniu, Gabriel, and Brasche, Goetz
- Subjects
ELECTRONIC data processing ,CLOUD computing ,INTERNET ,INFORMATION storage & retrieval systems ,VIRTUAL machine systems - Abstract
The emergence of cloud computing has brought the opportunity to use large-scale compute infrastructures for a broader and broader spectrum of applications and users. As the cloud paradigm gets attractive for the 'elasticity' in resource usage and associated costs (the users only pay for resources actually used), cloud applications still suffer from the high latencies and low performance of cloud storage services. As Big Data analysis on clouds becomes more and more relevant in many application areas, enabling high-throughput massive data processing on cloud data becomes a critical issue, as it impacts the overall application performance. In this paper, we address this challenge at the level of cloud storage. We introduce a concurrency-optimized data storage system (called TomusBlobs), which federates the virtual disks associated to the Virtual Machines running the application code on the cloud. We demonstrate the performance benefits of our solution for efficient data-intensive processing by building an optimized prototype MapReduce framework for Microsoft's Azure cloud platform on the basis of TomusBlobs. Finally, we specifically address the limitations of state-of-the-art MapReduce frameworks for reduce-intensive workloads, by proposing MapIterativeReduce as an extension of the MapReduce model. We validate the aforementioned contributions through large-scale experiments with synthetic benchmarks and with real-world applications on the Azure commercial cloud by using resources distributed across multiple data centers; they demonstrate that our solutions bring substantial benefits to data-intensive applications compared with approaches relying on state-of-the-art cloud object storage. Copyright © 2013 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
32. Bridging Data in the Clouds: An Environment-Aware System for Geographically Distributed Data Transfers.
- Author
-
Tudoran, Radu, Costan, Alexandru, Wang, Rui, Bouge, Luc, and Antoniu, Gabriel
- Published
- 2014
- Full Text
- View/download PDF
33. Big Data Storage and Processing on Azure Clouds: Experiments at Scale and Lessons Learned.
- Author
-
Tudoran, Radu, Costan, Alexandru, Antoniu, Gabriel, and Goetz, Brasche
- Published
- 2014
- Full Text
- View/download PDF
34. Transfer as a Service: Towards a Cost-Effective Model for Multi-site Cloud Data Management.
- Author
-
Tudoran, Radu, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2014
- Full Text
- View/download PDF
35. To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload.
- Author
-
Ene, Stefan, Nicolae, Bogdan, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2014
- Full Text
- View/download PDF
36. DataSteward: Using Dedicated Compute Nodes for Scalable Data Management on Public Clouds.
- Author
-
Tudoran, Radu, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2013
- Full Text
- View/download PDF
37. Distributed Data Storage in Support for Context-Aware Applications.
- Author
-
Burceanu, Elena, Dobre, Ciprian, Cristea, Valentin, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2013
- Full Text
- View/download PDF
38. MapIterativeReduce.
- Author
-
Tudoran, Radu, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2012
- Full Text
- View/download PDF
39. TomusBlobs: Towards Communication-Efficient Storage for MapReduce Applications in Azure.
- Author
-
Tudoran, Radu, Costan, Alexandru, Antoniu, Gabriel, and Soncu, Hakan
- Abstract
The emergence of cloud computing brought the opportunity to use large-scale compute infrastructures for a broad spectrum of applications and users. As the cloud paradigm gets attractive for the " elasticity'' in resource usage and associated costs (the users only pay for resources actually used), cloud applications still suffer from the high latencies and low performance of cloud storage services. Enabling high-throughput massive data processing on cloud data becomes a critical issue, as it impacts the overall application performance. In this paper we address the above challenge at the level of the cloud storage. We introduce a concurrency-optimized data storage system which federates the virtual disks associated to VMs. We demonstrate the performance of our solution for efficient data-intensive processing on commercial clouds by building an optimized prototype MapReduce framework for Azure that leverages the benefits of our storage solution. We perform extensive synthetic benchmarks as well as experiments with real-world applications: they demonstrate that our solution brings substantial benefits to data intensive applications compared to approaches relying on state-of-the-art cloud object storage. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
40. Prediction of Distributed Systems State Based on Monitoring Data.
- Author
-
Draghici, Adriana, Costan, Alexandru, and Cristea, Valentin
- Published
- 2010
- Full Text
- View/download PDF
41. Monitoring, accounting and automated decision support for the alice experiment based on the MonALISA framework.
- Author
-
Cirstoiu, Catalin C., Grigoras, Costin C., Betev, Latchezar L., Costan, Alexandru A., and Legrand, Iosif Charles
- Published
- 2007
- Full Text
- View/download PDF
42. Machine learning patterns for neuroimaging-genetic studies in the cloud.
- Author
-
Da Mota, Benoit, Tudoran, Radu, Costan, Alexandru, Varoquaux, Gaël, Brasche, Goetz, Conrod, Patricia, Lemaitre, Herve, Paus, Tomas, Rietschel, Marcella, Frouin, Vincent, Poline, Jean-Baptiste, Antoniu, Gabriel, and Thirion, Bertrand
- Subjects
MACHINE learning ,BRAIN imaging ,FUNCTIONAL magnetic resonance imaging ,NEUROINFORMATICS ,DATA analysis ,CLOUD computing - Abstract
Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
43. Towards a Generic Security Framework for Cloud Data Management Environments.
- Author
-
Carpen-Amarie, Alexandra, Costan, Alexandru, Leordeanu, Catalin, Basescu, Cristina, and Antoniu, Gabriel
- Published
- 2012
- Full Text
- View/download PDF
44. BRINGING INTROSPECTION INTO BLOBSEER: TOWARDS A SELF--ADAPTIVE DISTRIBUTED DATA MANAGEMENT SYSTEM.
- Author
-
CARPEN-AMARIE, ALEXANDRA, COSTAN, ALEXANDRU, CAI, JING, ANTONIU, GABRIEL, and BOUGÉ, LUC
- Subjects
DATABASE management ,DISTRIBUTED databases ,ADAPTIVE control systems ,DISTRIBUTED computing ,INFORMATION storage & retrieval systems ,PERFORMANCE evaluation ,EXPERIMENTS - Published
- 2011
- Full Text
- View/download PDF
45. Towards Multi-site Metadata Management for Geographically Distributed Cloud Workflows.
- Author
-
Pineda-Morales, Luis, Costan, Alexandru, and Antoniu, Gabriel
- Published
- 2015
- Full Text
- View/download PDF
46. MonALISA developers describe how it works, the key design principles behind it, and the biggest technical challenges in building it.
- Author
-
Legrand, Iosif, Voicu, Ramiro, Cirstoiu, Catalin, Grigoras, Costin, Betev, Latchezar, and Costan, Alexandru
- Subjects
COMPUTER architecture ,SYSTEMS design ,APPLICATION software ,DISTRIBUTED computing - Abstract
The article presents the principles behind and the problems encountered by the high energy physics (HEP) group at the California Institue of Technology in 1992 during the conceptualization and design of Monitoring Agents using a Large Integrated Services Architecture (MonALISA). The system aims to provide a distribution system adept in managing big size, data-intensive applications. It describes in detail the layers of the MonALISA architecture namely, the lookup services, MonALISA services and proxy services.
- Published
- 2009
47. 1st Workshop on Big Data Management in Clouds – BDMC2012.
- Author
-
Costan, Alexandru and Dobre, Ciprian
- Published
- 2013
- Full Text
- View/download PDF
48. Is Today’s Public Cloud Suited to Deploy Hardcore Realtime Services? : A CPU Perspective
- Author
-
Raaen, Kjetil, Petlund, Andreas, Halvorsen, Pål, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Kobsa, Alfred, editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Weikum, Gerhard, editor, an Mey, Dieter, editor, Alexander, Michael, editor, Bientinesi, Paolo, editor, Cannataro, Mario, editor, Clauss, Carsten, editor, Costan, Alexandru, editor, Kecskemeti, Gabor, editor, Morin, Christine, editor, Ricci, Laura, editor, Sahuquillo, Julio, editor, Schulz, Martin, editor, Scarano, Vittorio, editor, Scott, Stephen L., editor, and Weidendorfer, Josef, editor
- Published
- 2014
- Full Text
- View/download PDF
49. Elastic Manycores : How to Bring the OS Back into the Scheduling Game?
- Author
-
Völp, Marcus, Roitzsch, Michael, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Kobsa, Alfred, editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Weikum, Gerhard, editor, an Mey, Dieter, editor, Alexander, Michael, editor, Bientinesi, Paolo, editor, Cannataro, Mario, editor, Clauss, Carsten, editor, Costan, Alexandru, editor, Kecskemeti, Gabor, editor, Morin, Christine, editor, Ricci, Laura, editor, Sahuquillo, Julio, editor, Schulz, Martin, editor, Scarano, Vittorio, editor, Scott, Stephen L., editor, and Weidendorfer, Josef, editor
- Published
- 2014
- Full Text
- View/download PDF
50. The PerSyst Monitoring Tool : A Transport System for Performance Data Using Quantiles
- Author
-
Guillen, Carla, Hesse, Wolfram, Brehm, Matthias, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Lopes, Luís, editor, Žilinskas, Julius, editor, Costan, Alexandru, editor, Cascella, Roberto G., editor, Kecskemeti, Gabor, editor, Jeannot, Emmanuel, editor, Cannataro, Mario, editor, Ricci, Laura, editor, Benkner, Siegfried, editor, Petit, Salvador, editor, Scarano, Vittorio, editor, Gracia, José, editor, Hunold, Sascha, editor, Scott, Stephen L., editor, Lankes, Stefan, editor, Lengauer, Christian, editor, Carretero, Jesús, editor, Breitbart, Jens, editor, and Alexander, Michael, editor
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.