Author: "Sandro, Fiore" / Language: undetermined - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sandro, Fiore"' showing total 56 results

Start Over Author "Sandro, Fiore" Language undetermined

56 results on '"Sandro, Fiore"'

1. A Data Space for Climate Science in the European Open Science Cloud

Author: Donatello Elia, Fabrizio Antonio, Sandro Fiore, Paola Nassisi, and Giovanni Aloisio
Subjects: General Computer Science, General Engineering
Published: 2023

2. An EOSC-enabled Data Space environment for the climate community

Author: Fabrizio Antonio, Donatello Elia, Guillaume Levavasseur, Atef Ben Nasser, Paola Nassisi, Alessandro D'Anca, Alessandra Nuzzo, Sandro Fiore, Sylvie Joussaume, and Giovanni Aloisio
Abstract: The exponential increase in data volumes and complexities is causing a radical change in the scientific discovery process in several domains, including climate science. This affects the different stages of the data lifecycle, thus posing significant data management challenges in terms of data archiving, access, analysis, visualization, and sharing. The data space concept can support scientists' workflow and simplify the process towards a more FAIR use of data.In the context of the European Open Science Cloud (EOSC) initiative launched by the European Commission, the ENES Data Space (EDS) represents a domain-specific implementation of the data space concept. The service, developed in the frame of the EGI-ACE project, aims to provide an open, scalable, cloud-enabled data science environment for climate data analysis on top of the EOSC Compute Platform. It is accessible in the European Open Science Cloud (EOSC) through the EOSC Catalogue and Marketplace (https://marketplace.eosc-portal.eu/services/enes-data-space) and it also provides a web portal (https://enesdataspace.vm.fedcloud.eu) including information, tutorials and training materials on how to get started with its main features. The EDS integrates into a single environment ready-to-use climate datasets, compute resources and tools, all made available through the Jupyter interface, with the aim of supporting the overall scientific data processing workflow. Specifically, the data store linked to the ENES Data Space provides access to a multi-terabyte set of variable-centric collections from large-scale global climate experiments. The data pool consists of a mirrored subset of CMIP (Coupled Model Intercomparison Project) datasets from the ESGF (Earth System Grid Federation) federated data archive, collected and kept synchronized with the remote copies by using the Synda tool developed within the scope of the IS-ENES3 H2020 project. Community-based, open source frameworks (e.g., Ophidia) and libraries from the Python ecosystem provide the capabilities for data access, analysis and visualisation. Results and experiment definitions (i.e., Jupyter Notebooks) can be easily shared among users promoting data sharing and application re-use towards a more Open Science approach. An overview of the data space capabilities along with the key aspects in terms of data management will be presented in this work.
Published: 2023

3. Tracking and reporting peta-scale data exploitation within the Earth System Grid Federation through the ESGF Data Statistics service

Author: Alessandra Nuzzo, Fabrizio Antonio, Maria Mirto, Paola Nassisi, Sandro Fiore, and Giovanni Aloisio
Abstract: The Earth System Grid Federation (ESGF) is an international collaboration powering most global climate change research and managing the first-ever decentralized repository for handling climate science data, with multiple petabytes of data at dozens of federated sites worldwide. It is recognized as the leading infrastructure for the management and access of large distributed data volumes for climate change research and supports the Coupled Model Intercomparison Project (CMIP) and the Coordinated Regional Climate Downscaling Experiment (CORDEX), whose protocols enable the periodic assessments carried out by the IPCC, the Intergovernmental Panel on Climate Change. As trusted international repository, ESGF hosts and replicates data from a broader range of domains and communities in the Earth sciences leading thus to a strong support to standards for connecting data and application of FAIR data principles to ensure free and open access and interoperability with other similar systems in the Earth Sciences. ESGF includes a specific software component, funded by the H2020 projects IS-ENES2 and IS-ENES3, named ESGF Data Statistics, which takes care of collecting, analyzing, visualizing the data usage metrics and data archive information across the federation. It provides a distributed and scalable software infrastructure responsible for capturing a set of metrics both at single site and federation level. It collects and stores a high volume of heterogeneous metrics, covering coarse and fine grain measures such as downloads and clients statistics, aggregated cross and project-specific download statistics thus offering a more user oriented perspective of the scientific experiments. This allows providing a strong feedback on how much, how frequently and how intensively the whole federation is exploited by the end-users, as well as the most downloaded data, which somehow captures the level of interest from the community on some specific data. It also gives feedback on the less accessed data, which from one side can help designing larger-scale experiments in the future and on the other hand can help getting some insights on the long tail of research. On top of this, a view of the total amount of data published and available through ESGF offers users the possibility to monitor the status of the data archive of the entire federation. This contribution presents an overview of the Data Statistics capabilities as well as the main results in terms of data analysis and visualization.
Published: 2023

4. An EOSC-enabled Data Space Environment for Climate Science

Author: Donatello Elia, Sandro Fiore, Fabrizio Antonio, Guillaume Levavasseur, Paola Nassisi, Alessandro D'Anca, Sylvie Joussaume, and Giovanni Aloisio
Subjects: Open Science, Data Space, Climate Science
Abstract: In the context of the European Open Science Cloud, the ENES Data Space represents a domain-specific implementation of the data space concept, a digital ecosystem supporting scientific communities towards a more sustainable, effective, and FAIR use of data. Such ecosystem has been recently opened to climate users, offering datasets, tools, and services into a single environment with ready-to-use data and programmatic capabilities for the development of data science applications. Presently, the data store of the ENES Data Space provides access to models output from large-scale global experiments for climate model intercomparison. The storage and computational resources for the execution of the data space are provided by the EGI Federated Cloud e-Infrastructure. From a science gateway perspective, the ENES Data Space provides an interactive web-based environment based on the Jupyter project and available through the EOSC MarketPlace.
Published: 2023
Full Text: View/download PDF

5. A multi-model architecture based on Long Short-Term Memory neural networks for multi-step sea level forecasting

Author: Giovanni Aloisio, Gabriele Accarino, Ivan Federico, Sandro Fiore, Giovanni Coppini, Marco Chiarelli, and Salvatore Causio
Subjects: Meteorology, Artificial neural network, Computer Networks and Communications, Computer science, Climate change, Storm surge, 020206 networking & telecommunications, 02 engineering and technology, Mediterranean sea, Hardware and Architecture, Climate change scenario, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Architecture, Coastal flood, Software, Sea level
Abstract: The intensification of extreme events, storm surges and coastal flooding in a climate change scenario increasingly influences human processes, especially in coastal areas where sea-based activities are concentrated. Predicting sea level near the coasts, with a high accuracy and in a reasonable amount of time, becomes a strategic task. Despite the developments of complex numerical codes for high-resolution ocean modeling, the task of making forecasts in areas at the intersection between land and sea remains challenging. In this respect, the use of machine learning techniques can represent an interesting alternative to be investigated and evaluated by numerical modelers. This article presents the application of the Long-Short Term Memory (LSTM) neural network to the problem of short-term sea level forecasting in the Southern Adriatic Northern Ionian (SANI) domain in the Mediterranean sea. The proposed multi-model architecture based on LSTM networks has been trained to predict mean sea levels three days ahead, for different coastal locations. Predictions were compared with the observation data collected through the tide-gauge devices as well as with the forecasts produced by the Southern Adriatic Northern Ionian Forecasting System (SANIFS) developed at the Euro-Mediterranean Center on Climate Change (CMCC), which provides short-term daily updated forecasts in the Mediterranean basin. Experimental results demonstrate that the multi-model architecture is able to bridge information far in time and to produce predictions with a much higher accuracy than SANIFS forecasts.
Published: 2021

6. ENES Data Space: an open, cloud-enabled data science environment for climate analysis

Author: Fabrizio Antonio, Donatello Elia, Andrea Giannotta, Alessandra Nuzzo, Guillaume Levavasseur, Atef Ben Nasser, Paola Nassisi, Alessandro D'Anca, Sandro Fiore, Sylvie Joussaume, and Giovanni Aloisio
Abstract: The scientific discovery process has been deeply influenced by the data deluge started at the beginning of this century. This has caused a profound transformation in several scientific domains which are now moving towards much more collaborative processes. In the climate sciences domain, the ENES Data Space aims to provide an open, scalable, cloud-enabled data science environment for climate data analysis. It represents a collaborative research environment, deployed on top of the EGI federated cloud infrastructure, specifically designed to address the needs of the ENES community. The service, developed in the context of the EGI-ACE project, provides ready-to-use compute resources and datasets, as well as a rich ecosystem of open source Python modules and community-based tools (e.g., CDO, Ophidia, Xarray, Cartopy, etc.), all made available through the user-friendly Jupyter interface. In particular, the ENES Data Space provides access to a multi-terabyte set of specific variable-centric collections from large community experiments to support researchers in climate model data analysis experiments. The data pool of the ENES Data Space consists of a mirrored subset of CMIP datasets from the ESGF federated data archive collected by using the Synda community tool in order to provide the most up to date datasets into a single location. Results and output products as well as experiment definitions (in the form of Jupyter Notebooks) can be easily shared among users through data sharing services, which are also being integrated in the infrastructure, such as EGI DataHub.The service was opened in the second part of 2021 and is now accessible in the European Open Science Cloud (EOSC) through the EOSC Portal Marketplace (https://marketplace.eosc-portal.eu/services/enes-data-space). This contribution will present an overview of the ENES Data Space service and its main features.
Published: 2022
Full Text: View/download PDF

7. Skip high-volume data transfer and access free computing resources for your CMIP6 multi-model analyses

Author: Sophie Morellon, Marco Kulüke, Charlotte L Pascoe, Stephan Kindermann, Guillaume Levavasseur, Fabian Wachsmann, Regina Kwee-Hinzmann, Maria Moreno de Castro, Sandro Fiore, Paola Nassisi, Sylvie Joussaume, and Martin Juckes
Subjects: Computer science, Volume (compression), Data transmission, Computational science
Abstract: Tired of downloading tons of model results? Is your internet connection flakey? Are you about to overload your computer’s memory with the constant increase of data volume and you need more computing resources? You can request free of charge computing time at one of the supercomputers of the Infrastructure of the European Network of Earth System modelling (IS-ENES)1, the European part of Earth System Grid Federation (ESGF)2, which also hosts and maintains more than 6 Petabytes of CMIP6 and CORDEX data.Thanks to this new EU Comission funded service, you can run your own scripts in your favorite programming language and straightforward pre- and post-process model data. There is no need for heavy data transfer, just load with one line of code the data slice you need because your script will directly access the data pool. Therefore, days-lasting calculations will be done in seconds. You can test the service, we very easily provide pre-access activities.In this session we will run Jupyter notebooks directly on the German Climate Computing Center (DKRZ)3, one of the ENES high performance computers and a ESGF data center, showing how to load, filter, concatenate, take means, and plot several CMIP6 models to compare their results, use some CMIP6 models to calculate some climate indexes for any location and period, and evaluate model skills with observational data. We will use Climate Data Operators (cdo)4 and Python packages for Big Data manipulation, as Intake5, to easily extract the data from the huge catalog, and Xarray6, to easily read NetDCF files and scale to parallel computing. We are continuously creating more use cases for multi-model evaluation, mechanisms of variability, and impact analysis, visit the demos, find more information, and apply here: https://portal.enes.org/data/data-metadata-service/analysis-platforms.[1] https://is.enes.org/[2] https://esgf.llnl.gov/[3] https://www.dkrz.de/[4] https://code.mpimet.mpg.de/projects/cdo/[5] https://intake.readthedocs.io/en/latest/[6] http://xarray.pydata.org/en/stable/
Published: 2021

8. Meridional distribution of moisture transport associated to Tropical Cyclones

Author: Sandro Fiore, Enrico Scoccimarro, Malcolm J. Roberts, Daniele Peano, Alessandro D'Anca, Fabrizio Antonio, Annalisa Cherchi, Silvio Gualdi, and Alessio Bellucci
Subjects: Moisture, Distribution (number theory), Environmental science, Zonal and meridional, Tropical cyclone, Atmospheric sciences
Abstract: Tropical cyclones (TCs) transport energy and moisture along their pathways interacting with the climate system and TCs activities are expected to extend further poleward during the 21st century.For this reason, it is important to assess the ability of state-of-the-art climate models in reproducing an accurate meridional distribution of TCs as well as a reasonable meridional portrait of moisture transport associated with TCs.Since high resolutions are required to reconstruct observed TCs activity, the present work is based on the simulations performed as part of HighResMIP in the framework of the community CMIP6 effort. To inspect this feature, two horizontal resolutions for each climate model are considered. Besides, the impact of boundary conditions, i.e. observed ocean surface state, is examined by considering both coupled and atmosphere-only configurations.In the present work, the north Atlantic region is analyzed as a sample region, while the same approach is applied on a multi-basin basis. In the sample area, climate models present a good ability in reproducing the TCs distribution, with a general underestimation at lower latitudes and a slight overestimation at high-latitudes compared to observed TCs tracks (e.g. IBTRACK).The meridional distribution of moisture transport associated with TCs is evaluated by considering the radial average of the integrated water vapor transport along the TC tracks. When compared to observation (IBTRACS and JRA-55 reanalysis), the simulated moisture transport associated with TCs displays reasonably good performance in atmosphere-only high-resolution models configuration. The interannual variability of water vapor associated with TCs, instead, is poorly represented in climate models.Climate models in high-resolution configuration can then be used in estimating future TCs meridional distribution and changes in meridional moisture transport associated with TCs.This effort is part of HighResMIP and it is developed in the framework of the EU-funded PRIMAVERA project.
Published: 2020

9. Boosting climate change research with direct access to high performance computers

Author: Ag Stephens, Martin Juckes, Maria Moreno de Castro, Sophie Morellon, Sandro Fiore, Sylvie Joussaume, Guillaume Levavasseur, Karsten Peters, Stephan Kindermann, and Paola Nassisi
Subjects: Boosting (machine learning), Computer science, Climate change, Environmental economics
Abstract: Earth System observational and model data volumes are constantly increasing and it can be challenging to discover, download, and analyze data if scientists do not have the required computing and storage resources at hand. This is especially the case for detection and attribution studies in the field of climate change research since we need to perform multi-source and cross-disciplinary comparisons for datasets of high-spatial and large temporal coverage. Researchers and end-users are therefore looking for access to cloud solutions and high performance compute facilities. The Earth System Grid Federation (ESGF, https://esgf.llnl.gov/) maintains a global system of federated data centers that allow access to the largest archive of model climate data world-wide. ESGF portals provide free access to the output of the data contributing to the next assessment report of the Intergovernmental Panel on Climate Change through the Coupled Model Intercomparison Project. In order to support users to directly access to high performance computing facilities to perform analyses such as detection and attribution of climate change and its impacts, the EU Commission funded a new service within the infrastructure of the European Network for Earth System Modelling (ENES, https://portal.enes.org/data/data-metadata-service/analysis-platforms). This new service is designed to reduce data transfer issues, speed up the computational analysis, provide storage, and ensure the resources access and maintenance. Furthermore, the service is free of charge, only requires a lightweight application. We will present a demo on how flexible it is to calculate climate indices from different ESGF datasets covering a wide range of temporal and spatial scales using cdo (Climate Data Operators, https://code.mpimet.mpg.de/projects/cdo/) and Jupyter notebooks running directly on the ENES partners: the DKRZ (Germany), JASMIN (UK), CMCC(Italy), and IPSL (France) high performance computing centers.
Published: 2020

10. Python-based Multidimensional and Parallel Climate Model Data Analysis in ECAS

Author: Regina Kwee, Tobias Weigel, Hannes Thiemann, Karsten Peters, Sandro Fiore, and Donatello Elia
Abstract: This contribution highlights the Python xarray technique in context of a climate specific application (typical formats are NetCDF, GRIB and HDF).We will see how to use in-file metadata and why they are so powerful for data analysis, in particular by looking at community specific problems, e.g. one can select purely on coordinate variable names. ECAS, the ENES Climate Analytics Service available at Deutsches Klimarechenzentrum (DKRZ), will help by enabling faster access to the high-volume simulation data output from climate modeling experiments. In this respect, we can also make use of “dask” which was developed for parallel computing and can smoothly work with xarray. This is extremely useful when we want to exploit fully the advantages of our supercomputer.Our fully integrated service offers an interface via Jupyter notebooks (ecaslab.dkrz.de). We provide an analysis environment without the need of costly transfers, accessing CF standardized data files and all accessible via the ESGF portal on our nodes (esgf-data.dkrz.de). We can analyse the data of e.g. CMIP5, CMIP6, Grand Ensemble and observation data. ECAS was developed in the frame of European Open Source Cloud (EOSC) hub.
Published: 2020

11. A Python-oriented environment for climate experiments at scale in the frame of the European Open Science Cloud

Author: Donatello Elia, Fabrizio Antonio, Cosimo Palazzo, Paola Nassisi, Sofiane Bendoukha, Regina Kwee-Hinzmann, Sandro Fiore, Tobias Weigel, Hannes Thiemann, and Giovanni Aloisio
Subjects: 13. Climate action
Abstract: Scientific data analysis experiments and applications require software capable of handling domain-specific and data-intensive workflows. The increasing volume of scientific data is further exacerbating these data management and analytics challenges, pushing the community towards the definition of novel programming environments for dealing efficiently with complex experiments, while abstracting from the underlying computing infrastructure. ECASLab provides a user-friendly data analytics environment to support scientists in their daily research activities, in particular in the climate change domain, by integrating analysis tools with scientific datasets (e.g., from the ESGF data archive) and computing resources (i.e., Cloud and HPC-based). It combines the features of the ENES Climate Analytics Service (ECAS) and the JupyterHub service, with a wide set of scientific libraries from the Python landscape for data manipulation, analysis and visualization. ECASLab is being set up in the frame of the European Open Science Cloud (EOSC) platform - in the EU H2020 EOSC-Hub project - by CMCC (https://ecaslab.cmcc.it/) and DKRZ (https://ecaslab.dkrz.de/), which host two major instances of the environment. ECAS, which lies at the heart of ECASLab, enables scientists to perform data analysis experiments on large volumes of multi-dimensional data by providing a workflow-oriented, PID-supported, server-side and distributed computing approach. ECAS consists of multiple components, centered around the Ophidia High Performance Data Analytics framework, which has been integrated with data access and sharing services (e.g., EUDAT B2DROP/B2SHARE, Onedata), along with the EGI federated cloud infrastructure. The integration with JupyterHub provides a convenient interface for scientists to access the ECAS features for the development and execution of experiments, as well as for sharing results (and the experiment/workflow definition itself). ECAS parallel data analytics capabilities can be easily exploited in Jupyter Notebooks (by means of PyOphidia, the Ophidia Python bindings) together with well-known Python modules for processing and for plotting the results on charts and maps (e.g., Dask, Xarray, NumPy, Matplotlib, etc.). ECAS is also one of the compute services made available to climate scientists by the EU H2020 IS-ENES3 project. Hence, this integrated environment represents a complete software stack for the design and run of interactive experiments as well as complex and data-intensive workflows. One class of such large-scale workflows, efficiently implemented through the environment resources, refers to multi-model data analysis in the context of both CMIP5 and CMIP6 (i.e., precipitation trend analysis orchestrated in parallel over multiple CMIP-based datasets).
Published: 2020

12. On the road to exascale: Advances in High Performance Computing and Simulations—An overview and editorial

Author: Waleed W. Smari, Sandro Fiore, and Mohamed Bakhouya
Subjects: Computer Networks and Communications, Computer science, business.industry, Distributed computing, 020206 networking & telecommunications, 02 engineering and technology, Systems modeling, Supercomputer, Exascale computing, Software, Hardware and Architecture, Scalability, Path (graph theory), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business
Abstract: In recent decades, the complexity of scientific and engineering problems has increased considerably. New applications and domains that use high performance computing systems have been introduced. These trends are projected to continue for the foreseen future (Reed and Dongarra, 2015) [ 1 ]. In many areas of engineering and science, High-Performance Computing (HPC) and Simulations have become determinants of industrial competitiveness and advanced research. In fact, advances in HPC architectures, storages, networking, and software capabilities are leading to a new era in HPC and simulations, along with new challenges both in computing and systems modeling (Geist and Lucas, 2009) [ 2 ]. These developments are especially critical considering that HPC systems continue to scale up in terms of nodes, cores, and accelerators, as well as software, infrastructure and tools, which in turn are expediting the move on the path toward Exascale (Reed and Dongarra, 2015; Geist and Lucas, 2009; Dongarra and Beckman, 2011; Dosanjh et al., 2014; Engelmann, 2014) [ [1] , [2] , [3] , [4] , [5] ]. Scalability and availability represent two of the main requirements that need to be considered before conceiving of these large-scale systems (ASCAC Subcommittee on Exascale Computing, 2010). The scalability feature allows the system to proportionately grow when service demand increases, whereas availability means the system continues to provide their services despite hardware and software failures (Theodoropoulos et al., 2014; Tang et al., 2014) [ [7] , [8] ]. The goal in large-scale HPC is to accommodate both availability and scalability while staying under strict constraints on performance (e.g., processing time) and cost metrics (e.g., power consumption). This special issue is envisioned to provide examples of research work on topics related to recent advances in High Performance Computing and Simulations. It briefly addresses and explores challenges toward Exascale computing, current state-of-the-art in HPC and simulation, and the path forward in the domains of large-scale HPC systems.
Published: 2018

13. Enabling Server-Based Computing and FAIR Data Sharing with the ENES Climate Analytics Service

Author: Sandro Fiore, D. Elia, Sofiane Bendoukha, and Tobias Weigel
Subjects: Data sharing, Workflow, Data access, Computer science, Analytics, business.industry, Data management, e-Science, Cloud computing, business, Data science, Virtual research environment
Abstract: The European Network for Earth System Modelling (ENES) Climate Analytics Service (ECAS) is a new service from the EOSC-hub project. It offers a Virtual Research Environment (VRE) to scientific users, combining a Python (Jupyter) work environment with support services for data access, computing and data sharing. ECAS is motivated by providing users with remote access to extensive computing and storage resources beyond what they may have access to locally, reducing the need to conduct costly data transfer, and helping to realize the vision of FAIR data management. ECAS aims at providing a paradigm shift for the ENES community and beyond with a strong focus on data intensive analysis, provenance management, and server-side approaches as opposed to the current ones mostly client-based, sequential and with limited or missing end-to-end analytics workflow and provenance capabilities. Furthermore, the integrated data analytics service enables basic data provenance tracking by establishing a graph of persistent identifiers (PIDs) through the whole chain, and thereby improving reusability, traceability, and reproducibility. ECAS targets multiple user groups, including researchers in lack of local computing and storage resources, researchers with interest in the high-volume climate data pools, and use within education and training scenarios.
Published: 2019

14. BIGSEA: A Big Data analytics platform for public transportation information

Author: Dorgival Guedes, Sandro Fiore, Rosa M. Badia, Nazareno Andrade, Nádia P. Kozievitch, Walter Abrahão dos Santos, Tarciso Braz, Giovanni Aloisio, Paulo Silva, Marco Vieira, Danilo Ardagna, Fábio Morais, Nuno Antunes, Jussara M. Almeida, Daniele Lezzi, Demetrio Gomes Mestre, Andy S. Alic, Wagner Meira, Tânia Basso, Carlos Eduardo Santos Pires, Ignacio Blanquer, Matheus Maciel, Regina Moraes, Donatello Elia, Andrey Brito, Marco Lattuada, European Commission, Ministério da Ciência, Tecnologia e Inovação (Brasil), Almeida, Jussara [0000-0001-9142-2919], Antunes, Nuno [0000-0002-6044-4012], Ardagna, Danilo [0000-0003-4224-927X], Badia, Rosa M. [0000-0003-2941-5499], Braz, Tarciso [0000-0001-8620-3877], Lattuada, Marco [0000-0003-0062-6049], Lezzi, Daniele [0000-0001-5081-7244], Mestre, Demetrio [0000-0003-4727-3340], Moraes, Regina [0000-0003-0678-4777], Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Almeida, Jussara, Antunes, Nuno, Ardagna, Danilo, Badia, Rosa M., Braz, Tarciso, Lattuada, Marco, Lezzi, Daniele, Mestre, Demetrio, Moraes, Regina, Alic, A. S., Almeida, J., Aloisio, G., Andrade, N., Antunes, N., Ardagna, D., Badia, R. M., Basso, T., Blanquer, I., Braz, T., Brito, A., Elia, D., Fiore, S., Guedes, D., Lattuada, M., Lezzi, D., Maciel, M., Meira, W., Mestre, D., Moraes, R., Morais, F., Pires, C. E., Kozievitch, N. P., Santos, W. D., Silva, P., and Vieira, M.
Subjects: Computació en núvol, Computer Networks and Communications, Computer science, Performance, Deployment, Big data, Library science, Transport, Transportation, 02 engineering and technology, Workflows, 11. Sustainability, CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL, 0202 electrical engineering, electronic engineering, information engineering, Cloud computing, European commission, Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC], business.industry, Macrodades, 020206 networking & telecommunications, Workflow, Work (electrical), Hardware and Architecture, Software deployment, Public transport, 020201 artificial intelligence & image processing, business, Software
Abstract: Analysis of public transportation data in large cities is a challenging problem. Managing data ingestion, data storage, data quality enhancement, modelling and analysis requires intensive computing and a non-trivial amount of resources. In EUBra-BIGSEA (Europe–Brazil Collaboration of Big Data Scientific Research Through Cloud-Centric Applications) we address such problems in a comprehensive and integrated way. EUBra-BIGSEA provides a platform for building up data analytic workflows on top of elastic cloud services without requiring skills related to either programming or cloud services. The approach combines cloud orchestration, Quality of Service and automatic parallelisation on a platform that includes a toolbox for implementing privacy guarantees and data quality enhancement as well as advanced services for sentiment analysis, traffic jam estimation and trip recommendation based on estimated crowdedness., The work shown in this article has been funded jointly by European Commission under the Cooperation Programme, Horizon2020 grant agreement No 690116 (EUBra-BIGSEA) and the Min-istériode Ciência,Tecnologiae Inovação(MCTI) from Brazil
Published: 2019

15. AMGCC 2018 Foreword

Author: Hyeonsang Eom, Myungho Lee, Kento Aida, Taiga Nakamura, Yoonhee Kim, Ananta Tiwari, Ilkyeun Ra, Young Choon Lee, Kyungyong Lee, Robert Quick, Jose Luis Vazquez-Poletti, Steven Timm, Ewa Deelman, E. M. Heien, Beomseok Nam, Sangmi Lee Pallickara, Jaehwan Lee, Raffaele Montella, Sungyong Park, Youngjae Kim, Taro Tezuka, David Sarramia, Seung-Jong Park, Young-ri Choi, Jens Jensen, Justin M. Wozniak, Heon-Young Yeom, Ricardo Graciani Diaz, Sandro Fiore, Yoshio Tanaka, Jaewook Lee, Jik-Soo Kim, and Jae-Young Choi
Subjects: business.industry, Computer science, Distributed computing, Cloud computing, business, Grid
Published: 2018

16. Towards an Open (Data) Science Analytics-Hub for Reproducible Multi-Model Climate Analysis at Scale

Author: Dean N. Williams, Giovanni Aloisio, Donatello Elia, Sandro Fiore, Alessandro DrAnca, Ian Foster, Fabrizio Antonio, Cosimo Palazzo, Fiore, S., Elia, D., Palazzo, C., Dranca, A., Antonio, F., Williams, D. N., Foster, I., and Aloisio, G.
Subjects: Analytics-hub, analytics-hub, Open science, 010504 meteorology & atmospheric sciences, Computer science, Big data, provenance, Climate change, 02 engineering and technology, 01 natural sciences, Open Science, 11. Sustainability, 0202 electrical engineering, electronic engineering, information engineering, reproducibility, 0105 earth and related environmental sciences, 020203 distributed computing, Coupled model intercomparison project, business.industry, Data science, Reproducibility, Knowledge sharing, Open data, 13. Climate action, Analytics, Provenance, Data analytics, Scientific method, data analytic, Data analysis, Earth System Grid, business
Abstract: Open Science is key to future scientific research and promotes a deep transformation in the whole scientific research process encouraging the adoption of transparent and collaborative scientific approaches aimed at knowledge sharing. Open Science is increasingly gaining attention in the current and future research agenda worldwide. To effectively address Open Science goals, besides Open Access to results and data, it is also paramount to provide tools or environments to support the whole research process, in particular the design, execution and sharing of transparent and reproducible experiments, including data provenance (or lineage) tracking. This work introduces the Climate Analytics-Hub, a new component on top of the Earth System Grid Federation (ESGF), which joins big data approaches and parallel computing paradigms to provide an Open Science environment for reproducible multi-model climate change data analytics experiments at scale. An operational implementation has been set up at the SuperComputing Centre of the Euro-Mediterranean Center on Climate Change, with the main goal of becoming a reference Open Science hub in the climate community regarding the multi-model analysis based on the Coupled Model Intercomparison Project (CMIP). This paper reports about some ESiWACE WP3 activities described in the deliverable D3.10 "ESiWACE Scheduler development and support activities"
Published: 2018
Full Text: View/download PDF

17. Recent developments in high-performance computing and simulation: distributed systems, architectures, algorithms, and applications

Author: Waleed W. Smari, Sandro Fiore, and Carsten Trinitis
Subjects: Computational Theory and Mathematics, Computer Networks and Communications, Computer science, Distributed computing, Supercomputer, Software, Computer Science Applications, Theoretical Computer Science
Published: 2015

18. High performance computing and simulation: architectures, systems, algorithms, technologies, services, and applications

Author: David R.C. Hill, Sandro Fiore, and Waleed W. Smari
Subjects: Computational Theory and Mathematics, Computer architecture, Computer Networks and Communications, Computer science, 0202 electrical engineering, electronic engineering, information engineering, 020206 networking & telecommunications, 020201 artificial intelligence & image processing, 02 engineering and technology, Supercomputer, Software, Computer Science Applications, Theoretical Computer Science
Published: 2013

19. EUBrazilCC Federated Cloud

Author: Jose Luis Vivas, Abmar Barros, Francisco Brasileiro, Giovanni Farias da Silva, Daniele Lezzi, Jacek Cala, Cristina D. Ururahy, Erik Torres, Ignacio Blanquer, Maria Julia de Lima, Rosa M. Badia, Sandro Fiore, Marcos Nobrega, Antônio Tadeu A. Gomes, Francisco Germano de Araújo Neto, and Giovanni Aloisio
Subjects: Computer science, business.industry, Cloud computing, Computer security, computer.software_genre, business, computer
Abstract: Many e-science initiatives are currently investigating the use of cloud computing to support all kinds of scientific activities. The objective of this chapter is to describe the architecture and the deployment of the EUBrazilCC federated e-infrastructure, a Research & Development project that aims at providing a user-centric test bench enabling European and Brazilian research communities to test the deployment and execution of scientific applications on a federated intercontinental e-infrastructure. This e-infrastructure exploits existing resources that consist of virtualized data centers, supercomputers, and even opportunistically exploited desktops spread over a transatlantic geographic area. These heterogeneous resources are federated with the aid of appropriate middleware that provide the necessary features to achieve the established challenging goals. In order to elicit the requirements and validate the resulting infrastructure, three complex scientific applications have been implemented, which are also presented here.
Published: 2016

20. The OFIDIA Fire Danger Rating System

Author: A. Raolil, Giovanni Aloisio, Marco Mancini, Michele Salis, Valentina Bacciu, Sandro Fiore, Costantino Sirca, Andrea Mariello, Alessandra Nuzzo, O. Marra, Maria Mirto, Donatella Spano, Mirto, Maria, Mariello, Andrea, Nuzzo, Alessandra, Mancini, Marco, Raolil, Alessandro, Marra, Osvaldo, Fiore, Sandro, Sirca, Costantino, Salis, Michele, Bacciu, Valentina, Spano, Donatella, and Aloisio, Giovanni
Subjects: Meteorology, business.industry, Wireless sensors network, Weather forecasting, computer.software_genre, Wind speed, Primary station, Data visualization, Geography, Data acquisition, Data analytic, Fire danger index, Natural hazard, Data analysis, Fire behaviour, business, computer, Wireless sensor network
Abstract: Prevention is one of the most important stages in wildfire and other natural hazard management. Fire Danger Rating Systems (FDRSs) have been adopted by many countries to enhance wildfire prevention and suppression planning. With the aim to provide real-Time fire danger forecasts and finer-scale fire behaviour analysis, an operational fire danger prevention platform has been developed within the OFIDIA project (Operational FIre Danger preventIon plAtform). The OFIDIA Fire Danger Rating System platform consists of (1) a data archive for managing weather forecasting and wireless sensors data, (2) a data analytics platform for post-processing weather data and for computing fire danger indices, and (3) a web application system for the visualization of weather and fire index maps and related timeseries. The OFIDIA platform is also connected to a Wireless Sensor Network (WSN) that gathers data from several sites in the Apulia (Italy) and Epirus (Greece) regions. The WSN is made by a primary station and several wireless sensors dislocated in wooded areas, the data acquisition process relates to variables like air temperature, relative humidity, wind speed and direction, precipitation, solar radiation, and fuel moisture.
Published: 2015

21. Data issues at the Euro-Mediterranean Centre for Climate Change

Author: Alessandro Negro, Sandro Fiore, Salvatore Vadacca, and Giovanni Aloisio
Subjects: Metadata, World Wide Web, Data collection, Data grid, Computer science, Metadata management, Dashboard (business), Earth and Planetary Sciences(all), General Earth and Planetary Sciences, Petabyte, Climate change, Client-side, Data science
Abstract: Climate Change research is even more becoming a data intensive and oriented scientific activity. Petabytes of climate data, big collections of datasets are continuously produced, delivered, accessed, processed by scientists and researchers at multiple sites at an international level. This work presents the Euro-Mediterranean Centre for Climate Change (CMCC) initiative, discussing data and metadata issues and dealing with both architectural and infrastructural aspects concerning the adopted grid enabled solution. A complete overview of the grid services deployed at the Centre is presented as well as the client side support (CMCC data portal and monitoring dashboard).
Published: 2009

22. Near real-time parallel processing and advanced data management of SAR images in grid environments

Author: Massimo Cafaro, Italo Epicoco, Daniele Lezzi, Silvia Mocavero, Sandro Fiore, Giovanni Aloisio, Cafaro, Massimo, Epicoco, Italo, S., Fiore, D., Lezzi, S., Mocavero, and Aloisio, Giovanni
Subjects: Synthetic aperture radar, Speedup, Data grid, business.industry, Computer science, Data management, Real-time computing, Grid, SAR processing, Parallel computing, Data grids, Software, Parallel processing (DSP implementation), business, Software architecture, Information Systems
Abstract: In this paper, we describe the process of parallelizing an existing, production level, sequential Synthetic Aperture Radar (SAR) processor based on the Range-Doppler algorithmic approach. We show how, taking into account the constraints imposed by the software architecture and related software engineering costs, it is still possible with a moderate programming effort to parallelize the software and present an message-passing interface (MPI) implementation whose speedup is about 8 on 9 processors, achieving near real-time processing of raw SAR data even on a moderately aged parallel platform. Moreover, we discuss a hybrid two-level parallelization approach that involves the use of both MPI and OpenMP. We also present GridStore, a novel data grid service to manage raw, focused and post-processed SAR data in a grid environment. Indeed, another aim of this work is to show how the processed data can be made available in a grid environment to a wide scientific community, through the adoption of a data grid service providing both metadata and data management functionalities. In this way, along with near real-time processing of SAR images, we provide a data grid-oriented system for data storing, publishing, management, etc.
Published: 2009

23. A Grid-Enabled Protein Secondary Structure Predictor

Author: Sandro Fiore, Maria Mirto, Daniele Tartarini, Giovanni Aloisio, Massimo Cafaro, M., Mirto, Cafaro, Massimo, S., Fiore, Daniele, Tartarini, and Aloisio, Giovanni
Subjects: Models, Molecular, neural network, Computer science, Biomedical Engineering, Pharmaceutical Science, Medicine (miscellaneous), Bioengineering, Machine learning, computer.software_genre, Protein Structure, Secondary, Set (abstract data type), User-Computer Interface, Artificial Intelligence, Sequence Analysis, Protein, Computer Simulation, Electrical and Electronic Engineering, Web services, Internet, Multiple sequence alignment, Artificial neural network, business.industry, Proteins, Protein structure prediction, Grid, Backpropagation, Computer Science Applications, protein structure prediction, Models, Chemical, Grid computing, Multilayer perceptron, Artificial intelligence, business, computer, Algorithms, Software, Biotechnology
Abstract: We present an integrated Grid system for the prediction of protein secondary structures, based on the frequent automatic update of proteins in the training set. The predictor model is based on a feed-forward multilayer perceptron (MLP) neural network which is trained with the back-propagation algorithm; the design reuses existing legacy software and exploits novel grid components. The predictor takes into account the evolutionary information found in multiple sequence alignment (MSA); the information is obtained running an optimized parallel version of the PSI-BLAST tool, based on the MPI Master–Worker paradigm. The training set contains proteins of known structure. Using Grid technologies and efficient mechanisms for running the tools and extracting the data, the time needed to train the neural network is dramatically reduced, whereas the results are comparable to a set of well-known predictor tools.
Published: 2007

24. The Grid Resource Broker portal

Author: Italo Epicoco, Sandro Fiore, Giovanni Aloisio, Daniele Lezzi, Maria Mirto, Massimo Cafaro, Gabriele Carteni, Silvia Mocavero, Aloisio, Giovanni, Cafaro, Massimo, G., Carteni, Epicoco, Italo, S., Fiore, D., Lezzi, M., Mirto, and S., Mocavero
Subjects: Grid Portal, Data grid, Database, Grid Computing, Computer Networks and Communications, Computer science, Storage Resource Broker, PROCESSORS, INDEPENDENT TASKS, computer.software_genre, Computer Science Applications, Theoretical Computer Science, World Wide Web, Semantic grid, Computational Theory and Mathematics, Grid computing, Grid resources, computer, Software
Abstract: This paper describes the Grid Resource Broker (GRB), a Grid portal built leveraging a set of high-level, Globus-Toolkit-based Grid libraries called GRB libraries. The portal leverages the Liferay framework to provide users with an intuitive, highly customizable Web GUI. The underlying GRB middleware allows trusted users seamless access to their computational Grid environments. Copyright (c) 2007 John Wiley & Sons, Ltd.
Published: 2007

25. SeaConditions: Present and future sea conditions for safer navigation (www.sea-conditions.com)

Author: Giuseppe Turrisi, Davide Rollo, Alessandro D'Anca, Sergio Creti, Gianandrea Mannarini, Paola Agostini, Tony Monacizzo, Sandro Fiore, Leopoldo Fazioli, Antonio Bonaduce, Giovanni Aloisio, Stefania Angela Ciliberti, Luca Tedesco, Andrea Cucco, Ivan Federico, Marina Tonani, Yogesh Kumkar, Cosimo Palazzo, Rita Lecci, Sara Martinelli, Roberto Sorgente, Marco Spagnulo, Mario Scalas, Massimiliano Drudi, Arturo Cavallo, Antonio Olita, Giovanni Coppini, Roberto Bonarelli, Nadia Pinardi, Palmalisa Marra, and Antonio Tumolo
Subjects: User Friendly, Service (systems architecture), Meteorology, Situation awareness, Computer science, business.industry, Weather forecasting, computer.software_genre, Data science, Environmental data, Bathymetry, Mobile telephony, business, computer, Dissemination
Abstract: Sea Situational Awareness (SSA) is strategically important for management purposes of Italian Seas and coastal areas. The lack of adequate dissemination of marine environmental data and consequent poor knowledge available for operations at sea reduce the response capacity, leading to loss of lives and potential socio-economic damages. The SSA topic is being addressed by "TESSA", an industrial research project funded under the PON "Ricerca & Competitivita 2007–2013" program of Ministero Italiano dell'Istruzione, dell'Universita' e della Ricerca. TESSA is a joint effort of research groups of operational oceanography and scientific computing, and aims to strengthen and consolidate the operational oceanography service and to integrate it with advanced technological platforms in order to disseminate information for the SSA. The first product of TESSA is “SeaConditions”, a public service providing ocean and weather forecasts for the Mediterranean Sea, on the web and mobile applications. Every day, forecasts are produced by operational services, such as the Mediterranean Monitoring and Forecasting Center (www.myocean.eu) for the ocean variables and ECMWF for the atmospheric variables. The service delivers detailed information with high spatial and temporal resolution. Main variables displayed on Google Maps are: bathymetry, weather and oceanographic forecasts and satellite ocean colour data. Ocean forecasts are given at different resolution since nested limited area models for Mediterranean sub-regions are also displayed. SeaConditions provides a user friendly interface with zoom and drag Google Maps' features allowing to display data with different levels of details. SeaConditions' main strength is to provide a single point of access to meteo-marine forecasts, which are based on advanced oceanographic models, remote sensing products and bathymetry, and to deliver high quality information. The SeaConditions products are available through web and mobile channels. The web portal www.sea-conditions.com is compatible with all modern web-browsers on all operating systems. For the mobile users, APPs were also developed to consider the different kind of screens and gesture/interactions. The APPs are available on AppleStore and Google Play.
Published: 2015

26. Ophidia: A full software stack for scientific data analytics

Author: Ian Foster, Cosimo Palazzo, Alessandro D'Anca, Sandro Fiore, Dean N. Williams, Giovanni Aloisio, and Donatello Elia
Subjects: Database, business.industry, Computer science, Big data, computer.software_genre, Data cube, Software analytics, Workflow, Software, Analytics, Data analysis, Web service, business, computer
Abstract: The Ophidia project aims to provide a big data analytics platform solution that addresses scientific use cases related to large volumes of multidimensional data. In this work, the Ophidia software infrastructure is discussed in detail, presenting the entire software stack from level-0 (the Ophidia data store) to level-3 (the Ophidia web service front end). In particular, this paper presents the big data cube primitives provided by the Ophidia framework, discussing in detail the most relevant and available data cube manipulation operators. These primitives represent the proper foundations to build more complex data cube operators like the apex one presented in this paper. A massive data reduction experiment on a 1TB climate dataset is also presented to demonstrate the apex workflow in the context of the proposed framework.
Published: 2014

27. Topic 5: Parallel and Distributed Data Management

Author: Sandro Fiore, Stergios V. Anastasiadis, André Brinkmann, Kostas Magoutis, María S. Pérez-Hernández, and Adrien Lebre
Subjects: Distributed design patterns, business.industry, Distributed algorithm, Computer science, Scale (chemistry), Data management, Big data, Enhanced Data Rates for GSM Evolution, business, Data science
Abstract: Nowadays we are facing an exponential growth of new data that is overwhelming the capabilities of companies, institutions and the society in general to manage and use it in a proper way. Ever-increasing investments in Big Data, cutting edge technologies and the latest advances in both application development and underlying storage systems can help dealing with data of such magnitude. Especially parallel and distributed approaches will enable new data management solutions that operate effectively at large scale.
Published: 2013

28. The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data

Author: Estanislao Gonzalez, Sebastian Denvil, Mark Morgan, Dean N. Williams, Chris A. Mattmann, Luca Cinquini, Zed Pobre, Neill Miller, Daniel J. Crichton, Sandro Fiore, Stephen Pascoe, Rachana Ananthakrishnan, Philip Kershaw, Gavin M. Bell, Bob Drach, Feiyi Wang, Galen M. Shipman, John Harney, and Roland Schweitzer
Subjects: World Wide Web, Geospatial analysis, Grid computing, Application programming interface, Computer science, Node (computer science), Interoperability, Data system, Earth System Grid, OpenID, computer.software_genre, computer
Abstract: The Earth System Grid Federation (ESGF) is a multi-agency, international collaboration that aims at developing the software infrastructure needed to facilitate and empower the study of climate change on a global scale. The ESGF's architecture employs a system of geographically distributed peer nodes, which are independently administered yet united by the adoption of common federation protocols and application programming interfaces (APIs). The cornerstones of its interoperability are the peer-to-peer messaging that is continuously exchanged among all nodes in the federation; a shared architecture and API for search and discovery; and a security infrastructure based on industry standards (OpenID, SSL, GSI and SAML). The ESGF software is developed collaboratively across institutional boundaries and made available to the community as open source. It has now been adopted by multiple Earth science projects and allows access to petabytes of geophysical data, including the entire model output used for the next international assessment report on climate change (IPCC-AR5) and a suite of satellite observations (obs4MIPs) and reanalysis data sets (ANA4MIPs).
Published: 2012

29. The GRelC Project: From 2001 to 2011, 10 Years Working on Grid-DBMSs

Author: Sandro Fiore, Alessandro Negro, and Giovanni Aloisio
Subjects: Security framework, Computer science, Command-line interface, business.industry, Interoperability, Grid, Software engineering, business, Database research, Domain (software engineering)
Abstract: This chapter provides a complete overview on the Grid Relational Catalog (GRelC) Project, a grid database research effort started in 2001 at the University of Salento. The project’s main features, its interoperability with gLite-based production grids, and a relevant show-case in the environmental domain are presented.
Published: 2011

30. Grid and Cloud Database Management

Author: Sandro Fiore and Giovanni Aloisio
Subjects: Cloud computing security, business.industry, Computer science, Cloud computing, Provisioning, Virtualization, computer.software_genre, World Wide Web, Utility computing, Grid computing, Scalability, Cloud database, business, computer
Abstract: Since the 1990s Grid Computing has emerged as a paradigm for accessing and managing distributed, heterogeneous and geographically spread resources, promising that we will be able to access computer power as easily as we can access the electric power grid. Later on, Cloud Computing brought the promise of providing easy and inexpensive access to remote hardware and storage resources. Exploiting pay-per-use models and virtualization for resource provisioning, cloud computing has been rapidly accepted and used by researchers, scientists and industries. In this volume, contributions from internationally recognized experts describe the latest findings on challenging topics related to grid and cloud database management. By exploring current and future developments, they provide a thorough understanding of the principles and techniques involved in these fields. The presented topics are well balanced and complementary, and they range from well-known research projects and real case studies to standards and specifications, and non-functional aspects such as security, performance and scalability. Following an initial introduction by the editors, the contributions are organized into four sections: Open Standards and Specifications, Research Efforts in Grid Database Management, Cloud Data Management, and Scientific Case Studies.With this presentation, the book serves mostly researchers and graduate students, both as an introduction to and as a technical reference for grid and cloud database management. The detailed descriptions of research prototypes dealing with spatiotemporal or genomic data will also be useful for application engineers in these fields.
Published: 2011

31. Data virtualization in grid environments through the GRelC Data Access and Integration Service

Author: Sandro Fiore, Alessandro Negro, and Giovanni Aloisio
Subjects: Service (systems architecture), Database, Data grid, Computer science, business.industry, Data management, Enterprise information integration, computer.software_genre, Grid, Data science, Data access, Grid computing, business, computer, Data virtualization
Abstract: Grids promote the publication, sharing and integration of scientific data, distributed across Virtual Organizations. The complexity of data management in a grid environment comes from the distribution, heterogeneity, dynamicity and number of data sources. Data virtualization is a fundamental issue to manage in a unified and virtualized manner (from structure, location, access service, performance points of view) data stored into multiple, geographically spread data sources. It represents a key point for distributed data management services which has to be addressed to build high quality production/enterprise oriented services. In this work we talk about the convergence process (among three main data grid services developed in the context of the GRelC Project) that has led to the unified GRelC Data Access and Integration Servce.
Published: 2009

32. Advances in the GRelC Data Access Service

Author: Sandro Fiore, Giovanni Aloisio, Salvatore Vadacca, R. Barbera, Emidio Giorgio, Massimo Cafaro, Alessandro Negro, S., Fiore, A., Negro, S., Vadacca, Cafaro, Massimo, Aloisio, Giovanni, R., Barbera, and E., Giorgio
Subjects: Service (systems architecture), Data grid, business.industry, Computer science, Data management, Interoperability, computer.software_genre, Metadata, World Wide Web, Data access, Grid computing, Metadata management, business, computer
Abstract: In a growing number of scientific disciplines, large data collections are emerging as important community resources. Data and metadata management exploiting the data grid paradigm is becoming more and more important as the number of involved data sources is continuously increasing and decentralizing. Efficient grid data access services are perceived as mandatory components for data management. In the grid data management area the GRelC Project has been addressing efficiency, transparency, interoperability and security issues, providing grid enabled solutions and proposing a set of data access and integration/federation services. In this paper we present the advances related to the GRelC Data Access, highlighting differences and innovations w.r.t. previous work. Basic foundations about the grid-enabled queries provided by the GRelC DAS and experimental results related to a bioinformatics international testbed on the GILDA t-Infrastructure are also reported and discussed.
Published: 2008

33. A Grid-Based Bioinformatics Wrapper for Biological Databases

Author: Sandro Fiore, Marco Passante, Maria Mirto, Massimo Cafaro, Giovanni Aloisio, M., Mirto, S., Fiore, Cafaro, Massimo, M., Passante, and Aloisio, Giovanni
Subjects: Biological data, Information retrieval, Database, Data grid, Computer science, Flat file database, Relational database, computer.software_genre, Bioinformatics, Data warehouse, Data independence, Data redundancy, computer, Data integration
Abstract: With a growing trend towards grid-based data repositories and data analysis services, scientific data analysis often involves accessing multiple data sources, and analyzing the data using a variety of analysis programs. A strictly related critical challenge is the fact that data sources often hold the same type of data in a number of different formats; moreover, the formats expected and generated by various data analysis services are often distinct. In bioinformatics the data are often stored in flat files, therefore accessing them to retrieve a subset of records determined by constraints, is slower with respect to other approaches such as relational DBMS. We have developed a data grid system, built on top of specific biological data sources in flat file format, which carries out the ingestion into a relational DBMS for data integration reducing the data redundancy present in the biological flat files. In this work, we describe the prototype for the ingestion in a relational DBMS of the Swiss-2D PAGE flat file.
Published: 2008

34. The GRelC Portal: A Ubiquitous and Seamless Way to Manage Grid Databases

Author: E. Verdesca, Sandro Fiore, A. Leone, Salvatore Vadacca, Alessandro Negro, and Giovanni Aloisio
Subjects: Ubiquitous computing, Database, Data grid, business.industry, Computer science, computer.software_genre, Grid, World Wide Web, Metadata, Data access, Grid computing, Web page, Web application, business, computer
Abstract: Grid portals are web gateways aiming at providing a pervasive and ubiquitous access in grid to computational resources, tools, instruments, datasets and metadata via standard Web protocols. Moreover, they provide enhanced problem solving capabilities to deal with modern, large scale scientific and engineering problems. Data grid management systems are becoming increasingly important in the context of the recently adopted service oriented paradigm. The grid relational catalog (GRelC) project is working towards ubiquitous, integrated, seamless and comprehensive grid database management solutions. This paper describes the GRelC Portal, a web based grid-enabled solution for grid-database access, management and integration built on top of the GRelC Data Access Service.
Published: 2008

35. iGRelC: A Dashboard Implementation for Grid Environments

Author: Alessandro Negro, Giovanni Aloisio, Sandro Fiore, Salvatore Vadacca, Fiore, Sandro Luigi, Negro, Alessandro, S., Vadacca, and Aloisio, Giovanni
Subjects: Database, Distributed database, Computer science, business.industry, Process (engineering), Data management, Dashboard (business), computer.software_genre, Grid, Grid computing, Accounting information system, TeraGrid, business, computer
Abstract: Nowadays production grids such as EGEE, Teragrid, DEISA adopt several tools in order to monitor jobs, check the status of the grid, manage accounting information, etc. Anyway, from the end-user perspective, monitoring the global status of the grid taking into account machines, networks, services, databases, job, etc. is not straightforward, uniform, and tightly coupled. What we present in this paper is the iGRelC dashboard, an integrated approach able to retrieve, process and display information coming from different data sources (both relational and non-relational) and published in grid by heterogeneous systems and services.
Published: 2008

36. Design and Implementation of a Grid Computing Environment for Remote Sensing

Author: Giovanni Aloisio, Italo Epicoco, Massimo Cafaro, Sandro Fiore, Gianvito Quarta, A. PLAZA AND C. CHANG EDS, Cafaro, Massimo, Epicoco, Italo, Quarta, G., Fiore, Sandro Luigi, and Aloisio, Giovanni
Subjects: Remote Sensing, Grid computing, Grid Computing, Computer science, Remote sensing (archaeology), Real-time computing, computer.software_genre, computer
Abstract: This chapter presents an overview of a Grid Computing Environment de- signed for remote sensing. Combining recent grid computing technologies, concepts related to problem solving environments and high performance com- puting, we show how a dynamic Earth Observation system can be designed and implemented, with the goal of management of huge quantities of data coming from space missions and for their on-demand processing and deliver- ing to final users.
Published: 2007

37. High Throughput Protein Similarity Searches in the LIBI Grid Problem Solving Environment

Author: Rita Casadio, Giovanni Aloisio, Ivan Rossi, Maria Mirto, Sandro Fiore, Piero Fariselli, Italo Epicoco, Mirto M., Rossi I., Epicoco I., Fiore S., Fariselli P., Casadio R., Aloisio G., P. Thulasiraman, X. He, T. Li Xu, M. K. Denko, R. K. Thulasiram, L. T. Yang, Mirto, M, Rossi, I, Epicoco, Italo, Fiore, S, Fariselli, P, Casadio, R, and Aloisio, Giovanni
Subjects: Bioinformatics, Protein Similarity Searches, Bioinformatics requirements, Complex applications, High computing power, Problem Solving Environment (PSE), Computer science, Scale (chemistry), Distributed computing, Integration platform, Problem solving environment, Biological database, Grid, Supercomputer, Throughput (business)
Abstract: Bioinformatics applications are naturally distributed, due to distribution of involved data sets, experimental data and biological databases. They require high computing power, owing to the large size of data sets and the complexity of basic computations, may access heterogeneous data, where heterogeneity is in data format, access policy, distribution, etc., and require a secure infrastructure, because they could access private data owned by different organizations. The Problem Solving Environment (PSE) is an approach and a technology that can fulfil such bioinformatics requirements. The PSE can be used for the definition and composition of complex applications, hiding programming and configuration details to the user that can concentrate only on the specific problem. Moreover, Grids can be used for building geographically distributed collaborative problem solving environments and Grid aware PSEs can search and use dispersed high performance computing, networking, and data resources. In this work, the PSE solution has been chosen as the integration platform of bioinformatics tools and data sources. In particular an experiment of multiple sequence alignment on large scale, supported by the LIBI PSE, is presented.
Published: 2007

38. GReIC Data Storage: A Lightweight Disk Storage Management Solution for Bioinformatics 'in silico' Experiments

Author: Maria Mirto, Sandro Fiore, Massimo Cafaro, Giovanni Aloisio, Fiore, Sandro, Mirto, Maria, Cafaro, Massimo, Aloisio, Giovanni, S., Fiore, and M., Mirto
Subjects: Bioinformatic, Middleware, Database, Computer science, business.industry, Security of data, Societies and institution, Lightweight disk storage, Information repository, computer.software_genre, Grid, Bioinformatics, Standard, Virtual reality, Shared resource, Data storage equipment, Bioinformatics data, Converged storage, Computer data storage, Publication, Grid energy storage, Disk storage, business, computer
Abstract: Data grids are middleware systems that offer secure shared storage of massive scientific datasets over wide area networks. In this paper we describe the GReIC Data Storage, a novel grid storage service which has been developed within the Grid Relational Catalog (GReIC) Project. The aim of this service is to manage efficiently, securely and transparently collections of bioinformatics data concerning "in silico" experiments on the grid promoting flexible, secure and coordinated storage resource sharing and publication across virtual organizations, taking into account current grid standards and specifications.
Published: 2007

39. A services oriented system for bioinformatics applications on the grid

Author: Giovanni, Aloisio, Massimo, Cafaro, Italo, Epicoco, Sandro, Fiore, and Maria, Mirto
Subjects: Access to Information, Proteomics, Internet, Italy, Computational Biology, Medical Informatics, Problem Solving
Abstract: This paper describes the evolution of the main services of the ProGenGrid (ProteomicsGenomics Grid) system, a distributed and ubiquitous grid environment ("virtual laboratory"), based on Workflow and supporting the design, execution and monitoring of "in silico" experiments in bioinformatics.ProGenGrid is a Grid-based Problem Solving Environment that allows the composition of data sources and bioinformatics programs wrapped as Web Services (WS). The use of WS provides ease of use and fosters re-use. The resulting workflow of WS is then scheduled on the Grid, leveraging Grid-middleware services. In particular, ProGenGrid offers a modular bag of services and currently is focused on the biological simulation of two important bioinformatics problems: prediction of the secondary structure of proteins, and sequence alignment of proteins. Both services are based on an enhanced data access service.
Published: 2007

40. A Split & Merge Data Management Architecture for a Grid Environment

Author: Massimo Cafaro, Sandro Fiore, Giovanni Aloisio, and Maria Mirto
Subjects: Data grid, Database, Computer science, business.industry, Distributed computing, Data management, Interoperability, computer.software_genre, Grid, Supercomputer, Grid computing, Web service, business, Space-based architecture, computer
Abstract: Currently several applications produce huge amount of data making them available for post-processing operations in order to infer new knowledge. Main issues of these applications are the need for efficient mechanisms to access data and high performance computing to obtain the results in an acceptable time. Wrapping the applications as Web services allows interoperability with others tools and in particular with grid computing environments exploiting a large set of resources through a standard interface, to support the requirements of so-called "data intensive" applications that handle large amounts of data. This paper presents the architecture of a complex data managements system leveraging the grid computing paradigm, exploiting existing middleware developed at the University of Lecce within the ProGenGrid, GReIC and GRB projects to support high throughput applications. This architecture has been specialized for a bioinformatics domain and a case study of the use of a biological application will be also described
Published: 2006

41. A web service-based Grid portal for Edgebreaker compression

Author: Giovanni Aloisio, Maria Cristina Barba, Sandro Fiore, Massimo Cafaro, Euro Blasi, Maria Mirto, Aloisio, Giovanni, Barba, Mc, Blasi, E, Cafaro, Massimo, Fiore, S, and Mirto, M.
Subjects: Databases, Factual, Medical Records Systems, Computerized, Teleradiology, Computer science, Interoperability, Information Storage and Retrieval, Health Informatics, computer.software_genre, Edgebreaker algorithm, Imaging, Three-Dimensional, Health Information Management, Humans, Program Development, Protocol (object-oriented programming), Advanced and Specialized Nursing, Distributed Computing Environment, Internet, Database, business.industry, Grid portal, Object (computer science), Grid, Computational Grid, Visualization, Systems Integration, Radiology Information Systems, Computer architecture, Globus toolkit, Italy, Database Management Systems, The Internet, Web service, business, computer, Web Service, Algorithms
Abstract: Summary Background: In health applications, and elsewhere, 3D data sets are increasingly accessed through the Internet. To reduce the transfer time while maintaining an unaltered 3D model, adequate compression and decompression techniques are needed. Recently, Grid technologies have been integrated with Web Services technologies to provide a framework for interoperable application-to-application interaction. Objectives: The paper describes an implementation of the Edgebreaker compression technique exploiting web services technology and presents a novel approach for using such services in a Grid Portal. The Grid portal, developed at the CACT/ISUFI of the University of Lecce, allows the processing and delivery of biomedical images (CT – computerized tomography – and MRI – magnetic resonance images) in a distributed environment, using the power and security of computational Grids. Methods: The Edgebreaker Compression Web Service has been deployed on a Grid portal and allows compressing and decompressing 3D data sets using the Globus toolkit GSI (Globus Security Infrastructure) protocol. Moreover, the classical algorithm has been modified extending the compression to files containing more than one object. Results and Conclusions: An implementation of the Edgebreaker compression technique and related experimental results are presented. A novel approach for using the compression web service in a Grid portal allowing storing and preprocessing of huge 3D data sets, and subsequent efficient transmission of results for remote visualization is also described.
Published: 2005

42. ProGenGrid: a grid-enabled platform for bioinformatics

Author: Giovanni, Aloisio, Massimo, Cafaro, Sandro, Fiore, and Maria, Mirto
Subjects: Proteomics, Internet, Italy, Computer Systems, Drug Design, Computational Biology, Humans, Genomics, Information Systems
Abstract: In this paper we describe the ProGenGrid (Proteomics and Genomics Grid) system, developed at the CACT/ISUFI of the University of Lecce which aims at providing a virtual laboratory where e-scientists can simulate biological experiments, composing existing analysis and visualization tools, monitoring their execution, storing the intermediate and final output and finally, if needed, saving the model of the experiment for updating or reproducing it. The tools that we are considering are software components wrapped as Web Services and composed through a workflow. Since bioinformatics applications need to use high performance machines or a high number of workstations to reduce the computational time, we are exploiting a Grid infrastructure for interconnecting wide-spread tools and hardware resources. As an example, we are considering some algorithms and tools needed for drug design, providing them as services, through easy to use interfaces such as the Web and Web service interfaces built using the open source gSOAP Toolkit, whereas as Grid middleware we are using the Globus Toolkit 3.2, exploiting some protocols such as GSI and GridFTP.
Published: 2005

43. A grid-based architecture for earth observation data access

Author: Sandro Fiore, Massimo Cafaro, Gianvito Quarta, Giovanni Aloisio, Aloisio, Giovanni, Cafaro, Massimo, S., Fiore, and G., Quarta
Subjects: Service (systems architecture), Earth observation, Data access, Geospatial analysis, Grid computing, Database, Computer science, Architecture, computer.software_genre, Grid, computer
Abstract: A huge quantity of Earth Observation (EO) and geospatial data is daily produced by several organizations. These heterogeneous data are very useful in several scientific, civil, military and industrial applications.Securely and transparently storing, managing and accessing this huge quantity of data spread over distributed systems is a challenging problem. Grid computing offers today a way to achieve secure access to geographically spread storage and computational resources.In this paper we present the Distributed Earth Observation System Information Service (DEOSIS) a distributed information service, developed by CACT/ISUFI at the University of Lecce which aims at managing and accessing EO and geospatial heterogeneous data sources, in a grid environment.
Published: 2005

44. iGrid, a Novel Grid Information Service

Author: Giovanni Aloisio, Massimo Cafaro, Silvia Mocavero, Sandro Fiore, Italo Epicoco, Maria Mirto, Daniele Lezzi, Sloot P.M.A.,Hoekstra A.G.,Priol T.,Reinefeld A.,Bubak M., Aloisio, Giovanni, Cafaro, Massimo, Epicoco, Italo, S., Fiore, D., Lezzi, M., Mirto, and S., Mocavero
Subjects: Service (systems architecture), Database, Grid Computing, Computer science, Dynamic data, Distributed computing, Mutual authentication, computer.software_genre, Grid, Information Service, Scalability, Relational model, Web service, computer
Abstract: In this paper we describe iGrid, a novel Grid Information Service based on the relational model. iGrid is developed within the European GridLab project by the ISUFI Center for Advanced Computational Technologies (CACT) of the University of Lecce, Italy. Among iGrid requirements there are security, decentralized control, support for dynamic data and the possibility to handle user ' s and/or application supplied information, performance and scalability. The iGrid Information Service has been specified and carefully designed to meet these requirements.
Published: 2005

45. Resource and Service Discovery in the iGrid Information Service

Author: Silvia Mocavero, Giovanni Aloisio, Italo Epicoco, Daniele Lezzi, Massimo Cafaro, Maria Mirto, Sandro Fiore, Gervasi, O, Gavrilova, ML, Kumar, V, Lagana, A, Lee, HP, Mun, Y, Taniar, D, Tan, CJK, Aloisio, Giovanni, Cafaro, Massimo, Epicoco, Italo, S., Fiore, D., Lezzi, M., Mirto, and S., Mocavero
Subjects: World Wide Web, Service (systems architecture), Computer science, Testbed, Scalability, Service discovery, Information system, Resource management, Web service, computer.software_genre, Grid, computer, Data modeling
Abstract: In this paper we describe resource and service discovery mechanisms available in iGrid, a novel Grid Information Service based on the relational model. iGrid is developed within the GridLab project by the ISUFI Center for Advanced Computational Technologies (CACT) at the University of Lecce, Italy and it is deployed on the European GridLab testbed. The GridLab Information Service provides fast and secure access to both static and dynamic information through a GSI enabled web service. Besides publishing system information, iGrid also allow publication of user’s or service supplied information. The adoption of the relational model provides a flexible model for data, and the hierarchical distributed architecture provides scalability and fault tolerance.
Published: 2005

46. Web services for a biomedical imaging portal

Author: Giovanni Aloisio, Daniele Lezzi, Massimo Cafaro, Euro Blasi, Sandro Fiore, Maria Mirto, Aloisio, Giovanni, Cafaro, Massimo, E., Blasi, M., Mirto, S., Fiore, and D., Lezzi
Subjects: Web standards, medicine.medical_specialty, Web development, business.industry, Computer science, WS-I Basic Profile, computer.software_genre, Web application security, World Wide Web, medicine, Web mapping, Web service, business, computer, Web modeling, Data Web
Abstract: TACWeb (TAC images on the Web), is a Web-based Grid portal, developed at the CACT/ISUFI laboratory of the University of Lecce for the management of biomedical images in a distributed environment. TACWeb, building on top of the Globus Toolkit, is an interactive environment that deals with complex user's requests, regarding the acquisition of biomedical data, the "processing" and "delivering" of biomedical images, using the power and security of computational grids. Recently, Grid technologies are being integrated with Web services technologies to provide a framework for interoperable application-to-application interaction. In this paper we present an evolution of the TACWeb architecture that is compliant with the Web services approach and its main functionalities. In such a system, the basic capabilities are encapsulated and exposed as Web services allowing the development of new health applications as a composition of such services.
Published: 2004

47. The GRelC library: a basic pillar in the grid relational catalog architecture

Author: Massimo Cafaro, Giovanni Aloisio, Sandro Fiore, Maria Mirto, Aloisio, Giovanni, Cafaro, Massimo, S., Fiore, and M., Mirto
Subjects: SQL, Database, Data grid, Relational database, computer.internet_protocol, Computer science, computer.software_genre, Grid, Technology management, World Wide Web, Grid computing, Middleware (distributed applications), computer, XML, computer.programming_language
Abstract: Today many data grid applications need to manage and process a very large amount of data distributed across multiple grid nodes and stored in relational databases. The Grid Relational Catalog Project (GRelC) developed at the CACT/ISUFI of the University of Lecce, represents an attempt to design and deploy a grid-DBMS for the Globus Community. In this paper, after defining the grid-DBMS concept, we describe the GRelC library which is layered on top of the Globus Toolkit. The user can build client applications on top of it that can easily get access to and interact with data resources.
Published: 2004

48. A grid environment for diesel engine chamber optimization

Author: Sandro Fiore, Silvia Mocavero, Euro Blasi, Italo Epicoco, Massimo Cafaro, Giovanni Aloisio, Aloisio, Giovanni, E., Blasi, Cafaro, Massimo, Epicoco, Italo, S., Fiore, and S., Mocavero
Subjects: Distributed Computing Environment, Computer engineering, Grid computing, Computer science, Real-time computing, Process (computing), Fuel efficiency, Performance improvement, Grid, Diesel engine, computer.software_genre, Global optimization, computer
Abstract: The goal of this paper is to show that computer modelling techniques can be used to solve the real life problem of Diesel Engine performance improvement. The purpose of our work is to achieve the lowest emissions level and improved fuel efficiency with respect to European Norm for emissions. Particularly, we are interested to reduce NO + HC and soot emissions and to maximize the PMI, a pressure proportional to engine power. These parameters depend on combustion chamber geometry too, so we propose its optimization turning to the new micro Genetic Algorithms (micro-GA) technique. The idea consists in the automated random generation of a lot of meshes. each of these representing a different chamber geometry, with respect to some common geometric constraints: then in the use of the micro-GA to optimize, at each iteration, the results obtained in the previous steps. The innovative feature of our work is the multiobjective nature of the optimization process. This is the main reason to chose micro-GA rather than simple Genetic Agorithms. Emissions level and fuel efficiency can be evaluated using a modified version of KIVA3 code that outputs three values, each of these related to one of the three specific fitness functions to be maximized. The optimization process involves the execution of a lot of KIVA3 simulations to calculate fitness values of the chamber geometries taken in consideration during all of the optimization steps. We propose the use of Grid Computing technologies to increase the performance of the KIVA-micro-GA, showing how a distributed environment allows to reduce the computational time needed by the optimization process, taking advantage of intrinsic parallelism of micro-GA. In fact, their structure allows executing simultaneously KIVA3 simulations over the random meshes and over the geometries that populate the micro-population at each iteration. The services offered by the system are the micro-GA parameters definition. the submission of the optimization process and the monitoring of the process status. A trusted user can access the implemented services using a grid portal, called DESGrid (Grid for Diesel Engine Simulation). The analysis of the results, achieved by execution of KIMA-micro-GA on three ES40 Compaq nodes, each one equipped by four processors, shows a good reduction in both emissions and fuel consumption. In the paper we show numerical values and related geometries representation obtained after the first steps of the global optimization process execution.
Published: 2004

49. Special Issue on Advances in High Performance Computing and Simulation

Author: Waleed W. Smari, Sandro Fiore, and Mads Nygaard
Subjects: Computational Theory and Mathematics, Computer architecture, Computer Networks and Communications, Computer science, Supercomputer, Software, Computer Science Applications, Theoretical Computer Science
Published: 2012

50. A semantic grid-based data access and integration service for bioinformatics

Author: Italo Epicoco, Massimo Cafaro, Giovanni Aloisio, Sandro Fiore, Maria Mirto, Aloisio, Giovanni, Cafaro, Massimo, Epicoco, Italo, Sandro, Fiore, and Maria, Mirto
Subjects: Biological data, Semantic grid, Data access, Workflow, Grid computing, Computer science, Web service, Semantic data model, computer.software_genre, Bioinformatics, computer, Data integration
Abstract: Given the heterogeneous nature of biological data and their intensive use in many tools, in this paper we propose a semantic data access and integration (DAI) service, based on the grid paradigm, for the bioinformatics domain. This service uses ontologies for correlating different data sets. The DAI proposed in this work is a fundamental component of the ProGenGrid system, a grid-enabled platform, which aims at the design and implementation of a virtual laboratory where e-scientists could simulate complex "in silico" experiments, composing some popular analysis and visualization tools (e.g. Blast and Rasmol) available as Web services, into a workflow. The main goal of the DAI is to provide bioinformatics tools with advanced functionalities and data integration services for heterogeneous biological data banks, such as PDB and Swiss-Prot. A case study of our specialized data access service for locating similar protein sequences is presented.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Journal

Database

Publisher

56 results on '"Sandro, Fiore"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources