25 results on '"Jakub Moscicki"'
Search Results
2. Open Data Science Mesh: friction-free collaboration for researchers bridging High-Energy Physics and European Open Science Cloud
- Author
-
Jakub Moscicki, Armin Burger, David Antos, Frederik Orellana, Gavin Kennedy, Guido Aben, Holger Angenent, Lars Kierkegaard, Laura Castelucci, Lorenzo Posani, Maciej Brzezniak, Martin Bech, Massimo Lamanna, Pierre Soille, Renato Furter, Ron Trompert, Sieprawski Marcin, and Victoria Cochrane
- Abstract
Open Data Science Mesh (CS3MESH4EOSC) is a newly funded project to create a new generation, interoperable federation of data and higher-level services to enable friction-free collaboration between European researchers. This new EU-funded project brings together 12 partners from the CS3 community (Cloud Synchronization and Sharing Services). The consortium partners include CERN, Danish Technical University (DK), SURFSARA (NL), Poznan Supercomputing Centre (PL), CESNET (CZ), AARNET (AUS), SWITCH (CH), University of Munster (DE), Ailleron SA (PL), Cubbit (IT), Joint Research Centre (BE) and Fundacion ESADE (ES). CERN acts as project coordinator. The consortium already operates services and storage-centric infrastructure for around 300 thousand scientists and researchers across the globe. The project will integrate these local existing sites and services into a seamless mesh infrastructure which is fully interconnected with the EOSC-Hub, as proposed in the European Commission's Implementation Roadmap for EOSC. The project will provide a framework for applications in several major areas: Data Science Environments, Open Data Systems, Collaborative Documents, On-demand Large Dataset Transfers and Cross-domain Data Sharing. The collaboration between the users will be enabled by a simple sharing mechanism: a user will select a file or folder to share with other users at other sites. Such shared links will be established and removed dynamically by the users from a streamline web interface of their local storage systems. The mesh will automatically and contextually enable different research workflow actions based on type of content shared in the folder. One of the excellence areas of CS3 services is access to content from all types of devices: web, desktop applications and mobile devices. The project augments this capability to access content stored on remote sites and will *in practice* introduce FAIR principles in European Science. The project with leverage on technologies developed and integrated in the research community, such as ScienceBox (CERNBox, SWAN, EOS), EGI-CheckIn, File Transfer Service (FTS), ARGO, EduGAIN and others. The project will also involve commercial cloud providers, integrating their software and services
- Published
- 2019
- Full Text
- View/download PDF
3. Enabling interoperable data and application services in a federated ScienceMesh
- Author
-
Ishank Arora, Hugo Gonzalez Labrador, Pedro G. Ferreira, Samuel Alfageme Sainz, and Jakub Moscicki
- Subjects
Service (systems architecture) ,business.industry ,Physics ,QC1-999 ,Interoperability ,sync ,Access control ,Cloud computing ,Computing and Computers ,World Wide Web ,Collaborative editing ,Workflow ,User experience design ,business - Abstract
In recent years, cloud sync & share storage services, provided by academic and research institutions, have become a daily workplace environment for many local user groups in the High Energy Physics (HEP) community. These, however, are primarily disconnected and deployed in isolation from one another, even though new technologies have been developed and integrated to further increase the value of data. The EU-funded CS3MESH4EOSC project is connecting locally and individually provided sync and share services, and scaling them up to the European level and beyond. It aims to deliver the ScienceMesh service, an interoperable platform to easily sync and share data across institutions and extend functionalities by connecting to other research services using streamlined sets of interoperable protocols, APIs and deployment methodologies. This supports multiple distributed application workflows: data science environments, collaborative editing and data transfer services. In this paper, we present the architecture of ScienceMesh and the technical design of its reference implementation, a platform that allows organizations to join the federated service infrastructure easily and to access application services outof-the-box. We discuss the challenges faced during the process, which include diversity of sync & share platforms (Nextcloud, Owncloud, Seafile and others), absence of global user identities and user discovery, lack of interoperable protocols and APIs, and access control and protection of data endpoints. We present the rationale for the design decisions adopted to tackle these challenges and describe our deployment architecture based on Kubernetes, which enabled us to utilize monitoring and tracing functionalities. We conclude by reporting on the early user experience with ScienceMesh.
- Published
- 2021
- Full Text
- View/download PDF
4. Providing large-scale disk storage at CERN
- Author
-
Cristian I. Contescu, Massimo Lamanna, Hugo Gonzalez Labrador, Herve Rousseau, Giuseppe Lo Presti, Belinda Chan Kwok Cheong, Jakub Moscicki, Xavier Espinal Curull, Dan van der Ster, Jan Iven, and Luca Mascetti
- Subjects
Service (systems architecture) ,010308 nuclear & particles physics ,business.industry ,Physics ,QC1-999 ,Home directory ,Cloud computing ,Software distribution ,Microsoft Office ,computer.software_genre ,01 natural sciences ,Computing and Computers ,0103 physical sciences ,Distributed data store ,Operating system ,Disk storage ,010306 general physics ,business ,computer ,Block (data storage) - Abstract
The CERN IT Storage group operates multiple distributed storage systems and is responsible for the support of the infrastructure to accommodate all CERN storage requirements, from the physics data generated by LHC and non-LHC experiments to the personnel users' files. EOS is now the key component of the CERN Storage strategy. It allows to operate at high incoming throughput for experiment data-taking while running concurrent complex production work-loads. This high-performance distributed storage provides now more than 250PB of raw disks and it is the key component behind the success of CERNBox, the CERN cloud synchronisation service which allows syncing and sharing files on all major mobile and desktop platforms to provide offline availability to any data stored in the EOS infrastructure. CERNBox recorded an exponential growth in the last couple of year in terms of files and data stored thanks to its increasing popularity inside CERN users community and thanks to its integration with a multitude of other CERN services (Batch, SWAN, Microsoft Office). In parallel CASTOR is being simplified and transitioning from an HSM into an archival system, focusing mainly in the long-term data recording of the primary data from the detectors, preparing the road to the next-generation tape archival system, CTA. The storage services at CERN cover as well the needs of the rest of our community: Ceph as data back-end for the CERN OpenStack infrastructure, NFS services and S3 functionality; AFS for legacy home directory filesystem services and its ongoing phase-out and CVMFS for software distribution. In this paper we will summarise our experience in supporting all our distributed storage system and the ongoing work in evolving our infrastructure, testing very-dense storage building block (nodes with more than 1PB of raw space) for the challenges waiting ahead.
- Published
- 2019
5. Declarative Big Data Analysis for High-Energy Physics: TOTEM Use Case
- Author
-
Prasanth Kothuri, Enrico Bocchi, Maciej Malawski, Danilo Piparo, Jan Kaspar, Jakub Moscicki, Leszek Grzanka, Valentina Avati, Enrico Guiraud, Milosz Blaszkiewicz, Massimo Lamanna, Luca Canali, Javier Cervantes, Aleksandra Mnich, Shravan Murali, Diogo Castro, and Enric Tejedor
- Subjects
010308 nuclear & particles physics ,business.industry ,Computer science ,Distributed computing ,05 social sciences ,Big data ,050301 education ,Cloud computing ,01 natural sciences ,Data set ,Software ,0103 physical sciences ,Scalability ,Data file ,Spark (mathematics) ,business ,0503 education ,Declarative programming - Abstract
The High-Energy Physics community faces new data processing challenges caused by the expected growth of data resulting from the upgrade of LHC accelerator. These challenges drive the demand for exploring new approaches for data analysis. In this paper, we present a new declarative programming model extending the popular ROOT data analysis framework, and its distributed processing capability based on Apache Spark. The developed framework enables high-level operations on the data, known from other big data toolkits, while preserving compatibility with existing HEP data files and software. In our experiments with a real analysis of TOTEM experiment data, we evaluate the scalability of this approach and its prospects for interactive processing of such large data sets. Moreover, we show that the analysis code developed with the new model is portable between a production cluster at CERN and an external cluster hosted in the Helix Nebula Science Cloud thanks to the bundle of services of Science Box.
- Published
- 2019
- Full Text
- View/download PDF
6. Facilitating collaborative analysis in SWAN
- Author
-
Enrico Bocchi, Massimo Lamanna, Pere Mato, Jakub Moscicki, Hugo Gonzalez, Enric Tejedor, Danilo Piparo, and Diogo Castro
- Subjects
Service (systems architecture) ,business.industry ,Interface (Java) ,Software as a service ,Physics ,QC1-999 ,Cloud computing ,Computing and Computers ,World Wide Web ,Outreach ,Software ,Code (cryptography) ,business ,Cloud storage - Abstract
SWAN (Service for Web-based ANalysis) is a CERN service that allows users to perform interactive data analysis in the cloud, in a “software as a service” model. It is built upon the widely-used Jupyter notebooks, allowing users to write - and run - their data analysis using only a web browser. By connecting to SWAN, users have immediate access to storage, software and computing resources that CERN provides and that they need to do their analyses. Besides providing an easier way of producing scientific code and results, SWAN is also a great tool to create shareable content. From results that need to be reproducible, to tutorials and demonstrations for outreach and teaching, Jupyter notebooks are the ideal way of distributing this content. In one single file, users can include their code, the results of the calculations and all the relevant textual information. By sharing them, it allows others to visualize, modify, personalize or even re-run all the code. In that sense, this paper describes the efforts made to facilitate sharing in SWAN. Given the importance of collaboration in our scientific community, we have brought the sharing functionality from CERNBox, CERN’s cloud storage service, directly inside SWAN. SWAN users have available a new and redesigned interface where theycan share “Projects”: a special kind of folder containing notebooks and other files, e.g., like input datasets and images. When a user shares a Project with some other users, the latter can immediately see andwork with the contents of that project from SWAN.
- Published
- 2019
7. Big Data Tools and Cloud Services for High Energy Physics Analysis in TOTEM Experiment
- Author
-
Enric Tejedor, Aleksandra Mnich, Valentina Avati, Enrico Bocchi, Prasanth Kothuri, Milosz Blaszkiewicz, Enrico Guiraud, Javier Cervantes, Luca Canali, Jan Kaspar, Massimo Lamanna, Leszek Grzanka, Danilo Piparo, Shravan Murali, Maciej Malawski, Jakub Moscicki, and Diogo Castro
- Subjects
Particle physics ,Large Hadron Collider ,010308 nuclear & particles physics ,business.industry ,Computer science ,Totem ,Big data ,Cloud computing ,01 natural sciences ,0103 physical sciences ,Scalability ,Leverage (statistics) ,Single-core ,Rewriting ,010306 general physics ,business - Abstract
The High Energy Physics community has been developing dedicated solutions for processing experiment data over decades. However, with recent advancements in Big Data and Cloud Services, a question of application of such technologies in the domain of physics data analysis becomes relevant. In this paper, we present our initial experience with a system that combines the use of public cloud infrastructure (Helix Nebula Science Cloud), storage and processing services developed by CERN, and off-the-shelf Big Data frameworks. The system is completely decoupled from CERN main computing facilities and provides an interactive web-based interface based on Jupyter Notebooks as the main entry-point for the users. We run a sample analysis on 4.7 TB of data from the TOTEM experiment, rewriting the analysis code to leverage the PyRoot and RDataFrame model and to take full advantage of the parallel processing capabilities offered by Apache Spark. We report on the experience collected by embracing this new analysis model: preliminary scalability results show the processing time of our dataset can be reduced from 13 hrs on a single core to 7 mins on 248 cores.
- Published
- 2018
- Full Text
- View/download PDF
8. SWAN: a Service for Interactive Analysis in the Cloud
- Author
-
Danilo Piparo, Luca Mascetti, Pere Mato, Massimo Lamanna, Enric Tejedor, and Jakub Moscicki
- Subjects
Service (systems architecture) ,Focus (computing) ,Database ,010308 nuclear & particles physics ,Computer Networks and Communications ,Computer science ,business.industry ,Interface (computing) ,Cloud computing ,computer.software_genre ,01 natural sciences ,Mass storage ,Computing and Computers ,Software ,File sharing ,Hardware and Architecture ,0103 physical sciences ,Operating system ,Web application ,010306 general physics ,business ,computer - Abstract
SWAN (Service for Web based ANalysis) is a platform to perform interactive data analysis in the cloud. SWAN allows users to write and run their data analyses with only a web browser, leveraging on the widely-adopted Jupyter notebook interface. The user code, executions and data live entirely in the cloud. SWAN makes it easier to produce and share results and scientific code, access scientific software, produce tutorials and demonstrations as well as preserve analyses. Furthermore, it is also a powerful tool for non-scientific data analytics. This paper describes how a pilot of the SWAN service was implemented and deployed at CERN. Its backend combines state-of-the-art software technologies with a set of existing IT services such as user authentication, virtual computing infrastructure, mass storage, file synchronisation and sharing, specialised clusters and batch systems. The added value of this combination of services is discussed, with special focus on the opportunities offered by the CERNBox service and its massive storage backend, EOS. In particular, it is described how a cloud-based analysis model benefits from synchronised storage and sharing capabilities.
- Published
- 2016
9. CERNBox: the CERN cloud storage hub
- Author
-
Hugo Gonzalez Labrador, Diogo Castro, Giuseppe Lo Presti, Cristian I. Contescu, Massimo Lamanna, Georgios Alexandropoulos, Enrico Bocchi, Belinda Chan, Remy Pelletier, Paul Musset, Luca Mascetti, Roberto Valverde, Edward Karavakis, and Jakub Moscicki
- Subjects
010308 nuclear & particles physics ,business.industry ,Physics ,QC1-999 ,Universal design ,Home directory ,Cloud computing ,01 natural sciences ,Computing and Computers ,Visualization ,World Wide Web ,Collaborative editing ,0103 physical sciences ,Road map ,Android (operating system) ,010306 general physics ,business ,Cloud storage - Abstract
CERNBox is the CERN cloud storage hub. It allows synchronizing and sharing files on all major desktop and mobile platforms (Linux, Windows, MacOSX, Android, iOS) aiming to provide universal access and offline availability to any data stored in the CERN EOS infrastructure. With more than 16000 users registered in the system, CERNBox has responded to the high demand in our diverse community to an easily and accessible cloud storage solution that also provides integration with other CERN services for big science: visualization tools, interactive data analysis and real-time collaborative editing. Collaborative authoring of documents is now becoming standard practice with public cloud services, and within CERNBox we are looking into several options: from the collaborative editing of shared office documents with different solutions (Microsoft, OnlyOffice, Collabora) to integrating mark-down as well as LaTeX editors, to exploring the evolution of Jupyter Notebooks towards collaborative editing, where the latter leverages on the existing SWAN Physics analysis service. We report on our experience managing this technology and applicable use-cases, also in a broader scientific and research context and its future evolution with highlights on the current development status and future road map. In particular we will highlight the future move to an architecture based on micro services to easily adapt and evolve the service to the technology and usage evolution, notably to unify CERN home directory services.
- Published
- 2019
- Full Text
- View/download PDF
10. The user-level scheduling of divisible load parallel applications with resource selection and adaptive workload balancing on the Grid
- Author
-
Valeria V. Krzhizhanovskaya, Vladimir Korkhov, Jakub Moscicki, and Computational Science Lab (IVI, FNWI)
- Subjects
Computer Networks and Communications ,Computer science ,Distributed computing ,Real-time computing ,020206 networking & telecommunications ,Workload ,02 engineering and technology ,Dynamic priority scheduling ,computer.software_genre ,Grid ,Application profile ,Computer Science Applications ,Scheduling (computing) ,Load management ,Grid computing ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Algorithm design ,Electrical and Electronic Engineering ,computer ,Information Systems - Abstract
This paper presents a hybrid resource management environment, operating on both application and system levels developed for minimizing the execution time of parallel applications with divisible workload on heterogeneous grid resources. The system is based on the adaptive workload balancing algorithm (AWLB) incorporated into the distributed analysis environment (DIANE) user-level scheduling (ULS) environment. The AWLB ensures optimal workload distribution based on the discovered application requirements and measured resource parameters. The ULS maintains the user-level resource pool, enables resource selection and controls the execution. We present the results of performance comparison of default self-scheduling used in DIANE with AWLB-based scheduling, evaluate dynamic resource pool and resource selection mechanisms, and examine dependencies of application performance on aggregate characteristics of selected resources and application profile.
- Published
- 2009
11. Dynamic workload balancing of parallel applications with user-level scheduling on the Grid
- Author
-
Valeria V. Krzhizhanovskaya, Vladimir Korkhov, Jakub Moscicki, and Faculty of Science
- Subjects
Job scheduler ,Speedup ,Computer Networks and Communications ,Computer science ,Distributed computing ,Real-time computing ,Workload ,Load balancing (computing) ,Grid ,computer.software_genre ,Supercomputer ,Scheduling (computing) ,Hardware and Architecture ,Distributed algorithm ,Resource management ,computer ,Software - Abstract
This paper suggests a hybrid resource management approach for efficient parallel distributed computing on the Grid. It operates on both application and system levels, combining user-level job scheduling with dynamic workload balancing algorithm that automatically adapts a parallel application to the heterogeneous resources, based on the actual resource parameters and estimated requirements of the application. The hybrid environment and the algorithm for automated load balancing are described, the influence of resource heterogeneity level is measured, and the speedup achieved with this technique is demonstrated for different types of applications and resources.
- Published
- 2009
- Full Text
- View/download PDF
12. Data Mining as a Service (DMaaS)
- Author
-
Massimo Lamanna, Danilo Piparo, Enric Tejedor, Jakub Moscicki, Pere Mato, and Luca Mascetti
- Subjects
History ,Service (systems architecture) ,Database ,business.industry ,Computer science ,Interface (computing) ,Cloud computing ,computer.software_genre ,Computer Science Applications ,Education ,Mass storage ,Set (abstract data type) ,Software ,Added value ,Code (cryptography) ,Data mining ,business ,computer - Abstract
Data Mining as a Service (DMaaS) is a software and computing infrastructure that allows interactive mining of scientific data in the cloud. It allows users to run advanced data analyses by leveraging the widely adopted Jupyter notebook interface. Furthermore, the system makes it easier to share results and scientific code, access scientific software, produce tutorials and demonstrations as well as preserve the analyses of scientists. This paper describes how a first pilot of the DMaaS service is being deployed at CERN, starting from the notebook interface that has been fully integrated with the ROOT analysis framework, in order to provide all the tools for scientists to run their analyses. Additionally, we characterise the service backend, which combines a set of IT services such as user authentication, virtual computing infrastructure, mass storage, file synchronisation, development portals or batch systems. The added value acquired by the combination of the aforementioned categories of services is discussed, focusing on the opportunities offered by the CERNBox synchronisation service and its massive storage backend, EOS.
- Published
- 2016
- Full Text
- View/download PDF
13. Recent Improvements in Geant4 Electromagnetic Physics Models and Interfaces
- Author
-
Helmut Burkhard, Stephane Chauvie, B. Mascialino, H.N. Tran, Francesco Romano, Rachel Black, Alexey Bogdanov, Alexander Bagulya, Joseph Perl, G. Russ, Ivan Petrović, T. Yamashit, Giorgio Ivan Russo, Aleksandra Ristić-Fira, Francesco Longo, Sabine Elles, L. Urbán, Hisaya Kurashige, John Apostolakis, Toshiyuki Toshito, Vladimir Ivanchenko, M. Karamitros, M. Maire, P. Gumplinger, C. Zacharatou, J. Jaquemier, Jean Jacquemier, A. Ivanchenko, Giacomo Cuttone, O Kadri, F. Di Rosa, P. Cirrone, Gerardo Depaola, Jakub Moscicki, Paul Gueye, Andreas Schaelicke, Tomohiro Yamashita, A. Mantero, O. Kadr, Haifa Ben Abdelouahed, Vladimir Grichine, Nicolas A. Karakatsanis, Anton Lechner, Giovanni Santin, R. P. Kokoulin, Sebastien Incerti, Ziad Francis, Francesco Di Rosa, Luciano Pandola, F. Roman, Martin, Nathalie, Interface Physique et Chimie pour le Vivant (IPCV), Centre d'Etudes Nucléaires de Bordeaux Gradignan (CENBG), Université Sciences et Technologies - Bordeaux 1-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS)-Université Sciences et Technologies - Bordeaux 1-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Annecy de Physique des Particules (LAPP), Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Université Savoie Mont Blanc (USMB [Université de Savoie] [Université de Chambéry])-Centre National de la Recherche Scientifique (CNRS), European Space Research and Technology Centre (ESTEC), and European Space Agency (ESA)
- Subjects
Physics ,[PHYS.PHYS.PHYS-MED-PH] Physics [physics]/Physics [physics]/Medical Physics [physics.med-ph] ,Computer performance ,010308 nuclear & particles physics ,Interface (Java) ,Computation ,energy loss ,Monte Carlo method ,Hadron ,Bremsstrahlung ,CPU time ,Geant4 ,General Medicine ,01 natural sciences ,bremsstrahlung ,030218 nuclear medicine & medical imaging ,Computational science ,multiple scattering ,03 medical and health sciences ,0302 clinical medicine ,Pair production ,0103 physical sciences ,[PHYS.PHYS.PHYS-MED-PH]Physics [physics]/Physics [physics]/Medical Physics [physics.med-ph] ,Statistical physics ,Monte Carlo - Abstract
International audience; An overview of the electromagnetic (EM) physics of the Geant4 toolkit is presented. Two sets of EM models are available: the "Standard" initially focused on high energy physics (HEP) while the "Low-energy" was developed for medical, space and other applications. The "Standard" models provide a faster computation but are less accurate for keV energies, the "Low-energy" models are more CPU time consuming. A common interface to EM physics models has been developed allowing a natural combination of ultra-relativistic, relativistic and low-energy models for the same run providing both precision and CPU performance. Due to this migration additional capabilities become available. The new developments include relativistic models for bremsstrahlung and e+e- pair production, models of multiple and single scattering, hadron/ion ionization, microdosimetry for very low energies and also improvements in existing Geant4 models. In parallel, validation suites and benchmarks have been intensively developed.
- Published
- 2011
14. Lattice QCD Thermodynamics on the Grid
- Author
-
Maciej Wos, Philippe de Forcrand, Massimo Lamanna, Jakub Moscicki, and Owe Philipsen
- Subjects
Quark ,FOS: Computer and information sciences ,Particle physics ,Strange quark ,High Energy Physics::Lattice ,Lattice field theory ,General Physics and Astronomy ,FOS: Physical sciences ,computer.software_genre ,01 natural sciences ,High Energy Physics - Lattice ,0103 physical sciences ,Statistical physics ,010306 general physics ,Quantum chromodynamics ,Physics ,010308 nuclear & particles physics ,High Energy Physics::Phenomenology ,High Energy Physics - Lattice (hep-lat) ,Lattice QCD ,Grid ,Computer Science - Distributed, Parallel, and Cluster Computing ,Grid computing ,Hardware and Architecture ,Quark–gluon plasma ,Distributed, Parallel, and Cluster Computing (cs.DC) ,computer - Abstract
We describe how we have used simultaneously O ( 10 3 ) nodes of the EGEE Grid, accumulating ca. 300 CPU-years in 2–3 months, to determine an important property of Quantum Chromodynamics. We explain how Grid resources were exploited efficiently and with ease, using user-level overlay based on Ganga and DIANE tools above standard Grid software stack. Application-specific scheduling and resource selection based on simple but powerful heuristics allowed to improve efficiency of the processing to obtain desired scientific results by a specified deadline. This is also a demonstration of combined use of supercomputers, to calculate the initial state of the QCD system, and Grids, to perform the subsequent massively distributed simulations. The QCD simulation was performed on a 16 3 × 4 lattice. Keeping the strange quark mass at its physical value, we reduced the masses of the up and down quarks until, under an increase of temperature, the system underwent a second-order phase transition to a quark–gluon plasma. Then we measured the response of this system to an increase in the quark density. We find that the transition is smoothened rather than sharpened. If confirmed on a finer lattice, this finding makes it unlikely for ongoing experimental searches to find a QCD critical point at small chemical potential.
- Published
- 2009
- Full Text
- View/download PDF
15. Benchmark of medical dosimetry simulation using the Grid
- Author
-
Maria Grazia Pia, Jakub Moscicki, Patricia Mendez Lorenzo, Stephane Chauvie, and Anton Lechner
- Subjects
Large Hadron Collider ,Grid computing ,Computer science ,Middleware ,Benchmark (computing) ,Dosimetry ,Application software ,computer.software_genre ,Software architecture ,Grid ,computer ,Computational science - Abstract
The practical capability of using Grid resources to perform high-precision dosimetry simulation in radiation oncology is evaluated, taking into account the peculiar demands of different treatment modalities. For this purpose extensive benchmark tests on the LHC Computing Grid (LCG) are performed, involving the calculation of dose distributions as required for clinical practice. The software architecture of the test is based on Geant4 for simulation, on AIDA for data analysis, on the LCG middleware for grid computing and on DIANE as an intermediate layer between the application software and the computing environment.
- Published
- 2007
- Full Text
- View/download PDF
16. Geant4 Simulation in a Distributed Computing Environment
- Author
-
Susanna Guatelli, Maria Grazia Pia, Jakub Moscicki, A. Mantero, and Patricia Mendez Lorenzo
- Subjects
Distributed design patterns ,Distributed Computing Environment ,Grid computing ,Parallel processing (DSP implementation) ,DCE/RPC ,Distributed algorithm ,Computer science ,Distributed computing ,Parallel computing ,Architecture ,computer.software_genre ,Grid ,computer - Abstract
A general system to perform a Geant4-based simulation in a distributed computing environment is presented. The architecture developed makes the system transparent to sequential or parallel execution, and to the environment where the simulation is run: a single machine, a local cluster or a geographically distributed grid.
- Published
- 2006
- Full Text
- View/download PDF
17. Experiences in the Gridification of the Geant4 Toolkit in the WLCG/EGEE Environment
- Author
-
P. Mendez-Lorenzo, Alberto Ribon, and Jakub Moscicki
- Subjects
Physics ,High energy ,Virtual organization ,business.industry ,Monte Carlo method ,Grid ,computer.software_genre ,Wide field ,Computational science ,Grid computing ,Regression testing ,Software engineering ,business ,computer - Abstract
The general patterns observed in supporting the Geant4 application in the EGEE infrastructure are discussed. Regression testing of Geant4 public releases is in the focus of this paper. Geant4 is a toolkit for the Monte Carlo simulation of the interaction of particle with matter, used by a wide field of research, including high energy and nuclear physics and also medical, accelerator and space physics studies. The support required for the release regression testing of Geant4 toolkit, including setting up of the new, official Virtual Organization in the EGEE, is explained. Recent developments of automatic regression testing suites and the benefits of the optimization layer above the standard Grid infrastructure are presented.
- Published
- 2006
- Full Text
- View/download PDF
18. Biomedical applications on the GRID: efficient management of parallel jobs
- Author
-
Jakub Moscicki, H.C. Lee, Susanna Guatelli, S.C. Lin, and Maria Grazia Pia
- Subjects
Distributed Computing Environment ,Grid computing ,Computer science ,Robustness (computer science) ,Distributed computing ,Physics::Medical Physics ,Monte Carlo method ,Grid resources ,Minification ,computer.software_genre ,Grid ,computer ,Computational science - Abstract
Distributed computing based on the Master-Worker and PULL interaction model is applicable to a number of applications in high energy physics, medical physics and bio-informatics. We demonstrate a realistic medical physics use-case of a dosimetric system for brachytherapy using distributed GRID resources. We present the efficient techniques for running parallel jobs in a case of the BLAST, a gene sequencing application, as well as for the Monte Carlo simulation based on Geant4. We present a strategy for improving the runtime performance and robustness of the jobs as well as for the minimization of the development time needed to migrate the applications to a distributed environment.
- Published
- 2005
- Full Text
- View/download PDF
19. DISTRIBUTED PROCESSING, MONTECARLO AND CT INTERFACE FOR MEDICAL TREATMENT PLANNING
- Author
-
S. Guatelli, Franca Foppiano, M. G. Pia, and Jakub Moscicki
- Subjects
General purpose ,Medical treatment ,Grid computing ,Computer science ,Computation ,Distributed computing ,Interface (computing) ,Dosimetry ,User interface ,computer.software_genre ,computer - Abstract
We show how nowadays it is possible to achieve the goal of accuracy and fast computation response in radiotherapic dosimetry using MonteCarlo methods, together with a grid computing model. We present a complete, fully functional prototype system for brachytherapy, integrating a Geant4-based simulation, an AIDA-based dosimetric analysis, a web-based user interface, and distributed processing either on a local computing farm or on geographically spread nodes. Thanks to the object-oriented approach adopted for the architecture, the work presented can be easily extended to become a general purpose dosimetric system, capable to address all radiotherapic techniques.
- Published
- 2004
- Full Text
- View/download PDF
20. DIANE - distributed analysis environment for GRID-enabled simulation and analysis of physics data
- Author
-
Jakub Moscicki
- Subjects
Workflow ,Database ,Grid computing ,Interfacing ,Computer science ,Distributed computing ,Component (UML) ,Middleware (distributed applications) ,Interoperability ,Web service ,computer.software_genre ,Grid ,computer - Abstract
Distributed analysis environment (DIANE) is the result of R&D in CERN IT Division focused on interfacing semi-interactive parallel applications with distributed GRID technology. DIANE provides a master-worker workflow management layer above low-level GRID services. DIANE is application and language-neutral. Component-container architecture and component adapters provide flexibility necessary to fulfill the diverse requirements of distributed applications. Physical transport layer assures interoperability with existing middleware frameworks based on Web services. Several distributed simulations based on Geant 4 were deployed and tested in real-life scenarios with DIANE.
- Published
- 2003
- Full Text
- View/download PDF
21. From DICOM to GRID: a dosimetric system for brachytherapy born from HEP
- Author
-
Franca Foppiano, Jakub Moscicki, Maria Grazia Pia, and Susanna Guatelli
- Subjects
business.industry ,Computer science ,medicine.medical_treatment ,Brachytherapy ,Dose distribution ,Grid ,DICOM ,Software ,Computer engineering ,medicine ,Dosimetry ,Radiation treatment planning ,business ,Simulation - Abstract
In brachytherapy software defines the experimental configuration of radioactive seeds appropriate to achieve the desired dose distribution in the patient. We will show you how nowadays it is possible to develop a software for brachytherapy which achieves the goal of high accuracy and speed thanks to the combination of various software toolkits: Geant4 simulation toolkit, AIDA analysis toolkit, GRID and the Web. Geant4-based brachytherapy application calculates dose distribution in tissues with great accuracy, in a realistic experimental set-up derived from CT data. The AIDA analysis toolkit provides the elaboration of simulation results. It is possible to run Geant4-based brachytherapy application through a Web portal, sharing distributed computing resources thanks to the integration in the GRID, making possible even to modest size hospitals to profit of advanced treatment planning tools.
- Published
- 2003
- Full Text
- View/download PDF
22. Toward a petabyte-scale AFS service at CERN
- Author
-
Arne Wiebalck, Daniel van der Ster, and Jakub Moscicki
- Subjects
Unix ,History ,Service (systems architecture) ,Engineering ,Large Hadron Collider ,Database ,business.industry ,Scale (chemistry) ,Petabyte ,computer.software_genre ,Computer Science Applications ,Education ,File server ,Work (electrical) ,Component-based software engineering ,Operating system ,business ,computer - Abstract
AFS is a mature and reliable storage service at CERN, having worked for more than 20 years as the provider of Unix home directories and project areas. Recently, the AFS service has grown at unprecedented rates (200% in the past year); this growth was unlocked thanks to innovations in both the hardware and software components of our file servers. This work presents how AFS is used at CERN and how the service offering is evolving with the increasing storage needs of its local and remote user communities. In particular, we demonstrate the usage patterns for home directories, workspaces and project spaces, as well as show the daily work which is required to rebalance data and maintaining stability and performance. Finally, we highlight some recent changes and optimisations made to the AFS Service, thereby revealing how AFS can possibly operate at all while being subjected to frequent–almost DDOS-like–attacks from its users.
- Published
- 2014
- Full Text
- View/download PDF
23. Distributed analysis in ATLAS using GANGA
- Author
-
Mark Slater, Katarina Pajchel, Johannes Elmsheuser, Benjamin Gaidioz, A. A. Maier, Jakub Moscicki, A. Soroko, Hurng Chun Lee, D.C. Vanderster, Ulrik Egede, W. Reece, Mike Williams, G. A. Cowan, Frederic Brochu, and Bjørn Hallvard Samset
- Subjects
History ,Engineering ,Large Hadron Collider ,Database ,business.industry ,User analysis ,Grid ,computer.software_genre ,Computer Science Applications ,Education ,Set (abstract data type) ,Grid computing ,Management system ,Batch processing ,User interface ,business ,computer - Abstract
Distributed data analysis using Grid resources is one of the fundamental applications in high energy physics to be addressed and realized before the start of LHC data taking. The needs to manage the resources are very high. In every experiment up to a thousand physicists will be submitting analysis jobs to the Grid. Appropriate user interfaces and helper applications have to be made available to assure that all users can use the Grid without expertise in Grid technology. These tools enlarge the number of Grid users from a few production administrators to potentially all participating physicists. The GANGA job management system (http://cern.ch/ganga), developed as a common project between the ATLAS and LHCb experiments, provides and integrates these kind of tools. GANGA provides a simple and consistent way of preparing, organizing and executing analysis tasks within the experiment analysis framework, implemented through a plug-in system. It allows trivial switching between running test jobs on a local batch system and running large-scale analyzes on the Grid, hiding Grid technicalities. We will be reporting on the plug-ins and our experiences of distributed data analysis using GANGA within the ATLAS experiment. Support for all Grids presently used by ATLAS, namely the LCG/EGEE, NDGF/NorduGrid, and OSG/PanDA is provided. The integration and interaction with the ATLAS data management system DQ2 into GANGA is a key functionality. An intelligent job brokering is set up by using the job splitting mechanism together with data-set and file location knowledge. The brokering is aided by an automated system that regularly processes test analysis jobs at all ATLAS DQ2 supported sites. Large numbers of analysis jobs can be sent to the locations of data following the ATLAS computing model. GANGA supports amongst other things tasks of user analysis with reconstructed data and small scale production of Monte Carlo data.
- Published
- 2010
- Full Text
- View/download PDF
24. User analysis of LHCbdata with Ganga
- Author
-
Bjørn Hallvard Samset, Hurng Chun Lee, Benjamin Gaidioz, Katarina Pajchel, Daniel van der Ster, C.L. Tan, Mike Williams, A. A. Maier, K. Harrison, Jakub Moscicki, W. Reece, Greg Cowan, A. Soroko, Adrian Muraru, Johannes Elmsheuser, Mark Slater, Dietrich Liko, Frederic Brochu, and Ulrik Egede
- Subjects
History ,Engineering ,Large Hadron Collider ,business.industry ,User analysis ,Grid ,computer.software_genre ,Bookkeeping ,Computer Science Applications ,Education ,Variety (cybernetics) ,Set (abstract data type) ,Software ,Operating system ,Architecture ,business ,computer - Abstract
GANGA (http://cern.ch/ganga) is a job-management tool that offers a simple, efficient and consistent user analysis tool in a variety of heterogeneous environments: from local clusters to global Grid systems. Experiment specific plug-ins allow GANGA to be customised for each experiment. For LHCb users GANGA is the officially supported and advertised tool for job submission to the Grid. The LHCb specific plug-ins allow support for end-to-end analysis helping the user to perform his complete analysis with the help of GANGA. This starts with the support for data selection, where a user can select data sets from the LHCb Bookkeeping system. Next comes the set up for large analysis jobs: with tailored plug-ins for the LHCb core software, jobs can be managed by the splitting of these analysis jobs with the subsequent merging of the resulting files. Furthermore, GANGA offers support for Toy Monte-Carlos to help the user tune their analysis. In addition to describing the GANGA architecture, typical usage patterns within LHCb and experience with the updated LHCb DIRAC workload management system are presented.
- Published
- 2010
- Full Text
- View/download PDF
25. Distributed analysis using GANGA on the EGEE/LCG infrastructure
- Author
-
A. A. Maier, Johannes Elmsheuser, Ulrik Egede, Adrian Muraru, Dietrich Liko, Hurng Chun Lee, A. Soroko, Frederic Brochu, K. Harrison, Benjamin Gaidioz, Vladimir Romanovsky, Jakub Moscicki, and C.L. Tan
- Subjects
History ,Engineering ,Large Hadron Collider ,Database ,business.industry ,User analysis ,computer.software_genre ,Grid ,Computer Science Applications ,Education ,Grid computing ,Management system ,Batch processing ,Key (cryptography) ,User interface ,business ,computer - Abstract
The distributed data analysis using Grid resources is one of the fundamental applications in high energy physics to be addressed and realized before the start of LHC data taking. The need to facilitate the access to the resources is very high. In every experiment up to a thousand physicist will be submitting analysis jobs into the Grid. Appropriate user interfaces and helper applications have to be made available to assure that all users can use the Grid without too much expertise in Grid technology. These tools enlarge the number of grid users from a few production administrators to potentially all participating physicists. The GANGA job management system (http://cern.ch/ganga), developed as a common project between the ATLAS and LHCb experiments provides and integrates these kind of tools. GANGA provides a simple and consistent way of preparing, organizing and executing analysis tasks within the experiment analysis framework, implemented through a plug-in system. It allows trivial switching between running test jobs on a local batch system and running large-scale analyzes on the Grid, hiding Grid technicalities. We will be reporting on the plug-ins and our experiences of distributed data analysis using GANGA within the ATLAS experiment and the EGEE/LCG infrastructure. The integration with the ATLAS data management system DQ2 into GANGA is a key functionality. In combination with the job splitting mechanism large amounts of jobs can be sent to the locations of data following the ATLAS computing model. GANGA supports tasks of user analysis with reconstructed data and small scale production of Monte Carlo data.
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.