13 results on '"Zhou, Amelie Chi"'
Search Results
2. Privacy-preserving workflow scheduling in geo-distributed data centers
- Author
-
Xiao, Yao, Zhou, Amelie Chi, Yang, Xuan, and He, Bingsheng
- Published
- 2022
- Full Text
- View/download PDF
3. Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley
- Author
-
Yildiz, Orcun, Zhou, Amelie Chi, and Ibrahim, Shadi
- Published
- 2018
- Full Text
- View/download PDF
4. FarSpot: Optimizing Monetary Cost for HPC Applications in the Cloud Spot Market.
- Author
-
Zhou, Amelie Chi, Lao, Jianming, Ke, Zhoubin, Wang, Yi, and Mao, Rui
- Subjects
- *
HIGH performance computing , *CLOUD computing , *SPOT prices , *BID price , *FAULT tolerance (Engineering) , *COST - Abstract
Recently, we have witnessed many HPC applications developed and hosted in the cloud, which can benefit from the elastic and diversified resources on the cloud, while on the other hand confronting high costs for executing the long-running HPC applications. Although public clouds such as Amazon EC2 offer spot instances with dynamic and usually low prices compared to on-demand ones, the spot prices can vary significantly and sometimes can even be more expensive than on-demand prices of the same type. Previous work on reducing the monetary cost for HPC applications using spot instances focused on designing fault tolerance techniques or selecting appropriate instance types/bid prices to make good usage of the low spot prices. However, with the recent update of spot pricing model on Amazon EC2, these work may become either inefficient or invalid. In this article, we present FarSpot which is an optimization framework for HPC applications in the latest cloud spot market with the goal of minimizing application cost while ensuring performance constraints. FarSpot provides accurate long-term price prediction for a wide range of spot instance types using ensemble-based learning method. It further incorporates a cost-aware deadline assignment algorithm to distribute application deadline to each task according to spot price changes. With the assigned subdeadline of each task, FarSpot dynamically migrates tasks among spot instances to reduce execution cost. Evaluation results using real HPC benchmark show that 1) the prediction error of FarSpot is very low (below 3%), 2) FarSpot reduced the monetary cost by 32% on average compared to state-of-the-art algorithms, and 3) FarSpot satisfies the user-specified deadline constraints at all time. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. ParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs.
- Author
-
Chen, Zheng, Zhang, Feng, Zhou, Amelie Chi, Zhai, Jidong, Zhang, Chenyang, and Du, Xiaoyong
- Published
- 2020
- Full Text
- View/download PDF
6. A Taxonomy and Survey of Scientific Computing in the Cloud
- Author
-
Zhou, Amelie Chi, He, Bingsheng, Ibrahim, Shadi, Scalable Storage for Clouds and Beyond (KerData), SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), National University of Singapore (NUS), Zhou, Amelie Chi, Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)
- Subjects
[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[INFO]Computer Science [cs] ,[INFO] Computer Science [cs] ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] - Abstract
International audience; Cloud computing has evolved as a popular computing infrastructure for many applications. With (big) data acquiring a crucial role in eScience, efforts have been made recently exploring how to efficiently develop and deploy scientific applications on the unprecedentedly scalable cloud infrastructures. We review recent efforts in developing and deploying scientific computing applications in the cloud. In particular, we introduce a taxonomy specifically designed for scientific computing in the cloud, and further review the taxonomy with four major kinds of science applications, including life sciences, physics sciences, social and humanities sciences, and climate and earth sciences. Due to the large data size in most scientific applications, the performance of I/O operations can greatly affect the overall performance of the applications. We notice that, the dynamic I/O performance of the cloud has made the resource provisioning an important and complex problem for scientific applications in the cloud. We present our efforts on improving the resource provisioning efficiency and effectiveness of scientific applications in the cloud. Finally, we present the open problems for developing the next-generation eScience applications and systems in the cloud and conclude this chapter.
- Published
- 2016
7. Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters.
- Author
-
Zhou, Amelie Chi, Shen, Bingkun, Xiao, Yao, Ibrahim, Shadi, and He, Bingsheng
- Subjects
- *
SERVER farms (Computer network management) , *APPLICATION stores , *WIDE area networks , *DATA transmission systems , *PARTITION functions - Abstract
Graph processing is an emerging computation model for a wide range of applications and graph partitioning is important for optimizing the cost and performance of graph processing jobs. Recently, many graph applications store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the multi-level heterogeneities in network bandwidth and communication prices in geo-distributed DCs. In this article, we propose an efficient graph partitioning method named Geo-Cut, which takes both the cost and performance objectives into consideration for large graph processing in geo-distributed DCs. Geo-Cut adopts two optimization stages. First, we propose a cost-aware streaming heuristic and utilize the one-pass streaming graph partitioning method to quickly assign edges to different DCs while minimizing inter-DC data communication cost. Second, we propose two partition refinement heuristics which identify the performance bottlenecks of geo-distributed graph processing and refine the partitioning result obtained in the first stage to reduce the inter-DC data transfer time while satisfying the budget constraint. Geo-Cut can be also applied to partition dynamic graphs thanks to its lightweight runtime overhead. We evaluate the effectiveness and efficiency of Geo-Cut using real-world graphs with both real geo-distributed DCs and simulations. Evaluation results show that Geo-Cut can reduce the inter-DC data transfer time by up to 79 percent (42 percent as the median) and reduce the monetary cost by up to 75 percent (26 percent as the median) compared to state-of-the-art graph partitioning methods with a low overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
8. Energy-Efficient Speculative Execution using Advanced Reservation for Heterogeneous Clusters.
- Author
-
Zhou, Amelie Chi, Phan, Tien-Dat, Ibrahim, Shadi, and He, Bingsheng
- Published
- 2018
- Full Text
- View/download PDF
9. Privacy Regulation Aware Process Mapping in Geo-Distributed Cloud Data Centers.
- Author
-
Zhou, Amelie Chi, Xiao, Yao, Gong, Yifan, He, Bingsheng, Zhai, Jidong, and Mao, Rui
- Subjects
- *
SERVER farms (Computer network management) , *NETWORK performance , *PRIVACY , *PARALLEL processing , *CLOUDS & the environment , *MACHINE learning - Abstract
Recently, various applications including data analytics and machine learning have been developed for geo-distributed cloud data centers. For those applications, the ways of mapping parallel processes to physical nodes (i.e., "process mapping") could significantly impact the performance of the applications because of non-uniform communication cost in geo-distributed environments. What's more, the different data privacy requirements in geo-distributed data centers pose additional constraints on process mapping solutions. While process mapping has been widely studied in grid/cluster environments, few of the existing studies have considered the problem in geo-distributed cloud environment, which is a challenging task due to the multi-level data privacy constraints, heterogeneous network performance and process failures. In this paper, we introduce the special privacy requirements in geo-distributed data centers and formulate the geo-distributed process mapping problem as an optimization problem with multiple constraints. We develop a new method to efficiently find good process mapping solutions to the problem. Experimental results on real clouds (including Amazon EC2 and Windows Azure) and simulations demonstrate that our proposed approach can achieve significant performance improvement compared to the state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
10. A Taxonomy and Survey on eScience as a Service in the Cloud
- Author
-
Zhou, Amelie Chi, He, Bingsheng, and Ibrahim, Shadi
- Subjects
FOS: Computer and information sciences ,Computer Science - Distributed, Parallel, and Cluster Computing ,Distributed, Parallel, and Cluster Computing (cs.DC) - Abstract
Cloud computing has recently evolved as a popular computing infrastructure for many applications. Scientific computing, which was mainly hosted in private clusters and grids, has started to migrate development and deployment to the public cloud environment. eScience as a service becomes an emerging and promising direction for science computing. We review recent efforts in developing and deploying scientific computing applications in the cloud. In particular, we introduce a taxonomy specifically designed for scientific computing in the cloud, and further review the taxonomy with four major kinds of science applications, including life sciences, physics sciences, social and humanities sciences, and climate and earth sciences. Our major finding is that, despite existing efforts in developing cloud-based eScience, eScience still has a long way to go to fully unlock the power of cloud computing paradigm. Therefore, we present the challenges and opportunities in the future development of cloud-based eScience services, and call for collaborations and innovations from both the scientific and computer system communities to address those challenges.
- Published
- 2014
11. A Declarative Optimization Engine for Resource Provisioning of Scientific Workflows in Geo-Distributed Clouds.
- Author
-
Zhou, Amelie Chi, He, Bingsheng, Cheng, Xuntao, and Lau, Chiew Tong
- Subjects
- *
VIRTUAL machine systems , *CLOUD computing , *WORKFLOW management systems , *MATHEMATICAL optimization , *GRAPHICS processing units - Abstract
Geo-distributed clouds are becoming increasingly popular for cloud providers, and data centers with different regions often offer different prices, even for the same type of virtual machines. Resource provisioning in geo-distributed clouds is an important and complicated problem for budget and performance optimizations of scientific workflows. Scientists are facing the complexities resulted from various cloud offerings in the geo-distributed settings, severe cloud performance dynamics and evolving user requirements on performance and cost. To address those complexities, we propose a declarative optimization engine named Geco for resource provisioning of scientific workflows in geo-distributed clouds. Geco allows users to specify their workflow optimization goals and constraints of specific problems with an extended declarative language. We propose a novel probabilistic optimization approach for evaluating the declarative optimization goals and constraints to address the cloud dynamics. Additionally, we develop runtime optimizations to more effectively utilize the cloud resources at runtime. To accelerate the solution finding, Geco leverages the power of GPUs to find the solution in a fast and timely manner. Our evaluations with four common workflow provisioning problems demonstrate that, Geco is able to achieve more effective performance/cost optimizations in geo-distributed cloud environments than the state-of-the-art approaches. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
12. Simplified Resource Provisioning for Workflows in IaaS Clouds.
- Author
-
Zhou, Amelie Chi and He, Bingsheng
- Published
- 2014
- Full Text
- View/download PDF
13. Improving Update-Intensive Workloads on Flash Disks through Exploiting Multi-Chip Parallelism.
- Author
-
He, Bingsheng, Yu, Jeffrey Xu, and Zhou, Amelie Chi
- Subjects
WORKLOAD of computer networks ,MULTICHIP modules (Microelectronics) ,SOLID state drives ,INFORMATION retrieval ,RANDOM access memory ,TRANSACTION systems (Computer systems) - Abstract
Solid state drives (SSDs), or flash disks have been considered as ideal storage for various data-intensive workloads, because of the low random access latency and the intra-disk multi-chip parallelism. However, due to inherent nature of flash memories, update-intensive workloads cause the flash disk fragmented, and trigger costly internal activities such as cleaning and wear leveling. We use database transaction processing as a motivating update-intensive workload. Our studies based on a flash disk simulator as well as flash disks show that, these activities result in significant overhead to the I/O response time and system throughput. To resolve the impact of internal activities, we propose dynamic page replications to exploit the multi-chip parallelism on the flash disk. Specifically, we replicate the frequently blocked data pages to improve the data availability even when internal activities block the request. To reduce the overhead of replications, we take advantage of the idle periods in the flash chips for the I/O operations by writes to replicas or reads from replicas, and further develop a prediction model for the decisions on those I/O operations to minimize the interference to normal I/O operations. We evaluate our techniques with three public transaction benchmarks in the simulator as well as on the real flash disks. Our results demonstrate the effectiveness of our replication management on improving I/O response time and system throughput. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.