Author: "Xuanhua Shi" / Topic: computer.software_genre - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xuanhua Shi"' showing total 86 results

Start Over Author "Xuanhua Shi" Topic computer.software_genre

86 results on '"Xuanhua Shi"'

1. LoomIO: Object-Level Coordination in Distributed File Systems

Author: Xuanhua Shi, Yong Chen, Ligang He, Kang He, Hai Jin, Wei Xie, and Yusheng Hua
Subjects: Computational Theory and Mathematics, Hardware and Architecture, Programming language, Computer science, Signal Processing, Object level, computer.software_genre, computer
Published: 2022
Full Text: View/download PDF

2. Optimizing the copy-on-write mechanism of docker by dynamic prefetching

Author: Yan Jiang, Xuanhua Shi, Wei Liu, and Weizhong Qiang
Subjects: Instruction prefetch, Multidisciplinary, Copy-on-write, Computer science, computer.file_format, computer.software_genre, Image (mathematics), Mechanism (engineering), Container (abstract data type), Operating system, Overhead (computing), Image file formats, Layer (object-oriented design), computer
Abstract: Docker, as a mainstream container solution, adopts the Copy-on-Write (CoW) mechanism in its storage drivers. This mechanism satisfies the need of different containers to share the same image. However, when a single container performs operations such as modification of an image file, a duplicate is created in the upper read-write layer, which contributes to the runtime overhead. When the accessed image file is fairly large, this additional overhead becomes non-negligible. Here we present the concept of Dynamic Prefetching Strategy Optimization (DPSO), which optimizes the CoW mechanism for a Docker container on the basis of the dynamic prefetching strategy. At the beginning of the container life cycle, DPSO pre-copies up the image files that are most likely to be copied up later to eliminate the overhead caused by performing this operation during application runtime. The experimental results show that DPSO has an average prefetch accuracy of greater than 78% in complex scenarios and could effectively eliminate the overhead caused by the CoW mechanism.
Published: 2021
Full Text: View/download PDF

3. Semantic and Syntactic Enhanced Aspect Sentiment Triplet Extraction

Author: Hai Jin, Zhexue Chen, Xuanhua Shi, Bang Liu, and Hong Huang
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computer science, business.industry, Inference, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Mutual information, computer.software_genre, Pipeline (software), Benchmark (computing), Graph (abstract data type), Artificial intelligence, Representation (mathematics), business, Computation and Language (cs.CL), computer, Sentence, Natural language processing, Word (computer architecture), MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from sentences, where each triplet includes an entity, its associated sentiment, and the opinion span explaining the reason for the sentiment. Most existing research addresses this problem in a multi-stage pipeline manner, which neglects the mutual information between such three elements and has the problem of error propagation. In this paper, we propose a Semantic and Syntactic Enhanced aspect Sentiment triplet Extraction model (S3E2) to fully exploit the syntactic and semantic relationships between the triplet elements and jointly extract them. Specifically, we design a Graph-Sequence duel representation and modeling paradigm for the task of ASTE: we represent the semantic and syntactic relationships between word pairs in a sentence by graph and encode it by Graph Neural Networks (GNNs), as well as modeling the original sentence by LSTM to preserve the sequential information. Under this setting, we further apply a more efficient inference strategy for the extraction of triplets. Extensive evaluations on four benchmark datasets show that S3E2 significantly outperforms existing approaches, which proves our S3E2's superiority and flexibility in an end-to-end fashion.
Published: 2021
Full Text: View/download PDF

4. Vectorizing disks blocks for efficient storage system via deep learning

Author: Dong Dai, Xuanhua Shi, Yong Chen, Jiang Zhou, and Forrest Sheng Bao
Subjects: File system, Artificial neural network, Computer Networks and Communications, Computer science, business.industry, Deep learning, computer.software_genre, Computer Graphics and Computer-Aided Design, Theoretical Computer Science, Scheduling (computing), Computer engineering, Artificial Intelligence, Hardware and Architecture, Computer data storage, Leverage (statistics), Artificial intelligence, business, computer, Software
Abstract: Efficient storage systems come from the intelligent management of the data units, i.e., disk blocks in local file system level. Block correlations represent the semantic patterns in storage systems. These correlations can be exploited for data caching, pre-fetching, layout optimization, I/O scheduling, etc. to finally realize an efficient storage system. In this paper, we introduce Block2Vec, a deep learning based strategy to mine the block correlations in storage systems. The core idea of Block2Vec is twofold. First, it proposes a new way to abstract blocks, which are considered as multi-dimensional vectors instead of traditional block Ids. In this way, we are able to capture similarity between blocks through the distances of their vectors. Second, based on vector representation of blocks, it further trains a deep neural network to learn the best vector assignment for each block. We leverage the recently advanced word embedding technique in natural language processing to efficiently train the neural network. To demonstrate the effectiveness of Block2Vec, we design a demonstrative block prediction algorithm based on mined correlations. Empirical comparison based on the simulation of real system traces shows that Block2Vec is capable of mining block-level correlations efficiently and accurately. This research and trial show that the deep learning strategy is a promising direction in optimizing storage system performance.
Published: 2019
Full Text: View/download PDF

5. Maxson: Reduce Duplicate Parsing Overhead on Raw Data

Author: Hai Jin, Keyong Zhou, Hong Huang, Ruibo Li, Xuanhua Shi, Zhenyu Hu, Zhang Yipeng, Yongluan Zhou, Huan Shen, and Bingsheng He
Subjects: Conditional random field, 020203 distributed computing, Parsing, Database, Computer science, 02 engineering and technology, Reuse, computer.software_genre, JSON, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), Cache, Raw data, computer, computer.programming_language
Abstract: JSON is a very popular data format in many applications in Web and enterprise. Recently, many data analytical systems support the loading and querying JSON data. However, JSON parsing can be costly, which dominates the execution time of querying JSON data. Many previous studies focus on building efficient parsers to reduce this parsing cost, and little work has been done on how to reduce the occurrences of parsing. In this paper, we start with a study with a real production workload in Alibaba, which consists of over 3 million queries on JSON. Our study reveals significant temporal and spatial correlations among those queries, which result in massive redundant parsing operations among queries. Instead of repetitively parsing the JSON data, we propose to develop a cache system named Maxson for caching the JSON query results (the values evaluated from JSONPath) for reuse. Specifically, we develop effective machine learning-based predictor with combining LSTM (long shortterm memory) and CRF (conditional random field) to determine the JSONPaths to cache given the space budget. We have implemented Maxson on top of SparkSQL. We experimentally evaluate Maxson and show that 1) Maxson is able to eliminate the most of duplicate JSON parsing overhead, 2) Maxson improves end-to-end workload performance by 1.5–6.5×.
Published: 2020
Full Text: View/download PDF

6. Capuchin

Author: Fan Yang, Hai Jin, Hulin Dai, Xuanhua Shi, Ma Weiliang, Xuehai Qian, Xuan Peng, and Xiong Qian
Subjects: 010302 applied physics, Flexibility (engineering), Computer science, business.industry, Deep learning, Process (computing), 02 engineering and technology, Machine learning, computer.software_genre, 01 natural sciences, 020202 computer hardware & architecture, Task (computing), Memory management, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Memory footprint, Key (cryptography), Feature (machine learning), Artificial intelligence, business, computer
Abstract: In recent years, deep learning has gained unprecedented success in various domains, the key of the success is the larger and deeper deep neural networks (DNNs) that achieved very high accuracy. On the other side, since GPU global memory is a scarce resource, large models also pose a significant challenge due to memory requirement in the training process. This restriction limits the DNN architecture exploration flexibility. In this paper, we propose Capuchin, a tensor-based GPU memory management module that reduces the memory footprint via tensor eviction/prefetching and recomputation. The key feature of Capuchin is that it makes memory management decisions based on dynamic tensor access pattern tracked at runtime. This design is motivated by the observation that the access pattern to tensors is regular during training iterations. Based on the identified patterns, one can exploit the total memory optimization space and offer the fine-grain and flexible control of when and how to perform memory optimization techniques. We deploy Capuchin in a widely-used deep learning framework, Tensorflow, and show that Capuchin can reduce the memory footprint by up to 85% among 6 state-of-the-art DNNs compared to the original Tensorflow. Especially, for the NLP task BERT, the maximum batch size that Capuchin can outperforms Tensorflow and gradient-checkpointing by 7x and 2.1x, respectively. We also show that Capuchin outperforms vDNN and gradient-checkpointing by up to 286% and 55% under the same memory oversubscription.
Published: 2020
Full Text: View/download PDF

7. PrivGuard: Protecting Sensitive Kernel Data From Privilege Escalation Attacks

Author: Xuanhua Shi, Jiawei Yang, Weizhong Qiang, and Hai Jin
Subjects: Software_OPERATINGSYSTEMS, General Computer Science, Computer science, 0211 other engineering and technologies, Linux kernel, Memory corruption, Access control, 02 engineering and technology, Computer security, computer.software_genre, System call, non-control-data, Data integrity, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, credential, 021110 strategic, defence & security studies, business.industry, General Engineering, 020207 software engineering, Kernel, Kernel (statistics), privilege escalation, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, lcsh:TK1-9971, computer, Privilege escalation
Abstract: Kernels of operating systems are written in low-level unsafe languages, which make them inevitably vulnerable to memory corruption attacks. Most existing kernel defense mechanisms focus on preventing control-data attacks. Recently, attackers have turned the direction to non-control-data attacks by hijacking data flow, so as to bypass current defense mechanisms. Previous work has proved that non-control-data attacks are the critical threat to kernels. One of the important purposes of these attacks is to achieve privilege escalation by overwriting sensitive kernel data. The goal of our research is to develop a lightweight protection mechanism to mitigate non-control-data attacks that compromise sensitive kernel data. We propose an approach that enforces data integrity of sensitive kernel data by preventing the illegal write to these data to mitigate privilege escalation attacks. The main challenge of the proposed approach is to validate the modification of sensitive kernel data at runtime. The validation routine must verify the legitimacy of the duplicated sensitive data and ensure the credibility of the verification. To address this challenge, we modify the system call entry point to monitor the change of the sensitive kernel data without any change to Linux access control mechanism. Then, we use stack canaries to protect the duplication of sensitive kernel data that are used for integrity checking. In addition, we protect the integrity of sensitive kernel data by forbidding illegal updates to them. We have implemented the prototype for Linux kernel on Ubuntu Linux platform. The evaluation results of our prototype demonstrate that it can mitigate privilege escalation attacks and its performance overhead is moderate.
Published: 2018
Full Text: View/download PDF

8. Poris: A Scheduler for Parallel Soft Real-Time Applications in Virtualized Environments

Author: Song Wu, Hai Jin, Like Zhou, Huahua Sun, and Xuanhua Shi
Subjects: 020203 distributed computing, Schedule, business.industry, Computer science, Real-time computing, Preemption, Processor scheduling, 020206 networking & telecommunications, Cloud computing, Hypervisor, 02 engineering and technology, Dynamic priority scheduling, computer.software_genre, Virtualization, Scheduling (computing), Computational Theory and Mathematics, Hardware and Architecture, Virtual machine, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Operating system, business, computer
Abstract: With the prevalence of cloud computing and virtualization, more and more cloud services including parallel soft real-time applications (PSRT applications) are running in virtualized data centers. However, current hypervisors do not provide adequate support for them because of soft real-time constraints and synchronization problems, which result in frequent deadline misses and serious performance degradation. CPU schedulers in underlying hypervisors are central to these issues. In this paper, we identify and analyze CPU scheduling problems in hypervisors. Then, we design and implement a parallel soft real-time scheduler according to the analysis, named Poris , based on Xen. It addresses both soft real-time constraints and synchronization problems simultaneously. In our proposed method, priority promotion and dynamic time slice mechanisms are introduced to determine when to schedule virtual CPUs (VCPUs) according to the characteristics of soft real-time applications. Besides, considering that PSRT applications may run in a virtual machine (VM) or multiple VMs, we present parallel scheduling , group scheduling and communication-driven group scheduling to accelerate synchronizations of these applications and make sure that tasks are finished before their deadlines under different scenarios. Our evaluation shows Poris can significantly improve the performance of PSRT applications no matter how they run in a VM or multiple VMs. For example, compared to the Credit scheduler, Poris decreases the response time of web search benchmark by up to 91.6 percent.
Published: 2016
Full Text: View/download PDF

9. Towards Optimized Fine-Grained Pricing of IaaS Cloud Platform

Author: Song Wu, Hai Jin, Xuanhua Shi, Sheng Di, and Xinhou Wang
Subjects: Operations research, Computer Networks and Communications, business.industry, Computer science, Spot market, Workload, Cloud computing, computer.software_genre, Profit (economics), Computer Science Applications, Service-level agreement, Grid computing, Hardware and Architecture, Overhead (business), Operating system, Resource allocation, Revenue, Resource management, business, computer, Software, Information Systems
Abstract: Although many pricing schemes in IaaS platform are already proposed with pay-as-you-go and subscription/spot market policy to guarantee service level agreement, it is still inevitable to suffer from wasteful payment because of coarse-grained pricing scheme. In this paper, we investigate an optimized fine-grained and fair pricing scheme. Two tough issues are addressed: (1) the profits of resource providers and customers often contradict mutually; (2) VM-maintenance overhead like startup cost is often too huge to be neglected. Not only can we derive an optimal price in the acceptable price range that satisfies both customers and providers simultaneously, but we also find a best-fit billing cycle to maximize social welfare (i.e., the sum of the cost reductions for all customers and the revenue gained by the provider). We carefully evaluate the proposed optimized fine-grained pricing scheme with two large-scale real-world production traces (one from Grid Workload Archive and the other from Google data center). We compare the new scheme to classic coarse-grained hourly pricing scheme in experiments and find that customers and providers can both benefit from our new approach. The maximum social welfare can be increased up to $72.98$ and $48.15$ percent with respect to DAS-2 trace and Google trace respectively.
Published: 2015
Full Text: View/download PDF

10. Synchronization-Aware Scheduling for Virtual Clusters in Cloud

Author: Song Wu, Xuanhua Shi, Sheng Di, Zhenjiang Xie, Haibao Chen, Hai Jin, and Bing Bing Zhou
Subjects: Schedule, business.industry, Computer science, Distributed computing, Temporal isolation among virtual machines, Cloud computing, computer.file_format, Virtualization, computer.software_genre, Fair-share scheduling, Scheduling (computing), Hybrid Scheduling, Computational Theory and Mathematics, Hardware and Architecture, Virtual machine, Signal Processing, Operating system, Executable, business, computer
Abstract: Due to high flexibility and cost-effectiveness, cloud computing is increasingly being explored as an alternative to local clusters by academic and commercial users. Recent research already confirmed the feasibility of running tightly-coupled parallel applications with virtual clusters. However, such types of applications suffer from significant performance degradation, especially as the overcommitment is common in cloud. That is, the number of executable Virtual CPUs (VCPUs) is often larger than that of available Physical CPUs (PCPUs) in the system. The performance degradation is mainly due to the fact that the current virtual machine monitors (VMMs) are unaware of the synchronization requirements of the VMs which are running parallel applications. In this paper, There are two key contributions. (1) We propose an autonomous synchronization-aware VM scheduling (SVS) algorithm, which can effectively mitigate the performance degradation of tightly-coupled parallel applications running atop them in overcommitted situation. (2) We integrate the SVS algorithm into Xen VMM scheduler, and rigorously implement a prototype. We evaluate our design on a real cluster environment with NPB benchmark and real-world trace. Experiments show that our solution attains better performance for tightly-coupled parallel applications than the state-of-the-art approaches like Xen’s Credit scheduler, balance scheduling, and hybrid scheduling.
Published: 2015
Full Text: View/download PDF

11. FITDOC: fast virtual machines checkpointing with delta memory compression

Author: Song Wu, Hai Jin, Laurence T. Yang, Yunjie Du, and Xuanhua Shi
Subjects: Dirty data, Software_OPERATINGSYSTEMS, Computer science, 02 engineering and technology, Parallel computing, computer.software_genre, Theoretical Computer Science, 020204 information systems, Server, Dirty bit, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), Hardware_MEMORYSTRUCTURES, business.industry, computer.file_format, Virtualization, Hardware and Architecture, Virtual machine, Scalability, Operating system, Bitmap, 020201 artificial intelligence & image processing, Data center, business, computer, Software, Information Systems
Abstract: Virtualization provides the function of saving the entire status of the execution environment of a running virtual machine (VM), which makes checkpointing flexible and practical for HPC servers or data center servers. However, the system-level checkpointing needs to save a large number of data to the disk. Moreover, the overhead grows linearly with the increasing size of virtual machine memory, which leads to disk I/O consumption disaster along with poor system scalability. To target this, we propose a novel fast VM checkpointing approach, named Fast Incremental checkpoinTing with Delta memOry Compression (FITDOC). By studying the run-time memory characteristics of different workloads, FITDOC counts the dirty pages in a fine-granularity manner (i.e., the number of 8 bytes), instead of in the conventional method (i.e., the number of pages). FITDOC utilises a dirty page logging mechanism to record the dirty pages. Accordingly, a delta memory compression mechanism is implemented to eliminate redundant memory data in checkpointing files. To locate the dirty data in dirty pages, FITDOC utilizes two mechanisms: by analyzing the distribution characteristics of dirty pages in the dirty bitmap, we propose a fast dirty bitmap scanning method to locate the dirty pages, and take a multi-threading data comparison mechanism to locate the real dirty data in one page. The experimental results show that compared with Xen's default system-level checkpointing algorithm, FITDOC can on average reduce checkpointing time 70.54 % with a 1 GB memory size and achieve better improvement for VMs with larger memory configurations. FITDOC can reduce the size of checkpointing data 52.88 % on average compared with Remus's incremental solution, which is in page granularity. Compared with the default dirty bitmap scanning method in Xen, the scanning time of FITDOC is decreased by 91.13 % on average.
Published: 2015
Full Text: View/download PDF

12. MURS: Mitigating Memory Pressure in Service-Oriented Data Processing System

Author: Xuanhua Shi, Ligang He, Zhixiang Ke, Song Wu, Hai Jin, and Xiong Zhang
Subjects: 020203 distributed computing, Hardware_MEMORYSTRUCTURES, business.industry, Computer science, Big data, Context (computing), 02 engineering and technology, computer.software_genre, Data processing system, Data modeling, Memory management, 020204 information systems, Spark (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Operating system, Batch processing, business, computer, Garbage collection
Abstract: Although a data processing system often works as a batch processing system, many enterprises deploy such a system as a service, which we call the service-oriented data processing system. It has been shown that in-memory data processing systems suffer from serious memory pressure. The situation becomes even worse for the service-oriented data processing systems due to various reasons. For example, in a service-oriented system, multiple submitted tasks are launched at the same time and executed in the same context in the resources, compared with the batch processing mode where the tasks are processed one by one. Therefore, the memory pressure will affect all submitted tasks, including the tasks that only incur the light memory pressure when they are run alone. In this paper, we find that the reason why memory pressure arises is because the running tasks produce massive long-living data objects in the limited memory space. Our studies further reveal that the long-living data objects are generated by the API functions that are invoked by the in-memory processing frameworks. Based on these findings, we propose a method to classify the API functions based on the memory usage rate. Further, we design a scheduler called MURS to mitigate the memory pressure. We implement MURS in Spark and conduct the experiments to evaluate the performance of MURS. The results show that when comparing to Spark, MURS can 1) decrease the execution time of the submitted jobs by up to 65.8%, 2) mitigate the memory pressure in the server by decreasing the garbage collection time by up to 81%, and 3) reduce the data spilling, and hence disk I/O, by approximately 90%.
Published: 2017
Full Text: View/download PDF

13. MECOM: Live migration of virtual machines by adaptively compressing memory pages

Author: Li Deng, Xiaodong Pan, Song Wu, Hai Jin, Xuanhua Shi, and Hanhua Chen
Subjects: Computer Networks and Communications, Computer science, Distributed computing, Real-time computing, Fault tolerance, Load balancing (computing), computer.software_genre, Hardware and Architecture, Virtual machine, Page, Data diffusion machine, computer, Software, Live migration
Abstract: Live migration of virtual machines has been a powerful tool to facilitate system maintenance, load balancing, fault tolerance, and power-saving, especially in clusters or data centers. Although pre-copy is extensively used to migrate memory data of virtual machines, it cannot provide quick migration with low network overhead but leads to large performance degradation of virtual machine services due to the great amount of transferred data during migration. To solve the problem, this paper presents the design and implementation of a novel memory-compression-based VM migration approach (MECOM for short) that uses memory compression to provide fast, stable virtual machine migration, while guaranteeing the virtual machine services to be slightly affected. Based on memory page characteristics, we design an adaptive zero-aware compression algorithm for balancing the performance and the cost of virtual machine migration. Using the proposed scheme pages are rapidly compressed in batches on the source and exactly recovered on the target. Experimental results demonstrate that compared with Xen, our system can significantly reduce downtime, total migration time, and total transferred data by 27.1%, 32%, and 68.8% respectively.
Published: 2014
Full Text: View/download PDF

14. Morpho: A decoupled MapReduce framework for elastic cloud computing

Author: Lu Lu, Xuanhua Shi, Song Wu, Hai Jin, Qiuyue Wang, and Daxing Yuan
Subjects: Speedup, Computer Networks and Communications, Computer science, business.industry, Distributed computing, Locality, Cloud computing, computer.software_genre, Network topology, Resource (project management), Hardware and Architecture, Virtual machine, Operating system, business, computer, Software
Abstract: MapReduce as a service enjoys wide adoption in commercial clouds today [3] , [23] . But most cloud providers just deploy native Hadoop [24] systems on their cloud platforms to provide MapReduce services without any adaptation to these virtualized environments [6] , [25] . In cloud environments, the basic executing units of data processing are virtual machines. Each user’s virtual cluster needs to deploy HDFS [26] every time when it is initialized, while the user’s input and output data should be transferred between the HDFS and external persistent data storage to ensure that the native Hadoop works properly. These costly data movements can lead to significant performance degradation of MapReduce jobs in the cloud. We present Morpho—a modified version of the Hadoop MapReduce framework, which decouples storage and computation into physical clusters and virtual clusters respectively. In Morpho, the map/reduce tasks are still running in VMs without corresponding ad-hoc HDFS deployments; instead, HDFS is deployed on the underlying physical machines. When MapReduce computation is performing, the map tasks can get data directly from physical machines without any extra data transfers. We design data location perception module to improve the cooperativity of the computation and storage layers, which means that the map tasks can intelligently fetch information about the network topology of physical machines and the VM placements. Additionally, Morpho also achieves high performance by two complementary strategies for data placement and VM placement, which can provide better map and reduce input locality. Furthermore, our data placement strategy can mitigate the resource contentions between jobs. The evaluation of our Morpho system prototype shows it achieves a nearly 62% speedup of job execution time and a significant reduction in network traffic of the entire system compared with the traditional cloud computing scheme of Amazon and other cloud providers.
Published: 2014
Full Text: View/download PDF

15. Dynamic and fast processing of queries on large-scale RDF data

Author: Xuanhua Shi, Changfeng Xie, Pingpeng Yuan, Guang Yang, Hai Jin, and Ling Liu
Subjects: Information retrieval, Database, Computer science, RDF Schema, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, InformationSystems_DATABASEMANAGEMENT, computer.file_format, Query optimization, computer.software_genre, Query language, Human-Computer Interaction, Query plan, Query expansion, Artificial Intelligence, Hardware and Architecture, SPARQL, Sargable, computer, Software, Information Systems, RDF query language, computer.programming_language
Abstract: As RDF data continue to gain popularity, we witness the fast growing trend of RDF datasets in both the number of RDF repositories and the size of RDF datasets. Many known RDF datasets contain billions of RDF triples (subject, predicate and object). One of the grant challenges for managing these huge RDF data is how to execute RDF queries efficiently. In this paper, we address the query processing problems against the billion triple challenges. We first identify some causes for the problems of existing query optimization schemes, such as large intermediate results, initial query cost estimation errors. Then, we present our block-oriented dynamic query plan generation approach powered with pipelining execution. Our approach consists of two phases. In the first phase, a near-optimal execution plan for queries is chosen by identifying the processing blocks of queries. We group the join patterns sharing a join variable into building blocks of the query plan since executing them first provides opportunities to reduce the size of intermediate results generated. In the second phase, we further optimize the initial pipelining for a given query plan. We employ optimization techniques, such as sideways information passing and semi-join, to further reduce the size of intermediate results, improve the query processing cost estimation and speed up the performance of query execution. Experimental results on several RDF datasets of over a billion triples demonstrate that our approach outperforms existing RDF query engines that rely on dynamic programming based static query processing strategies.
Published: 2014
Full Text: View/download PDF

16. Guaranteeing QoS of media-based applications in virtualized environment

Author: Song Wu, Hai Jin, Jiangfu Zhou, Like Zhou, and Xuanhua Shi
Subjects: business.industry, Computer science, Quality of service, Cloud computing, Hypervisor, Virtualization, computer.software_genre, Computer Science Applications, Network management, Bandwidth allocation, Server, Media Technology, Bandwidth (computing), business, computer, Information Systems, Computer network
Abstract: With the rapid development of web technology and smart phone, multimedia contents spread all over the Internet. The prevalence of virtualization technology enables multimedia service providers to run media servers in virtualized servers or rented virtual machines VMs in a cloud environment. Although server consolidation using virtualization can substantially increase the efficient use of server resources, it introduces resources competition among VMs running different applications. Recently, hypervisors do not make any Quality of Service QoS guarantee for media-based applications if they are consolidated with other network-intensive applications, which leads to significant performance degradation. For example, Xen only offers a static method to allocate network bandwidth. In this paper, we find that the performance of media-based applications running in VMs degrades seriously when they are consolidated with other VMs running network-intensive applications and argues that dynamic network bandwidth allocation is essential to guarantee the QoS of media-based applications. Then, we present a dynamic network bandwidth allocation system in virtualized environment, which allocates network bandwidth dynamically and effectively, and does not interrupt running services in VMs. The experiments show that our system can not only guarantee the QoS of media-based applications well but also maximize the system's the overall performance while ensuring the QoS of media-based applications.
Published: 2013
Full Text: View/download PDF

17. SafeStack: Automatically Patching Stack-Based Buffer Overflow Vulnerabilities

Author: Hai Jin, Bing Bing Zhou, Weide Zheng, Deqing Zou, Zhenkai Liang, Gang Chen, and Xuanhua Shi
Subjects: computer.internet_protocol, business.industry, Computer science, Service-oriented architecture, computer.software_genre, Buffer (optical fiber), Software quality, Rendering (computer graphics), Computer virus, Attack prevention, Embedded system, Operating system, Stack buffer overflow, Electrical and Electronic Engineering, business, computer, Buffer overflow
Abstract: Buffer overflow attacks still pose a significant threat to the security and availability of today's computer systems. Although there are a number of solutions proposed to provide adequate protection against buffer overflow attacks, most of existing solutions terminate the vulnerable program when the buffer overflow occurs, effectively rendering the program unavailable. The impact on availability is a serious problem on service-oriented platforms. This paper presents SafeStack, a system that can automatically diagnose and patch stack-based buffer overflow vulnerabilities. The key technique of our solution is to virtualize memory accesses and move the vulnerable buffer into protected memory regions, which provides a fundamental and effective protection against recurrence of the same attack without stopping normal system execution. We developed a prototype on a Linux system, and conducted extensive experiments to evaluate the effectiveness and performance of the system using a range of applications. Our experimental results showed that SafeStack can quickly generate runtime patches to successfully handle the attack's recurrence. Furthermore, SafeStack only incurs acceptable overhead for the patched applications.
Published: 2013
Full Text: View/download PDF

18. VRAS: A Lightweight Local Resource Allocation System for Virtual Machine Monitor

Author: Song Wu, Hai Jin, Wei Gao, and Xuanhua Shi
Subjects: Computer science, Virtual machine, Distributed computing, Resource allocation, Hypervisor, Workload, Throughput, Electrical and Electronic Engineering, Virtualization, computer.software_genre, computer, Host (network), Computer Science Applications
Abstract: Traditional computing resource allocations in virtualization environment devote to provide fairness of resource distribution when the overall workload of host is heavy. That makes those allocations lack of efficiency under light workloads. To target this, we design and implement a lightweight resource allocation system, virtual resource allocation system (VRAS). Considering the fact that workloads can be balanced by migrating virtual machines to other hosts, we propose a request driven mechanism to focus on resource allocation under light workloads. We also present some allocation strategies used in VRAS to explain how it works on processor and memory resources. Our experiment results demonstrate that VRAS can result in throughput improvements of 28 % for RUBiS application, and the network overhead reduction of 81 %, comparing with the traditional allocation methods.
Published: 2013
Full Text: View/download PDF

19. A disk bandwidth allocation mechanism with priority

Author: Hai Jin, Xia Xie, Wenzhi Cao, Xijiang Ke, Wang Xibin, and Xuanhua Shi
Subjects: Bandwidth management, Dynamic bandwidth allocation, Full virtualization, Computer science, business.industry, Distributed computing, Temporal isolation among virtual machines, Virtualization, computer.software_genre, Theoretical Computer Science, Scheduling (computing), Bandwidth allocation, Hardware and Architecture, Virtual machine, Resource management, business, computer, Software, Information Systems, Computer network
Abstract: Virtualization is a popular technology. Services and applications running on each virtual machine have to compete with each other for limited physical computer or network resources. Each virtual machine has different I/O requirement and special priority. Without proper scheduling resource management, a load surge in a virtual machine may inevitably degrade other's performance. In addition, each virtual machine may run different kinds of application, which have different disk bandwidth demands and service priorities. When assigning I/O resources, we should deal with each case on demand. In this paper, we propose a dynamic virtual machine disk bandwidth control mechanism in virtualization environment. A Disk Credit Algorithm is introduced to support a fine-gained disk bandwidth allocation mechanism among virtual machines. We can assign disk bandwidth according to each virtual machine's service priority/weight and its requirement. Related experiments show that the mechanism can improve the VMs' isolation and guarantee the performance of the specific virtual machine well.
Published: 2013
Full Text: View/download PDF

20. TPS: An Efficient VM Scheduling Algorithm for HPC Applications in Cloud

Author: Zhang Chi, Xuanhua Shi, Dai Wei, Wang Duoqiang, and Hai Jin
Subjects: 020203 distributed computing, business.industry, Network packet, Computer science, Distributed computing, Cloud computing, 02 engineering and technology, Virtualization, computer.software_genre, Supercomputer, Synchronization, 020202 computer hardware & architecture, Scheduling (computing), Virtual machine, 0202 electrical engineering, electronic engineering, information engineering, Operating system, Latency (engineering), business, computer
Abstract: Cloud computing platforms are becoming viable alternative for running high performance parallel applications. However, when running these applications in cloud platforms, VMs (virtual machines) are easily affected by the synchronization latency problem which leads to serious performance degradation. There are two main reasons. The first is that a host is unaware of the synchronization requirements of the guest OS. The second is that the synchronization requests of VMs on a physical host are unknown to VMs on the other physical host. It is a great challenge to mitigate the negative influence of virtualization on synchronization and accelerate the synchronization response of HPC applications. In this paper, we propose a $two-phase$ synchronization-aware (TPS) scheduling algorithm to solve the problem above. The TPS algorithm takes both intra-VMs’ and inter-VMs’ synchronization demands into consideration. Spin-locks and network packets are used as the metrics to detect the synchronization demands of VMs and VMs are scheduled based on spinlock-aware and communication-aware strategies. The algorithm is implemented on the base of KVM and experiments are conducted in a cluster environment. The experimental results show that our TPS algorithm obtains a better performance for HPC applications and reduces negative impacts on non-HPC applications simultaneously. Therefore, our approach is an effective solution to the HPC synchronization issues in cloud platforms.
Published: 2017
Full Text: View/download PDF

21. SSDUP: An Efficient SSD Write Buffer Using Pipeline

Author: Yong Chen, Ming Li, Wei Liu, Xuanhua Shi, and Hai Jin
Subjects: Hardware_MEMORYSTRUCTURES, Computer science, Operating system, Latency (engineering), Write buffer, computer.software_genre, Supercomputer, computer, Bottleneck
Abstract: High performance computing (HPC) applications are becoming more data-intensive and produce increasingly large I/O demands on storage systems. New storage devices such as SSD which has nearly no seek latency and high throughput have been widely used together with HDD to serve as a hybrid storage system. To solve the I/O bottleneck problem, existing hybrid storage solutions such as Burst Buffer have been proposed as intermediate layer between clients and disks to absorb burst I/O requests and improve write performance. However Burst Buffer needs sufficient SSD space to meet the maximum burst I/O requests which is still a costly solution. In this paper, we propose a hybrid architecture called SSDUP (an SSD write buffer Using Pipeline) for HPC storage systems, which uses NAND flash based SSD as a write-back buffer for HDD. With our efforts, SSDUP can achieve a good performance by using limited SSD space.
Published: 2016
Full Text: View/download PDF

22. HPC Cloud 环境中基于网络I/O 负载的虚拟机放置算法

Author: Xuanhua Shi, Hai Jin, Fei Wang, Song Wu, and ZhiWu Wang
Subjects: General Computer Science, I/O scheduling, business.industry, Computer science, Computation, Distributed computing, Cloud computing, Load balancing (computing), Virtualization, computer.software_genre, Supercomputer, Virtual machine, Greedy algorithm, business, Engineering (miscellaneous), Algorithm, computer
Abstract: With development of virtualization and cloud computing, increasingly more scientific computing applications run in the cloud computing resources. In HPC Cloud, high performance computing, referred to as HPC, applications run in multiple virtual machines (VMs), which can be placed in different physical nodes. If VMs, which are used for communication-intensive jobs, are placed in the same physical node, VMs will compete for the network I/O bandwidth of the physical nodes. If the network I/O resources, required by VMs, are more than the network I/O bandwidth upper bound of the physical node, the computing performance will be affected severely, and the calculation time will increase. This paper presents a network I/O load aware VM placement algorithm called NLPA, which adopts the network I/O load balancing strategy to reduce competition for the network I/O bandwidth between VMs. Experiments show that, compared with the Greedy algorithm, NLPA algorithm has better performance in the computation time, the total system network I/O load throughput rate and the network I/O load balancing degree.
Published: 2012
Full Text: View/download PDF

23. Performance implications of non-uniform VCPU-PCPU mapping in virtualization environment

Author: Hai Jin, Wei Gao, Alin Zhong, Xuanhua Shi, and Song Wu
Subjects: Application virtualization, Computer Networks and Communications, Computer science, business.industry, Full virtualization, Hypervisor, Cloud computing, computer.software_genre, Virtualization, Runtime system, Virtual machine, Server, Scalability, Operating system, business, computer, Software
Abstract: Virtualization technology promises to provide better isolation and consolidation in traditional servers. However, with VMM (virtual machine monitor) layer getting involved, virtualization system changes the architecture of traditional software stack, bringing about limitations in resource allocating. The non-uniform VCPU (virtual CPU)-PCPU (physical CPU) mapping, deriving from both the configuration or the deployment of virtual machines and the dynamic runtime feature of applications, causes the different percentage of processor allocation in the same physical machine,and the VCPUs mapped these PCPUs will gain asymmetric performance. The guest OS, however, is agnostic to the non-uniformity. With assumption that all VCPUs have the same performance, it can carry out sub-optimal policies when allocating virtual resource for applications. Likewise, application runtime system can also make the same mistakes. Our focus in this paper is to understand the performance implications of the non-uniform VCPU-PCPU mapping in a virtualization system. Based on real measurements of a virtualization system with state of art multi-core processors running different commercial and emerging applications, we demonstrate that the presence of the non-uniform mapping has negative impacts on application's performance predictability. This study aims to provide timely and practical insights on the problem of non-uniform VCPU mapping, when virtual machines being deployed and configured, in emerging cloud.
Published: 2012
Full Text: View/download PDF

24. Adapting grid computing environments dependable with virtual machines: design, implementation, and evaluations

Author: Wei Zhu, Li Qi, Xuanhua Shi, Song Wu, and Hai Jin
Subjects: Focus (computing), Computer science, Reliability (computer networking), Distributed computing, computer.software_genre, Grid, Theoretical Computer Science, Set (abstract data type), Adaptive management, Grid computing, Hardware and Architecture, Virtual machine, computer, Throughput (business), Software, Information Systems
Abstract: Due to its potential, using virtual machines in grid computing is attracting increasing attention. Most of the researches focus on how to create or destroy a virtual execution environments for different kinds of applications, while the policy of managing the virtual environments is not widely discussed. This paper proposes the design, implementation, and evaluation of an adaptive and dependable virtual execution environment for grid computing, ADVE, which focuses on the policy of managing virtual machines in grid environments. To build a dependable virtual execution environments for grid applications, ADVE provides an set of adaptive policies managing virtual machine, such as when to create and destroy a new virtual execution environment, when to migrate applications from one virtual execution environment to a new virtual execution environment. We conduct experiments over a cluster to evaluate the performance of ADVE, and the experimental results show that ADVE can improve the throughput and the reliability of grid resources with the adaptive management of virtual machines.
Published: 2011
Full Text: View/download PDF

25. Optimizing the live migration of virtual machine by CPU scheduling

Author: Wei Gao, Fan Zhou, Xuanhua Shi, Song Wu, Hai Jin, and Xiaoxin Wu
Subjects: Computer Networks and Communications, Computer science, business.industry, Real-time computing, Liveness, computer.software_genre, Computer Science Applications, Scheduling (computing), Hardware and Architecture, Virtual machine, Embedded system, business, computer, Live migration
Abstract: Live migration has been proposed to reduce the downtime for migrated VMs by pre-copying the generated run-time memory state files from the original host to the migration destination host. However, if the rate for such a dirty memory generation is high, it may take a long time to accomplish live migration because a large amount of data needs to be transferred. In extreme cases when dirty memory generation rate is faster than pre-copy speed, live migration will fail. In this work we address the problem by designing an optimization scheme for live migration, under which according to pre-copy speed, the VCPU working frequency may be reduced so that at a certain phase of the pre-copy the remaining dirty memory can reach a desired small amount. The VM downtime during the migration can be limited. The scheme works for the scenario where the migrated application has a high memory writing speed, or the pre-copy speed is slow, e.g., due to low network bandwidth between the migration parties. The method improves migration liveness at the cost of application performance, and works for those applications for which interruption causes much more serious problems than quality deterioration. Compared to the original live migration, our experiments show that the optimized scheme can reduce up to 88% of application downtime with an acceptable overhead.
Published: 2011
Full Text: View/download PDF

26. An optimistic checkpoint mechanism based on job characteristics and resource availability for dynamic grids

Author: Xuanhua Shi, Song Wu, Hai Jin, and Yongcai Tao
Subjects: Software_OPERATINGSYSTEMS, Multidisciplinary, Markov chain, Computer science, Distributed computing, Node (networking), Mechanism based, Fault tolerance, computer.software_genre, Grid, Resource (project management), Grid computing, Aperiodic graph, computer
Abstract: In the paper, based on the job characteristics and resources availability, an optimistic checkpoint mechanism for dynamic grids(OCM4G) is proposed. It can determine whether to checkpoint a given job running on a given resource node and establish optimal aperiodic checkpoint intervals by applying the knowledge of job characteristics and resource availability. We evaluate OCM4G over a real grid environment (ChinaGrid) and the results show that OCM4G achieves better performance than the periodic checkpoint and the analytical method of calculating aperiodic checkpoint intervals.
Published: 2011
Full Text: View/download PDF

27. Toward scalable Web systems on multicore clusters: making use of virtual machines

Author: Xiaodong Pan, Hongbo Jiang, Bo Yu, Hai Jin, Dachuan Huang, and Xuanhua Shi
Subjects: Multi-core processor, Computer science, business.industry, Domain Name System, Design pattern, computer.software_genre, Bottleneck, Theoretical Computer Science, Hardware and Architecture, Virtual machine, Embedded system, Scalability, Operating system, business, computer, Software, Server-side, Information Systems
Abstract: Limited by the existing design pattern, a lot of existing softwares have not yet taken full use of multicore processing power, incurring low utilization of hardware, even a bottleneck of the whole system. To address this problem, in this paper, we propose a VM-based Web system on multicore clusters. The VM-based Web system is scheduled by Linux Virtual Server (LVS) and we implement the web server with Tomcat. In the mean time, we develop VNIX, a set of VM management toolkit, to facilitate managing VMs on clusters, aiming at improving the usage of multicore CPU power. To reduce resources contention among VMs, we propose to deploy LVS schedulers distributively on different physical nodes. To evaluate our approach, we conduct extensive experiments to compare VM-based Web system with classical physical machine-based Web system. Our experimental results demonstrate that the proposed VM-based Web system can result in throughput improvements of up to three times compared with the same multicore clusters, with an error rate at the server side as low as 20% of that of classic systems.
Published: 2011
Full Text: View/download PDF

28. DAGMap: efficient and dependable scheduling of DAG workflow job in Grid

Author: Haijun Cao, Song Wu, Hai Jin, Xiaoxin Wu, and Xuanhua Shi
Subjects: Job scheduler, Speedup, Computer science, Distributed computing, Workload, Fault tolerance, Directed acyclic graph, Grid, computer.software_genre, Theoretical Computer Science, Scheduling (computing), Workflow, Hardware and Architecture, Resource allocation, Dependability, computer, Software, Information Systems
Abstract: DAG has been extensively used in Grid workflow modeling. Since Grid resources tend to be heterogeneous and dynamic, efficient and dependable workflow job scheduling becomes essential. It poses great challenges to achieve minimum job accomplishing time and high resource utilization efficiency, while providing fault tolerance. Based on list scheduling and group scheduling, in this paper, we propose a novel scheduling heuristic called DAGMap. DAGMap consists of two phases, namely Static Mapping and Dependable Execution. Four salient features of DAGMap are: (1) Task grouping is based on dependency relationships and task upward priority; (2) Critical tasks are scheduled first; (3) Min-Min and Max-Min selective scheduling are used for independent tasks; and (4) Checkpoint server with cooperative checkpointing is designed for dependable execution. The experimental results show that DAGMap can achieve better performance than other previous algorithms in terms of speedup, efficiency, and dependability.
Published: 2009
Full Text: View/download PDF

29. Measuring Directional Semantic Similarity with Multi-features

Author: Bo Liu, Hai Jin, and Xuanhua Shi
Subjects: Lexical semantics, business.industry, Computer science, 02 engineering and technology, computer.software_genre, Feature (linguistics), Symmetric relation, Identification (information), Query expansion, Semantic similarity, Similarity (network science), 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, Natural language processing, Word (computer architecture)
Abstract: Semantic similarity measures between linguistic terms are essential in many Natural Language Processing (NLP) applications. Term similarity is most conventionally perceived as a symmetric relation. However, semantic directional (asymmetric) relations exist in lexical semantics and make symmetric similarity measures less suitable for their identification. Furthermore, directional similarity actually represents even more general conditions and is more practical in some specific NLP applications than symmetric similarity. As the footstone of similarity measures, current semantic features cannot efficiently represent large scale web text collections. Hence, we propose a new directional similarity method, considering feature representations both in linguistic and extra linguistic dimensions. We evaluate our approach on standard word similarity, reporting state-of-the-art performance on multiple datasets. Experiments show that our directional method handles both symmetric and directional semantic relations and leads to clear improvements in entity search and query expansion.
Published: 2016
Full Text: View/download PDF

30. D3-MapReduce: Towards MapReduce for Distributed and Dynamic Data Sets

Author: Gilles Fedak, Bing Tang, Mircea Moca, Anthony Simonet, Heithem Abbes, Lu Lu, Gheorghe Cosmin Silaghi, Xuanhua Shi, Julio Anjos Jose-Francisco Saray, Asma Ben Cheikh, Haiwu He, and Hai Jin
Subjects: Emulation, Distributed database, business.industry, Computer science, Dynamic data, Data management, Distributed computing, computer.software_genre, Data modeling, Middleware (distributed applications), Programming paradigm, Data-intensive computing, business, computer
Abstract: Since its introduction in 2004 by Google, MapReduce has become the programming model of choice for processing large data sets. Although MapReduce was originally developed for use by web enterprises in large data-centers, this technique has gained a lot of attention from the scientific community for its applicability in large parallel data analysis (including geographic, high energy physics, genomics, etc.). So far MapReduce has been mostly designed for batch processing of bulk data. The ambition of D3-MapReduce is to extend the MapReduce programming model and propose efficient implementation of this model to: i) cope with distributed data sets, i.e. that span over multiple distributed infrastructures or stored on network of loosely connected devices, ii) cope with dynamic data sets, i.e. which dynamically change over time or can be either incomplete or partially available. In this paper, we draw the path towards this ambitious goal. Our approach leverages Data Life Cycle as a key concept to provide MapReduce for distributed and dynamic data sets on heterogeneous and distributed infrastructures. We first report on our attempts at implementing the MapReduce programming model for Hybrid Distributed Computing Infrastructures (Hybrid DCIs). We present the architecture of the prototype based on BitDew, a middleware for large scale data management, and Active Data, a programming model for data life cycle management. Second, we outline the challenges in term of methodology and present our approaches based on simulation and emulation on the Grid'5000 experimental testbed. We conduct performance evaluations and compare our prototype with Hadoop, the industry reference MapReduce implementation. We present our work in progress on dynamic data sets that has lead us to implement an incremental MapReduce framework. Finally, we discuss our achievements and outline the challenges that remain to be addressed before obtaining a complete D3-MapReduce environment.
Published: 2015
Full Text: View/download PDF

31. Mammoth : gearing Hadoop towards memory-intensive MapReduce applications

Author: Chen Ming, Xu Xie, Ligang He, Xuanhua Shi, Lu Lu, Yong Chen, Song Wu, and Hai Jin
Subjects: Flat memory model, Computer science, Distributed computing, Uniform memory access, Supercomputer, computer.software_genre, Data structure, QA76, Memory management, Computational Theory and Mathematics, Shared memory, Hardware and Architecture, Signal Processing, Operating system, Distributed memory, Central processing unit, computer, Garbage collection
Abstract: The MapReduce platform has been widely used for large-scale data processing and analysis recently. It works well if the hardware of a cluster is well configured. However, our survey has indicated that common hardware configurations in small- and medium-size enterprises may not be suitable for such tasks. This situation is more challenging for memory-constrained systems, in which the memory is a bottleneck resource compared with the CPU power and thus does not meet the needs of large-scale data processing. The traditional high performance computing (HPC) system is an example of the memory-constrained system according to our survey. In this paper, we have developed Mammoth, a new MapReduce system, which aims to improve MapReduce performance using global memory management. In Mammoth, we design a novel rule-based heuristic to prioritize memory allocation and revocation among execution units (mapper, shuffler, reducer, etc.), to maximize the holistic benefits of the Map/Reduce job when scheduling each memory unit. We have also developed a multi-threaded execution engine, which is based on Hadoop but runs in a single JVM on a node. In the execution engine, we have implemented the algorithm of memory scheduling to realize global memory management, based on which we further developed the techniques such as sequential disk accessing, multi-cache and shuffling from memory, and solved the problem of full garbage collection in the JVM. We have conducted extensive experiments to compare Mammoth against the native Hadoop platform. The results show that the Mammoth system can reduce the job execution time by more than 40 percent in typical cases, without requiring any modifications of the Hadoop programs. When a system is short of memory, Mammoth can improve the performance by up to 5.19 times, as observed for I/O intensive applications, such as PageRank. We also compared Mammoth with Spark. Although Spark can achieve better performance than Mammoth for interactive and iterative applications when the memory is sufficient, our experimental results show that for batch processing applications, Mammoth can adapt better to various memory environments and outperform Spark when the memory is insufficient, and can obtain similar performance as Spark when the memory is sufficient. Given the growing importance of supporting large-scale data processing and analysis and the proven success of the MapReduce platform, the Mammoth system can have a promising potential and impact.
Published: 2015

32. DRIC: Dependable Grid Computing Framework

Author: Weizhong Qiang, Deqing Zou, Xuanhua Shi, and Hai Jin
Subjects: Service (systems architecture), business.industry, Computer science, Distributed computing, Quality of service, Grid, computer.software_genre, Semantic grid, Grid computing, Artificial Intelligence, Hardware and Architecture, Scalability, Dependability, The Internet, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, business, computer, Software
Abstract: Grid computing presents a new trend to distributed and Internet computing to coordinate large scale resources sharing and problem solving in dynamic, multi-institutional virtual organizations. Due to the diverse failures and error conditions in the grid environments, developing, deploying, and executing applications over the grid is a challenge, thus dependability is a key factor for grid computing. This paper presents a dependable grid computing framework, called DRIC, to provide an adaptive failure detection service and a policy-based failure handling mechanism. The failure detection service in DRIC is adaptive to users' QoS requirements and system conditions, and the failure-handling mechanism can be set optimized based on decision-making method by a policy engine. The performance evaluation results show that this framework is scalable, high efficiency and low overhead.
Published: 2006
Full Text: View/download PDF

33. FITDOC: Fast Virtual Machines Checkpointing with Delta Memory Compression

Author: Xuanhua Shi, Yunjie Du, Song Wu, and Hai Jin
Subjects: Dirty data, Hardware_MEMORYSTRUCTURES, business.industry, Computer science, Real-time computing, computer.file_format, computer.software_genre, Virtualization, Instruction set, Virtual machine, Dirty bit, Server, Operating system, Overhead (computing), Bitmap, Data center, business, computer
Abstract: Virtualization provides the function of saving the whole execution environment status of the running virtual machine (VM), which makes check pointing flexible and practical for HPC servers or data center servers. However, the system-level check pointing needs to save a large number of data to the disk. Moreover, the overhead grows linearly with the increasing size of virtual machine memory, which leads to disk I/O consumption disaster along with poor system scalability. To target this, we propose a novel fast VMs check pointing approach, named Fast Incremental Check pointing with Delta Memory Compression (FITDOC). By studying the run-time memory characteristics of different workloads, FITDOC counts the dirty pages in a fine-granularity manner (the number of 8 bytes), instead of the conventional method (the number of pages). FITDOC utilizes dirty page logging mechanism to record the dirty pages, accordingly, a delta memory compression mechanism is implemented to eliminate redundant memory data in check pointing files. To locate the dirty data in dirty pages, FITDOC utilize two mechanisms: by analyzing the distribution characteristics of dirty pages in dirty bitmap, we propose a fast dirty bitmap scanning method to locate the dirty pages, and take a multi-threading data comparison mechanism to locate the real dirty data in one page. The experimental results show that compared with Xen's default system-level check pointing algorithm, FITDOC can reduce 70.54% of check pointing time on average with 1GB memory size and achieve better improvement for VMs with larger memory configurations. FITDOC can reduce 52.88% of the check pointing data size on average compared with Remus's incremental solution which is in page granularity. Compared with default dirty bitmap scanning method in Xen, the scanning time of FITDOC is decreased by 91.13% on average.
Published: 2014
Full Text: View/download PDF

34. Communication-driven scheduling for virtual clusters in cloud

Author: Zhenjiang Xie, Song Wu, Sheng Di, Hai Jin, Haibao Chen, Xuanhua Shi, and Bing Bing Zhou
Subjects: business.industry, Computer science, Distributed computing, Virtual cluster, Cloud computing, computer.file_format, computer.software_genre, Fair-share scheduling, Scheduling (computing), Hybrid Scheduling, Virtual machine, Operating system, Cluster (physics), Executable, business, computer
Abstract: Due to high flexibility and cost-effectiveness, cloud computing is increasingly being explored as an alternative to local clusters by academic and commercial users. Recent research already confirmed the feasibility of running tightly-coupled parallel applications with virtual clusters. However, such types of applications suffer from significant performance degradation, especially as the overcommitment is common in cloud. That is, the number of executable Virtual CPUs (VCPUs) is often larger than that of available Physical CPUs (PCPUs) in the system. The performance degradation mainly results from that the current Virtual Machine Monitors (VMMs) cannot co-schedule (or coordinate at the same time) the VCPUs that host parallel application threads/processes with synchronization requirements.We introduce a communication-driven scheduling approach for virtual clusters in this paper, which can effectively mitigate the performance degradation of tightly-coupled parallel applications running atop them in overcommitted situation. There are two key contributions. 1) We propose a communication-driven VM scheduling (CVS) algorithm, by which the involved VMM schedulers can autonomously schedule suitable VMs at runtime. 2) We integrate the CVS algorithm into Xen VMM scheduler, and rigorously implement a prototype. We evaluate our design on a real cluster environment, and experiments show that our solution attains better performance for tightly-coupled parallel applications than the state-of-the-art approaches like Credit scheduler of Xen, balance scheduling, and hybrid scheduling.
Published: 2014
Full Text: View/download PDF

35. Iteration Based Collective I/O Strategy for Parallel I/O Systems

Author: Yong Chen, Zhixiang Wang, Xuanhua Shi, Song Wu, and Hai Jin
Subjects: File system, Parallel processing (DSP implementation), I/O scheduling, Computer science, Server, Distributed computing, Bandwidth (signal processing), Benchmark (computing), Parallel computing, Performance improvement, computer.software_genre, computer, Parallel I/O
Abstract: MPI collective I/O is a widely used I/O method that helps data-intensive scientific applications gain better I/O performance. However, it has been observed that existing collective I/O strategies do not perform well due to the access contention problem. Existing collective I/O optimization strategies mainly focus on the I/O phase efficiency and ignore the shuffle cost that may limit the potential of their performance improvement. We observe that as the size of I/O becomes larger, one I/O operation from the upper application would be separated into several iterations to complete. So, I/O requests in each file domain do not necessarily issue to the parallel file system simultaneously unless they are carried out within the same iteration step. Based on that observation, this paper proposes a new collective I/O strategy that reorganizes I/O requests within each file domain instead of coordinating requests across file domains, such that we can eliminate access contentions without introducing extra shuffle cost between aggregators and computing processes. Using benchmark workloads IOR, we evaluate our new strategy and compare with the conventional one. The proposed strategy achieves up to 47%--63% I/O bandwidth improvement compared to the existing ROMIO collective I/O strategy.
Published: 2014
Full Text: View/download PDF

36. Page Classifier and Placer: A Scheme of Managing Hybrid Caches

Author: Song Wu, Hai Jin, Xiaoming Li, Xuanhua Shi, Xiaofei Liao, and Xin Yu
Subjects: Hardware_MEMORYSTRUCTURES, Computer science, Cache coloring, business.industry, Cache-oblivious algorithm, Cache pollution, computer.software_genre, Smart Cache, Cache invalidation, Embedded system, Operating system, Page cache, Cache, business, computer, Cache algorithms
Abstract: Hybrid cache architecture (HCA), which uses two or more cache hierarchy designs in a processor, may outperform traditional cache architectures because no single memory technology can deliver the optimal power, performance and density at the same time. The general HCA scheme has also been proposed to manage cache regions that have different usage patterns. However previous HCA management schemes control data placement at cache set level and are oblivious to software’s different power and performance characteristics in different hardware cache regions. This hardware-only approach may lead to performance loss and may fail to guarantee quality of service. We propose a new HCA approach that enables OS to be aware of underlying hybrid cache architecture and to control data placement, at OS page level, onto difference cache regions. Our approach employs a light-weighted hardware profiler to monitor cache behaviors at OS page level and to capture the hot pages. With this knowledge, OS will be able to dynamically select different cache placement policies to optimize placement of data to achieve higher performance, lower power consumption and better quality of service. Our simulation experiments demonstrate that the proposed hybrid HCA achieves 7.8% performance improvement on a dual-core system compared to a traditional SRAM-only cache architecture and at the same time reduces area cost.
Published: 2014
Full Text: View/download PDF

37. A Real-Time Scheduling Framework Based on Multi-core Dynamic Partitioning in Virtualized Environment

Author: Like Zhou, Danqing Fu, Song Wu, Hai Jin, and Xuanhua Shi
Subjects: Earliest deadline first scheduling, Multi-core processor, Software_OPERATINGSYSTEMS, Computer science, business.industry, Distributed computing, Real-time computing, Hypervisor, Cloud computing, computer.software_genre, Virtualization, Scheduling (computing), Virtual machine, business, computer
Abstract: With the prevalence of virtualization and cloud computing, many real-time applications are running in virtualized cloud environments. However, their performance cannot be guaranteed because current hypervisors’ CPU schedulers aim to share CPU resources fairly and improve system throughput. They do not consider real-time constraints of these applications, which result in frequent deadline misses. In this paper, we present a real-time scheduling framework in virtualized environment. In the framework, we propose a mechanism called multi-core dynamic partitioning to divide physical CPUs (PCPUs) into two pools dynamically according to the scheduling parameters of real-time virtual machines (RT-VMs). We apply different schedulers to these pools to schedule RT-VMs and non-RT-VMs respectively. Besides, we design a global earliest deadline first (vGEDF) scheduler to schedule RT-VMs. We implement a prototype in the Xen hypervisor and conduct experiments to verify its effectiveness.
Published: 2014
Full Text: View/download PDF

38. Cost-Aware Client-Side File Caching for Data-Intensive Applications

Author: Song Wu, Hai Jin, Xuanhua Shi, Yong Chen, and Huang Yaning
Subjects: Snoopy cache, Distributed database, Cache coloring, CPU cache, Computer science, Cache pollution, Cache-oblivious algorithm, computer.software_genre, Supercomputer, Cache stampede, Smart Cache, Cache invalidation, Write-once, Server, Operating system, Data-intensive computing, Page cache, Cache, computer, Cache algorithms
Abstract: Parallel and distributed file systems are widely used to provide high throughput in high-performance computing and Cloud computing systems. To increase the parallelism, I/O requests are partitioned into multiple sub-requests (or `flows') and distributed across different data nodes. The performance of file systems is extremely poor if data nodes have highly unbalanced response time. Client-side caching offers a promising direction for addressing this issue. However, current work has primarily used client-side memory as a read cache and employed a write-through policy which requires synchronous update for every write and significantly under-utilizes the client-side cache when the applications are write-intensive. Realizing that the cost of an I/O request depends on the struggler sub-requests, we propose a cost-aware client-side file caching (CCFC) strategy, that is designed to cache the sub-requests with high I/O cost on the client end. This caching policy enables a new trade-off across write performance, consistency guarantee and cache size dimensions. Using benchmark workloads MADbench2, we evaluate our new cache policy alongside conventional write-through. We find that the proposed CCFC strategy can achieve up to 110% throughput improvement compared to the conventional write-through policies with the same cache size on an 85-node cluster.
Published: 2013
Full Text: View/download PDF

39. Virtual Machine Scheduling for Parallel Soft Real-Time Applications

Author: Song Wu, Hai Jin, Xuanhua Shi, Huahua Sun, and Like Zhou
Subjects: Multi-core processor, Computer science, business.industry, Real-time computing, Cloud computing, Dynamic priority scheduling, computer.software_genre, Virtualization, Fair-share scheduling, Scheduling (computing), Fixed-priority pre-emptive scheduling, Virtual machine, business, computer
Abstract: With the prevalence of multicore processors in computer systems, many soft real-time applications, such as media-based ones, use parallel programming models to utilize hardware resources better and possibly shorten response time. Meanwhile, virtualization technology is widely used in cloud data centers. More and more cloud services including such parallel soft real-time applications are running in virtualized environment. However, current hyper visors do not provide adequate support for them because of soft real-time constraints and synchronization problems, which result in frequent deadline misses and serious performance degradation. CPU schedulers in underlying hyper visors are central to these issues. In this paper, we identify and analyze CPU scheduling problems in hyper visors, and propose a novel scheduling algorithm considering both soft real-time constraints and synchronization problems. In our proposed method, real-time priority is introduced to accelerate event processing of parallel soft real-time applications, and dynamic time slice is used to schedule virtual CPUs. Besides, all runnable virtual CPUs of virtual machines running parallel soft real-time applications are scheduled simultaneously to address synchronization problems. We implement a parallel soft real-time scheduler, named Poris, based on Xen. Our evaluation shows Poris can significantly improve the performance of parallel soft real-time applications. For example, compared to the Credit scheduler, Poris improves the performance of media player by up to a factor of 1.35, and shortens the execution time of PARSEC benchmark by up to 44.12%.
Published: 2013
Full Text: View/download PDF

40. Cranduler: A Dynamic and Reusable Scheduler for Cloud Infrastructure Service

Author: Xuanhua Shi, Hongqing Zhu, Bo Xie, Song Wu, and Hai Jin
Subjects: Cloud computing security, business.industry, Computer science, Information technology, Cloud computing, computer.software_genre, Scheduling (computing), Software portability, Virtual machine, Operating system, Architecture, Cyberspace, business, computer
Abstract: As an import trend of cyberspace in the future, cloud computing has attracted much attention from the IT industry. Many research institutions and companies have launched their own cloud platforms, which have virtual machine schedulers to manage the infrastructure resource pool. The virtual machine scheduling modules in these platforms are built in the platform and it is hard for developers to re-program. Since developers cannot design and implement special policies in the platform, the flexibility of the virtual machine scheduler is poor. Furthermore, the schedule architecture which has a firm and unchangeable interface is designed and customized for one kind of cloud platform. It leads to poor portability. To target the problems above, this paper presents a dynamic and reusable scheduling system for cloud infrastructure service, called Cranduler, which introduces the advantages of cluster schedulers to the virtual machine scheduling in cloud infrastructure. The scheduling policies of Cranduler could be dynamically configured by developers. Developers can easily insert the custom policy. In addition, Cranduler provides a set of unified interfaces to the cloud platform, which make the system easily access resources from different cloud platforms and be reused in different cloud platforms.
Published: 2013
Full Text: View/download PDF

41. RTRM: A Response Time-Based Replica Management Strategy for Cloud Storage System

Author: Bai Xiaohu, Xiaofei Liao, Hai Jin, Zhiyuan Shao, and Xuanhua Shi
Subjects: Service (systems architecture), Database, Computer science, Replica, Distributed computing, Data_MISCELLANEOUS, Bandwidth (signal processing), Response time, Graph theory, computer.software_genre, Condensed Matter::Disordered Systems and Neural Networks, Reduction (complexity), Server, computer, Computer Science::Distributed, Parallel, and Cluster Computing, Selection (genetic algorithm)
Abstract: Replica management has become a hot research topic in storage systems. This paper presents a dynamic replica management strategy based on response time, named RTRM. RTRM strategy consists of replica creation, replica selection, and replica placement mechanisms. RTRM sets a threshold for response time, if the response time is longer than the threshold, RTRM will increase the number of replicas and create new replica. When a new request comes, RTRM will predict the bandwidth among the replica servers, and make the replica selection accordingly. The replica placement refers to search new replica placement location, and it is a NP-hard problem. Based on graph theory, this paper proposes a reduction algorithm to solve this problem. The simulation results show that RTRM strategy performs better than the five built-in replica management strategies in terms of network utilization and service response time.
Published: 2013
Full Text: View/download PDF

42. Optimizing Xen Hypervisor by Using Lock-Aware Scheduling

Author: Alin Zhong, Song Wu, Hai Jin, Wei Gen, and Xuanhua Shi
Subjects: Application virtualization, Hardware virtualization, Full virtualization, business.industry, Computer science, Temporal isolation among virtual machines, Hypervisor, computer.software_genre, Virtualization, Fair-share scheduling, Virtual machine, Embedded system, Operating system, business, computer
Abstract: System virtualization enables multiple isolated running environments to be safely consolidated on a physical server, achieving better physical resource utilization and power saving. Virtual machine has been an essential component in most of the cloud/data-center system software stacks. However, virtualization brings negative impacts on synchronization in guest operating system (guest OS) and thus dramatically degrades the performance of the virtual machine. Therefore, how to effectively eliminate or alleviate the disadvantageous impacts has been becoming an open research issue. Xen hyper visor is a wide used virtualized platform in the area of industry and research. In this work, our aim is to optimize Xen hyper visor to minimize the impacts of virtualization on synchronization in guest OS. We propose a lock-aware scheduling mechanism, which focuses on improving the performance of virtual machines where spin-lock primitive is frequently invoked, as well as guaranteeing the scheduling fairness. The mechanism adopts a flexible scheduling algorithm based on the information of spin-lock, which is updated dynamically. We have modified Xen and Linux to implement the scheduling mechanism. Experimental results show that the optimized system can nearly eliminate the impacts of virtualization on synchronization and improve the performance of virtual machines substantially. Although the proposed mechanism is aimed at optimizing Xen hyper visor, it can also be applied to some other Para virtualized platforms.
Published: 2012
Full Text: View/download PDF

43. Assessing MapReduce for Internet Computing: a Comparison of Hadoop and BitDew-MapReduce

Author: Gilles Fedak, Hai Jin, Lu Lu, Xuanhua Shi, Cluster and Grid Computing Lab, Huazhong University of Science and Technology [Wuhan] (HUST), Algorithms and Software Architectures for Distributed and HPC Platforms (AVALON), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de l'Informatique du Parallélisme (LIP), École normale supérieure de Lyon (ENS de Lyon)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure de Lyon (ENS de Lyon)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon-Centre National de la Recherche Scientifique (CNRS), École normale supérieure - Lyon (ENS Lyon)-Université Claude Bernard Lyon 1 (UCBL), and Université de Lyon-Université de Lyon-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Lyon (ENS Lyon)-Université Claude Bernard Lyon 1 (UCBL)
Subjects: 020203 distributed computing, Computer science, business.industry, Distributed computing, Node (networking), Cloud computing, 02 engineering and technology, computer.software_genre, Grid computing, Backup, Middleware (distributed applications), Scalability, Distributed data store, 0202 electrical engineering, electronic engineering, information engineering, Operating system, 020201 artificial intelligence & image processing, The Internet, [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], business, computer
Abstract: International audience; MapReduce is emerging as an important programming model for data-intensive application. Adapting this model to desktop grid would allow taking advantage of the vast amount of computing power and distributed storage to execute new range of application able to process enormous amount of data. In 2010, we have presented the first implementation of MapReduce dedicated to Internet Desktop Grid based on the BitDew middleware. In this paper, we present new optimizations to BitDew-MapReduce (BitDew-MR): aggressive task backup, intermediate result backup, task re-execution mitigation and network failure hiding. We propose a new experimental framework which emulates key fundamental aspects of Internet Desktop Grid. Using the framework, we compare BitDew-MR and the open-source Hadoop middleware on Grid5000. Our experimental results show that 1) BitDew-MR successfully passes all the stress-tests of the framework while Hadoop is unable to work in typical wide-area network topology which includes PC hidden behind firewall and NAT; 2) BitDew-MR outperforms Hadoop performances on several aspects: scalability, fairness, resilience to node failures, and network disconnections.
Published: 2012
Full Text: View/download PDF

44. Effectively deploying services on virtualization infrastructure

Author: Xuanhua Shi, Wei Gao, Song Wu, Hai Jin, and Jinyan Yuan
Subjects: Service (systems architecture), Deployment time, General Computer Science, Computer science, business.industry, Distributed computing, Management model, computer.software_genre, Virtualization, Theoretical Computer Science, Software, Software deployment, Virtual machine, Operating system, Clone (computing), business, computer
Abstract: Virtualization technology provides an opportunity to acieve efficient usage of computing resources. However, the management of services on virtualization infrastructure is still in the preliminary stage. Contstructing user service environments quickly and efficiently remains a challenge. This paper presents a service oriented multiple-VM deployment system (SO-MVDS) for creating and configuring virtual appliances running services on-demand. The system provides a template management model where all the virtual machines are created based on the templates with the software environment pre-prepared. To improve the deployment performance, we explore some strategies for incremental mechanisms and deployment.We also design a service deployment mechanism to dynamically and automatically deploy multiple services within virtual appliances. We evaluate both the deployment time and I/O performance using the proposed incremental mechanism. The experimental results show that the incremental mechanism outperforms the clone one.
Published: 2012
Full Text: View/download PDF

45. Fast saving and restoring virtual machines with page compression

Author: Xuanhua Shi, Song Wu, Li Deng, Jiangfu Zhou, and Hai Jin
Subjects: File system, Desktop virtualization, Computer science, Hardware virtualization, Full virtualization, Virtualization, computer.software_genre, Virtual machine, Server, Multithreading, Operating system, Network File System, Page table, computer
Abstract: More and more enterprises are moving beyond server virtualization to desktop virtualization in recent years. In virtualization environments, centralized shared storage systems are generally used to take advantage of virtualization features such as VM migration. Network file system (NFS) is considered to be the best choice in small or medium sized LANs due to its flexibility and low cost. But it becomes the bottleneck when many clients access the server simultaneously, especially when multiple virtual machines access a large amount of data at the same time, such as operation save and restore. In this paper, we present a new method named ComIO to quickly save and restore virtual machines using page compression. Based on the analysis of virtual machines' memory characteristics, we design a fast enhanced characteristic-based compression (ECBC) algorithm. Combined with multi-threaded techniques, the compression tasks are parallelized for significantly shortened compresssion time. Page boundary alignment is proposed to enable wanted page data to be directly extracted from the compressed block. The experimental results demonstrate that compared with Xen, our method ComIO not only greatly reduces the time spent on saving and restoring virtual machines on average, but also indirectly augments the effective storage space.
Published: 2011
Full Text: View/download PDF

46. LCM: A lightweight communication mechanism in HPC cloud

Author: Yu Fu, Haibao Chen, Xuanhua Shi, Song Wu, and Hai Jin
Subjects: business.industry, Computer science, Binary image, Linux kernel, Cloud computing, computer.software_genre, Unix domain socket, Mechanism (engineering), Binary code compatibility, Embedded system, Benchmark (computing), Operating system, File transfer, business, computer
Abstract: Inspired by the concept of cloud computing, the construction of HPC Cloud with traditional HPC resources not only provides new opportunities to address the challenges to the traditional HPC, but also brings many exciting research problems. One of them is how to reduce the network overhead of virtual cluster in HPC cloud. In order to resolve the problem, this paper presents the design and implementation of lightweight communication mechanism for virtual cluster in HPC Cloud, called LCM, which maintains binary compatibility for applications written in standard socket interface. We implemented our design on Xen 3.2 with Linux kernel 2.6.18, and evaluated the speed of file transfer, the running time of NAS Parallel benchmark and binary compatibility using binary image of real socket applications. In our tests, we have proved that LCM realizes the high performance that is comparable to UNIX domain socket and ensures full binary compatibility.
Published: 2011
Full Text: View/download PDF

47. EAPAC: An Enhanced Application Placement Framework for Data Centers

Author: Xuanhua Shi, Hai Jin, Bo Yu, Ligang He, Fei Wang, Hongbo Jiang, and Chonggang Wang
Subjects: Focus (computing), CPU power dissipation, business.industry, Computer science, Application server, Distributed computing, computer.software_genre, Concurrency control, Resource (project management), Order (exchange), Resource allocation (computer), Resource allocation, business, Host (network), computer, Computer network
Abstract: Emerging data centers may host a large number of applications that consume CPU power, memory, and I/O resources. Previous studies focus on the allocation of resources in order to perfectly satisfy the demands seen in the current cycle, and the existing application placement algorithms are all based on applications. The existing application placement algorithms in literature assume that the consumption of system resources is proportional to the level of workloads submitted to the system. In this paper, we revealed that it may not be the case in some circumstances. Based on this observation, we design and implement an application placement framework, called EAPAC, for data centers. The developed framework is able to judiciously allocate to application servers a proper mixture of different types of application requests as well as an appropriate number of requests in each type. Further, we investigate the issue of resource conflicts among different applications when there exist concurrent requests in the system. We have conducted extensive experiments to evaluate the performance of the developed framework. The experiment results show that compared with the existing studies, EAPAC can improve the performance by 30% in terms of the reply rate. Especially, when there are concurrent requests in the system, the performance can be improved by 100%.
Published: 2011
Full Text: View/download PDF

48. An Approach to Use Cluster-Wide Free Memory in Virtual Environment

Author: Like Zhou, Song Wu, Hai Jin, and Xuanhua Shi
Subjects: Distributed shared memory, Hardware_MEMORYSTRUCTURES, Flat memory model, Computer science, business.industry, Uniform memory access, Registered memory, Semiconductor memory, computer.software_genre, Memory map, Embedded system, Interleaved memory, Operating system, business, computer, Computer memory
Abstract: Memory and I/O intensive applications always use a huge amount of memory and the performance decreases quickly when memory pressure arises. With the development of high performance network and widely used in cluster, the latency of remote memory access is much less than that of disk operation. In this paper, we present an approach to let the VM (Virtual Machine) use cluster-wide free memory, which can overcome the limitation of physical memory by exploiting low-latency access to the memory of other nodes in cluster. This approach can reduce the execution time for memory and I/O intensive applications significantly by utilizing cluster-wide memory and increase the whole cluster utilization.
Published: 2011
Full Text: View/download PDF

49. A Cloud Service Cache System Based on Memory Template of Virtual Machine

Author: Xiaoxin Wu, Chao Liu, Song Wu, Hai Jin, Li Deng, and Xuanhua Shi
Subjects: Cache coloring, business.industry, Computer science, Temporal isolation among virtual machines, Cloud computing, computer.software_genre, Virtual machine, Virtual memory, Operating system, Page cache, Cache, business, Data diffusion machine, computer
Abstract: In data centers and cloud computing environments, the number of virtual machines (VMs) increases when the number of service requests increases. Since services are invoked on demand, the corresponding virtual machines will be created and shut down frequently. This makes the time of starting a virtual machine a crucial performance bottleneck for services in data centers. Besides, if virtual machines read image files from disks to start themselves, additional overhead to access disk will be generated. In this paper, we present a cloud service cache system based on memory template of virtual machines ¨C VCache to improve the response time of cloud computing services, and to reduce the disk access overhead. This system can create and maintain service cache VMs through memory templates, which are snapshots of running virtual machines. By creating virtual machines from cached image files, the ser-vice running in these VMs can be deployed rapidly, which greatly reduces the launching time of the service and disk I/O load. We evaluate our system with experiments, and the experimental results show that the average time for creating a VM is reduced about 80% and the amount of data through disk access decreases more than 50%.
Published: 2011
Full Text: View/download PDF

50. Dynamic Processor Resource Configuration in Virtualized Environments

Author: Song Wu, Hai Jin, Li Deng, and Xuanhua Shi
Subjects: Computer science, business.industry, Distributed computing, Temporal isolation among virtual machines, Dynamic priority scheduling, computer.software_genre, Virtualization, Virtual machine, Operating system, Resource allocation, Resource management, Data center, business, computer, Live migration
Abstract: Virtualization can provide significant benefits in data centers, such as dynamic resource configuration, live virtual machine migration. Services are deployed in virtual machines (VMs) and resource utilization can be greatly improved. In this paper, we present VScheduler, a system that dynamically adjusts processor resource configuration of virtual machines, including the amount of virtual resource and a new mapping of virtual machines and physical nodes. VScheduler implements a two-level resource configuration scheme -- local resource configuration (LRC) for an individual virtual machine and global resource configuration (GRC) for a whole cluster or data center. GRC especially takes variation tendency of workloads into account when remapping virtual machines to physical nodes. We implement our techniques in Xen and conduct a detailed evaluation using RUBiS and dbench. The experimental results show that VScheduler not only satisfies resource demands of services, but also reduces the number of virtual machines migration, which can provide a stable VM distribution on physical nodes in data centers.
Published: 2011
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

86 results on '"Xuanhua Shi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources