Author: "Chunming Hu" / Topic: business.industry - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chunming Hu"' showing total 47 results

Start Over Author "Chunming Hu" Topic business.industry

47 results on '"Chunming Hu"'

1. A Collective Approach to Scholar Name Disambiguation

Author: Chunming Hu, Xiang Zhang, Jinpeng Huai, Shuai Ma, Dongsheng Luo, and Yaowei Yan
Subjects: Computational Theory and Mathematics, business.industry, Computer science, Name disambiguation, Artificial intelligence, business, computer.software_genre, computer, Natural language processing, Computer Science Applications, Information Systems
Published: 2022

2. ScaleReactor: A graceful performance isolation agent with interference detection and investigation for container‐based scale‐out workloads

Author: Xiaoqiang Yu, Tianyu Wo, Chunming Hu, and Jianyong Zhu
Subjects: Computer Networks and Communications, business.industry, Computer science, Temporal isolation among virtual machines, Computer Science Applications, Theoretical Computer Science, Computational Theory and Mathematics, Interference (communication), Scalability, Container (abstract data type), Correlation analysis, business, Software, Computer hardware
Published: 2021

3. Online Multi-Skilled Task Assignment on Road Networks

Author: Kaixin Wang, Chunming Hu, Wenjun Wu, and Yu Liang
Subjects: General Computer Science, Computer science, 0211 other engineering and technologies, 02 engineering and technology, Crowdsourcing, Machine learning, computer.software_genre, Task (project management), road networks, Constant (computer programming), Road networks, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Online algorithm, task assignment, Hierarchically separated tree, 021103 operations research, Competitive analysis, business.industry, General Engineering, Tree structure, spatial crowdsourcing, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, computer, lcsh:TK1-9971, Strengths and weaknesses
Abstract: With the development of smart phones and online to offline, spatial platforms, such as TaskRabbit, are getting famous and popular. Tasks on these platforms have three main characters: they are in real-time dynamic scenario, they are on the road networks, and some of them have multiple skills. However, existing studies do not take into account all these three things simultaneously. Therefore, an important issue of spatial crowdsourcing platforms is to assign workers to tasks according to their skills on road networks in a real-time scenario. In this paper, we first propose a practical problem, called online multi-skilled task assignment on road networks (OMTARN) problem, and prove that the OMTARN problem is NP-Hard and no online algorithms can achieve a constant competitive ratio on this problem. Then, we design a framework using batch-based algorithms, including fixed and dynamic batch-based algorithm, and we show that how the algorithms update the batch. After that, we use the hierarchically separated tree structure to accelerate our algorithms. Finally, we implement all the algorithms of the OMTARN problem and clarify their strengths and weaknesses by testing them on both synthetic and real datasets.
Published: 2019

4. Brief Industry Paper: optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Author: Yunli Chen, Weisheng Zhao, Pengcheng Dai, Yingjie Qi, Jianlei Yang, Xiaoyi Wang, Ao Zhou, Yeqi Gao, Tong Qiao, and Chunming Hu
Subjects: Graph neural networks, business.industry, Computer science, Inference, Machine learning, computer.software_genre, Range (mathematics), Memory management, Feature (machine learning), Decomposition (computer science), Limit (mathematics), Artificial intelligence, business, computer, Edge computing
Abstract: Graph neural networks (GNN) have achieved state-of-the-art performance on various industrial tasks. However, the poor efficiency of GNN inference and frequent Out-of-Memory (OOM) problem limit the successful application of GNN on edge computing platforms. To tackle these problems, a feature decomposition approach is proposed for memory efficiency optimization of GNN inference. The proposed approach could achieve outstanding optimization on various GNN models, covering a wide range of datasets, which speeds up the inference by up to 3×. Furthermore, the proposed feature decomposition could significantly reduce the peak memory usage (up to 5× in memory efficiency improvement) and mitigate OOM problems during GNN inference.
Published: 2021

5. Hybrid Resource Orchestration and Scheduling for Cyber-Physical-Human Systems

Author: Tianyu Wo, Wang Xu, Jianyong Zhu, and Chunming Hu
Subjects: Computer science, business.industry, Distributed computing, 020208 electrical & electronic engineering, Cyber-physical system, 020206 networking & telecommunications, Cloud computing, 02 engineering and technology, Networking hardware, Scheduling (computing), Resource (project management), Server, 0202 electrical engineering, electronic engineering, information engineering, Orchestration (computing), business, Edge computing
Abstract: Recently Cyber-Physical-Human Systems (CPHS) have been attracted much attention. Unlike existing computing paradigms such as Cloud Computing and Edge Computing, CPHS usually comprise many types of hybrid resources such as software and hardware in the cloud layer, network devices and edge servers in the network layer, as well as the physical devices and human participation. Such hybrid resource makes it challenging to efficient resource scheduling for CPHS. Previous studies on CPHS often focus on improving scheduling efficiency for specific application scenarios and thus are limited for other CPHS. In this work, we present a unified hybrid resource scheduling framework for CPHS, including a hybrid resource model for multiple types of resource providers and participants in CPHS, a hybrid resource orchestration tool that can describe our hybrid resource model as the resource requirements, and a scheduling mechanism to satisfy the resource requirements especially by considering the resource constraints and crosslayer scheduling in CPHS. We perform the case study of typical CPHS scenarios and show that our hybrid resource model and scheduling framework is effective, and provide a new view for improving the resource provision efficiency in CPHS.
Published: 2020

6. TOPOSCH: Latency-Aware Scheduling Based on Critical Path Analysis on Shared YARN Clusters

Author: Tianyu Wo, Shiqing Xue, Xiaoqiang Yu, Jianyong Zhu, Jie Xu, Chunming Hu, Rajiv Ranjan, Hao Peng, and Renyu Yang
Subjects: 020203 distributed computing, Computer science, business.industry, Quality of service, Distributed computing, Big data, Resource Management System, 020206 networking & telecommunications, 02 engineering and technology, Yarn, Microservices, computer.software_genre, Scheduling (computing), visual_art, 0202 electrical engineering, electronic engineering, information engineering, visual_art.visual_art_medium, Batch processing, Resource allocation, Resource management, Latency (engineering), Web service, business, computer, Critical path method
Abstract: Balancing resource utilization and application QoS is a long-standing research topic in cluster resource management. Big data YARN clusters need to co-schedule diverse workloads on shared resources including batch processing jobs, streaming jobs, and other long-running applications such as web services, database services, etc. Current resource managers are only responsible for resource allocation among applications/jobs but completely unaware of runtime QoS requirements of interactive and latency-sensitive applications. Prior works to maximize the QoS of monolithic applications ignore inherent dependencies and temporal-spatio performance variability of components, characteristics of distributed applications primarily driven by microservices. In this paper, we present Toposch, a new resource management system to adaptively co-locate batch tasks and microservices by harvesting runtime latency. In particular, Toposch tracks full footprints of every request across microservices over time. A latency graph is periodically generated for identifying victim microservices through an end-to-end latency critical path analysis. We then exploit per-microservice and per-node risk assessment to gauge the visible resources to the capacity scheduler in YARN. Execution of batch tasks are adaptively throttled or delayed, thereby avoiding latency increase due to node over-saturation. TOPOSCH is integrated with YARN and experiments show that the latency of DLRAs can be reduced by up to 39.8% against the default capacity scheduling in YARN.
Published: 2020

7. Performance-aware Speculative Resource Oversubscription for Large-scale Clusters

Author: Peter Garraghan, Chao Li, Xiaoyang Sun, Tianyu Wo, Jie Xu, Hao Peng, Renyu Yang, Zhenyu Wen, and Chunming Hu
Subjects: business.industry, Computer science, Quality of service, Admission control, Scheduling (computing), Cost reduction, Computational Theory and Mathematics, Hardware and Architecture, Signal Processing, Batch processing, Resource allocation, Resource management, business, Resource utilization, Computer network
Abstract: It is a long-standing challenge to achieve a high degree of resource utilization in cluster scheduling. Resource oversubscription has become a common practice in improving resource utilization and cost reduction. However, current centralized approaches to oversubscription suffer from the issue with resource mismatch and fail to take into account other performance requirements, e.g., tail latency. In this article we present ROSE, a new resource management platform capable of conducting performance-aware resource oversubscription. ROSE allows latency-sensitive long-running applications (LRAs) to co-exist with computation-intensive batch jobs. Instead of waiting for resource allocation to be confirmed by the centralized scheduler, job managers in ROSE can independently request to launch speculative tasks within specific machines according to their suitability for oversubscription. Node agents of those machines can however, avoid any excessive resource oversubscription by means of a mechanism for admission control using multi-resource threshold control and performance-aware resource throttle. Experiments show that in case of mixed co-location of batch jobs and latency-sensitive LRAs, the CPU utilization and the disk utilization can reach 56.34 and 43.49 percent, respectively, but the 95th percentile of read latency in YCSB workloads only increases by 5.4 percent against the case of executing the LRAs alone.
Published: 2020

8. Eagle+: A fast incremental approach to automaton and table online updates for cloud services

Author: Shenghai Zhong, Erica Yang, Hao Peng, Chunming Hu, Lihong Wang, and Jianxin Li
Subjects: Eagle, Theoretical computer science, biology, Computer Networks and Communications, Computer science, business.industry, 020206 networking & telecommunications, Cloud computing, 02 engineering and technology, Oracle, Automaton, Hardware and Architecture, biology.animal, Computation complexity, Atomic operations, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business, Algorithm, Software
Abstract: Automaton or table-based multi-pattern matching methods have been widely used in cloud services, i.e., virtual Firewall service, virtual IDS service, etc. In cloud, a large scale of patterns in such services are frequently updated causing by users’ joining or quitting and adjustment of security and management policies. Therefore, how to quickly and accurately update the Automaton and Table becomes an important issue. In this paper, we propose Eagle+, an incremental approach for updating the matching Automaton and Table whilst avoiding recalculating the whole patterns after each change. In Eagle+, we attain efficiency by computing only the latest update set of patterns when updating the Automaton and Table. Moreover, Eagle+ achieves accurately local updating based on three atomic operations, adding, updating and deleting, each of which modifies values on classical Aho–Corasick (AC) automaton, Set Backward Oracle Matching (SBOM) automaton and Wu–Manber (WM) table. Compared with existing pattern updating methods, Eagle+ reduces the computation complexity from O ( n 2 ) to O ( n ) . The experimental results show that Eagle+ can save nearly 72%–92% of the time consumption in updating automatons and perform 100X faster in WM table.
Published: 2018

9. Evolution of Cloud Operating System: From Technology to Ecosystem

Author: Yongwei Wu, Zuo-Ning Chen, Lu-Fei Zhang, Ao-Bing Sun, Chunming Hu, Hong Tang, Song Wu, Yuzhong Sun, Zheng-Wei Qi, Kang Chen, Jinlei Jiang, and Zi-Lu Kang
Subjects: Application programming interface, Computer science, business.industry, 020207 software engineering, Cloud computing, 02 engineering and technology, Virtualization, computer.software_genre, Computer Science Applications, Theoretical Computer Science, Scheduling (computing), Computational Theory and Mathematics, Hardware and Architecture, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Operating system, Resource management, business, computer, Software
Abstract: The cloud operating system (cloud OS) is used for managing the cloud resources such that they can be used effectively and efficiently. And also it is the duty of cloud OS to provide convenient interface for users and applications. However, these two goals are often conflicting because convenient abstraction usually needs more computing resources. Thus, the cloud OS has its own characteristics of resource management and task scheduling for supporting various kinds of cloud applications. The evolution of cloud OS is in fact driven by these two often conflicting goals and finding the right tradeoff between them makes each phase of the evolution happen. In this paper, we have investigated the ways of cloud OS evolution from three different aspects: enabling technology evolution, OS architecture evolution and cloud ecosystem evolution. We show that finding the appropriate APIs (application programming interfaces) is critical for the next phase of cloud OS evolution. Convenient interfaces need to be provided without scarifying efficiency when APIs are chosen. We present an API-driven cloud OS practice, showing the great capability of APIs for developing a better cloud OS and helping build and run the cloud ecosystem healthily.
Published: 2017

10. Perphon

Author: Tianyu Wo, Jie Xu, Jianyong Zhu, Renyu Yang, Ouyang Jin, Chunming Hu, and Shiqing Xue
Subjects: Computer science, business.industry, Quality of service, Distributed computing, Offline learning, Temporal isolation among virtual machines, Performance prediction, Workload, Access control, Memory bandwidth, Cache, business
Abstract: Cluster administrators are facing great pressures to improve cluster utilization through workload co-location. Guaranteeing performance of long-running applications (LRAs), however, is far from settled as unpredictable interference across applications is catastrophic to QoS [2]. Current solutions such as [1] usually employ sandboxed and offline profiling for different workload combinations and leverage them to predict incoming interference. However, the time complexity restricts the applicability to complex co-locations. Hence, this issue entails a new framework to harness runtime performance and mitigate the time cost with machine intelligence: i) It is desirable to explore a quantitative relationship between allocated resource and consequent workload performance, not relying on analyzing interference derived from different workload combinations. The majority of works, however, depend on offline profiling and training which may lead to model aging problem. Moreover, multi-resource dimensions (e.g., LLC contention) that are not completely included by existing works but have impact on performance interference need to be considered [3]. ii) Workload co-location also necessitates fine-grained isolation and access control mechanism. Once performance degradation is detected, dynamic resource adjustment will be enforced and application will be assigned an access to specific slices of each resources. Inferring a "just enough" amount of resource adjustment ensures the application performance can be secured whilst improving cluster utilization. We present Perphon, a runtime agent on a per node basis, that decouples ML-based performance prediction and resource inference from centralized scheduler. Figure 1 outlines the proposed architecture. We initially exploit sensitivity of applications to multi-resources to establish performance prediction. To achieve this, Metric Monitor aggregates application fingerprint and system-level performance metrics including CPU, memory, Last Level Cache (LLC), memory bandwidth (MBW) and number of running threads, etc. They are enabled by Intel-RDT and precisely obtained from resource group manager. Perphon employs an Online Gradient Boost Regression Tree (OGBRT) approach to resolve model aging problem. Res-Perf Model warms up via offline learning that merely relies on a small volume of profiling in the early stage, but evolves with arrival of workloads. Consequently, parameters will be automatically updated and synchronized among agents. Anomaly Detector can timely pinpoint a performance degradation via LSTM time-series analysis and determine when and which application need to be re-allocated resources. Once abnormal performance counter or load is detected, Resource Inferer conducts a gradient ascend based inference to work out a proper slice of resources, towards dynamically recovering targeted performance. Upon receiving an updated re-allocation, Access Controller re-assigns a specific portion of the node resources to the affected application. Eventually, Isolation Executor enforces resource manipulation and ensures performance isolation across applications. Specifically, we use cgroup cpuset and memory subsystem to control usage of CPU and memory while leveraging Intel-RDT technology to underpin the manipulation of LLC and MBW. For fine-granularity management, we create different groups for LRA and batch jobs when the agent starts. Our prototype integration with Node Manager of Apache YARN shows that throughput of Kafka data-streaming application in Perphon is 2.0x and 1.82x times that of isolation execution schemes in native YARN and pure cgroup cpu subsystem.
Published: 2019

11. Shaready: A Resource-IsolatedWorkload Co-Location System

Author: Renyu Yang, Jianyong Zhu, Chunming Hu, and Shiqing Xue
Subjects: Computer science, business.industry, Quality of service, Distributed computing, 020206 networking & telecommunications, Provisioning, Cloud computing, 02 engineering and technology, Virtualization, computer.software_genre, Shared resource, Resource (project management), Virtual machine, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Resource management, business, computer
Abstract: Over a decade, cloud and subsequent joint cloud computing has been evolving into one of biggest disruptive technologies in modern digital age. The rapidly maturing cloud service and system management still heavily relies on virtualization which underpins Infrastructure as a Service (IaaS) to offer on-demand and low-cost computing services. Nevertheless datacenters still suffer from low utilization and resource imbalance. IaaS systems and their workloads, as legacy estates, are intricate to be migrated or re-planned, thereby increasing the complexity of utilization improvement. Arguably workload co-location of long-running applications encapsulated in virtual machines and latency-insensitive batch jobs is an alternative to improve overall resource utilization. However, guaranteeing the quality of long-running services is still challenging. In this context, we proposed an isolation-based cluster resource sharing system Shaready to enable workload co-residences. By means of global resource quota configuration and multi-resource isolation, long-running services in virtual machines can be prioritized with maximized resource provisioning. We implemented and validated it based on Openstack and Yarn clusters, and experiments demonstrate that system CPU and memory utilization can be improved by roughly 50% and 16.67% respectively on average with at most 7% performance degradation.
Published: 2019

12. MultiLanes

Author: Junbin Kang, Ye Zhai, Benlong Zhang, Jinpeng Huai, Chunming Hu, and Tianyu Wo
Subjects: business.industry, Computer science, Temporal isolation among virtual machines, 020206 networking & telecommunications, 02 engineering and technology, Storage virtualization, Virtualization, computer.software_genre, Hardware and Architecture, 020204 information systems, Embedded system, Scalability, Container (abstract data type), 0202 electrical engineering, electronic engineering, information engineering, Operating system, Overhead (computing), Namespace, business, computer, Host (network)
Abstract: OS-level virtualization is often used for server consolidation in data centers because of its high efficiency. However, the sharing of storage stack services among the colocated containers incurs contention on shared kernel data structures and locks within I/O stack, leading to severe performance degradation on manycore platforms incorporating fast storage technologies (e.g., SSDs based on nonvolatile memories). This article presents MultiLanes, a virtualized storage system for OS-level virtualization on manycores. MultiLanes builds an isolated I/O stack on top of a virtualized storage device for each container to eliminate contention on kernel data structures and locks between them, thus scaling them to manycores. Meanwhile, we propose a set of techniques to tune the overhead induced by storage-device virtualization to be negligible, and to scale the virtualized devices to manycores on the host, which itself scales poorly. To reduce the contention within each single container, we further propose SFS, which runs multiple file-system instances through the proposed virtualized storage devices, distributes all files under each directory among the underlying file-system instances, then stacks a unified namespace on top of them. The evaluation of our prototype system built for Linux container (LXC) on a 32-core machine with both a RAM disk and a modern flash-based SSD demonstrates that MultiLanes scales much better than Linux in micro- and macro-benchmarks, bringing significant performance improvements, and that MultiLanes with SFS can further reduce the contention within each single container.
Published: 2016

13. ROSE: Cluster Resource Scheduling via Speculative Over-Subscription

Author: Tianyu Wo, Chunming Hu, Chao Li, Peter Garraghan, Jianyong Zhu, Jie Xu, Xiaoyang Sun, and Renyu Yang
Subjects: Resource scheduling, Job shop scheduling, Request queue, Computer science, business.industry, Processor scheduling, CPU time, 020206 networking & telecommunications, Workload, 02 engineering and technology, Scheduling (computing), Idle, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Task analysis, Resource allocation, Resource management, business, Computer network
Abstract: A long-standing challenge in cluster scheduling is to achieve a high degree of utilization of heterogeneous resources in a cluster. In practice there exists a substantial disparity between perceived and actual resource utilization. A scheduler might regard a cluster as fully utilized if a large resource request queue is present, but the actual resource utilization of the cluster can be in fact very low. This disparity results in the formation of idle resources, leading to inefficient resource usage and incurring high operational costs and an inability to provision services. In this paper we present a new cluster scheduling system, ROSE, that is based on a multi-layered scheduling architecture with an ability to over-subscribe idle resources to accommodate unfulfilled resource requests. ROSE books idle resources in a speculative manner: instead of waiting for resource allocation to be confirmed by the centralized scheduler, it requests intelligently to launch tasks within machines according to their suitability to oversubscribe resources. A threshold control with timely task rescheduling ensures fully-utilized cluster resources without generating potential task stragglers. Experimental results show that ROSE can almost double the average CPU utilization, from 36.37% to 65.10%, compared with a centralized scheduling scheme, and reduce the workload makespan by 30.11%, with an 8.23% disk utilization improvement over other scheduling strategies.
Published: 2018

14. Cider: a Rapid Docker Container Deployment System through Sharing Network Storage

Author: Renyu Yang, Chunming Hu, Du Lian, and Tianyu Wo
Subjects: 020203 distributed computing, business.industry, Computer science, Distributed computing, Process (computing), Cloud computing, 02 engineering and technology, Concurrency control, Resource (project management), Software deployment, Scalability, Container (abstract data type), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business
Abstract: Container technology has been prevalent and widely-adopted in production environment considering the huge benefits to application packing, deploying and management. However, the deployment process is relatively slow by using conventional approaches. In large-scale concurrent deployments, resource contentions on the central image repository would aggravate such situation. In fact, it is observable that the image pulling operation is mainly responsible for the degraded performance. To this end, we propose Cider — a novel deployment system to enable rapid container deployment in a high concurrent and scalable manner at scale. Firstly, on-demand image data loading is proposed by altering the local Docker storage of worker nodes into all-nodes-sharing network storage. Also, the local copy-on-write layer for containers can ensure Cider to achieve the scalability whilst improving the cost-effectiveness during the holistic deployment. Experimental results reveal that Cider can shorten the overall deployment time by 85% and 62% on average when deploying one container and 100 concurrent containers respectively.
Published: 2017

15. ScalaRDF: A Distributed, Elastic and Scalable In-Memory RDF Triple Store

Author: Xixu Wang, Renyu Yang, Tianyu Wo, and Chunming Hu
Subjects: Computer science, business.industry, Distributed computing, 02 engineering and technology, computer.file_format, Query language, 020204 information systems, Scalability, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, RDF, business, Semantic Web, computer, Computer network
Abstract: The Resource Description Framework (RDF) andSPARQL query language are gaining increasing popularity andacceptance. The ever-increasing RDF data has reached a billionscale of triples, resulting in the proliferation of distributed RDFstore systems within the Semantic Web community. However, theelasticity and performance issues are still far from settled inface of data volume explosion and workload spike. In addition, providers face great pressures to provision uninterrupted reliablestorage service whilst reducing the operational costs due to avariety of system failures. Therefore, how to efficiently realizesystem fault tolerance remains an intractable problem. In this paper, we introduce ScalaRDF, a distributed and elastic in-memoryRDF triple store to provision a fault-tolerant and scalable RDFstore and query mechanism. Specifically, we describe a consistenthashing protocol that optimizes the RDF data placement, dataoperations (especially for online RDF triple update operations)and achieves an autonomously elastic data re-distribution in theevent of cluster node joining or departing, avoiding the holisticoscillation of data storage. In addition, the data store is ableto realize a rapid and transparent failover through replicationmechanism which stores in-memory data replica in the next hashhop. The experiments demonstrate that query time and updatetime are reduced by 87% and 90% respectively compared to otherapproaches. For an 18G source dataset, the data redistributiontakes at most 60 seconds when system scales out and at most 100seconds for recovery when nodes undergo crash-stop failures.
Published: 2016

16. Optimizing Virtual Machine Live Migration without Shared Storage in Hybrid Clouds

Author: Chunming Hu, Tianyu Wo, Bin Shi, Bo Li, and He Shan
Subjects: Computer science, business.industry, Distributed computing, Working set, 020206 networking & telecommunications, Hypervisor, Cloud computing, 02 engineering and technology, computer.software_genre, Storage hypervisor, Virtual machine, 0202 electrical engineering, electronic engineering, information engineering, Operating system, 020201 artificial intelligence & image processing, Data diffusion machine, business, computer, Live migration
Abstract: Virtual machine live migration technology allows a running VM migrates from one physical host to another with no impact on users. As the scale of distributed computing gets larger and larger, hybrid cloud, which is integrated cloud service utilizing both private and public clouds to perform distinct functions within the same organization becomes a hotspot for both academia and industry. Thus, it will be vital to migrate virtual machines in hybrid clouds. The biggest issue which VM migrationinhybridcloudsfacestodayishowtotransferthelarge amount of storage data. Storage migration of virtual machine with existing mechanisms of memory migration has been studied for a long time. Unfortunately, no one can perform perfectly without modification. It is compelling to consider a newmigration model and optimization methods. In this paper, we analyze the advantage and disadvantage of classical methods, and put forward a method which combines the strengths of pre-copy and post-copy migration model and we also use disk working set to optimize virtual machines storage migration. We have developed and implemented our approach to the QEMU/KVM hypervisor and run a series of experiments. The presented technique shows a reduction of up 20.5% on average of the total transfer time for most of the selected scenario. Our approach can also reduce the migration downtime in most cases.
Published: 2016

17. D^2PS: A Dependable Data Provisioning Service in Multi-tenant Cloud Environment

Author: Mingming Zhang, Tianyu Wo, Jie Xu, Chunming Hu, and Renyu Yang
Subjects: Multitenancy, business.industry, Computer science, Distributed computing, Software as a service, 020206 networking & telecommunications, Cloud computing, Provisioning, 02 engineering and technology, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Page cache, Cache, Data as a service, business, Cloud storage
Abstract: Software as a Service (SaaS) is a software delivery and business model widely used by Cloud computing. Instead of purchasing and maintaining a software suite permanently, customers only need to lease the software on-demand. The domain of high assurance distributed systems has focused greatly on the areas of fault tolerance and dependability. In a multi-tenant context, it is particularly important to store, manage and provision data services to customers in a highly efficient and dependable manner due to a large number of file operations involved in running such services. It is also desirable to allow a user group to share and cooperate (e.g., co-edit) on some specific data. In this paper we present a dependable data provisioning service in a multitenant Cloud environment. We describe a metadata management approach and leverage multiple replicated metadata caching to shorten the file access time, with the improved efficiency of data sharing. In order to reduce frequent data transmission and data access latency, we introduce a distributed cooperative disk cache mechanism that supports effective cache placement and pull-push cache synchronization. In addition, we use efficient component failover to enhance the service dependability whilst avoiding negative impact from system failures. Our experimental results show that our system can significantly reduce both unused data transmission and response latency. Specifically, over 50% network transmission and operational latency can be saved for random reads while 28.24% network traffic and 25% response latency can be reduced for random write operations. We believe that these findings are demonstrating positive results along the right direction of resolving storage-related challenges in a multitenant Cloud environment.
Published: 2016

18. CyberGuarder: A virtualization security assurance architecture for green cloud computing

Author: Jianxin Li, Bo Li, Lu Liu, Tianyu Wo, Chunming Hu, Jinpeng Huai, and K. P. Lam
Subjects: Cloud computing security, Computer Networks and Communications, business.industry, Computer science, Distributed computing, Access control, Cloud computing, Computer security model, computer.software_genre, Virtualization, Computer security, Security information and event management, Security service, Hardware and Architecture, Virtual machine, Software deployment, Software security assurance, Information technology management, Scalability, business, computer, Virtual network, Software
Abstract: As the sizes of IT infrastructure continue to grow, cloud computing is a natural extension of virtualisation technologies that enable scalable management of virtual machines over a plethora of physically connected systems. The so-called virtualisation-based cloud computing paradigm offers a practical approach to green IT/clouds, which emphasise the construction and deployment of scalable, energy-efficient network software applications (NetApp) by virtue of improved utilisation of the underlying resources. The latter is typically achieved through increased sharing of hardware and data in a multi-tenant cloud architecture/environment and, as such, accentuates the critical requirement for enhanced security services as an integrated component of the virtual infrastructure management strategy. This paper analyses the key security challenges faced by contemporary green cloud computing environments, and proposes a virtualisation security assurance architecture, CyberGuarder, which is designed to address several key security problems within the 'green' cloud computing context. In particular, CyberGuarder provides three different kinds of services; namely, a virtual machine security service, a virtual network security service and a policy based trust management service. Specifically, the proposed virtual machine security service incorporates a number of new techniques which include (1) a VMM-based integrity measurement approach for NetApp trusted loading, (2) a multi-granularity NetApp isolation mechanism to enable OS user isolation, and (3) a dynamic approach to virtual machine and network isolation for multiple NetApp's based on energy-efficiency and security requirements. Secondly, a virtual network security service has been developed successfully to provide an adaptive virtual security appliance deployment in a NetApp execution environment, whereby traditional security services such as IDS and firewalls can be encapsulated as VM images and deployed over a virtual security network in accordance with the practical configuration of the virtualised infrastructure. Thirdly, a security service providing policy based trust management is proposed to facilitate access control to the resources pool and a trust federation mechanism to support/optimise task privacy and cost requirements across multiple resource pools. Preliminary studies of these services have been carried out on our iVIC platform, with promising results. As part of our ongoing research in large-scale, energy-efficient/green cloud computing, we are currently developing a virtual laboratory for our campus courses using the virtualisation infrastructure of iVIC, which incorporates the important results and experience of CyberGuarder in a practical context.
Published: 2012

19. A secure collaboration service for dynamic virtual organizations

Author: Yanmin Zhu, Jianxin Li, Chunming Hu, and Jinpeng Huai
Subjects: Service (systems architecture), Information Systems and Management, Virtual organization, Computer science, business.industry, Authorization, Security policy, Computer security, computer.software_genre, Computer Science Applications, Theoretical Computer Science, Shared resource, Artificial Intelligence, Control and Systems Engineering, Distributed algorithm, Scalability, Systems architecture, The Internet, business, computer, Software
Abstract: Nowadays, various promising paradigms of distributed computing over the Internet, such as Grids, P2P and Clouds, have emerged for resource sharing and collaboration. To enable resources sharing and collaboration across different domains in an open computing environment, virtual organizations (VOs) often need to be established dynamically. However, the dynamic and autonomous characteristics of participating domains pose great challenges to the security of virtual organizations. In this paper, we propose a secure collaboration service, called PEACE-VO, for dynamic virtual organizations management. The federation approach based on role mapping has extensively been used to build virtual organizations over multiple domains. However, there is a serious issue of potential policy conflicts with this approach, which brings a security threat to the participating domains. To address this issue, we first depict concepts of implicit conflicts and explicit conflicts that may exist in virtual organization collaboration policies. Then, we propose a fully distributed algorithm to detect potential policy conflicts. With this algorithm participating domains do not have to disclose their full local privacy policies, and is able to withhold malicious internal attacks. Finally, we present the system architecture of PEACE-VO and design two protocols for VO management and authorization. PEACE-VO services and protocols have successfully been implemented in the CROWN test bed. Comprehensive experimental study demonstrates that our approach is scalable and efficient.
Published: 2010

20. CROWN: A service grid middleware with trust management mechanism

Author: Jianxin Li, Tianyu Wo, Jinpeng Huai, Hailong Sun, and Chunming Hu
Subjects: General Computer Science, Data grid, business.industry, Computer science, Testbed, Access control, computer.software_genre, Grid computing, Middleware (distributed applications), Trust management (information system), Resource management, Web service, business, computer, Computer network
Abstract: Based on a proposed Web service-based grid architecture, a service grid middleware system called CROWN is designed in this paper. As the two kernel points of the middleware, the overlay-based distributed grid resource management mechanism is proposed, and the policy-based distributed access control mechanism with the capability of automatic negotiation of the access control policy and trust management and negotia- tion is also discussed in this paper. Experience of CROWN testbed deployment and ap- plication development shows that the middleware can support the typical scenarios such as computing-intensive applications, data-intensive applications and mass information processing applications.
Published: 2006

21. dIRIEr: Distributed Influence Maximization in social network

Author: Chunming Hu, Zhou Zong, and Bo Li
Subjects: Brooks–Iyengar algorithm, Social network, business.industry, Computer science, Distributed algorithm, Computation, Distributed computing, Scalability, Overhead (computing), Maximization, business, Synchronization
Abstract: In this paper, for the first time, we tackle the scalability problem of Influence Maximization (IM) via distributed computing. First, we propose a distributed IM algorithm based on IRIE, one of the most state-of-the-art IM algorithms. Then an incremental updating method is proposed to reduce the overhead of repeated computation. Furthermore, based on some new insights, we redesign our algorithm with a strategy, which we call reservoir, to accumulate increments and delay exchange between machines. Experiments on real-world and synthetic networks show our redesigned algorithm, i.e. dIRIEr (distributed IRIE with Reservoir), reduces communication traffic dramatically and speeds up continuously as more machines are added in. dIRIEr can handle giant networks with hundreds of millions of nodes where centralized algorithms become infeasible.
Published: 2014

22. Improving utilization through dynamic VM resource allocation in hybrid cloud environment

Author: Renyu Yang, Wang Yuda, Tianyu Wo, Chunming Hu, and Wenbo Jiang
Subjects: Computer science, business.industry, Distributed computing, Quality of service, Big data, Cloud computing, Workload, Virtualization, computer.software_genre, Shared resource, Virtual machine, High availability, Operating system, Resource allocation, business, computer
Abstract: Virtualization is one of the most fascinating techniques because it can facilitate the infrastructure management and provide isolated execution for running workloads. Despite the benefits gained from virtualization and resource sharing, improved resource utilization is still far from settled due to the dynamic resource requirements and the widely-used over-provision strategy for guaranteed QoS. Additionally, with the emerging demands for big data analytic, how to effectively manage hybrid workloads such as traditional batch task and long-running virtual machine (VM) service needs to be dealt with. In this paper, we propose a system to combine long-running VM service with typical batch workload like MapReduce. The objectives are to improve the holistic cluster utilization through dynamic resource adjustment mechanism for VM without violating other batch workload executions. Furthermore, VM migration is utilized to ensure high availability and avoid potential performance degradation. The experimental results reveal that the dynamically allocated memory is close to the real usage with only 10% estimation margin, and the performance impact on VM and MapReduce jobs are both within 1%. Additionally, at most 50% increment of resource utilization could be achieved. We believe that these findings are in the right direction to solving workload consolidation issues in hybrid computing environments.
Published: 2014

23. Collaborative Design of Site-scale Green Infrastructure: A Case Study on the Ecologica l Restoration Design of Weiliu Wetland Park in Xianyang

Author: Xin Wang, Chunming Hu, Minwei Chai, Bo Luan, and Yueyan Jin
Subjects: geography, geography.geographical_feature_category, business.industry, Environmental resource management, Environmental science, Wetland, Collaborative design, Green infrastructure, Scale (map), business
Published: 2017

24. Demo

Author: Chunming Hu, Anran Wang, Chunyi Peng, Jinpeng Huai, Guobin Shen, and Shuai Ma
Subjects: Focus (computing), business.industry, Computer science, Reliability (computer networking), Visible light communication, Throughput, Barcode, law.invention, Transmission (telecommunications), Secure communication, law, Embedded system, File transfer, business, Computer hardware
Abstract: Visible light communication (VLC) over screen-camera links emerges as a novel form of near-field communication, and it offers a user-friendly, infrastructure-less and secure communication, which is highly competitive for one-time file transfer [1 - 4]. However, the limitations of smart devices and the uncertainty of user behaviors seriously impair the transmission reliability and hinder its applicability. Worse still, existing approaches [1, 2, 4]mostly focus on improving the transmission speed and ignore the transmission reliability. Hence, RDCode is proposed to boost the throughput over screen-camera links, by making use of a novel barcode design and several effective techniques to enhance the transmission reliability. In this demo, we show that our RDCode prototype system addresses many practical challenges. A short video on our prototype system is accessible from http://mashuai.buaa.edu.cn/demo/RDCode.mp4.
Published: 2014

25. VMCSnap: Taking Snapshots of Virtual Machine Cluster with Memory Deduplication

Author: Huang Yumei, Bo Li, Renyu Yang, Lei Cui, Chunming Hu, and Tianyu Wo
Subjects: Computer science, business.industry, Real-time computing, Cloud computing, Virtualization, computer.software_genre, Bottleneck, Shared memory, Virtual machine, Virtual memory, Data_FILES, Snapshot (computer storage), Data diffusion machine, business, computer
Abstract: Virtualization is one of the main technologies currently used to deploy computing systems due to the high reliability and rapid crash recovery it offers in comparison to physical nodes. These features are mainly achieved by continuously producing snapshots of the status of running virtual machines. In earlier works, the snapshot of each individual VM is performed independently, ignoring the memory similarities between VMs within the cluster. When the size of the virtual cluster becomes larger or snapshots are frequently taken, the size of snapshots can be extremely large, consuming large amount of storage space. In this paper, we introduce an innovative snapshot approach for virtual cluster that exploits shared memory pages among all the component VMs to reduce the size of produced snapshot and mitigate the I/O bottleneck. The duplicate memory pages are effectively discovered and stored only once when the snapshot is taken. In addition, our approach can be not only applied to the stop-copy snapshot but also to the pre-copy mechanism as well. Experiments on both snapshot methods are conducted and the result shows our method can reduce the total memory snapshot files by an average of 30% and reach 63% reduction of the snapshot time compared with the default KVM approach with little overhead of rollback time.
Published: 2014

26. CloudAP: Improving the QoS of Mobile Applications with Efficient VM Migration

Author: Renyu Yang, Lei Cui, Kang Junbin, Yunkai Zhang, Chunming Hu, and Tianyu Wo
Subjects: User experience design, business.industry, Computer science, Quality of service, Mobile station, Distributed computing, Mobile computing, Mobile search, Data center, Cloud computing, business, Mobile device, Computer network
Abstract: Mobile computing is increasingly growing in terms of massive computation as well as user demand and use of mobile devices. Remote execution techniques enrich the service experience of mobile devices by leveraging the resource pools of computation and storage capabilities of the cloud data center. However, the user experience and quality of service will be severely affected due to the inherent high latency and low bandwidth of a WAN environment. In this paper, we introduce a cloud base station "CloudAP" which is a small-scale computing infrastructure close to mobile users with local network access. In addition, we present a two-tier architecture consisting of CloudAP and Cloud center and show how to synthesize them to form a general computing environment. Furthermore, we propose a prompt execution environment migration scheme implemented by an efficient whole-system VM migration. It makes the execution environment move following the location of mobile device. Our experimental results demonstrate that the proposed architecture is effective and the execution environment migration scheme is efficient, consisting of up to 10ms and 30s for service downtime and execution environment switch time respectively. These improvements make vital contributions to user experience and QoS in mobile pervasive environment.
Published: 2013

27. A Memory Deduplication Approach Based on Group in Virtualized Environments

Author: Deng Yan, Bo Li, Chunming Hu, Tianyu Wo, and Lei Cui
Subjects: Computer science, business.industry, Distributed computing, Hash function, Cloud computing, Thread (computing), Trusted Computing, computer.software_genre, Virtualization, Virtual machine, Multithreading, Operating system, Data deduplication, business, computer
Abstract: The combination of cloud computing and virtualization technology introduces a new pattern on resource allocation and utilization. Memory scanning deduplication techniques based on eliminating duplicated pages among virtual machines can promote the resource utilization, and decrease the total cost of ownership. However, the existing memory deduplication technologies lack the supporting of isolation and trustworthiness mechanism. This paper proposes a memory sharing mechanism based on user groups. This mechanism guarantees isolation between the different users on the same host. In addition, we designed a sampling hash algorithm to make the memory scanning process more efficient. We have implemented our approach in Linux by modifying the KSM scanning mechanism and splitting the global ksmd thread into per-group ksmds. The experiment results show the work can optimize the memory-intensive VMs, and efficiently accelerate the memory scanning process.
Published: 2013

28. iScreen: A Merged Screen of Local System with Remote Applications in a Mobile Cloud Environment

Author: Qi Song, Jian Kang, Chunming Hu, Jianxin Li, and Weiren Yu
Subjects: business.industry, Computer science, Mobile computing, Cloud computing, Frame rate, computer.software_genre, Mobile cloud computing, Local system, Remote administration, Operating system, User interface, business, Mobile device, computer
Abstract: With the convergence of cloud computing and mobile computing, mobile devices can access remote applications in a cloud environment. However, existing research work mostly focused on leveraging cloud capabilities to enhance mobile clients. Particularly, in order to access different cloud platforms and applications, specific version of clients such as Web portal, Remote Desktop, are generally required. The original display and interaction experience on the client local system are changed. This paper presents an approach named screen which keeps a consistent display and interaction experience between local and remote applications in a mobile cloud computing environment. screen consists of a three-factor merging framework including windows merging, meta-info merging and interaction merging for applications. We developed a prototype of screen for Windows applications to allow thin clients to seamlessly access remote cloud windows applications. Experimental studies show that screen can effectively merge local desktop with remote display, and mobile clients can achieve 20 frames per second when running a remote video playback application. The bandwidth usage of screen is reduced by about 10% compared to Ultra VNC. It performs better especially under high-motion scenarios.
Published: 2013

29. NeTrOS: A Virtual Computing Environment towards Instant Service of Network Software

Author: Tianyu Wo, Jianxin Li, Chunming Hu, and Jinpeng Huai
Subjects: Full virtualization, Computer science, business.industry, Distributed computing, Temporal isolation among virtual machines, Local area network, Cloud computing, Virtualization, computer.software_genre, Shared resource, Virtual machine, business, computer, Host (network)
Abstract: Newly emerging requirements of network application instant service bring in the challenges of a flexible, scalable, and reliable resource environment. Virtualization based resource isolation is helpful. However the internal structure of network software is often neglected in current virtualization based solutions. To improve resource sharing and utilization, the granularity of isolation should evolve from a single host to a network. NeTrOS, a virtual resource operating platform is proposed. Novel core concepts of cyber let and cyber-interrupt are introduced to express the first class runtime managed object and the evolution of virtual computing environment. Key technologies of virtual machine networks (VMNs), the implementation binding of cyber let in NeTrOS, are discussed. A programmable VMN architecture is proposed. Efficient maintenance mechanisms are involved to support the connectivity and dynamic nature of VMNs. Migrations of VMN in both LAN and WAN environment are supported. Detail implementation and deployment of NeTrOS are presented. A series of system evaluation is conducted to show the effectiveness and efficiency of NeTrOS. Experience of a mobile Internet application case supported by NeTrOS is also discussed.
Published: 2012

30. Radiata: Enabling Whole System Hot-mirroring via Continual State Replication

Author: Tianyu Wo, Yang Chen, and Chunming Hu
Subjects: Service (systems architecture), business.industry, Computer science, Distributed computing, Fault tolerance, computer.software_genre, Replication (computing), Asynchronous communication, Virtual machine, Server, Embedded system, Overhead (computing), State (computer science), business, computer
Abstract: Checkpoint-recovery based on system virtualization is an attractive approach for providing the transparent and economic fault tolerance service in virtualized environments. The previous approaches introduce either great performance degradation or complex implementation issues. In this work, we propose a whole system hot-mirroring platform, namely Radiata, to provide fault-tolerance for any type of service by encapsulating the service instance into a virtual machine, and hot-mirroring the state changes of the virtual machine via the continual state replication. Our approach exploits three key optimizations for further reduction of the performance overhead: the asynchronous state replication, the COW-based memory checkpoint and the dirty page prediction. Based on the KVM platform, we have implemented the prototype system. The comprehensive evaluations under a variety of workloads demonstrate that Radiata is able to effectively support rapid and transparent fail-over in case of unexpected hardware failure, and outperforms the existing mechanisms in terms of the performance degradation in failure-free condition.
Published: 2012

31. DVCE: The Virtual Computing Environment Supported by Distributed VM Images

Author: Yabing Cui, Chunming Hu, Tianyu Wo, and Hanwen Wang
Subjects: sysfs, business.industry, Computer science, Distributed computing, Temporal isolation among virtual machines, Cloud computing, Kernel virtual address space, computer.software_genre, Virtual finite-state machine, Utility computing, Virtual machine, Data_FILES, Operating system, business, Data diffusion machine, computer
Abstract: Compared to the conventional physical cluster, the virtual cluster technology has features such as flexible configuration, easy management and high system security, which provides users with an easily customized computing environment. Virtual machine image files as the most important stored objects in the virtual computing environment include all the information about the whole life cycle of virtual machine. The storage of the virtual machine images is an important part of virtual computing. There are mainly two ways to store the images, which include local storage and networking storage. This paper proposes a mechanism to manage the distributed image files to support the virtual computing environment. It uses the computing nodes` local storage as a chunk server in a distributed file system, and regards the distribution of the image files as a primary factor to choose the computing node to deploy virtual machine and to implement a backup mechanism of distributed images based on historical records of image files. Functional test and performance test results show that it can effectively support the virtual computing environment and has great values of applications and research.
Published: 2012

32. Overbooking-Based Resource Allocation in Virtualized Data Center

Author: Chunming Hu, Tianyu Wo, Bo Li, and Qian Sun
Subjects: Resource (project management), Revenue management, Computer science, business.industry, Quality of service, Bandwidth (computing), Resource allocation, Revenue, Resource management, Cloud computing, business, Computer network
Abstract: Efficient resource management in the virtualized data center is always a practical concern and has attracted significant attention. In particularly, economic allocation mechanism is desired to maximize the revenue for commercial cloud providers. This paper uses overbooking from Revenue Management to avoid resource over-provision according to its runtime demand. We propose an economic model to control the overbooking policy while provide users probability based performance guarantee using risk estimation. To cooperate with overbooking policy, we optimize the VM placement with traffic-aware strategy to satisfy application's QoS requirement. We design GreedySelePod algorithm to achieve traffic localization in order to reduce network bandwidth consumption, especially the network bottleneck bandwidth, thus to accept more requests and increase the revenue in the future. The simulation results show that our approach can greatly improve the request acceptance rate and increase the revenue by up to 87% while with acceptable resource confliction.
Published: 2012

33. A Virtual File System for Streaming Loading of Virtual Software on Windows NT

Author: Chunming Hu, Yabing Cui, Hanwen Wang, and Tianyu Wo
Subjects: Resource-oriented architecture, business.industry, Computer science, Software as a service, Software development, computer.software_genre, Virtual file system, Software deployment, Embedded system, Component-based software engineering, Operating system, Backporting, Software system, business, computer
Abstract: With the cloud computing and virtualization technology popularizing and developing, the Software as a Service (SaaS), has become an innovative software delivery model. In this environment, if the virtual software runs in traditional way, that is to start up after completely downloaded, it will be time-consuming and greatly influence the users' experience. However, the streaming execution mode will enable the virtual software to start up while downloading. According to this conception, we design and implement a virtual file system for streaming delivery of software. The experimental results show that the first startup times of virtualized software have reduced by 20% to 60% and the users' experience has been improved effectively.
Published: 2012

34. Muse

Author: Jianxin Li, Liang Zhong, Chunming Hu, and Weiren Yu
Subjects: Multimedia, Computer science, business.industry, Desktop virtualization, Real-time computing, Video quality, computer.software_genre, Mobile cloud computing, High-motion, Codec, The Internet, business, Encoder, Mobile device, computer
Abstract: Recent years we have witnessed the rapid advent of mobile cloud computing, in which remote software is delivered as a service and accessed by mobile device users over the Internet. However, most existing remote display technologies for high motion application (e.g, movie) have defects in latency and bandwidth. In this paper, we designed an adaptive multimedia streaming enabled remote interactivity system, Muse, to utilize remote resources with reduced display update traffic and response latency. A window-aware updating mechanism is designed as an adaptation scheme, which allows users to focus on the current application in use and also enable them to switch between applications on the fly. Besides, a windowed display encoder using H.264 video codec is integrated into the remote frame buffer protocol to achieve high performance in compression to address the high latency limitation of mobile Internet. Experimental results show that the windowed display Muse mechanism can successfully reduce network traffic, loading time and response latency of remote display and interaction. Our system can achieve in average 22fps of 1024*768 desktop multimedia playbacks with good video quality under 1 Mbit/s of bandwidth limitation.
Published: 2011

35. VMDetector: A VMM-based Platform to Detect Hidden Process by Multi-view Comparison

Author: Bo Li, Chunming Hu, and Ying Wang
Subjects: Full virtualization, Hardware virtualization, Computer science, business.industry, Rootkit, Hypervisor, computer.software_genre, Virtualization, Virtual machine, Embedded system, Operating system, Malware, The Internet, business, computer
Abstract: Recently, "rootkit" becomes a popular hacker malware on the Internet, which controls the hosts on the Internet by hiding itself, and raises a serious security threat. Existing host-based and hardware-based solutions have some disadvantages, such as hardware overhead and being discovered by root kits, where the development of virtualization technology provides a better solution to avoid those. Virtual machine monitor has the highest authority on the virtual machine, and has the right to control the activities in the virtual machine without being found by root kits in the virtual machines. We propose VM Detector based on this hardware virtualization technology, using multi-view detection mechanism, to detect hidden processes inside the virtual machine on many aspects, then to improve the virtual machine's security. Through several experiments, VM Detector carried on the process detection effectively, and introduced less than 10% performance overhead.
Published: 2011

36. CREST: Towards Fast Speculation of Straggler Tasks in MapReduce

Author: Lei Lei, Tianyu Wo, and Chunming Hu
Subjects: Web indexing, Utility computing, Computer science, business.industry, Distributed computing, Search engine indexing, Locality, Programming paradigm, The Internet, Cloud computing, Parallel computing, business, Scheduling (computing)
Abstract: Data-Intensive Computing emerges as the fourth paradigm for modern scientific discoveries. MapReduce, a programming paradigm for large-scale data-parallel applications, is widely applied to web indexing, machine learning, and scientific simulations in industries as well as in academia. Recently, the virtualized "utility computing" environments, such as campus cloud, are becoming an important scenario to run MapReduce jobs. For a MapReduce job, the straggler tasks may dominate the response time and delay whole job. Various speculation schemes have been proposed to alleviate such problem, however, most of them implicitly assume that the time cost for data movement on launching speculative map tasks is trivial, which does not always hold for the virtualized Hadoop clusters in a campus cloud. In this paper, we propose a novel approach, CREST(Combination Re-Execution Scheduling Technology), which can achieve the optimal running time for speculative map tasks and decrease the response time of MapReduce jobs. The main idea is that re-executing a combination of tasks on a group of computing nodes may progress faster than directly speculating the straggler task on target node, due to data locality. The evaluation validates our approach and demonstrates that CREST can reduce the running time of a speculative map task by 70% with best cases and 50% on average, comparing with LATE.
Published: 2011

37. A VMM-Based System Call Interposition Framework for Program Monitoring

Author: Chunming Hu, Liang Zhong, Bo Li, Tianyu Wo, and Jianxin Li
Subjects: Computer science, business.industry, Hypervisor, computer.software_genre, System monitoring, Virtualization, System call, Virtual machine, Embedded system, Operating system, Malware, Overhead (computing), business, computer
Abstract: System call interposition is a powerful method for regulating and monitoring program behavior. A wide variety of security tools have been developed which use this technique. However, traditional system call interposition techniques are vulnerable to kernel attacks and have some limitations on effectiveness and transparency. In this paper, we propose a novel approach named VSyscall, which leverages virtualization technology to enable system call interposition outside the operating system. A system call correlating method is proposed to identify the coherent system calls belonging to the same process from the system call sequence. We have developed a prototype of VSyscall and implemented it in two mainstream virtual machine monitors, Qemu and KVM, respectively. We also evaluate the effectiveness and performance overhead of our approach by comprehensive experiments. The results show that VSyscall achieves effectiveness with a small overhead, and our experiments with six real-world applications indicate its practicality.
Published: 2010

38. Resilient Virtual Network Service Provision in Network Virtualization Environments

Author: Jianxin Li, Wantao Liu, Yang Chen, Chunming Hu, and Tianyu Wo
Subjects: Service (systems architecture), business.industry, Computer science, Distributed computing, Network virtualization, Intelligent computer network, Bandwidth allocation, Scalability, Bandwidth (computing), Resource allocation, Resource management, business, Virtual network, Computer network
Abstract: Network Virtualization has recently emerged to provide scalable, customized and on-demand virtual network services over a shared substrate network. How to provide VN services with resiliency guarantees against network failures has become a critical issue, meanwhile the service resource usages should be minimized under the strict constraints such as link bandwidth capability and service resiliency guarantees etc. In this paper, we present a resource allocation algorithm to balance the tradeoff between service resource consumptions and service resiliency. By exploiting a heuristic VN mapping scheme and a restoration path selection scheme based on intelligent bandwidth sharing, the algorithm simultaneously makes cost-effective usage of network resources and protects VN services against network failures. We perform evaluations and find that the algorithm is near optimal in terms of network resource usage, especially the additional restoration bandwidth cost for resiliency protection.
Published: 2010

39. A Prefetching Framework for the Streaming Loading of Virtual Software

Author: Bo Li, Chunming Hu, Liang Zhong, Haibing Zheng, Junbin Kang, and Tianyu Wo
Subjects: business.industry, Computer science, Software as a service, Cloud computing, Virtualization, computer.software_genre, Software, User experience design, Server, Hit rate, Operating system, The Internet, business, computer, Block (data storage)
Abstract: In recent years, the Software as a Service, largely enabled by the Internet, has become an innovative software delivery model. During the streaming execution of virtualization software, the execution will wait until the missing data was downloaded, which greatly influences the user experience. In this paper, we present a block-level prefetching framework for streaming delivery of software based on N-Gram prediction model and an incremental data mining algorithm. The prefetching framework uses the historical block access logs for data mining, then dynamically updates and polishes the prefetching rules. The experimental results show that this prefetching framework achieves a launch time reduced by 10% to 50%, as well as hit rate between 81% and 97%.
Published: 2010

40. A Multi-agents Contractual Approach to Incentive Provision in Non-cooperative Networks

Author: Li Lin, Xianxian Li, Yanmin Zhu, Jinpeng Huai, and Chunming Hu
Subjects: Mechanism design, business.industry, Principal (computer security), Pareto principle, PMAC, Computer security, computer.software_genre, law.invention, symbols.namesake, Relay, law, Nash equilibrium, Node (computer science), Collusion, Economics, symbols, business, computer, Computer network
Abstract: Recent years have witnessed the increasing importance of exchanging information over computer networks or distributed systems. Two end nodes wishing to communicate often rely on independent intermediate nodes to relay messages. In consideration of the rational nature of both the end nodes and intermediate nodes, we have to accommodate two inherently coexistent games: one played between the end nodes and the intermediate nodes and the other played among the intermediate nodes. This is particularly challenging due to the well-known hidden information and the hidden action issues. In this paper we propose a holistic approach PMAC to address the two games, exploiting the principal and multi-agents model creatively. In PMAC, the end nodes make contracts with each intermediate node. The contracts together produce good system properties which are twofold. First, it is guaranteed that the utility of the end nodes is maximized. Second, it is proved that the cooperation of the intermediate nodes can be induced since there exists a Nash equilibrium for the intermediate nodes. However, one serious issue that there may be other Pareto superior Nash equilibriums inevitably hinders the unique implementation of the contracts. We also adopt technique without incurring any additional cost to the end nodes. By knocking out the other redundant Nash equilibriums in the intermediate nodes' game, we ensure that the equilibrium most desired by the end pair is successfully achieved.
Published: 2008

41. CIVIC: A Hypervisor Based Virtual Computing Environment

Author: Qin Li, Jinpeng Huai, and Chunming Hu
Subjects: Computer science, business.industry, Distributed computing, Hypervisor, computer.software_genre, Virtual computing, Storage hypervisor, Encapsulation (networking), Software, Virtual machine, Software deployment, Operating system, business, Virtual network, computer, Resource utilization, Live migration
Abstract: The purpose of virtual computing environment is to improve resource utilization by providing a unified integrated operating platform for users and applications based on aggregation of heterogeneous and autonomous resources. With the rapid development in recent years, hypervisor technologies have become mature and comprehensive with four features, including transparency, isolation, encapsulation and manageability. In this paper, a hypervisor based virtual computing infrastructure, named CIVIC, is proposed. Compared with existing approaches, CIVIC may benefit in several ways. It offers separated and isolated computing environment for end users, and realizes hardware and software consolidation and centralized management. Beside this, CIVIC provides a transparent view to upper layer applications, by hiding the dynamicity, distribution and heterogeneity of underlying resources. Performance of the infrastructure is evaluated by an initial deployment and experiment. The result shows that CIVIC can facilitate installation, configuration and deployment of network-oriented applications.
Published: 2007

42. A Micro-Controller Based Control Unit for Motorcycle Engines to Meet Emission and OBD Requirements

Author: Tervin Tan, Mingde Hao, Nenghui Zhou, Chunming Hu, Hui Xie, and R Tan
Subjects: Engineering, Microcontroller, business.industry, Control unit, business, Automotive engineering
Published: 2006

43. Flexible resource reservation using slack time for service grid

Author: Jinpeng Huai, Tianyu Wo, and Chunming Hu
Subjects: Least slack time scheduling, Computer science, business.industry, Distributed computing, Quality of service, Reservation, Admission control, computer.software_genre, Grid, Grid computing, Middleware, Resource allocation, Resource management, business, computer, Resource utilization, Computer network
Abstract: Providing guaranteed QoS for grid services using resource reservation and allocation is an important feature for today's service grid. Reservation requests with existing mechanisms are often rejected during the resource utilization peak and lead to resource capacity fragment problem. In this paper, we propose a flexible capacity reservation mechanism, called FIRST, which employs the slack time-enabled request admission control with differentiated selection strategies. We implement the prototype of FIRST in our CROWN node server, the service container of CROWN service grid middleware. The performance of the admission control algorithm for FIRST is evaluated by comprehensive simulations and implementations. Experimental results show that a better resource utilization ratio can be achieved by introducing the slack time into fixed reservation scenario, and a min-min based request selection strategy obtains a better performance compared with existing strategies.
Published: 2006

44. CROWN Node Server: An Enhanced Grid Service Container Based on GT4 WSRF Core

Author: Tianyu Wo, Hailong Sun, Chunming Hu, and Wantao Liu
Subjects: Service (systems architecture), business.industry, Computer science, Node (networking), Grid, computer.software_genre, Core (game theory), Trustworthiness, Grid computing, Software deployment, Container (abstract data type), Operating system, business, computer, Computer network
Abstract: WSRF core is an open-source implementation of grid service container in Globus Toolkit 4 (GT4). However, GT4 WSRF core is far from a full-fledged grid service container. Basing on the Globus work, we develop CROWN Node Server, an enhanced grid service container. Besides basic functions of a WSRF grid service container, CROWN Node Server encompasses features including remote & hot service deployment with trustworthiness, monitoring, logging and management, etc, which are of paramount importance to build applications using service grid technologies. CROWN Node Server has been successfully adopted and widely deployed in CROWN Grid environment to support a wide range of service grid applications. We conduct comprehensive experiments to evaluate the performance of Node Server, and the comparing results with GT4 WSRF core are presented as well.
Published: 2006

45. IPR: Automated Interaction Process Reconciliation

Author: Chunming Hu, Jinpeng Huai, Lei Lei, Zongxia Du, and Yunhao Liu
Subjects: Matching (statistics), SIMPLE (military communications protocol), Business process, Computer science, business.industry, Process (engineering), Data mining, Petri net, Software engineering, business, computer.software_genre, Semantic Web, computer
Abstract: Inter-organizational business processes usually require complex and time-consuming interactions between partners than simple interactions supported by WSDL. Automated reconciliation is essential to enable dynamic inter-organizational business collaboration. To the best of our knowledge, however, there is not a practical automated reconciliation algorithm available. In this paper, we propose a practical automated reconciliation algorithm, called IPR (interaction process reconciliation) based on Petri net, which is able to effectively facilitate dynamic interactions among trading partners in a peer-to-peer fashion. We implement a prototype IPR server in our lab, and evaluate our design by comprehensive experiments. Results show that IPR significantly outperforms existing approaches in terms of matching success rate, response time, and matching efficiency.
Published: 2005

46. Introduction to ChinaGrid Support Platform

Author: Song Wu, Yongwei Wu, Chunming Hu, and Huashan Yu
Subjects: Service (systems architecture), Computer science, business.industry, media_common.quotation_subject, Data management, computer.software_genre, Domain (software engineering), Container (abstract data type), Operating system, Security management, Public service, Architecture, Function (engineering), business, computer, media_common
Abstract: ChinaGrid aims at building a public service system for Chinese education and research. ChinaGrid Support Platform (CGSP) is a grid middleware developed for the construction of the ChinaGrid. Function modules of CGSP for system running are Domain Manager, Information Center, Job Manager, Data Manager, Service Container and Security Manager. Developing tools for gird constructor and application developers consist of Service Packaging Tool, Job Defining Tool, Portal Constructor and Programming API. CGSP architecture is introduced first. Then, CGSP function modules and developing tools are described. At last, job executing flow in CGSP is also put forward in the paper.
Published: 2005

47. Grid middleware in China

Author: Chunming Hu, Song Wu, Yongwei Wu, and Li Zha
Subjects: Data grid, Database, Computer Networks and Communications, Computer science, business.industry, Interoperability, computer.software_genre, Grid, National Grid, Semantic grid, Grid computing, Middleware (distributed applications), Systems engineering, The Internet, business, computer, Software
Abstract: Grids aim at constructing a virtual single image of heterogeneous resources and provide uniform interface for distributed internet applications. China also devotes much passion and endeavour to the evolution of grid projects. Based on the experience in building and enhancing Chinese grids in collaboration with colleagues from around the globe, it is important to choose grid technologies that support and work on a wide variety of resources. Grid middleware research and development in China is described in this paper. First we give the overview of the three government-sponsored grid programmes, namely, China National Grid, ChinaGrid and NSFCGrid. Then three representative grid middleware, ChinaGrid Support Platform (CGSP), China National Grid Operation System (GOS), China Research and Development Environment Over Wide-area Network (CROWN) are introduced in detail from six aspects: design motivation, architecture, function modules, main features, interoperability, current status and applications. Finally, we abstract the main characteristics of these three middleware systems and put forward a comparison among them.
Published: 2007

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

47 results on '"Chunming Hu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources