Fast VMM-based overlay networking for bridging the cloud and high performance computing - PDF

Please download to get full document.

View again

of 20
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report

Entertainment & Media


Views: 6 | Pages: 20

Extension: PDF | Download: 0

Related documents
DOI /s Fast VMM-based overlay networking for bridging the cloud and high performance computing Lei Xia Zheng Cui John Lange Yuan Tang Peter Dinda Patrick Bridges Received: 1 October
DOI /s Fast VMM-based overlay networking for bridging the cloud and high performance computing Lei Xia Zheng Cui John Lange Yuan Tang Peter Dinda Patrick Bridges Received: 1 October 2012 / Accepted: 1 May 2013 Springer Science+Business Media New York 2013 Abstract A collection of virtual machines (VMs) interconnected with an overlay network with a layer 2 abstraction has proven to be a powerful, unifying abstraction for adaptive distributed and parallel computing on loosely-coupled environments. It is now feasible to allow VMs hosting high performance computing (HPC) applications to seamlessly bridge distributed cloud resources and tightly-coupled supercomputing and cluster resources. However, to achieve the application performance that the tightly-coupled resources are capable of, it is important that the overlay network not introduce significant overhead relative to the native hardware, which is not the case for current user-level tools, including our own existing VNET/U system. In response, we describe the design, implementation, and evaluation of a virtual networking system that has negligible latency and bandwidth overheads in 1 10 Gbps networks. Our sys- tem, VNET/P, is directly embedded into our publicly available Palacios virtual machine monitor (VMM). VNET/P achieves native performance on 1 Gbps Ethernet networks and very high performance on 10 Gbps Ethernet networks. The NAS benchmarks generally achieve over 95 % of their native performance on both 1 and 10 Gbps. We have further demonstrated that VNET/P can operate successfully over more specialized tightly-coupled networks, such as Infiniband and Cray Gemini. Our results suggest it is feasible to extend a software-based overlay network designed for computing at wide-area scales into tightly-coupled environments. Keywords Overlay networks Virtualization HPC Scalability L. Xia ( ) P. Dinda Northwestern University, Evanston, IL, USA P. Dinda Z. Cui P. Bridges University of New Mexico, Albuquerque, NM, USA Z. Cui P. Bridges J. Lange University of Pittsburgh, Pittsburgh, PA, USA Y. Tang University of Electronic Science and Technology of China, Chengdu, China 1 Introduction Cloud computing in the infrastructure as a service (IaaS) model has the potential to provide economical and effective on-demand resources for high performance computing. In this model, an application is mapped into a collection of virtual machines (VMs) that are instantiated as needed, and at the scale needed. Indeed, for loosely-coupled applications, this concept has readily moved from research [8, 44] to practice [39]. As we describe in Sect. 3, such systems can also be adaptive, autonomically selecting appropriate mappings of virtual components to physical components to maximize application performance or other objectives. However, tightlycoupled scalable high performance computing (HPC) applications currently remain the purview of resources such as clusters and supercomputers. We seek to extend the adaptive IaaS cloud computing model into these regimes, allowing an application to dynamically span both kinds of environments. The current limitation of cloud computing systems to loosely-coupled applications is not due to machine virtualization limitations. Current virtual machine monitors (VMMs) and other virtualization mechanisms present negligible overhead for CPU and memory intensive workloads [18, 37]. With VMM-bypass [34] or self-virtualizing devices [41] the overhead for direct access to network devices can also be made negligible. Considerable effort has also gone into achieving lowoverhead network virtualization and traffic segregation within an individual data center through extensions or changes to the network hardware layer [12, 25, 38]. While these tools strive to provide uniform performance across a cloud data center (a critical feature for many HPC applications), they do not provide the same features once an application has migrated outside the local data center, or spans multiple data centers, or involves HPC resources. Furthermore, they lack compatibility with the more specialized interconnects present on most HPC systems. Beyond the need to support our envisioned computing model across today s and tomorrow s tightly-coupled HPC environments, we note that data center network design and cluster/supercomputer network design seem to be converging [1, 13]. This suggests that future data centers deployed for general purpose cloud computing will become an increasingly better fit for tightly-coupled parallel applications, and therefore such environments could potentially also benefit. The current limiting factor in the adaptive cloud- and HPC-spanning model described above for tightly-coupled applications is the performance of the virtual networking system. Current adaptive cloud computing systems use software-based overlay networks to carry inter-vm traffic. For example, our VNET/U system, which is described in more detail later, combines a simple networking abstraction within the VMs with location-independence, hardwareindependence, and traffic control. Specifically, it exposes a layer 2 abstraction that lets the user treat his VMs as being on a simple LAN, while allowing the VMs to be migrated seamlessly across resources by routing their traffic through the overlay. By controlling the overlay, the cloud provider or adaptation agent can control the bandwidth and the paths between VMs over which traffic flows. Such systems [43, 49] and others that expose different abstractions to the VMs [56] have been under continuous research and development for several years. Current virtual networking systems have sufficiently low overhead to effectively host loosely-coupled scalable applications [7], but their performance is insufficient for tightly-coupled applications [40]. In response to this limitation, we have designed, implemented, and evaluated VNET/P, which shares its model and vision with VNET/U, but is designed to achieve near-native performance in the 1 Gbps and 10 Gbps switched networks common in clusters today, as well as to operate effectively on top of even faster networks, such as Infiniband and Cray Gemini. VNET/U and our model is presented in more detail in Sect. 3. VNET/P is implemented in the context of our publicly available, open source Palacios VMM [30], which is in part designed to support virtualized supercomputing. A detailed description of VNET/P s design and implementation is given in Sect. 4. As a part of Palacios, VNET/P is publicly available. VNET/P could be implemented in other VMMs, and as such provides a proof-of-concept that overlay-based virtual networking for VMs, with performance overheads low enough to be inconsequential even in a tightly-coupled computing environment, is clearly possible. The performance evaluation of VNET/P (Sect. 5) shows that it is able to achieve native bandwidth on 1 Gbps Ethernet with a small increase in latency, and very high bandwidth on 10 Gbps Ethernet with a similar, small latency increase. On 10 Gbps hardware, the kernel-level VNET/P system provides on average 10 times more bandwidth and 7 times less latency than the user-level VNET/U system can. In a related paper from our group [5], we describe additional techniques, specifically optimistic interrupts and cutthrough forwarding, that bring bandwidth to near-native levels for 10 Gbps Ethernet. Latency increases are predominantly due to the lack of selective interrupt exiting in the current AMD and Intel hardware virtualization extensions. We expect that latency overheads will be largely ameliorated once such functionality become available, or, alternatively, when software approaches such as ELI [11] are used. Although our core performance evaluation of VNET/P is on 10 Gbps Ethernet, VNET/P can run on top of any device that provides an IP or Ethernet abstraction within the Linux kernel. The portability of VNET/P is also important to consider, as the model we describe above would require it to run on many different hosts. In Sect. 6 we report on preliminary tests of VNET/P running over Infiniband via the IPoIB functionality, and on the Cray Gemini via the IPoG virtual Ethernet interface. Running on these platforms requires few changes to VNET/P, but creates considerable flexibility. In particular, using VNET/P, existing, unmodified VMs running guest OSes with commonplace network stacks can seamlessly run on top of such diverse hardware. To the guest, a complex network of commodity and highend networks looks like a simple Ethernet network. Also in Sect. 6, we describe a version of VNET/P that has been designed for use with the Kitten lightweight kernel as its host OS. Kitten is quite different from Linux indeed the combination of Palacios and Kitten is akin to a type-i (unhosted) VMM resulting in a different VNET/P architecture. This system and its performance provide evidence that the VNET/P model can be successfully brought to different host/vmm environments. Our contributions are as follows: We articulate the benefits of extending virtual networking for VMs down to clusters and supercomputers with high performance networks. These benefits are also applicable to data centers that support IaaS cloud computing. We describe the design and implementation of a virtual networking system, VNET/P, that does so. The design could be applied to other VMMs and virtual network systems. We perform an extensive evaluation of VNET/P on 1 and 10 Gbps Ethernet networks, finding that it provides performance with negligible overheads on the former, and manageable overheads on the latter. VNET/P generally has little impact on performance for the NAS benchmarks. We describe our experiences with running the VNET/P implementation on Infiniband and Cray Gemini networks. VNET/P allows guests with commodity software stacks to leverage these networks. We describe the design, implementation, and evaluation of a version of VNET/P for lightweight kernel hosts, particularly the Kitten LWK. Through the use of low-overhead overlay-based virtual networking in high-bandwidth, low-latency environments such as current clusters and supercomputers, and future data centers, we seek to make it practical to use virtual networking at all times, even when running tightly-coupled applications on such high-end environments. This would allow us to seamlessly and practically extend the already highly effective adaptive virtualization-based IaaS cloud computing model to such environments. This paper is an extended version of a previous conference publication [57]. Compared to the conference paper, it provides an extended presentation of the core design aspects of VNET/P as well as descriptions of implementations of VNET/P for Infiniband and Cray Gemini. It also includes initial performance evaluations on these platforms. 2 Related work VNET/P is related to NIC virtualization, overlays, and virtual networks, as we describe below. NIC virtualization There is a wide range of work on providing VMs with fast access to networking hardware, where no overlay is involved. For example, VMware and Xen support either an emulated register-level interface [47]oraparavirtualized interface to guest operating system [36]. While purely software-based virtualized network interface has high overhead, many techniques have been proposed to support simultaneous, direct-access network I/O. For example, some work [34, 41] has demonstrated the use of self-virtualized network hardware that allows direct guest access, thus provides high performance to untrusted guests. Willmann et al have developed a software approach that also supports concurrent, direct network access by untrusted guest operating systems [45]. In addition, VPIO [59] can be applied on network virtualization to allow virtual passthrough I/O on nonself-virtualized hardware. Virtual WiFi [58] is an approach to provide the guest with access to wireless networks, including functionality specific to wireless NICs. In contrast with such work, VNET/P provides fast access to an overlay network, which includes encapsulation and routing. It makes a set of VMs appear to be on the same local Ethernet regardless of their location anywhere in the world and their underlying hardware. Our work shows that this capability can be achieved without significantly compromising performance when the VMs happen to be very close together. Overlay networks: Overlay networks implement extended network functionality on top of physical infrastructure, for example to provide resilient routing (e.g, [3]), multicast (e.g. [17]), and distributed data structures (e.g., [46]) without any cooperation from the network core; overlay networks use end-systems to provide their functionality. VNET is an example of a specific class of overlay networks, namely virtual networks, discussed next. Virtual networking: Virtual networking systems provide a service model that is compatible with an existing layer 2 or 3 networking standard. Examples include VIOLIN [21], ViNe [52], VINI [4], SoftUDC VNET [23], OCALA [22], WoW [10], and the emerging VXLAN standard [35]. Like VNET, VIOLIN, SoftUDC, WoW, and VXLAN are specifically designed for use with virtual machines. Of these, VIOLIN is closest to VNET (and contemporaneous with VNET/U), in that it allows for the dynamic setup of an arbitrary private layer 2 and layer 3 virtual network among VMs. The key contribution of VNET/P is to show that this model can be made to work with minimal overhead even in extremely low latency, high bandwidth environments. Connections: VNET/P could itself leverage some of the related work described above. For example, effective NIC virtualization might allow us to push encapsulation directly into the guest, or to accelerate encapsulation via a split scatter/gather map. Mapping unencapsulated links to VLANs would enhance performance on environments that support them. There are many options for implementing virtual networking and the appropriate choice depends on the hardware and network policies of the target environment. In VNET/P, we make the choice of minimizing these dependencies. 3 VNET model and VNET/U The VNET model was originally designed to support adaptive computing on distributed virtualized computing resources within the Virtuoso system [6], and in particular to support the adaptive execution of a distributed or parallel computation executing in a collection of VMs potentially spread across multiple providers or supercomputing sites. The key requirements, which also hold for the present paper, were as follows. VNET would make within-vm network configuration the sole responsibility of the VM owner. VNET would provide location independence to VMs, allowing them to be migrated between networks and from site to site, while maintaining their connectivity, without requiring any within-vm configuration changes. VNET would provide hardware independence to VMs, allowing them to use diverse networking hardware without requiring the installation of specialized software. VNET would provide minimal overhead, compared to native networking, in the contexts in which it is used. The VNET model meets these requirements by carrying the user s VMs traffic via a configurable overlay network. The overlay presents a simple layer 2 networking abstraction: a user s VMs appear to be attached to the user s local area Ethernet network, regardless of their actual locations or the complexity of the VNET topology/properties. Further information about the model can be found elsewhere [49]. The VNET overlay is dynamically reconfigurable, and can act as a locus of activity for an adaptive system such as Virtuoso. Focusing on parallel and distributed applications running in loosely-coupled virtualized distributed environments (e.g., IaaS Clouds ), we demonstrated that the VNET layer can be effectively used to: 1. monitor application communication and computation behavior [14, 15]), 2. monitor underlying network behavior [16], 3. formulate performance optimization problems [48, 51], and 4. address such problems through VM migration and overlay network control [50], scheduling [32, 33], network reservations [31], and network service interposition [27]. These and other features that can be implemented within the VNET model have only marginal utility if carrying traffic via the VNET overlay has significant overhead compared to the underlying native network. The VNET/P system described in this paper is compatible with, and compared to, our previous VNET implementation, VNET/U. Both support a dynamically configurable general overlay topology with dynamically configurable routing on a per MAC address basis. The topology and routing configuration is subject to global or distributed control (for example, by the VADAPT [50]) part of Virtuoso. The overlay carries Ethernet packets encapsulated in UDP packets, TCP streams with and without SSL encryption, TOR privacy-preserving streams, and others. Because Ethernet packets are used, the VNET abstraction can also easily interface directly with most commodity network devices, including virtual NICs exposed by VMMs in the host, and with fast virtual devices (e.g., Linux virtio network devices) in guests. While VNET/P is implemented within the VMM, VNET/U is implemented as a user-level system. As a userlevel system, it readily interfaces with VMMs such as VMware Server and Xen, and requires no host changes to be used, making it very easy for a provider to bring it up on a new machine. Further, it is easy to bring up VNET daemons when and where needed to act as proxies or waypoints. A VNET daemon has a control port which speaks a control language for dynamic configuration. A collection of tools allows for the wholesale construction and teardown of VNET topologies, as well as dynamic adaptation of the topology and forwarding rules to the observed traffic and conditions on the underlying network. The last reported measurement of VNET/U showed it achieving 21.5 MB/s (172 Mbps) with a 1 ms latency overhead communicating between Linux 2.6 VMs running in VMware Server GSX 2.5 on machines with dual 2.0 GHz Xeon processors [27]. A current measurement, described in Sect. 5, shows 71 MB/s with a 0.88 ms latency. VNET/U s speeds are sufficient for its purpose in providing virtual networking for wide-area and/or loosely-coupled distributed computing. They are not, however, sufficient for use within a cluster at gigabit or greater speeds. Making this basic VMto-VM path competitive with hardware is the focus of this paper. VNET/U is fundamentally limited by the kernel/user space transitions needed to handle a guest s packet send or receive. In VNET/P, we move VNET directly into the VMM to avoid such transitions. 4 Design and implementation We now describe how VNET/P has been architected and implemented in the context of Palacios as embedded in a Linux host. Section 6.3 describes how VNET/P is implemented in the context of a Kitten embedding. The nature of the embedding affects VNET/P primarily in how it interfaces to the underlying networking hardware and networking stack. In the Linux embedding, this interface is accomplished directly in the Linux kernel. In the Kitten embedding, the interface is done via a service VM. 4.1 Palacios VMM VNET/P is implemented in the context of our Palacios VMM. Palacios is an OS-independent, open source, BSDlicensed, publicly available embeddable VMM designed as part of the V3VEE project ( The V3VEE project is a collaborative community resource development Fig. 1 VNET/P architecture project involving Northwestern University, the University of New Mexico, Sandia National Labs, and Oak Ridge National Lab. Detailed information about Palacios can be found elsewhere [28, 30]. Palacios is capable of virtualizing large scale (4096+ nodes) with 5 % overheads [29]. Palacios s OS-agnostic design allows it to be embedded into a wide range of different OS architectures. The Palacios implementation is built on the
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks