DATA LOGISTIC TOOLKIT (DLT)
The escalation of data intensive research, and the proliferation of highly distributed collaboration in nearly every field of science have combined to make data management and movement increasingly problematic. The goal of this project was the production of an easy-to-install, easy-to-use software system for data logistics, called the Data Logistics Toolkit (DLT). Logistics is defined as “the time sensitive positioning of goods and services” or the “detailed coordination of a complex operation involving many people, facilities, or supplies”. The basis of our data logistics approach is to consider network-centric cyberinfrastructure in which the “goods” are data. As such, we have developed models and software infrastructure for the coordination, positioning and distribution of data for various applications, packaged in a set of building blocks that comprise the DLT.
Various data logistics software components were integrated to become the DLT and the system was packaged for distribution in "shrink-wrapped" form. The Toolkit has guided installers and can configure itself to be usable as a platform for high-performance data sharing over high-speed research and education networks, as well as providing components that can be leveraged by other science applications and frameworks. Instances of the Toolkit will boot, automatically configure themselves, register themselves with a global discovery service, and become immediately useable as logistical data resources.
The Data Logistics Toolkit (DLT) combines software technologies for shared storage, network monitoring, enhanced control signaling and efficient use of dynamically allocated circuits. Its main components are network storage server (“depot") technology based on the Internet Backplane Protocol (IBP), services and libraries to provide data and file services, perfSONAR-based services for system and network performance measurement and monitoring, a resource discovery and configuration service, services and libraries to provide data and file services, and Phoebus for optimizing the use of network resources in wide-area data transfers.
DLT-enabled services (e.g. a global "drop box" for research that supports automatic data positioning) enable the automation and optimization of the timely placement of data across a wide range of scenarios, using policy-controlled sharing, replication, caching, control loop optimization and overlay multicast. Offering familiar user and programming interfaces, and backed by a solid but scalable security framework, the DLT transforms global data-intensive collaboration by providing a generic interface to an essential logistical resource - storage - and by giving campuses with DLT-enabled nodes the network awareness and adaptivity needed to maximize the productivity of their wide area network connections for dealing with distributed data and computing resources.
Data Logistics Toolkit components
The Internet Backplane Protocol / REDDnet
The Internet Backplane Protocol (IBP) is middleware for managing and using remote data. It provides persistent storage with buffer duration from minutes to months. It implements "working storage" for use in data-centric distributed applications. It enables network logistics (global scheduling and optimization of data movement, storage and computation in a model that takes into account all the underlying physical resources) in large scale, distributed systems and applications. IBP makes it possible for applications of all kinds to use logistical networking to exploit data locality and effectively manage buffer resources.
The IBP Storage Service is a minimal, generic, best-effort service that allocates, reads, writes and manages storage in variable-sized chunks known as allocations. The server that implements the IBP service is called a depot. In order to separate access policy from the mechanism that implements it, access to an allocation is governed by three independent keys or capabilities, which are long, random numbers: the read, write and manage keys. An allocation has no externally visible address or name which it can be referred to by IBP clients other than these three capabilities, which are returned by the depot as the result of a successful allocate operation.
The design of IBP is modeled after the Internet Protocol (IP), in that it implements the basic functionality of data storage in the most general way possible, enabling a variety of higher level services to make use of this basic functionality to share storage resources in an interoperable manner. The objective is for a community that is provisioned with shared storage on the IBP model to be able to use it for a wide variety of purposes related to the time-sensitive movement and placement of data, obviating the need to deploy a separate storage infrastructure for each class of applications. IBP storage allocations take the form of a renewable, time-limited lease, with the maximum duration settable by the depot operator, along with the maximum size of an allocation. This serves the purpose of enabling the depot to force the release of an allocation by denying a request for lease renewal.
The IBP Depot
- A depot can be one storage device or a collection located in one physical location.
- When a file is uploaded, it is split into "slices" of a fixed size that can be specified.
- Slices can all be put on one depot or can be spread out across several, and this can be user specified or policy driven.
- Slices are stored with an expiration date set by the application or the depot.
- An exNode stores file metadata, including slice location.
- IBP services are implemented by the IBP Server.
The exNode and LoRS
The exNode (modeled on the Unix inode) is the data structure used in the components of the DLT to describe the aggregation of IBP allocations into a file-like unit of storage. The exNode implements many, but not all, of the attributes that are associated with a file, leaving some, such as naming and permissions, to a directory service. While the exNode omits some file attributes, it also implements file-like functionality with greater generality than a typical file descriptor. For example, the exNode can express wide-area distribution of data replicas, whereas file systems are typically restricted to a local or enterprise network. Further, exNodes have an XML or JSON serialization that allows them to be expressed and transported in a portable form and operated on by a variety of data management operations and directory services.
The Logistical Runtime System (LoRS) library implements higher level operations that the IBP client can perform on exNodes and storage allocations. These operations (in particular upload, download and augment) have the effect of moving data stored to, from and between multiple allocations on sets of depots at. Services such as robustness through redundancy, and performance through dynamic scheduling are implemented in the LoRS library.
Policy Engines for Replication / Movement
The DLT provides support for policy and management engines including L-Store, LoDN and the IDMS. Each of these systems provides a flexible logistical storage framework for distributed, scalable, and secure access to data for a wide spectrum of users. They can replicate and/or migrate files based on various policies. Replication information is stored in same exNode as the original file and can be used to leverage proximity by choosing copies of allocations that are closer. If a depot (or its network) goes down, a user can still access the file by reading other available copies of allocations. The DLT supports policy driven replication – one can set replication and movement policies for a directory, sub-directory, or file. One use of the model is to upload to a depot that is close to a remote computing resource and then have the file moved cross-country to a user’s university. Another is to replicate the data, placing copies on depots near several collaborators. All the replication and movement occurs in the background, and is automated based on policy. The file can be accessed during the process of replication.
The directory service is responsible for storing and managing exNodes, associating them with administrative metadata such as a name in a hierarchical namespace, a concept of ownership by a specific user and/or group, and goals or policies as to where replicas should be placed. Because of the possibility of time limits on IBP allocation leases and the dynamic nature of distributed storage infrastructures, the DLT must be somewhat more active than a conventional directory service. For example, the dynamic management of exNodes includes making periodic requests for the renewal of allocation leases and making new allocations and moving data between allocations in order to take account of allocations that fail, due to depot or network problems or to non-renewal of a lease.
L-Store, the Logistical Storage system, is targeted at high performance data warehousing, or archiving. L-Store deployments are generally large storage clusters with high-performance networking support. The L-Store deployment at Vanderbilt is capable of using significant amounts of Vanderbilt’s CC-NIE supported 100G network. L-Store utilizes additional metadata to manage slicing and sharding of storage to balance load for maximum parallelism and maximum performance.
LoDN, the Logistical Depot Network, is implemented in two parts: a back-end server that is accessed programmatically by clients using the LoDN library, and a Web-based front end that exposes an interface similar to a typical FTP client. In addition to the conventional file management operations, the Web interface includes high performance parallel upload and download clients in Java. LoDN is designed to function as a set of distributed and replicated services to ensure its scalability. This includes distributing and replicating content and directory metadata across both local sites and in the wide area to provide the level of performance and fault tolerance necessary for these scales. LoDN was successfully used to multicast HD video over the Internet2 network, as well as to sustain coordinated data movement between Internet2 sites using large fractions of the available network capacity.
The Intelligent Data Movement Service (IDMS) was developed as part of this project with features designed for use in dynamic environments like the NSF-supported Global Environment for Network Innovation (GENI.) The IDMS provides functionality similar to LoDN with additional extensibility of data distribution policies. The IDMS is able to dynamically allocate new depots by allocating virtual machines on the fly, providing elastic storage services based on demand and network availability. In addition, the IDMS dispatcher enables rich, user-defined policies for prefetching and positioning data based on the content expressed in exNode metadata.
perfSONAR and Periscope
To help improve the system’s performance and to enhance network awareness of DLT, it also includes perfSONAR ("Performance focused Service Oriented Network monitoring ARchitecture"), a collection of software packages used to monitor, measure, and characterize networks that is widely deployed in Research and Education (R&E) networks worldwide. It enables the DLT data movement/replication tools to adapt dynamically and automatically to network topology and conditions.
perfSONAR is an infrastructure for network performance monitoring, facilitating the ability to solve end-to-end performance problems on paths crossing several networks and to enable network-aware applications. It contains a set of services delivering performance measurements in a federated environment. These services act as an intermediate layer between the performance measurements tools and the diagnostic or visualization applications. This layer is aimed at making and exchanging performance measurements between networks, using well-defined protocols. It allows for the easy retrieval of the same metrics from multiple vantage points and administrative domains.
Periscope is a perfSONAR derived system that is designed to work alongside and augment perfSONAR installations. BLiPP is a tool that runs on DLT hosts that collects various host and network data. It includes provisions for automated, remote configuration via the directory service infrastructure. In addition, Periscope features HELM, a system for coordinating and scheduling measurement activities to eliminate conflicting measurements and measurement overlap. The Measurement Store (MS) is a service for storing measurement data. It holds timestamp, value pairs that are associated with a metadata id, which refers to a metadata object in the directory service, which describes the context of the measurement and the measured entity. These tools monitor network and end systems for healthy operation, but also provides the traffic-sensitive network “roadmap” necessary to optimize the logistics of data distribution.
UNIS (Unified Network Information Service)
The Unified Network Information Service, UNIS, provides service discovery and metadata, describing and enabling the discovery of IBP depots, perfSONAR and Periscope services as well as Phoebus gateways. It is a distributed service with a REST interface which can hold a variety of information about a network. This includes information about the topology of the network including hosts, switches, routers, network links, ports, etc. In addition, UNIS stores and serves information about services running in the network, including things like instances of BLiPP, measurement stores, and perfSONAR measurement points. UNIS stores measurement metadata and information about the “performance topology” of the network that is derived from ongoing measurements. Finally, UNIS stores DLT exNodes describing the data distribution of file objects.
Phoebus is a network optimization "inlay" used to improve data transfer throughput and to optimize use of network resources. Phoebus is based on the eXtensible Session Protocol (XSP) and implements “store and forward" data relaying along the end-to-end path for improved throughput. The Phoebus system can dramatically improve end-to-end throughput by forwarding data via intermediaries, called Phoebus Gateways (PGs), placed at strategic locations in the network. Phoebus acts to reduce the negative effects of TCP In long-distance, high performance network connections and can also perform protocol translation, using better performing protocols over backbone links.
Phoebus supports transparent dynamic network resource allocation and can serve as an on-ramp to dynamic networks like Internet2’s AL2S. Phoebus embeds "intelligence'' in the network that allows a connection to become articulated and adapt to the environment on a segment by segment basis. The system includes a protocol and software infrastructure that addresses many of the fundamental issues in long distance data movement and allows the Internet infrastructure to evolve. The integration of perfSONAR and Phoebus enable us to automatically tune protocol and network settings and to dynamically rebalance active data flows or seek alternate paths to maximize throughput, allowing overlay multicast and streaming for massive data flows to be scheduled along highly efficient paths.