Data Services Newsletter

Volume 7 : No 1 : March 2005

Uniform Product Distribution System (UPDS) Overview

The Uniform Product Distribution System (UPDS) provides a single system to manage the contribution, archiving, searching, and access of US Array Informational and Data Products. While targeting USArray products, the system will be able to manage products from other components of EarthScope, provided that those products conform to the UPDS requirements and have compatible XML Schema. The scope and intent of the UPDS is to provide the mechanisms to submit products to an archive site, to query archive sites for available products by selected metadata, and to access or download selected products.

In the system, Producers create and submit XML-encoded Products to the UPDS Archive, and Consumers query the system to find what Products are available, refine their searches as necessary, and then access (download) their selected Products. Consumers will be able to access the system in several ways, including interactively through the Web and with stand-alone programs that communicate over the Internet using Web Services and the UPDS Application Programming Interface (API).

UPDS XML Message
Figure 1: UPDS XML Message.

A UPDS XML document consists of a base XML element containing the specific product data wrapped in an element containing generalized common product metadata, which is in turn wrapped in system-specific XML. The product-specific information and structure will be determined by the Producer and the DMC data product analyst using avaliable standards where possible. The generalized product metadata is a combination of auto-generated and producer-supplied information.

UPDS will be structured such that new and as-yet unforeseen Product Types can be added to the Archive with minimal effort. New Product Types will have to be registered with the system and will likely require some manual configuration and/or (hopefully minimal) additional development. The system will also support the evolution of the definition of a Product Type, and therefore the evolution of the Product’s XML Schema, over time. In addition, the system will support Product instance versioning, i.e., updating or replacing a submitted product with a newer, revised instance.

The submission process for a new instance of a Product by a Producer is simply to run a small client application that transfers the XML-encoded Product to the UPDS Archive system. Optionally, a monitoring application [daemon] running at the Producer site can be set up to watch for new Product instances and automatically execute the submission process. The client application wraps the Product XML document in a UPDS message envelope, and that message is sent to a UPDS Ingestion Server running at an Archive site which extracts searchable metadata fields into the UPDS metadata database and stores the Product instance in the local Archive.

The XML-encoding for a Product must conform to the accepted format for that Product Type as defined in an XML Schema document. Any steps required to convert an existing Product format to XML are outside of the scope of the UPDS, although perhaps there will be hooks available in the monitoring daemon to run the 2xml conversion step. The USArray Data Product Analyst will assist with the development of the XML Schema specification (XML encoding) for new Product Types, and may be able to help develop codes to convert existing Product formats into the XML-encoded representations (the 2xml conversion step.)

The UPDS Web Interface provides an interactive method for searching and accessing the archived Products. Web forms will be presented that will allow users to browse and query the archived Products by their associated metadata. The forms will be customized for each Product type but may necessarily be limited in the degree to which users can query.

UPDS Component Structure
Figure 2: UPDS Component Structure.

An Application Programming Interface (API) will be defined that will allow client programs to be written that can access the UPDS system. This interface will support the development of interactive GUI applications, command-line clients suitable for use in scripted environments, and background monitoring processes such as Standing Order –type applications. Applications written to the API will likely support more advanced querying and access patterns than will be available through the Web interface. It is expected that this interface will be built using Web Service technologies, provided that the performance of those technologies is adequate.

The system will allow numerous producers, although the total number is not expected to be very high. New Producers must register with the system. This may involve a manual step – perhaps to validate or assign the Producer’s credentials within the system, but it is hoped that it will be largely automatic. The USArray Data Product Analyst will be available to assist with this process.

Multiple independent archives and producers
Figure 3: Multiple independent archives and producers.

UPDS will be structured to support multiple independent Archives. The level at which these independent Archives cooperate is still to be determined. That is, will there be a centralized access point, independent access, or some combination thereof? Will there be a unified catalog of available Products? These and other questions relating to multiple independent archives will be addressed during the course of development. The initial implementation, however, will focus on a single Archive.

by Linus Kamb (IRIS Data Management Center)

11:01:42 v.22510d55